WorldWideScience

Sample records for accurate computational gene

  1. Accurate atom-mapping computation for biochemical reactions.

    Science.gov (United States)

    Latendresse, Mario; Malerich, Jeremiah P; Travers, Mike; Karp, Peter D

    2012-11-26

    The complete atom mapping of a chemical reaction is a bijection of the reactant atoms to the product atoms that specifies the terminus of each reactant atom. Atom mapping of biochemical reactions is useful for many applications of systems biology, in particular for metabolic engineering where synthesizing new biochemical pathways has to take into account for the number of carbon atoms from a source compound that are conserved in the synthesis of a target compound. Rapid, accurate computation of the atom mapping(s) of a biochemical reaction remains elusive despite significant work on this topic. In particular, past researchers did not validate the accuracy of mapping algorithms. We introduce a new method for computing atom mappings called the minimum weighted edit-distance (MWED) metric. The metric is based on bond propensity to react and computes biochemically valid atom mappings for a large percentage of biochemical reactions. MWED models can be formulated efficiently as Mixed-Integer Linear Programs (MILPs). We have demonstrated this approach on 7501 reactions of the MetaCyc database for which 87% of the models could be solved in less than 10 s. For 2.1% of the reactions, we found multiple optimal atom mappings. We show that the error rate is 0.9% (22 reactions) by comparing these atom mappings to 2446 atom mappings of the manually curated Kyoto Encyclopedia of Genes and Genomes (KEGG) RPAIR database. To our knowledge, our computational atom-mapping approach is the most accurate and among the fastest published to date. The atom-mapping data will be available in the MetaCyc database later in 2012; the atom-mapping software will be available within the Pathway Tools software later in 2012.

  2. Fast and accurate methods for phylogenomic analyses

    Directory of Open Access Journals (Sweden)

    Warnow Tandy

    2011-10-01

    Full Text Available Abstract Background Species phylogenies are not estimated directly, but rather through phylogenetic analyses of different gene datasets. However, true gene trees can differ from the true species tree (and hence from one another due to biological processes such as horizontal gene transfer, incomplete lineage sorting, and gene duplication and loss, so that no single gene tree is a reliable estimate of the species tree. Several methods have been developed to estimate species trees from estimated gene trees, differing according to the specific algorithmic technique used and the biological model used to explain differences between species and gene trees. Relatively little is known about the relative performance of these methods. Results We report on a study evaluating several different methods for estimating species trees from sequence datasets, simulating sequence evolution under a complex model including indels (insertions and deletions, substitutions, and incomplete lineage sorting. The most important finding of our study is that some fast and simple methods are nearly as accurate as the most accurate methods, which employ sophisticated statistical methods and are computationally quite intensive. We also observe that methods that explicitly consider errors in the estimated gene trees produce more accurate trees than methods that assume the estimated gene trees are correct. Conclusions Our study shows that highly accurate estimations of species trees are achievable, even when gene trees differ from each other and from the species tree, and that these estimations can be obtained using fairly simple and computationally tractable methods.

  3. Accurate measurement of surface areas of anatomical structures by computer-assisted triangulation of computed tomography images

    Energy Technology Data Exchange (ETDEWEB)

    Allardice, J.T.; Jacomb-Hood, J.; Abulafi, A.M.; Williams, N.S. (Royal London Hospital (United Kingdom)); Cookson, J.; Dykes, E.; Holman, J. (London Hospital Medical College (United Kingdom))

    1993-05-01

    There is a need for accurate surface area measurement of internal anatomical structures in order to define light dosimetry in adjunctive intraoperative photodynamic therapy (AIOPDT). The authors investigated whether computer-assisted triangulation of serial sections generated by computed tomography (CT) scanning can give an accurate assessment of the surface area of the walls of the true pelvis after anterior resection and before colorectal anastomosis. They show that the technique of paper density tessellation is an acceptable method of measuring the surface areas of phantom objects, with a maximum error of 0.5%, and is used as the gold standard. Computer-assisted triangulation of CT images of standard geometric objects and accurately-constructed pelvic phantoms gives a surface area assessment with a maximum error of 2.5% compared with the gold standard. The CT images of 20 patients' pelves have been analysed by computer-assisted triangulation and this shows the surface area of the walls varies from 143 cm[sup 2] to 392 cm[sup 2]. (Author).

  4. Accurate, model-based tuning of synthetic gene expression using introns in S. cerevisiae.

    Directory of Open Access Journals (Sweden)

    Ido Yofe

    2014-06-01

    Full Text Available Introns are key regulators of eukaryotic gene expression and present a potentially powerful tool for the design of synthetic eukaryotic gene expression systems. However, intronic control over gene expression is governed by a multitude of complex, incompletely understood, regulatory mechanisms. Despite this lack of detailed mechanistic understanding, here we show how a relatively simple model enables accurate and predictable tuning of synthetic gene expression system in yeast using several predictive intron features such as transcript folding and sequence motifs. Using only natural Saccharomyces cerevisiae introns as regulators, we demonstrate fine and accurate control over gene expression spanning a 100 fold expression range. These results broaden the engineering toolbox of synthetic gene expression systems and provide a framework in which precise and robust tuning of gene expression is accomplished.

  5. Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics

    Science.gov (United States)

    Goodrich, John W.; Dyson, Rodger W.

    1999-01-01

    The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that

  6. Computational prediction of miRNA genes from small RNA sequencing data

    Directory of Open Access Journals (Sweden)

    Wenjing eKang

    2015-01-01

    Full Text Available Next-generation sequencing now for the first time allows researchers to gauge the depth and variation of entire transcriptomes. However, now as rare transcripts can be detected that are present in cells at single copies, more advanced computational tools are needed to accurately annotate and profile them. miRNAs are 22 nucleotide small RNAs (sRNAs that post-transcriptionally reduce the output of protein coding genes. They have established roles in numerous biological processes, including cancers and other diseases. During miRNA biogenesis, the sRNAs are sequentially cleaved from precursor molecules that have a characteristic hairpin RNA structure. The vast majority of new miRNA genes that are discovered are mined from small RNA sequencing (sRNA-seq, which can detect more than a billion RNAs in a single run. However, given that many of the detected RNAs are degradation products from all types of transcripts, the accurate identification of miRNAs remain a non-trivial computational problem. Here we review the tools available to predict animal miRNAs from sRNA sequencing data. We present tools for generalist and specialist use cases, including prediction from massively pooled data or in species without reference genome. We also present wet-lab methods used to validate predicted miRNAs, and approaches to computationally benchmark prediction accuracy. For each tool, we reference validation experiments and benchmarking efforts. Last, we discuss the future of the field.

  7. Computer-based personality judgments are more accurate than those made by humans.

    Science.gov (United States)

    Youyou, Wu; Kosinski, Michal; Stillwell, David

    2015-01-27

    Judging others' personalities is an essential skill in successful social living, as personality is a key driver behind people's interactions, behaviors, and emotions. Although accurate personality judgments stem from social-cognitive skills, developments in machine learning show that computer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We show that (i) computer predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants' Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models show higher interjudge agreement; and (iii) computer personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy.

  8. Fast and accurate computation of projected two-point functions

    Science.gov (United States)

    Grasshorn Gebhardt, Henry S.; Jeong, Donghui

    2018-01-01

    We present the two-point function from the fast and accurate spherical Bessel transformation (2-FAST) algorithm1Our code is available at https://github.com/hsgg/twoFAST. for a fast and accurate computation of integrals involving one or two spherical Bessel functions. These types of integrals occur when projecting the galaxy power spectrum P (k ) onto the configuration space, ξℓν(r ), or spherical harmonic space, Cℓ(χ ,χ'). First, we employ the FFTLog transformation of the power spectrum to divide the calculation into P (k )-dependent coefficients and P (k )-independent integrations of basis functions multiplied by spherical Bessel functions. We find analytical expressions for the latter integrals in terms of special functions, for which recursion provides a fast and accurate evaluation. The algorithm, therefore, circumvents direct integration of highly oscillating spherical Bessel functions.

  9. Computer-based personality judgments are more accurate than those made by humans

    Science.gov (United States)

    Youyou, Wu; Kosinski, Michal; Stillwell, David

    2015-01-01

    Judging others’ personalities is an essential skill in successful social living, as personality is a key driver behind people’s interactions, behaviors, and emotions. Although accurate personality judgments stem from social-cognitive skills, developments in machine learning show that computer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We show that (i) computer predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants’ Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models show higher interjudge agreement; and (iii) computer personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy. PMID:25583507

  10. Evaluation of new reference genes in papaya for accurate transcript normalization under different experimental conditions.

    Directory of Open Access Journals (Sweden)

    Xiaoyang Zhu

    Full Text Available Real-time reverse transcription PCR (RT-qPCR is a preferred method for rapid and accurate quantification of gene expression studies. Appropriate application of RT-qPCR requires accurate normalization though the use of reference genes. As no single reference gene is universally suitable for all experiments, thus reference gene(s validation under different experimental conditions is crucial for RT-qPCR analysis. To date, only a few studies on reference genes have been done in other plants but none in papaya. In the present work, we selected 21 candidate reference genes, and evaluated their expression stability in 246 papaya fruit samples using three algorithms, geNorm, NormFinder and RefFinder. The samples consisted of 13 sets collected under different experimental conditions, including various tissues, different storage temperatures, different cultivars, developmental stages, postharvest ripening, modified atmosphere packaging, 1-methylcyclopropene (1-MCP treatment, hot water treatment, biotic stress and hormone treatment. Our results demonstrated that expression stability varied greatly between reference genes and that different suitable reference gene(s or combination of reference genes for normalization should be validated according to the experimental conditions. In general, the internal reference genes EIF (Eukaryotic initiation factor 4A, TBP1 (TATA binding protein 1 and TBP2 (TATA binding protein 2 genes had a good performance under most experimental conditions, whereas the most widely present used reference genes, ACTIN (Actin 2, 18S rRNA (18S ribosomal RNA and GAPDH (Glyceraldehyde-3-phosphate dehydrogenase were not suitable in many experimental conditions. In addition, two commonly used programs, geNorm and Normfinder, were proved sufficient for the validation. This work provides the first systematic analysis for the selection of superior reference genes for accurate transcript normalization in papaya under different experimental

  11. Accurate measurement of gene copy number for human alpha-defensin DEFA1A3.

    Science.gov (United States)

    Khan, Fayeza F; Carpenter, Danielle; Mitchell, Laura; Mansouri, Omniah; Black, Holly A; Tyson, Jess; Armour, John A L

    2013-10-20

    Multi-allelic copy number variants include examples of extensive variation between individuals in the copy number of important genes, most notably genes involved in immune function. The definition of this variation, and analysis of its impact on function, has been hampered by the technical difficulty of large-scale but accurate typing of genomic copy number. The copy-variable alpha-defensin locus DEFA1A3 on human chromosome 8 commonly varies between 4 and 10 copies per diploid genome, and presents considerable challenges for accurate high-throughput typing. In this study, we developed two paralogue ratio tests and three allelic ratio measurements that, in combination, provide an accurate and scalable method for measurement of DEFA1A3 gene number. We combined information from different measurements in a maximum-likelihood framework which suggests that most samples can be assigned to an integer copy number with high confidence, and applied it to typing 589 unrelated European DNA samples. Typing the members of three-generation pedigrees provided further reassurance that correct integer copy numbers had been assigned. Our results have allowed us to discover that the SNP rs4300027 is strongly associated with DEFA1A3 gene copy number in European samples. We have developed an accurate and robust method for measurement of DEFA1A3 copy number. Interrogation of rs4300027 and associated SNPs in Genome-Wide Association Study SNP data provides no evidence that alpha-defensin copy number is a strong risk factor for phenotypes such as Crohn's disease, type I diabetes, HIV progression and multiple sclerosis.

  12. Models of natural computation : gene assembly and membrane systems

    NARCIS (Netherlands)

    Brijder, Robert

    2008-01-01

    This thesis is concerned with two research areas in natural computing: the computational nature of gene assembly and membrane computing. Gene assembly is a process occurring in unicellular organisms called ciliates. During this process genes are transformed through cut-and-paste operations. We

  13. Improved Patient Size Estimates for Accurate Dose Calculations in Abdomen Computed Tomography

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Chang-Lae [Yonsei University, Wonju (Korea, Republic of)

    2017-07-15

    The radiation dose of CT (computed tomography) is generally represented by the CTDI (CT dose index). CTDI, however, does not accurately predict the actual patient doses for different human body sizes because it relies on a cylinder-shaped head (diameter : 16 cm) and body (diameter : 32 cm) phantom. The purpose of this study was to eliminate the drawbacks of the conventional CTDI and to provide more accurate radiation dose information. Projection radiographs were obtained from water cylinder phantoms of various sizes, and the sizes of the water cylinder phantoms were calculated and verified using attenuation profiles. The effective diameter was also calculated using the attenuation of the abdominal projection radiographs of 10 patients. When the results of the attenuation-based method and the geometry-based method shown were compared with the results of the reconstructed-axial-CT-image-based method, the effective diameter of the attenuation-based method was found to be similar to the effective diameter of the reconstructed-axial-CT-image-based method, with a difference of less than 3.8%, but the geometry-based method showed a difference of less than 11.4%. This paper proposes a new method of accurately computing the radiation dose of CT based on the patient sizes. This method computes and provides the exact patient dose before the CT scan, and can therefore be effectively used for imaging and dose control.

  14. An Accurate liver segmentation method using parallel computing algorithm

    International Nuclear Information System (INIS)

    Elbasher, Eiman Mohammed Khalied

    2014-12-01

    Computed Tomography (CT or CAT scan) is a noninvasive diagnostic imaging procedure that uses a combination of X-rays and computer technology to produce horizontal, or axial, images (often called slices) of the body. A CT scan shows detailed images of any part of the body, including the bones muscles, fat and organs CT scans are more detailed than standard x-rays. CT scans may be done with or without "contrast Contrast refers to a substance taken by mouth and/ or injected into an intravenous (IV) line that causes the particular organ or tissue under study to be seen more clearly. CT scan of the liver and biliary tract are used in the diagnosis of many diseases in the abdomen structures, particularly when another type of examination, such as X-rays, physical examination, and ultra sound is not conclusive. Unfortunately, the presence of noise and artifact in the edges and fine details in the CT images limit the contrast resolution and make diagnostic procedure more difficult. This experimental study was conducted at the College of Medical Radiological Science, Sudan University of Science and Technology and Fidel Specialist Hospital. The sample of study was included 50 patients. The main objective of this research was to study an accurate liver segmentation method using a parallel computing algorithm, and to segment liver and adjacent organs using image processing technique. The main technique of segmentation used in this study was watershed transform. The scope of image processing and analysis applied to medical application is to improve the quality of the acquired image and extract quantitative information from medical image data in an efficient and accurate way. The results of this technique agreed wit the results of Jarritt et al, (2010), Kratchwil et al, (2010), Jover et al, (2011), Yomamoto et al, (1996), Cai et al (1999), Saudha and Jayashree (2010) who used different segmentation filtering based on the methods of enhancing the computed tomography images. Anther

  15. An Accurate and Dynamic Computer Graphics Muscle Model

    Science.gov (United States)

    Levine, David Asher

    1997-01-01

    A computer based musculo-skeletal model was developed at the University in the departments of Mechanical and Biomedical Engineering. This model accurately represents human shoulder kinematics. The result of this model is the graphical display of bones moving through an appropriate range of motion based on inputs of EMGs and external forces. The need existed to incorporate a geometric muscle model in the larger musculo-skeletal model. Previous muscle models did not accurately represent muscle geometries, nor did they account for the kinematics of tendons. This thesis covers the creation of a new muscle model for use in the above musculo-skeletal model. This muscle model was based on anatomical data from the Visible Human Project (VHP) cadaver study. Two-dimensional digital images from the VHP were analyzed and reconstructed to recreate the three-dimensional muscle geometries. The recreated geometries were smoothed, reduced, and sliced to form data files defining the surfaces of each muscle. The muscle modeling function opened these files during run-time and recreated the muscle surface. The modeling function applied constant volume limitations to the muscle and constant geometry limitations to the tendons.

  16. Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

    Energy Technology Data Exchange (ETDEWEB)

    Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

    2014-03-15

    Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.

  17. A method for accurate computation of elastic and discrete inelastic scattering transfer matrix

    International Nuclear Information System (INIS)

    Garcia, R.D.M.; Santina, M.D.

    1986-05-01

    A method for accurate computation of elastic and discrete inelastic scattering transfer matrices is discussed. In particular, a partition scheme for the source energy range that avoids integration over intervals containing points where the integrand has discontinuous derivative is developed. Five-figure accurate numerical results are obtained for several test problems with the TRAMA program which incorporates the porposed method. A comparison with numerical results from existing processing codes is also presented. (author) [pt

  18. Methods for Efficiently and Accurately Computing Quantum Mechanical Free Energies for Enzyme Catalysis.

    Science.gov (United States)

    Kearns, F L; Hudson, P S; Boresch, S; Woodcock, H L

    2016-01-01

    Enzyme activity is inherently linked to free energies of transition states, ligand binding, protonation/deprotonation, etc.; these free energies, and thus enzyme function, can be affected by residue mutations, allosterically induced conformational changes, and much more. Therefore, being able to predict free energies associated with enzymatic processes is critical to understanding and predicting their function. Free energy simulation (FES) has historically been a computational challenge as it requires both the accurate description of inter- and intramolecular interactions and adequate sampling of all relevant conformational degrees of freedom. The hybrid quantum mechanical molecular mechanical (QM/MM) framework is the current tool of choice when accurate computations of macromolecular systems are essential. Unfortunately, robust and efficient approaches that employ the high levels of computational theory needed to accurately describe many reactive processes (ie, ab initio, DFT), while also including explicit solvation effects and accounting for extensive conformational sampling are essentially nonexistent. In this chapter, we will give a brief overview of two recently developed methods that mitigate several major challenges associated with QM/MM FES: the QM non-Boltzmann Bennett's acceptance ratio method and the QM nonequilibrium work method. We will also describe usage of these methods to calculate free energies associated with (1) relative properties and (2) along reaction paths, using simple test cases with relevance to enzymes examples. © 2016 Elsevier Inc. All rights reserved.

  19. Fast and Accurate Computation of Gauss--Legendre and Gauss--Jacobi Quadrature Nodes and Weights

    KAUST Repository

    Hale, Nicholas; Townsend, Alex

    2013-01-01

    An efficient algorithm for the accurate computation of Gauss-Legendre and Gauss-Jacobi quadrature nodes and weights is presented. The algorithm is based on Newton's root-finding method with initial guesses and function evaluations computed via asymptotic formulae. The n-point quadrature rule is computed in O(n) operations to an accuracy of essentially double precision for any n ≥ 100. © 2013 Society for Industrial and Applied Mathematics.

  20. Fast and Accurate Computation of Gauss--Legendre and Gauss--Jacobi Quadrature Nodes and Weights

    KAUST Repository

    Hale, Nicholas

    2013-03-06

    An efficient algorithm for the accurate computation of Gauss-Legendre and Gauss-Jacobi quadrature nodes and weights is presented. The algorithm is based on Newton\\'s root-finding method with initial guesses and function evaluations computed via asymptotic formulae. The n-point quadrature rule is computed in O(n) operations to an accuracy of essentially double precision for any n ≥ 100. © 2013 Society for Industrial and Applied Mathematics.

  1. Assessing reference genes for accurate transcript normalization using quantitative real-time PCR in pearl millet [Pennisetum glaucum (L. R. Br].

    Directory of Open Access Journals (Sweden)

    Prasenjit Saha

    Full Text Available Pearl millet [Pennisetum glaucum (L. R.Br.], a close relative of Panicoideae food crops and bioenergy grasses, offers an ideal system to perform functional genomics studies related to C4 photosynthesis and abiotic stress tolerance. Quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR provides a sensitive platform to conduct such gene expression analyses. However, the lack of suitable internal control reference genes for accurate transcript normalization during qRT-PCR analysis in pearl millet is the major limitation. Here, we conducted a comprehensive assessment of 18 reference genes on 234 samples which included an array of different developmental tissues, hormone treatments and abiotic stress conditions from three genotypes to determine appropriate reference genes for accurate normalization of qRT-PCR data. Analyses of Ct values using Stability Index, BestKeeper, ΔCt, Normfinder, geNorm and RefFinder programs ranked PP2A, TIP41, UBC2, UBQ5 and ACT as the most reliable reference genes for accurate transcript normalization under different experimental conditions. Furthermore, we validated the specificity of these genes for precise quantification of relative gene expression and provided evidence that a combination of the best reference genes are required to obtain optimal expression patterns for both endogeneous genes as well as transgenes in pearl millet.

  2. Defect correction and multigrid for an efficient and accurate computation of airfoil flows

    NARCIS (Netherlands)

    Koren, B.

    1988-01-01

    Results are presented for an efficient solution method for second-order accurate discretizations of the 2D steady Euler equations. The solution method is based on iterative defect correction. Several schemes are considered for the computation of the second-order defect. In each defect correction

  3. Genome-Wide Comparative Gene Family Classification

    Science.gov (United States)

    Frech, Christian; Chen, Nansheng

    2010-01-01

    Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221

  4. A comparative analysis of soft computing techniques for gene prediction.

    Science.gov (United States)

    Goel, Neelam; Singh, Shailendra; Aseri, Trilok Chand

    2013-07-01

    The rapid growth of genomic sequence data for both human and nonhuman species has made analyzing these sequences, especially predicting genes in them, very important and is currently the focus of many research efforts. Beside its scientific interest in the molecular biology and genomics community, gene prediction is of considerable importance in human health and medicine. A variety of gene prediction techniques have been developed for eukaryotes over the past few years. This article reviews and analyzes the application of certain soft computing techniques in gene prediction. First, the problem of gene prediction and its challenges are described. These are followed by different soft computing techniques along with their application to gene prediction. In addition, a comparative analysis of different soft computing techniques for gene prediction is given. Finally some limitations of the current research activities and future research directions are provided. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. How accurate are adolescents in portion-size estimation using the computer tool young adolescents' nutrition assessment on computer (YANA-C)?

    OpenAIRE

    Vereecken, Carine; Dohogne, Sophie; Covents, Marc; Maes, Lea

    2010-01-01

    Computer-administered questionnaires have received increased attention for large-scale population research on nutrition. In Belgium-Flanders, Young Adolescents' Nutrition Assessment on Computer (YANA-C) has been developed. In this tool, standardised photographs are available to assist in portion-size estimation. The purpose of the present study is to assess how accurate adolescents are in estimating portion sizes of food using YANA-C. A convenience sample, aged 11-17 years, estimated the amou...

  6. Fast and accurate algorithm for the computation of complex linear canonical transforms.

    Science.gov (United States)

    Koç, Aykut; Ozaktas, Haldun M; Hesselink, Lambertus

    2010-09-01

    A fast and accurate algorithm is developed for the numerical computation of the family of complex linear canonical transforms (CLCTs), which represent the input-output relationship of complex quadratic-phase systems. Allowing the linear canonical transform parameters to be complex numbers makes it possible to represent paraxial optical systems that involve complex parameters. These include lossy systems such as Gaussian apertures, Gaussian ducts, or complex graded-index media, as well as lossless thin lenses and sections of free space and any arbitrary combinations of them. Complex-ordered fractional Fourier transforms (CFRTs) are a special case of CLCTs, and therefore a fast and accurate algorithm to compute CFRTs is included as a special case of the presented algorithm. The algorithm is based on decomposition of an arbitrary CLCT matrix into real and complex chirp multiplications and Fourier transforms. The samples of the output are obtained from the samples of the input in approximately N log N time, where N is the number of input samples. A space-bandwidth product tracking formalism is developed to ensure that the number of samples is information-theoretically sufficient to reconstruct the continuous transform, but not unnecessarily redundant.

  7. Reference genes for accurate transcript normalization in citrus genotypes under different experimental conditions.

    Directory of Open Access Journals (Sweden)

    Valéria Mafra

    Full Text Available Real-time reverse transcription PCR (RT-qPCR has emerged as an accurate and widely used technique for expression profiling of selected genes. However, obtaining reliable measurements depends on the selection of appropriate reference genes for gene expression normalization. The aim of this work was to assess the expression stability of 15 candidate genes to determine which set of reference genes is best suited for transcript normalization in citrus in different tissues and organs and leaves challenged with five pathogens (Alternaria alternata, Phytophthora parasitica, Xylella fastidiosa and Candidatus Liberibacter asiaticus. We tested traditional genes used for transcript normalization in citrus and orthologs of Arabidopsis thaliana genes described as superior reference genes based on transcriptome data. geNorm and NormFinder algorithms were used to find the best reference genes to normalize all samples and conditions tested. Additionally, each biotic stress was individually analyzed by geNorm. In general, FBOX (encoding a member of the F-box family and GAPC2 (GAPDH was the most stable candidate gene set assessed under the different conditions and subsets tested, while CYP (cyclophilin, TUB (tubulin and CtP (cathepsin were the least stably expressed genes found. Validation of the best suitable reference genes for normalizing the expression level of the WRKY70 transcription factor in leaves infected with Candidatus Liberibacter asiaticus showed that arbitrary use of reference genes without previous testing could lead to misinterpretation of data. Our results revealed FBOX, SAND (a SAND family protein, GAPC2 and UPL7 (ubiquitin protein ligase 7 to be superior reference genes, and we recommend their use in studies of gene expression in citrus species and relatives. This work constitutes the first systematic analysis for the selection of superior reference genes for transcript normalization in different citrus organs and under biotic stress.

  8. Gene expression signatures of radiation response are specific, durable and accurate in mice and humans.

    Directory of Open Access Journals (Sweden)

    Sarah K Meadows

    2008-04-01

    Full Text Available Previous work has demonstrated the potential for peripheral blood (PB gene expression profiling for the detection of disease or environmental exposures.We have sought to determine the impact of several variables on the PB gene expression profile of an environmental exposure, ionizing radiation, and to determine the specificity of the PB signature of radiation versus other genotoxic stresses. Neither genotype differences nor the time of PB sampling caused any lessening of the accuracy of PB signatures to predict radiation exposure, but sex difference did influence the accuracy of the prediction of radiation exposure at the lowest level (50 cGy. A PB signature of sepsis was also generated and both the PB signature of radiation and the PB signature of sepsis were found to be 100% specific at distinguishing irradiated from septic animals. We also identified human PB signatures of radiation exposure and chemotherapy treatment which distinguished irradiated patients and chemotherapy-treated individuals within a heterogeneous population with accuracies of 90% and 81%, respectively.We conclude that PB gene expression profiles can be identified in mice and humans that are accurate in predicting medical conditions, are specific to each condition and remain highly accurate over time.

  9. An Efficient Approach for Fast and Accurate Voltage Stability Margin Computation in Large Power Grids

    Directory of Open Access Journals (Sweden)

    Heng-Yi Su

    2016-11-01

    Full Text Available This paper proposes an efficient approach for the computation of voltage stability margin (VSM in a large-scale power grid. The objective is to accurately and rapidly determine the load power margin which corresponds to voltage collapse phenomena. The proposed approach is based on the impedance match-based technique and the model-based technique. It combines the Thevenin equivalent (TE network method with cubic spline extrapolation technique and the continuation technique to achieve fast and accurate VSM computation for a bulk power grid. Moreover, the generator Q limits are taken into account for practical applications. Extensive case studies carried out on Institute of Electrical and Electronics Engineers (IEEE benchmark systems and the Taiwan Power Company (Taipower, Taipei, Taiwan system are used to demonstrate the effectiveness of the proposed approach.

  10. Accurate computations of monthly average daily extraterrestrial irradiation and the maximum possible sunshine duration

    International Nuclear Information System (INIS)

    Jain, P.C.

    1985-12-01

    The monthly average daily values of the extraterrestrial irradiation on a horizontal plane and the maximum possible sunshine duration are two important parameters that are frequently needed in various solar energy applications. These are generally calculated by solar scientists and engineers each time they are needed and often by using the approximate short-cut methods. Using the accurate analytical expressions developed by Spencer for the declination and the eccentricity correction factor, computations for these parameters have been made for all the latitude values from 90 deg. N to 90 deg. S at intervals of 1 deg. and are presented in a convenient tabular form. Monthly average daily values of the maximum possible sunshine duration as recorded on a Campbell Stoke's sunshine recorder are also computed and presented. These tables would avoid the need for repetitive and approximate calculations and serve as a useful ready reference for providing accurate values to the solar energy scientists and engineers

  11. Computational challenges in modeling gene regulatory events.

    Science.gov (United States)

    Pataskar, Abhijeet; Tiwari, Vijay K

    2016-10-19

    Cellular transcriptional programs driven by genetic and epigenetic mechanisms could be better understood by integrating "omics" data and subsequently modeling the gene-regulatory events. Toward this end, computational biology should keep pace with evolving experimental procedures and data availability. This article gives an exemplified account of the current computational challenges in molecular biology.

  12. Fast and accurate three-dimensional point spread function computation for fluorescence microscopy.

    Science.gov (United States)

    Li, Jizhou; Xue, Feng; Blu, Thierry

    2017-06-01

    The point spread function (PSF) plays a fundamental role in fluorescence microscopy. A realistic and accurately calculated PSF model can significantly improve the performance in 3D deconvolution microscopy and also the localization accuracy in single-molecule microscopy. In this work, we propose a fast and accurate approximation of the Gibson-Lanni model, which has been shown to represent the PSF suitably under a variety of imaging conditions. We express the Kirchhoff's integral in this model as a linear combination of rescaled Bessel functions, thus providing an integral-free way for the calculation. The explicit approximation error in terms of parameters is given numerically. Experiments demonstrate that the proposed approach results in a significantly smaller computational time compared with current state-of-the-art techniques to achieve the same accuracy. This approach can also be extended to other microscopy PSF models.

  13. Computational integration of homolog and pathway gene module expression reveals general stemness signatures.

    Directory of Open Access Journals (Sweden)

    Martina Koeva

    Full Text Available The stemness hypothesis states that all stem cells use common mechanisms to regulate self-renewal and multi-lineage potential. However, gene expression meta-analyses at the single gene level have failed to identify a significant number of genes selectively expressed by a broad range of stem cell types. We hypothesized that stemness may be regulated by modules of homologs. While the expression of any single gene within a module may vary from one stem cell type to the next, it is possible that the expression of the module as a whole is required so that the expression of different, yet functionally-synonymous, homologs is needed in different stem cells. Thus, we developed a computational method to test for stem cell-specific gene expression patterns from a comprehensive collection of 49 murine datasets covering 12 different stem cell types. We identified 40 individual genes and 224 stemness modules with reproducible and specific up-regulation across multiple stem cell types. The stemness modules included families regulating chromatin remodeling, DNA repair, and Wnt signaling. Strikingly, the majority of modules represent evolutionarily related homologs. Moreover, a score based on the discovered modules could accurately distinguish stem cell-like populations from other cell types in both normal and cancer tissues. This scoring system revealed that both mouse and human metastatic populations exhibit higher stemness indices than non-metastatic populations, providing further evidence for a stem cell-driven component underlying the transformation to metastatic disease.

  14. A simplified approach to characterizing a kilovoltage source spectrum for accurate dose computation

    Energy Technology Data Exchange (ETDEWEB)

    Poirier, Yannick; Kouznetsov, Alexei; Tambasco, Mauro [Department of Physics and Astronomy, University of Calgary, Calgary, Alberta T2N 4N2 (Canada); Department of Physics and Astronomy and Department of Oncology, University of Calgary and Tom Baker Cancer Centre, Calgary, Alberta T2N 4N2 (Canada)

    2012-06-15

    2% for the homogeneous and heterogeneous block phantoms, and agreement for the transverse dose profiles was within 6%. Conclusions: The HVL and kVp are sufficient for characterizing a kV x-ray source spectrum for accurate dose computation. As these parameters can be easily and accurately measured, they provide for a clinically feasible approach to characterizing a kV energy spectrum to be used for patient specific x-ray dose computations. Furthermore, these results provide experimental validation of our novel hybrid dose computation algorithm.

  15. NINJA-OPS: Fast Accurate Marker Gene Alignment Using Concatenated Ribosomes.

    Directory of Open Access Journals (Sweden)

    Gabriel A Al-Ghalith

    2016-01-01

    Full Text Available The explosion of bioinformatics technologies in the form of next generation sequencing (NGS has facilitated a massive influx of genomics data in the form of short reads. Short read mapping is therefore a fundamental component of next generation sequencing pipelines which routinely match these short reads against reference genomes for contig assembly. However, such techniques have seldom been applied to microbial marker gene sequencing studies, which have mostly relied on novel heuristic approaches. We propose NINJA Is Not Just Another OTU-Picking Solution (NINJA-OPS, or NINJA for short, a fast and highly accurate novel method enabling reference-based marker gene matching (picking Operational Taxonomic Units, or OTUs. NINJA takes advantage of the Burrows-Wheeler (BW alignment using an artificial reference chromosome composed of concatenated reference sequences, the "concatesome," as the BW input. Other features include automatic support for paired-end reads with arbitrary insert sizes. NINJA is also free and open source and implements several pre-filtering methods that elicit substantial speedup when coupled with existing tools. We applied NINJA to several published microbiome studies, obtaining accuracy similar to or better than previous reference-based OTU-picking methods while achieving an order of magnitude or more speedup and using a fraction of the memory footprint. NINJA is a complete pipeline that takes a FASTA-formatted input file and outputs a QIIME-formatted taxonomy-annotated BIOM file for an entire MiSeq run of human gut microbiome 16S genes in under 10 minutes on a dual-core laptop.

  16. Identification of Importin 8 (IPO8 as the most accurate reference gene for the clinicopathological analysis of lung specimens

    Directory of Open Access Journals (Sweden)

    Pio Ruben

    2008-11-01

    Full Text Available Abstract Background The accurate normalization of differentially expressed genes in lung cancer is essential for the identification of novel therapeutic targets and biomarkers by real time RT-PCR and microarrays. Although classical "housekeeping" genes, such as GAPDH, HPRT1, and beta-actin have been widely used in the past, their accuracy as reference genes for lung tissues has not been proven. Results We have conducted a thorough analysis of a panel of 16 candidate reference genes for lung specimens and lung cell lines. Gene expression was measured by quantitative real time RT-PCR and expression stability was analyzed with the softwares GeNorm and NormFinder, mean of |ΔCt| (= |Ct Normal-Ct tumor| ± SEM, and correlation coefficients among genes. Systematic comparison between candidates led us to the identification of a subset of suitable reference genes for clinical samples: IPO8, ACTB, POLR2A, 18S, and PPIA. Further analysis showed that IPO8 had a very low mean of |ΔCt| (0.70 ± 0.09, with no statistically significant differences between normal and malignant samples and with excellent expression stability. Conclusion Our data show that IPO8 is the most accurate reference gene for clinical lung specimens. In addition, we demonstrate that the commonly used genes GAPDH and HPRT1 are inappropriate to normalize data derived from lung biopsies, although they are suitable as reference genes for lung cell lines. We thus propose IPO8 as a novel reference gene for lung cancer samples.

  17. Accurate Computation of Reduction Potentials of 4Fe−4S Clusters Indicates a Carboxylate Shift in Pyrococcus furiosus Ferredoxin

    DEFF Research Database (Denmark)

    Kepp, Kasper Planeta; Ooi, Bee Lean; Christensen, Hans Erik Mølager

    2007-01-01

    This work describes the computation and accurate reproduction of subtle shifts in reduction potentials for two mutants of the iron-sulfur protein Pyrococcus furiosus ferredoxin. The computational models involved only first-sphere ligands and differed with respect to one ligand, either acetate (as...

  18. Identification and evaluation of new reference genes in Gossypium hirsutum for accurate normalization of real-time quantitative RT-PCR data

    Directory of Open Access Journals (Sweden)

    Alves-Ferreira Marcio

    2010-03-01

    Full Text Available Abstract Background Normalizing through reference genes, or housekeeping genes, can make more accurate and reliable results from reverse transcription real-time quantitative polymerase chain reaction (qPCR. Recent studies have shown that no single housekeeping gene is universal for all experiments. Thus, suitable reference genes should be the first step of any qPCR analysis. Only a few studies on the identification of housekeeping gene have been carried on plants. Therefore qPCR studies on important crops such as cotton has been hampered by the lack of suitable reference genes. Results By the use of two distinct algorithms, implemented by geNorm and NormFinder, we have assessed the gene expression of nine candidate reference genes in cotton: GhACT4, GhEF1α5, GhFBX6, GhPP2A1, GhMZA, GhPTB, GhGAPC2, GhβTUB3 and GhUBQ14. The candidate reference genes were evaluated in 23 experimental samples consisting of six distinct plant organs, eight stages of flower development, four stages of fruit development and in flower verticils. The expression of GhPP2A1 and GhUBQ14 genes were the most stable across all samples and also when distinct plants organs are examined. GhACT4 and GhUBQ14 present more stable expression during flower development, GhACT4 and GhFBX6 in the floral verticils and GhMZA and GhPTB during fruit development. Our analysis provided the most suitable combination of reference genes for each experimental set tested as internal control for reliable qPCR data normalization. In addition, to illustrate the use of cotton reference genes we checked the expression of two cotton MADS-box genes in distinct plant and floral organs and also during flower development. Conclusion We have tested the expression stabilities of nine candidate genes in a set of 23 tissue samples from cotton plants divided into five different experimental sets. As a result of this evaluation, we recommend the use of GhUBQ14 and GhPP2A1 housekeeping genes as superior references

  19. Identification and evaluation of new reference genes in Gossypium hirsutum for accurate normalization of real-time quantitative RT-PCR data.

    Science.gov (United States)

    Artico, Sinara; Nardeli, Sarah M; Brilhante, Osmundo; Grossi-de-Sa, Maria Fátima; Alves-Ferreira, Marcio

    2010-03-21

    Normalizing through reference genes, or housekeeping genes, can make more accurate and reliable results from reverse transcription real-time quantitative polymerase chain reaction (qPCR). Recent studies have shown that no single housekeeping gene is universal for all experiments. Thus, suitable reference genes should be the first step of any qPCR analysis. Only a few studies on the identification of housekeeping gene have been carried on plants. Therefore qPCR studies on important crops such as cotton has been hampered by the lack of suitable reference genes. By the use of two distinct algorithms, implemented by geNorm and NormFinder, we have assessed the gene expression of nine candidate reference genes in cotton: GhACT4, GhEF1alpha5, GhFBX6, GhPP2A1, GhMZA, GhPTB, GhGAPC2, GhbetaTUB3 and GhUBQ14. The candidate reference genes were evaluated in 23 experimental samples consisting of six distinct plant organs, eight stages of flower development, four stages of fruit development and in flower verticils. The expression of GhPP2A1 and GhUBQ14 genes were the most stable across all samples and also when distinct plants organs are examined. GhACT4 and GhUBQ14 present more stable expression during flower development, GhACT4 and GhFBX6 in the floral verticils and GhMZA and GhPTB during fruit development. Our analysis provided the most suitable combination of reference genes for each experimental set tested as internal control for reliable qPCR data normalization. In addition, to illustrate the use of cotton reference genes we checked the expression of two cotton MADS-box genes in distinct plant and floral organs and also during flower development. We have tested the expression stabilities of nine candidate genes in a set of 23 tissue samples from cotton plants divided into five different experimental sets. As a result of this evaluation, we recommend the use of GhUBQ14 and GhPP2A1 housekeeping genes as superior references for normalization of gene expression measures in

  20. Microarray-based cancer prediction using soft computing approach.

    Science.gov (United States)

    Wang, Xiaosheng; Gotoh, Osamu

    2009-05-26

    One of the difficulties in using gene expression profiles to predict cancer is how to effectively select a few informative genes to construct accurate prediction models from thousands or ten thousands of genes. We screen highly discriminative genes and gene pairs to create simple prediction models involved in single genes or gene pairs on the basis of soft computing approach and rough set theory. Accurate cancerous prediction is obtained when we apply the simple prediction models for four cancerous gene expression datasets: CNS tumor, colon tumor, lung cancer and DLBCL. Some genes closely correlated with the pathogenesis of specific or general cancers are identified. In contrast with other models, our models are simple, effective and robust. Meanwhile, our models are interpretable for they are based on decision rules. Our results demonstrate that very simple models may perform well on cancerous molecular prediction and important gene markers of cancer can be detected if the gene selection approach is chosen reasonably.

  1. Computational identification of putative cytochrome P450 genes in ...

    African Journals Online (AJOL)

    In this work, a computational study of expressed sequence tags (ESTs) of soybean was performed by data mining methods and bio-informatics tools and as a result 78 putative P450 genes were identified, including 57 new ones. These genes were classified into five clans and 20 families by sequence similarities and among ...

  2. Inferring biological functions of guanylyl cyclases with computational methods

    KAUST Repository

    Alquraishi, May Majed; Meier, Stuart Kurt

    2013-01-01

    A number of studies have shown that functionally related genes are often co-expressed and that computational based co-expression analysis can be used to accurately identify functional relationships between genes and by inference, their encoded proteins. Here we describe how a computational based co-expression analysis can be used to link the function of a specific gene of interest to a defined cellular response. Using a worked example we demonstrate how this methodology is used to link the function of the Arabidopsis Wall-Associated Kinase-Like 10 gene, which encodes a functional guanylyl cyclase, to host responses to pathogens. © Springer Science+Business Media New York 2013.

  3. Inferring biological functions of guanylyl cyclases with computational methods

    KAUST Repository

    Alquraishi, May Majed

    2013-09-03

    A number of studies have shown that functionally related genes are often co-expressed and that computational based co-expression analysis can be used to accurately identify functional relationships between genes and by inference, their encoded proteins. Here we describe how a computational based co-expression analysis can be used to link the function of a specific gene of interest to a defined cellular response. Using a worked example we demonstrate how this methodology is used to link the function of the Arabidopsis Wall-Associated Kinase-Like 10 gene, which encodes a functional guanylyl cyclase, to host responses to pathogens. © Springer Science+Business Media New York 2013.

  4. On Computing Breakpoint Distances for Genomes with Duplicate Genes.

    Science.gov (United States)

    Shao, Mingfu; Moret, Bernard M E

    2017-06-01

    A fundamental problem in comparative genomics is to compute the distance between two genomes in terms of its higher level organization (given by genes or syntenic blocks). For two genomes without duplicate genes, we can easily define (and almost always efficiently compute) a variety of distance measures, but the problem is NP-hard under most models when genomes contain duplicate genes. To tackle duplicate genes, three formulations (exemplar, maximum matching, and any matching) have been proposed, all of which aim to build a matching between homologous genes so as to minimize some distance measure. Of the many distance measures, the breakpoint distance (the number of nonconserved adjacencies) was the first one to be studied and remains of significant interest because of its simplicity and model-free property. The three breakpoint distance problems corresponding to the three formulations have been widely studied. Although we provided last year a solution for the exemplar problem that runs very fast on full genomes, computing optimal solutions for the other two problems has remained challenging. In this article, we describe very fast, exact algorithms for these two problems. Our algorithms rely on a compact integer-linear program that we further simplify by developing an algorithm to remove variables, based on new results on the structure of adjacencies and matchings. Through extensive experiments using both simulations and biological data sets, we show that our algorithms run very fast (in seconds) on mammalian genomes and scale well beyond. We also apply these algorithms (as well as the classic orthology tool MSOAR) to create orthology assignment, then compare their quality in terms of both accuracy and coverage. We find that our algorithm for the "any matching" formulation significantly outperforms other methods in terms of accuracy while achieving nearly maximum coverage.

  5. A simple and accurate two-step long DNA sequences synthesis strategy to improve heterologous gene expression in pichia.

    Directory of Open Access Journals (Sweden)

    Jiang-Ke Yang

    Full Text Available In vitro gene chemical synthesis is a powerful tool to improve the expression of gene in heterologous system. In this study, a two-step gene synthesis strategy that combines an assembly PCR and an overlap extension PCR (AOE was developed. In this strategy, the chemically synthesized oligonucleotides were assembled into several 200-500 bp fragments with 20-25 bp overlap at each end by assembly PCR, and then an overlap extension PCR was conducted to assemble all these fragments into a full length DNA sequence. Using this method, we de novo designed and optimized the codon of Rhizopus oryzae lipase gene ROL (810 bp and Aspergillus niger phytase gene phyA (1404 bp. Compared with the original ROL gene and phyA gene, the codon-optimized genes expressed at a significantly higher level in yeasts after methanol induction. We believe this AOE method to be of special interest as it is simple, accurate and has no limitation with respect to the size of the gene to be synthesized. Combined with de novo design, this method allows the rapid synthesis of a gene optimized for expression in the system of choice and production of sufficient biological material for molecular characterization and biotechnological application.

  6. Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

    Science.gov (United States)

    Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

    2014-01-16

    To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high

  7. A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status

    Science.gov (United States)

    Bastani, Meysam; Vos, Larissa; Asgarian, Nasimeh; Deschenes, Jean; Graham, Kathryn; Mackey, John; Greiner, Russell

    2013-01-01

    Background Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. Methods To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. Results This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. Conclusions Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions. PMID:24312637

  8. Identification of a 251 gene expression signature that can accurately detect M. tuberculosis in patients with and without HIV co-infection.

    Directory of Open Access Journals (Sweden)

    Noor Dawany

    Full Text Available BACKGROUND: Co-infection with tuberculosis (TB is the leading cause of death in HIV-infected individuals. However, diagnosis of TB, especially in the presence of an HIV co-infection, can be limiting due to the high inaccuracy associated with the use of conventional diagnostic methods. Here we report a gene signature that can identify a tuberculosis infection in patients co-infected with HIV as well as in the absence of HIV. METHODS: We analyzed global gene expression data from peripheral blood mononuclear cell (PBMC samples of patients that were either mono-infected with HIV or co-infected with HIV/TB and used support vector machines to identify a gene signature that can distinguish between the two classes. We then validated our results using publically available gene expression data from patients mono-infected with TB. RESULTS: Our analysis successfully identified a 251-gene signature that accurately distinguishes patients co-infected with HIV/TB from those infected with HIV only, with an overall accuracy of 81.4% (sensitivity = 76.2%, specificity = 86.4%. Furthermore, we show that our 251-gene signature can also accurately distinguish patients with active TB in the absence of an HIV infection from both patients with a latent TB infection and healthy controls (88.9-94.7% accuracy; 69.2-90% sensitivity and 90.3-100% specificity. We also demonstrate that the expression levels of the 251-gene signature diminish as a correlate of the length of TB treatment. CONCLUSIONS: A 251-gene signature is described to (a detect TB in the presence or absence of an HIV co-infection, and (b assess response to treatment following anti-TB therapy.

  9. An efficient and accurate method for computation of energy release rates in beam structures with longitudinal cracks

    DEFF Research Database (Denmark)

    Blasques, José Pedro Albergaria Amaral; Bitsche, Robert

    2015-01-01

    This paper proposes a novel, efficient, and accurate framework for fracture analysis of beam structures with longitudinal cracks. The three-dimensional local stress field is determined using a high-fidelity beam model incorporating a finite element based cross section analysis tool. The Virtual...... Crack Closure Technique is used for computation of strain energy release rates. The devised framework was employed for analysis of cracks in beams with different cross section geometries. The results show that the accuracy of the proposed method is comparable to that of conventional three......-dimensional solid finite element models while using only a fraction of the computation time....

  10. [Key effect genes responding to nerve injury identified by gene ontology and computer pattern recognition].

    Science.gov (United States)

    Pan, Qian; Peng, Jin; Zhou, Xue; Yang, Hao; Zhang, Wei

    2012-07-01

    In order to screen out important genes from large gene data of gene microarray after nerve injury, we combine gene ontology (GO) method and computer pattern recognition technology to find key genes responding to nerve injury, and then verify one of these screened-out genes. Data mining and gene ontology analysis of gene chip data GSE26350 was carried out through MATLAB software. Cd44 was selected from screened-out key gene molecular spectrum by comparing genes' different GO terms and positions on score map of principal component. Function interferences were employed to influence the normal binding of Cd44 and one of its ligands, chondroitin sulfate C (CSC), to observe neurite extension. Gene ontology analysis showed that the first genes on score map (marked by red *) mainly distributed in molecular transducer activity, receptor activity, protein binding et al molecular function GO terms. Cd44 is one of six effector protein genes, and attracted us with its function diversity. After adding different reagents into the medium to interfere the normal binding of CSC and Cd44, varying-degree remissions of CSC's inhibition on neurite extension were observed. CSC can inhibit neurite extension through binding Cd44 on the neuron membrane. This verifies that important genes in given physiological processes can be identified by gene ontology analysis of gene chip data.

  11. Gene cluster statistics with gene families.

    Science.gov (United States)

    Raghupathy, Narayanan; Durand, Dannie

    2009-05-01

    Identifying genomic regions that descended from a common ancestor is important for understanding the function and evolution of genomes. In distantly related genomes, clusters of homologous gene pairs are evidence of candidate homologous regions. Demonstrating the statistical significance of such "gene clusters" is an essential component of comparative genomic analyses. However, currently there are no practical statistical tests for gene clusters that model the influence of the number of homologs in each gene family on cluster significance. In this work, we demonstrate empirically that failure to incorporate gene family size in gene cluster statistics results in overestimation of significance, leading to incorrect conclusions. We further present novel analytical methods for estimating gene cluster significance that take gene family size into account. Our methods do not require complete genome data and are suitable for testing individual clusters found in local regions, such as contigs in an unfinished assembly. We consider pairs of regions drawn from the same genome (paralogous clusters), as well as regions drawn from two different genomes (orthologous clusters). Determining cluster significance under general models of gene family size is computationally intractable. By assuming that all gene families are of equal size, we obtain analytical expressions that allow fast approximation of cluster probabilities. We evaluate the accuracy of this approximation by comparing the resulting gene cluster probabilities with cluster probabilities obtained by simulating a realistic, power-law distributed model of gene family size, with parameters inferred from genomic data. Surprisingly, despite the simplicity of the underlying assumption, our method accurately approximates the true cluster probabilities. It slightly overestimates these probabilities, yielding a conservative test. We present additional simulation results indicating the best choice of parameter values for data

  12. Efficient reconfigurable hardware architecture for accurately computing success probability and data complexity of linear attacks

    DEFF Research Database (Denmark)

    Bogdanov, Andrey; Kavun, Elif Bilge; Tischhauser, Elmar

    2012-01-01

    An accurate estimation of the success probability and data complexity of linear cryptanalysis is a fundamental question in symmetric cryptography. In this paper, we propose an efficient reconfigurable hardware architecture to compute the success probability and data complexity of Matsui's Algorithm...... block lengths ensures that any empirical observations are not due to differences in statistical behavior for artificially small block lengths. Rather surprisingly, we observed in previous experiments a significant deviation between the theory and practice for Matsui's Algorithm 2 for larger block sizes...

  13. A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status.

    Directory of Open Access Journals (Sweden)

    Meysam Bastani

    Full Text Available BACKGROUND: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. METHODS: To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. RESULTS: This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. CONCLUSIONS: Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions.

  14. SELANSI: a toolbox for simulation of stochastic gene regulatory networks.

    Science.gov (United States)

    Pájaro, Manuel; Otero-Muras, Irene; Vázquez, Carlos; Alonso, Antonio A

    2018-03-01

    Gene regulation is inherently stochastic. In many applications concerning Systems and Synthetic Biology such as the reverse engineering and the de novo design of genetic circuits, stochastic effects (yet potentially crucial) are often neglected due to the high computational cost of stochastic simulations. With advances in these fields there is an increasing need of tools providing accurate approximations of the stochastic dynamics of gene regulatory networks (GRNs) with reduced computational effort. This work presents SELANSI (SEmi-LAgrangian SImulation of GRNs), a software toolbox for the simulation of stochastic multidimensional gene regulatory networks. SELANSI exploits intrinsic structural properties of gene regulatory networks to accurately approximate the corresponding Chemical Master Equation with a partial integral differential equation that is solved by a semi-lagrangian method with high efficiency. Networks under consideration might involve multiple genes with self and cross regulations, in which genes can be regulated by different transcription factors. Moreover, the validity of the method is not restricted to a particular type of kinetics. The tool offers total flexibility regarding network topology, kinetics and parameterization, as well as simulation options. SELANSI runs under the MATLAB environment, and is available under GPLv3 license at https://sites.google.com/view/selansi. antonio@iim.csic.es. © The Author(s) 2017. Published by Oxford University Press.

  15. Inference of cancer-specific gene regulatory networks using soft computing rules.

    Science.gov (United States)

    Wang, Xiaosheng; Gotoh, Osamu

    2010-03-24

    Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer) using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.

  16. Computational Identification of Novel Genes: Current and Future Perspectives.

    Science.gov (United States)

    Klasberg, Steffen; Bitard-Feildel, Tristan; Mallet, Ludovic

    2016-01-01

    While it has long been thought that all genomic novelties are derived from the existing material, many genes lacking homology to known genes were found in recent genome projects. Some of these novel genes were proposed to have evolved de novo, ie, out of noncoding sequences, whereas some have been shown to follow a duplication and divergence process. Their discovery called for an extension of the historical hypotheses about gene origination. Besides the theoretical breakthrough, increasing evidence accumulated that novel genes play important roles in evolutionary processes, including adaptation and speciation events. Different techniques are available to identify genes and classify them as novel. Their classification as novel is usually based on their similarity to known genes, or lack thereof, detected by comparative genomics or against databases. Computational approaches are further prime methods that can be based on existing models or leveraging biological evidences from experiments. Identification of novel genes remains however a challenging task. With the constant software and technologies updates, no gold standard, and no available benchmark, evaluation and characterization of genomic novelty is a vibrant field. In this review, the classical and state-of-the-art tools for gene prediction are introduced. The current methods for novel gene detection are presented; the methodological strategies and their limits are discussed along with perspective approaches for further studies.

  17. Development of a Computational Steering Framework for High Performance Computing Environments on Blue Gene/P Systems

    KAUST Repository

    Danani, Bob K.

    2012-07-01

    Computational steering has revolutionized the traditional workflow in high performance computing (HPC) applications. The standard workflow that consists of preparation of an application’s input, running of a simulation, and visualization of simulation results in a post-processing step is now transformed into a real-time interactive workflow that significantly reduces development and testing time. Computational steering provides the capability to direct or re-direct the progress of a simulation application at run-time. It allows modification of application-defined control parameters at run-time using various user-steering applications. In this project, we propose a computational steering framework for HPC environments that provides an innovative solution and easy-to-use platform, which allows users to connect and interact with running application(s) in real-time. This framework uses RealityGrid as the underlying steering library and adds several enhancements to the library to enable steering support for Blue Gene systems. Included in the scope of this project is the development of a scalable and efficient steering relay server that supports many-to-many connectivity between multiple steered applications and multiple steering clients. Steered applications can range from intermediate simulation and physical modeling applications to complex computational fluid dynamics (CFD) applications or advanced visualization applications. The Blue Gene supercomputer presents special challenges for remote access because the compute nodes reside on private networks. This thesis presents an implemented solution and demonstrates it on representative applications. Thorough implementation details and application enablement steps are also presented in this thesis to encourage direct usage of this framework.

  18. How to perform RT-qPCR accurately in plant species? A case study on flower colour gene expression in an azalea (Rhododendron simsii hybrids) mapping population.

    Science.gov (United States)

    De Keyser, Ellen; Desmet, Laurence; Van Bockstaele, Erik; De Riek, Jan

    2013-06-24

    Flower colour variation is one of the most crucial selection criteria in the breeding of a flowering pot plant, as is also the case for azalea (Rhododendron simsii hybrids). Flavonoid biosynthesis was studied intensively in several species. In azalea, flower colour can be described by means of a 3-gene model. However, this model does not clarify pink-coloration. The last decade gene expression studies have been implemented widely for studying flower colour. However, the methods used were often only semi-quantitative or quantification was not done according to the MIQE-guidelines. We aimed to develop an accurate protocol for RT-qPCR and to validate the protocol to study flower colour in an azalea mapping population. An accurate RT-qPCR protocol had to be established. RNA quality was evaluated in a combined approach by means of different techniques e.g. SPUD-assay and Experion-analysis. We demonstrated the importance of testing noRT-samples for all genes under study to detect contaminating DNA. In spite of the limited sequence information available, we prepared a set of 11 reference genes which was validated in flower petals; a combination of three reference genes was most optimal. Finally we also used plasmids for the construction of standard curves. This allowed us to calculate gene-specific PCR efficiencies for every gene to assure an accurate quantification. The validity of the protocol was demonstrated by means of the study of six genes of the flavonoid biosynthesis pathway. No correlations were found between flower colour and the individual expression profiles. However, the combination of early pathway genes (CHS, F3H, F3'H and FLS) is clearly related to co-pigmentation with flavonols. The late pathway genes DFR and ANS are to a minor extent involved in differentiating between coloured and white flowers. Concerning pink coloration, we could demonstrate that the lower intensity in this type of flowers is correlated to the expression of F3'H. Currently in plant

  19. FILMPAR: A parallel algorithm designed for the efficient and accurate computation of thin film flow on functional surfaces containing micro-structure

    Science.gov (United States)

    Lee, Y. C.; Thompson, H. M.; Gaskell, P. H.

    2009-12-01

    , industrial and physical applications. However, despite recent modelling advances, the accurate numerical solution of the equations governing such problems is still at a relatively early stage. Indeed, recent studies employing a simplifying long-wave approximation have shown that highly efficient numerical methods are necessary to solve the resulting lubrication equations in order to achieve the level of grid resolution required to accurately capture the effects of micro- and nano-scale topographical features. Solution method: A portable parallel multigrid algorithm has been developed for the above purpose, for the particular case of flow over submerged topographical features. Within the multigrid framework adopted, a W-cycle is used to accelerate convergence in respect of the time dependent nature of the problem, with relaxation sweeps performed using a fixed number of pre- and post-Red-Black Gauss-Seidel Newton iterations. In addition, the algorithm incorporates automatic adaptive time-stepping to avoid the computational expense associated with repeated time-step failure. Running time: 1.31 minutes using 128 processors on BlueGene/P with a problem size of over 16.7 million mesh points.

  20. Inference of Cancer-specific Gene Regulatory Networks Using Soft Computing Rules

    Directory of Open Access Journals (Sweden)

    Xiaosheng Wang

    2010-03-01

    Full Text Available Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.

  1. Accurate computation of Mathieu functions

    CERN Document Server

    Bibby, Malcolm M

    2013-01-01

    This lecture presents a modern approach for the computation of Mathieu functions. These functions find application in boundary value analysis such as electromagnetic scattering from elliptic cylinders and flat strips, as well as the analogous acoustic and optical problems, and many other applications in science and engineering. The authors review the traditional approach used for these functions, show its limitations, and provide an alternative ""tuned"" approach enabling improved accuracy and convergence. The performance of this approach is investigated for a wide range of parameters and mach

  2. Synthetic tetracycline-inducible regulatory networks: computer-aided design of dynamic phenotypes

    Directory of Open Access Journals (Sweden)

    Kaznessis Yiannis N

    2007-01-01

    Full Text Available Abstract Background Tightly regulated gene networks, precisely controlling the expression of protein molecules, have received considerable interest by the biomedical community due to their promising applications. Among the most well studied inducible transcription systems are the tetracycline regulatory expression systems based on the tetracycline resistance operon of Escherichia coli, Tet-Off (tTA and Tet-On (rtTA. Despite their initial success and improved designs, limitations still persist, such as low inducer sensitivity. Instead of looking at these networks statically, and simply changing or mutating the promoter and operator regions with trial and error, a systematic investigation of the dynamic behavior of the network can result in rational design of regulatory gene expression systems. Sophisticated algorithms can accurately capture the dynamical behavior of gene networks. With computer aided design, we aim to improve the synthesis of regulatory networks and propose new designs that enable tighter control of expression. Results In this paper we engineer novel networks by recombining existing genes or part of genes. We synthesize four novel regulatory networks based on the Tet-Off and Tet-On systems. We model all the known individual biomolecular interactions involved in transcription, translation, regulation and induction. With multiple time-scale stochastic-discrete and stochastic-continuous models we accurately capture the transient and steady state dynamics of these networks. Important biomolecular interactions are identified and the strength of the interactions engineered to satisfy design criteria. A set of clear design rules is developed and appropriate mutants of regulatory proteins and operator sites are proposed. Conclusion The complexity of biomolecular interactions is accurately captured through computer simulations. Computer simulations allow us to look into the molecular level, portray the dynamic behavior of gene regulatory

  3. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    Science.gov (United States)

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  4. Development of highly accurate approximate scheme for computing the charge transfer integral

    Energy Technology Data Exchange (ETDEWEB)

    Pershin, Anton; Szalay, Péter G. [Laboratory for Theoretical Chemistry, Institute of Chemistry, Eötvös Loránd University, P.O. Box 32, H-1518 Budapest (Hungary)

    2015-08-21

    The charge transfer integral is a key parameter required by various theoretical models to describe charge transport properties, e.g., in organic semiconductors. The accuracy of this important property depends on several factors, which include the level of electronic structure theory and internal simplifications of the applied formalism. The goal of this paper is to identify the performance of various approximate approaches of the latter category, while using the high level equation-of-motion coupled cluster theory for the electronic structure. The calculations have been performed on the ethylene dimer as one of the simplest model systems. By studying different spatial perturbations, it was shown that while both energy split in dimer and fragment charge difference methods are equivalent with the exact formulation for symmetrical displacements, they are less efficient when describing transfer integral along the asymmetric alteration coordinate. Since the “exact” scheme was found computationally expensive, we examine the possibility to obtain the asymmetric fluctuation of the transfer integral by a Taylor expansion along the coordinate space. By exploring the efficiency of this novel approach, we show that the Taylor expansion scheme represents an attractive alternative to the “exact” calculations due to a substantial reduction of computational costs, when a considerably large region of the potential energy surface is of interest. Moreover, we show that the Taylor expansion scheme, irrespective of the dimer symmetry, is very accurate for the entire range of geometry fluctuations that cover the space the molecule accesses at room temperature.

  5. Beyond mean-field approximations for accurate and computationally efficient models of on-lattice chemical kinetics

    Science.gov (United States)

    Pineda, M.; Stamatakis, M.

    2017-07-01

    Modeling the kinetics of surface catalyzed reactions is essential for the design of reactors and chemical processes. The majority of microkinetic models employ mean-field approximations, which lead to an approximate description of catalytic kinetics by assuming spatially uncorrelated adsorbates. On the other hand, kinetic Monte Carlo (KMC) methods provide a discrete-space continuous-time stochastic formulation that enables an accurate treatment of spatial correlations in the adlayer, but at a significant computation cost. In this work, we use the so-called cluster mean-field approach to develop higher order approximations that systematically increase the accuracy of kinetic models by treating spatial correlations at a progressively higher level of detail. We further demonstrate our approach on a reduced model for NO oxidation incorporating first nearest-neighbor lateral interactions and construct a sequence of approximations of increasingly higher accuracy, which we compare with KMC and mean-field. The latter is found to perform rather poorly, overestimating the turnover frequency by several orders of magnitude for this system. On the other hand, our approximations, while more computationally intense than the traditional mean-field treatment, still achieve tremendous computational savings compared to KMC simulations, thereby opening the way for employing them in multiscale modeling frameworks.

  6. Computational Tools and Algorithms for Designing Customized Synthetic Genes

    Energy Technology Data Exchange (ETDEWEB)

    Gould, Nathan [Department of Computer Science, The College of New Jersey, Ewing, NJ (United States); Hendy, Oliver [Department of Biology, The College of New Jersey, Ewing, NJ (United States); Papamichail, Dimitris, E-mail: papamicd@tcnj.edu [Department of Computer Science, The College of New Jersey, Ewing, NJ (United States)

    2014-10-06

    Advances in DNA synthesis have enabled the construction of artificial genes, gene circuits, and genomes of bacterial scale. Freedom in de novo design of synthetic constructs provides significant power in studying the impact of mutations in sequence features, and verifying hypotheses on the functional information that is encoded in nucleic and amino acids. To aid this goal, a large number of software tools of variable sophistication have been implemented, enabling the design of synthetic genes for sequence optimization based on rationally defined properties. The first generation of tools dealt predominantly with singular objectives such as codon usage optimization and unique restriction site incorporation. Recent years have seen the emergence of sequence design tools that aim to evolve sequences toward combinations of objectives. The design of optimal protein-coding sequences adhering to multiple objectives is computationally hard, and most tools rely on heuristics to sample the vast sequence design space. In this review, we study some of the algorithmic issues behind gene optimization and the approaches that different tools have adopted to redesign genes and optimize desired coding features. We utilize test cases to demonstrate the efficiency of each approach, as well as identify their strengths and limitations.

  7. Computational Tools and Algorithms for Designing Customized Synthetic Genes

    International Nuclear Information System (INIS)

    Gould, Nathan; Hendy, Oliver; Papamichail, Dimitris

    2014-01-01

    Advances in DNA synthesis have enabled the construction of artificial genes, gene circuits, and genomes of bacterial scale. Freedom in de novo design of synthetic constructs provides significant power in studying the impact of mutations in sequence features, and verifying hypotheses on the functional information that is encoded in nucleic and amino acids. To aid this goal, a large number of software tools of variable sophistication have been implemented, enabling the design of synthetic genes for sequence optimization based on rationally defined properties. The first generation of tools dealt predominantly with singular objectives such as codon usage optimization and unique restriction site incorporation. Recent years have seen the emergence of sequence design tools that aim to evolve sequences toward combinations of objectives. The design of optimal protein-coding sequences adhering to multiple objectives is computationally hard, and most tools rely on heuristics to sample the vast sequence design space. In this review, we study some of the algorithmic issues behind gene optimization and the approaches that different tools have adopted to redesign genes and optimize desired coding features. We utilize test cases to demonstrate the efficiency of each approach, as well as identify their strengths and limitations.

  8. DrugSig: A resource for computational drug repositioning utilizing gene expression signatures.

    Directory of Open Access Journals (Sweden)

    Hongyu Wu

    Full Text Available Computational drug repositioning has been proved as an effective approach to develop new drug uses. However, currently existing strategies strongly rely on drug response gene signatures which scattered in separated or individual experimental data, and resulted in low efficient outputs. So, a fully drug response gene signatures database will be very helpful to these methods. We collected drug response microarray data and annotated related drug and targets information from public databases and scientific literature. By selecting top 500 up-regulated and down-regulated genes as drug signatures, we manually established the DrugSig database. Currently DrugSig contains more than 1300 drugs, 7000 microarray and 800 targets. Moreover, we developed the signature based and target based functions to aid drug repositioning. The constructed database can serve as a resource to quicken computational drug repositioning. Database URL: http://biotechlab.fudan.edu.cn/database/drugsig/.

  9. Spectrally accurate contour dynamics

    International Nuclear Information System (INIS)

    Van Buskirk, R.D.; Marcus, P.S.

    1994-01-01

    We present an exponentially accurate boundary integral method for calculation the equilibria and dynamics of piece-wise constant distributions of potential vorticity. The method represents contours of potential vorticity as a spectral sum and solves the Biot-Savart equation for the velocity by spectrally evaluating a desingularized contour integral. We use the technique in both an initial-value code and a newton continuation method. Our methods are tested by comparing the numerical solutions with known analytic results, and it is shown that for the same amount of computational work our spectral methods are more accurate than other contour dynamics methods currently in use

  10. How accurate are adolescents in portion-size estimation using the computer tool Young Adolescents' Nutrition Assessment on Computer (YANA-C)?

    Science.gov (United States)

    Vereecken, Carine; Dohogne, Sophie; Covents, Marc; Maes, Lea

    2010-06-01

    Computer-administered questionnaires have received increased attention for large-scale population research on nutrition. In Belgium-Flanders, Young Adolescents' Nutrition Assessment on Computer (YANA-C) has been developed. In this tool, standardised photographs are available to assist in portion-size estimation. The purpose of the present study is to assess how accurate adolescents are in estimating portion sizes of food using YANA-C. A convenience sample, aged 11-17 years, estimated the amounts of ten commonly consumed foods (breakfast cereals, French fries, pasta, rice, apple sauce, carrots and peas, crisps, creamy velouté, red cabbage, and peas). Two procedures were followed: (1) short-term recall: adolescents (n 73) self-served their usual portions of the ten foods and estimated the amounts later the same day; (2) real-time perception: adolescents (n 128) estimated two sets (different portions) of pre-weighed portions displayed near the computer. Self-served portions were, on average, 8 % underestimated; significant underestimates were found for breakfast cereals, French fries, peas, and carrots and peas. Spearman's correlations between the self-served and estimated weights varied between 0.51 and 0.84, with an average of 0.72. The kappa statistics were moderate (>0.4) for all but one item. Pre-weighed portions were, on average, 15 % underestimated, with significant underestimates for fourteen of the twenty portions. Photographs of food items can serve as a good aid in ranking subjects; however, to assess the actual intake at a group level, underestimation must be considered.

  11. Indexed variation graphs for efficient and accurate resistome profiling.

    Science.gov (United States)

    Rowe, Will P M; Winn, Martyn D

    2018-05-14

    Antimicrobial resistance remains a major threat to global health. Profiling the collective antimicrobial resistance genes within a metagenome (the "resistome") facilitates greater understanding of antimicrobial resistance gene diversity and dynamics. In turn, this can allow for gene surveillance, individualised treatment of bacterial infections and more sustainable use of antimicrobials. However, resistome profiling can be complicated by high similarity between reference genes, as well as the sheer volume of sequencing data and the complexity of analysis workflows. We have developed an efficient and accurate method for resistome profiling that addresses these complications and improves upon currently available tools. Our method combines a variation graph representation of gene sets with an LSH Forest indexing scheme to allow for fast classification of metagenomic sequence reads using similarity-search queries. Subsequent hierarchical local alignment of classified reads against graph traversals enables accurate reconstruction of full-length gene sequences using a scoring scheme. We provide our implementation, GROOT, and show it to be both faster and more accurate than a current reference-dependent tool for resistome profiling. GROOT runs on a laptop and can process a typical 2 gigabyte metagenome in 2 minutes using a single CPU. Our method is not restricted to resistome profiling and has the potential to improve current metagenomic workflows. GROOT is written in Go and is available at https://github.com/will-rowe/groot (MIT license). will.rowe@stfc.ac.uk. Supplementary data are available at Bioinformatics online.

  12. Computational prediction and experimental validation of Ciona intestinalis microRNA genes

    Directory of Open Access Journals (Sweden)

    Pasquinelli Amy E

    2007-11-01

    Full Text Available Abstract Background This study reports the first collection of validated microRNA genes in the sea squirt, Ciona intestinalis. MicroRNAs are processed from hairpin precursors to ~22 nucleotide RNAs that base pair to target mRNAs and inhibit expression. As a member of the subphylum Urochordata (Tunicata whose larval form has a notochord, the sea squirt is situated at the emergence of vertebrates, and therefore may provide information about the evolution of molecular regulators of early development. Results In this study, computational methods were used to predict 14 microRNA gene families in Ciona intestinalis. The microRNA prediction algorithm utilizes configurable microRNA sequence conservation and stem-loop specificity parameters, grouping by miRNA family, and phylogenetic conservation to the related species, Ciona savignyi. The expression for 8, out of 9 attempted, of the putative microRNAs in the adult tissue of Ciona intestinalis was validated by Northern blot analyses. Additionally, a target prediction algorithm was implemented, which identified a high confidence list of 240 potential target genes. Over half of the predicted targets can be grouped into the gene ontology categories of metabolism, transport, regulation of transcription, and cell signaling. Conclusion The computational techniques implemented in this study can be applied to other organisms and serve to increase the understanding of the origins of non-coding RNAs, embryological and cellular developmental pathways, and the mechanisms for microRNA-controlled gene regulatory networks.

  13. GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.

    Science.gov (United States)

    Cumbie, Jason S; Kimbrel, Jeffrey A; Di, Yanming; Schafer, Daniel W; Wilhelm, Larry J; Fox, Samuel E; Sullivan, Christopher M; Curzon, Aron D; Carrington, James C; Mockler, Todd C; Chang, Jeff H

    2011-01-01

    GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.

  14. GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.

    Directory of Open Access Journals (Sweden)

    Jason S Cumbie

    Full Text Available GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.

  15. Assessing smoking status in disadvantaged populations: is computer administered self report an accurate and acceptable measure?

    Directory of Open Access Journals (Sweden)

    Bryant Jamie

    2011-11-01

    Full Text Available Abstract Background Self report of smoking status is potentially unreliable in certain situations and in high-risk populations. This study aimed to determine the accuracy and acceptability of computer administered self-report of smoking status among a low socioeconomic (SES population. Methods Clients attending a community service organisation for welfare support were invited to complete a cross-sectional touch screen computer health survey. Following survey completion, participants were invited to provide a breath sample to measure exposure to tobacco smoke in expired air. Sensitivity, specificity, positive predictive value and negative predictive value were calculated. Results Three hundred and eighty three participants completed the health survey, and 330 (86% provided a breath sample. Of participants included in the validation analysis, 59% reported being a daily or occasional smoker. Sensitivity was 94.4% and specificity 92.8%. The positive and negative predictive values were 94.9% and 92.0% respectively. The majority of participants reported that the touch screen survey was both enjoyable (79% and easy (88% to complete. Conclusions Computer administered self report is both acceptable and accurate as a method of assessing smoking status among low SES smokers in a community setting. Routine collection of health information using touch-screen computer has the potential to identify smokers and increase provision of support and referral in the community setting.

  16. Accurate Computation of Periodic Regions' Centers in the General M-Set with Integer Index Number

    Directory of Open Access Journals (Sweden)

    Wang Xingyuan

    2010-01-01

    Full Text Available This paper presents two methods for accurately computing the periodic regions' centers. One method fits for the general M-sets with integer index number, the other fits for the general M-sets with negative integer index number. Both methods improve the precision of computation by transforming the polynomial equations which determine the periodic regions' centers. We primarily discuss the general M-sets with negative integer index, and analyze the relationship between the number of periodic regions' centers on the principal symmetric axis and in the principal symmetric interior. We can get the centers' coordinates with at least 48 significant digits after the decimal point in both real and imaginary parts by applying the Newton's method to the transformed polynomial equation which determine the periodic regions' centers. In this paper, we list some centers' coordinates of general M-sets' k-periodic regions (k=3,4,5,6 for the index numbers α=−25,−24,…,−1 , all of which have highly numerical accuracy.

  17. Accurate Assessment of Computed Order Tracking

    Directory of Open Access Journals (Sweden)

    P.N. Saavedra

    2006-01-01

    Full Text Available Spectral vibration analysis using the Fourier transform is the most common technique for evaluating the mechanical condition of machinery working in stationary regimen. However, machinery operating in transient modes, such as variable speed equipment, generates spectra with distinct frequency content at each time, and the standard approach is not directly applicable for diagnostic. The "order tracking" technique is a suitable tool for analyzing variable speed machines. We have studied the computed order tracking (COT, and a new computed procedure is proposed for solving the indeterminate results generated by the traditional method at constant speed. The effect on the accuracy of the assumptions inherent in the COT was assessed using data from various simulations. The use of these simulations allowed us to determine the effect on the overall true accuracy of the method of different user-defined factors: the signal and tachometric pulse sampling frequency, the method of amplitude interpolation, and the number of tachometric pulses per revolution. Tests on real data measured on the main transmissions of a mining shovel were carried out, and we concluded that the new method is appropriate for the condition monitoring of this type of machine.

  18. A new computational method for the detection of horizontal gene transfer events.

    Science.gov (United States)

    Tsirigos, Aristotelis; Rigoutsos, Isidore

    2005-01-01

    In recent years, the increase in the amounts of available genomic data has made it easier to appreciate the extent by which organisms increase their genetic diversity through horizontally transferred genetic material. Such transfers have the potential to give rise to extremely dynamic genomes where a significant proportion of their coding DNA has been contributed by external sources. Because of the impact of these horizontal transfers on the ecological and pathogenic character of the recipient organisms, methods are continuously sought that are able to computationally determine which of the genes of a given genome are products of transfer events. In this paper, we introduce and discuss a novel computational method for identifying horizontal transfers that relies on a gene's nucleotide composition and obviates the need for knowledge of codon boundaries. In addition to being applicable to individual genes, the method can be easily extended to the case of clusters of horizontally transferred genes. With the help of an extensive and carefully designed set of experiments on 123 archaeal and bacterial genomes, we demonstrate that the new method exhibits significant improvement in sensitivity when compared to previously published approaches. In fact, it achieves an average relative improvement across genomes of between 11 and 41% compared to the Codon Adaptation Index method in distinguishing native from foreign genes. Our method's horizontal gene transfer predictions for 123 microbial genomes are available online at http://cbcsrv.watson.ibm.com/HGT/.

  19. Computational Tools and Algorithms for Designing Customized Synthetic Genes

    Directory of Open Access Journals (Sweden)

    Nathan eGould

    2014-10-01

    Full Text Available Advances in DNA synthesis have enabled the construction of artificial genes, gene circuits, and genomes of bacterial scale. Freedom in de-novo design of synthetic constructs provides significant power in studying the impact of mutations in sequence features, and verifying hypotheses on the functional information that is encoded in nucleic and amino acids. To aid this goal, a large number of software tools of variable sophistication have been implemented, enabling the design of synthetic genes for sequence optimization based on rationally defined properties. The first generation of tools dealt predominantly with singular objectives such as codon usage optimization and unique restriction site incorporation. Recent years have seen the emergence of sequence design tools that aim to evolve sequences toward combinations of objectives. The design of optimal protein coding sequences adhering to multiple objectives is computationally hard, and most tools rely on heuristics to sample the vast sequence design space. In this review we study some of the algorithmic issues behind gene optimization and the approaches that different tools have adopted to redesign genes and optimize desired coding features. We utilize test cases to demonstrate the efficiency of each approach, as well as identify their strengths and limitations.

  20. A computational methodology for formulating gasoline surrogate fuels with accurate physical and chemical kinetic properties

    KAUST Repository

    Ahmed, Ahfaz

    2015-03-01

    Gasoline is the most widely used fuel for light duty automobile transportation, but its molecular complexity makes it intractable to experimentally and computationally study the fundamental combustion properties. Therefore, surrogate fuels with a simpler molecular composition that represent real fuel behavior in one or more aspects are needed to enable repeatable experimental and computational combustion investigations. This study presents a novel computational methodology for formulating surrogates for FACE (fuels for advanced combustion engines) gasolines A and C by combining regression modeling with physical and chemical kinetics simulations. The computational methodology integrates simulation tools executed across different software platforms. Initially, the palette of surrogate species and carbon types for the target fuels were determined from a detailed hydrocarbon analysis (DHA). A regression algorithm implemented in MATLAB was linked to REFPROP for simulation of distillation curves and calculation of physical properties of surrogate compositions. The MATLAB code generates a surrogate composition at each iteration, which is then used to automatically generate CHEMKIN input files that are submitted to homogeneous batch reactor simulations for prediction of research octane number (RON). The regression algorithm determines the optimal surrogate composition to match the fuel properties of FACE A and C gasoline, specifically hydrogen/carbon (H/C) ratio, density, distillation characteristics, carbon types, and RON. The optimal surrogate fuel compositions obtained using the present computational approach was compared to the real fuel properties, as well as with surrogate compositions available in the literature. Experiments were conducted within a Cooperative Fuels Research (CFR) engine operating under controlled autoignition (CAI) mode to compare the formulated surrogates against the real fuels. Carbon monoxide measurements indicated that the proposed surrogates

  1. A hybrid computational method for the discovery of novel reproduction-related genes.

    Science.gov (United States)

    Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Guohua; Huang, Tao; Cai, Yu-Dong

    2015-01-01

    Uncovering the molecular mechanisms underlying reproduction is of great importance to infertility treatment and to the generation of healthy offspring. In this study, we discovered novel reproduction-related genes with a hybrid computational method, integrating three different types of method, which offered new clues for further reproduction research. This method was first executed on a weighted graph, constructed based on known protein-protein interactions, to search the shortest paths connecting any two known reproduction-related genes. Genes occurring in these paths were deemed to have a special relationship with reproduction. These newly discovered genes were filtered with a randomization test. Then, the remaining genes were further selected according to their associations with known reproduction-related genes measured by protein-protein interaction score and alignment score obtained by BLAST. The in-depth analysis of the high confidence novel reproduction genes revealed hidden mechanisms of reproduction and provided guidelines for further experimental validations.

  2. A hybrid solution using computational prediction and measured data to accurately determine process corrections with reduced overlay sampling

    Science.gov (United States)

    Noyes, Ben F.; Mokaberi, Babak; Mandoy, Ram; Pate, Alex; Huijgen, Ralph; McBurney, Mike; Chen, Owen

    2017-03-01

    Reducing overlay error via an accurate APC feedback system is one of the main challenges in high volume production of the current and future nodes in the semiconductor industry. The overlay feedback system directly affects the number of dies meeting overlay specification and the number of layers requiring dedicated exposure tools through the fabrication flow. Increasing the former number and reducing the latter number is beneficial for the overall efficiency and yield of the fabrication process. An overlay feedback system requires accurate determination of the overlay error, or fingerprint, on exposed wafers in order to determine corrections to be automatically and dynamically applied to the exposure of future wafers. Since current and future nodes require correction per exposure (CPE), the resolution of the overlay fingerprint must be high enough to accommodate CPE in the overlay feedback system, or overlay control module (OCM). Determining a high resolution fingerprint from measured data requires extremely dense overlay sampling that takes a significant amount of measurement time. For static corrections this is acceptable, but in an automated dynamic correction system this method creates extreme bottlenecks for the throughput of said system as new lots have to wait until the previous lot is measured. One solution is using a less dense overlay sampling scheme and employing computationally up-sampled data to a dense fingerprint. That method uses a global fingerprint model over the entire wafer; measured localized overlay errors are therefore not always represented in its up-sampled output. This paper will discuss a hybrid system shown in Fig. 1 that combines a computationally up-sampled fingerprint with the measured data to more accurately capture the actual fingerprint, including local overlay errors. Such a hybrid system is shown to result in reduced modelled residuals while determining the fingerprint, and better on-product overlay performance.

  3. Probability-based collaborative filtering model for predicting gene-disease associations.

    Science.gov (United States)

    Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan

    2017-12-28

    Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene-disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our model. Firstly, on the basis of a typical latent factorization model, we propose model I with an average heterogeneous regularization. Secondly, we develop modified model II with personal heterogeneous regularization to enhance the accuracy of aforementioned models. In this model, vector space similarity or Pearson correlation coefficient metrics and data on related species are also used. We compared the results of PCFM with the results of four state-of-arts approaches. The results show that PCFM performs better than other advanced approaches. PCFM model can be leveraged for predictions of disease genes, especially for new human genes or diseases with no known relationships.

  4. Effect of computational grid on accurate prediction of a wind turbine rotor using delayed detached-eddy simulations

    Energy Technology Data Exchange (ETDEWEB)

    Bangga, Galih; Weihing, Pascal; Lutz, Thorsten; Krämer, Ewald [University of Stuttgart, Stuttgart (Germany)

    2017-05-15

    The present study focuses on the impact of grid for accurate prediction of the MEXICO rotor under stalled conditions. Two different blade mesh topologies, O and C-H meshes, and two different grid resolutions are tested for several time step sizes. The simulations are carried out using Delayed detached-eddy simulation (DDES) with two eddy viscosity RANS turbulence models, namely Spalart- Allmaras (SA) and Menter Shear stress transport (SST) k-ω. A high order spatial discretization, WENO (Weighted essentially non- oscillatory) scheme, is used in these computations. The results are validated against measurement data with regards to the sectional loads and the chordwise pressure distributions. The C-H mesh topology is observed to give the best results employing the SST k-ω turbulence model, but the computational cost is more expensive as the grid contains a wake block that increases the number of cells.

  5. Accurate and efficient computation of synchrotron radiation functions

    International Nuclear Information System (INIS)

    MacLeod, Allan J.

    2000-01-01

    We consider the computation of three functions which appear in the theory of synchrotron radiation. These are F(x)=x∫x∞K 5/3 (y) dy))F p (x)=xK 2/3 (x) and G p (x)=x 1/3 K 1/3 (x), where K ν denotes a modified Bessel function. Chebyshev series coefficients are given which enable the functions to be computed with an accuracy of up to 15 sig. figures

  6. Computational design and application of endogenous promoters for transcriptionally targeted gene therapy for rheumatoid arthritis.

    NARCIS (Netherlands)

    Geurts, J.; Joosten, L.A.B.; Takahashi, N.; Arntz, O.J.; Gluck, A.; Bennink, M.B.; Berg, W.B. van den; Loo, F.A.J. van de

    2009-01-01

    The promoter regions of genes that are differentially regulated in the synovial membrane during the course of rheumatoid arthritis (RA) represent attractive candidates for application in transcriptionally targeted gene therapy. In this study, we applied an unbiased computational approach to define

  7. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

    Science.gov (United States)

    Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

    2012-01-01

    Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.

  8. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    CERN Document Server

    Abdurachmanov, David; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

    2014-01-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).

  9. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    Science.gov (United States)

    Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

    2015-05-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).

  10. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    International Nuclear Information System (INIS)

    Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Muzaffar, Shahzad; Knight, Robert

    2015-01-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG). (paper)

  11. Accurate overlaying for mobile augmented reality

    NARCIS (Netherlands)

    Pasman, W; van der Schaaf, A; Lagendijk, RL; Jansen, F.W.

    1999-01-01

    Mobile augmented reality requires accurate alignment of virtual information with objects visible in the real world. We describe a system for mobile communications to be developed to meet these strict alignment criteria using a combination of computer vision. inertial tracking and low-latency

  12. Methods for Computing Accurate Atomic Spin Moments for Collinear and Noncollinear Magnetism in Periodic and Nonperiodic Materials.

    Science.gov (United States)

    Manz, Thomas A; Sholl, David S

    2011-12-13

    The partitioning of electron spin density among atoms in a material gives atomic spin moments (ASMs), which are important for understanding magnetic properties. We compare ASMs computed using different population analysis methods and introduce a method for computing density derived electrostatic and chemical (DDEC) ASMs. Bader and DDEC ASMs can be computed for periodic and nonperiodic materials with either collinear or noncollinear magnetism, while natural population analysis (NPA) ASMs can be computed for nonperiodic materials with collinear magnetism. Our results show Bader, DDEC, and (where applicable) NPA methods give similar ASMs, but different net atomic charges. Because they are optimized to reproduce both the magnetic field and the chemical states of atoms in a material, DDEC ASMs are especially suitable for constructing interaction potentials for atomistic simulations. We describe the computation of accurate ASMs for (a) a variety of systems using collinear and noncollinear spin DFT, (b) highly correlated materials (e.g., magnetite) using DFT+U, and (c) various spin states of ozone using coupled cluster expansions. The computed ASMs are in good agreement with available experimental results for a variety of periodic and nonperiodic materials. Examples considered include the antiferromagnetic metal organic framework Cu3(BTC)2, several ozone spin states, mono- and binuclear transition metal complexes, ferri- and ferro-magnetic solids (e.g., Fe3O4, Fe3Si), and simple molecular systems. We briefly discuss the theory of exchange-correlation functionals for studying noncollinear magnetism. A method for finding the ground state of systems with highly noncollinear magnetism is introduced. We use these methods to study the spin-orbit coupling potential energy surface of the single molecule magnet Fe4C40H52N4O12, which has highly noncollinear magnetism, and find that it contains unusual features that give a new interpretation to experimental data.

  13. Computationally efficient and quantitatively accurate multiscale simulation of solid-solution strengthening by ab initio calculation

    International Nuclear Information System (INIS)

    Ma, Duancheng; Friák, Martin; Pezold, Johann von; Raabe, Dierk; Neugebauer, Jörg

    2015-01-01

    We propose an approach for the computationally efficient and quantitatively accurate prediction of solid-solution strengthening. It combines the 2-D Peierls–Nabarro model and a recently developed solid-solution strengthening model. Solid-solution strengthening is examined with Al–Mg and Al–Li as representative alloy systems, demonstrating a good agreement between theory and experiments within the temperature range in which the dislocation motion is overdamped. Through a parametric study, two guideline maps of the misfit parameters against (i) the critical resolved shear stress, τ 0 , at 0 K and (ii) the energy barrier, ΔE b , against dislocation motion in a solid solution with randomly distributed solute atoms are created. With these two guideline maps, τ 0 at finite temperatures is predicted for other Al binary systems, and compared with available experiments, achieving good agreement

  14. In pursuit of an accurate spatial and temporal model of biomolecules at the atomistic level: a perspective on computer simulation

    Energy Technology Data Exchange (ETDEWEB)

    Gray, Alan [The University of Edinburgh, Edinburgh EH9 3JZ, Scotland (United Kingdom); Harlen, Oliver G. [University of Leeds, Leeds LS2 9JT (United Kingdom); Harris, Sarah A., E-mail: s.a.harris@leeds.ac.uk [University of Leeds, Leeds LS2 9JT (United Kingdom); University of Leeds, Leeds LS2 9JT (United Kingdom); Khalid, Syma; Leung, Yuk Ming [University of Southampton, Southampton SO17 1BJ (United Kingdom); Lonsdale, Richard [Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr (Germany); Philipps-Universität Marburg, Hans-Meerwein Strasse, 35032 Marburg (Germany); Mulholland, Adrian J. [University of Bristol, Bristol BS8 1TS (United Kingdom); Pearson, Arwen R. [University of Leeds, Leeds LS2 9JT (United Kingdom); University of Hamburg, Hamburg (Germany); Read, Daniel J.; Richardson, Robin A. [University of Leeds, Leeds LS2 9JT (United Kingdom); The University of Edinburgh, Edinburgh EH9 3JZ, Scotland (United Kingdom)

    2015-01-01

    The current computational techniques available for biomolecular simulation are described, and the successes and limitations of each with reference to the experimental biophysical methods that they complement are presented. Despite huge advances in the computational techniques available for simulating biomolecules at the quantum-mechanical, atomistic and coarse-grained levels, there is still a widespread perception amongst the experimental community that these calculations are highly specialist and are not generally applicable by researchers outside the theoretical community. In this article, the successes and limitations of biomolecular simulation and the further developments that are likely in the near future are discussed. A brief overview is also provided of the experimental biophysical methods that are commonly used to probe biomolecular structure and dynamics, and the accuracy of the information that can be obtained from each is compared with that from modelling. It is concluded that progress towards an accurate spatial and temporal model of biomacromolecules requires a combination of all of these biophysical techniques, both experimental and computational.

  15. Computational fitness landscape for all gene-order permutations of an RNA virus.

    Directory of Open Access Journals (Sweden)

    Kwang-il Lim

    2009-02-01

    Full Text Available How does the growth of a virus depend on the linear arrangement of genes in its genome? Answering this question may enhance our basic understanding of virus evolution and advance applications of viruses as live attenuated vaccines, gene-therapy vectors, or anti-tumor therapeutics. We used a mathematical model for vesicular stomatitis virus (VSV, a prototype RNA virus that encodes five genes (N-P-M-G-L, to simulate the intracellular growth of all 120 possible gene-order variants. Simulated yields of virus infection varied by 6,000-fold and were found to be most sensitive to gene-order permutations that increased levels of the L gene transcript or reduced levels of the N gene transcript, the lowest and highest expressed genes of the wild-type virus, respectively. Effects of gene order on virus growth also depended upon the host-cell environment, reflecting different resources for protein synthesis and different cell susceptibilities to infection. Moreover, by computationally deleting intergenic attenuations, which define a key mechanism of transcriptional regulation in VSV, the variation in growth associated with the 120 gene-order variants was drastically narrowed from 6,000- to 20-fold, and many variants produced higher progeny yields than wild-type. These results suggest that regulation by intergenic attenuation preceded or co-evolved with the fixation of the wild type gene order in the evolution of VSV. In summary, our models have begun to reveal how gene functions, gene regulation, and genomic organization of viruses interact with their host environments to define processes of viral growth and evolution.

  16. Computational modeling identifies key gene regulatory interactions underlying phenobarbital-mediated tumor promotion

    Science.gov (United States)

    Luisier, Raphaëlle; Unterberger, Elif B.; Goodman, Jay I.; Schwarz, Michael; Moggs, Jonathan; Terranova, Rémi; van Nimwegen, Erik

    2014-01-01

    Gene regulatory interactions underlying the early stages of non-genotoxic carcinogenesis are poorly understood. Here, we have identified key candidate regulators of phenobarbital (PB)-mediated mouse liver tumorigenesis, a well-characterized model of non-genotoxic carcinogenesis, by applying a new computational modeling approach to a comprehensive collection of in vivo gene expression studies. We have combined our previously developed motif activity response analysis (MARA), which models gene expression patterns in terms of computationally predicted transcription factor binding sites with singular value decomposition (SVD) of the inferred motif activities, to disentangle the roles that different transcriptional regulators play in specific biological pathways of tumor promotion. Furthermore, transgenic mouse models enabled us to identify which of these regulatory activities was downstream of constitutive androstane receptor and β-catenin signaling, both crucial components of PB-mediated liver tumorigenesis. We propose novel roles for E2F and ZFP161 in PB-mediated hepatocyte proliferation and suggest that PB-mediated suppression of ESR1 activity contributes to the development of a tumor-prone environment. Our study shows that combining MARA with SVD allows for automated identification of independent transcription regulatory programs within a complex in vivo tissue environment and provides novel mechanistic insights into PB-mediated hepatocarcinogenesis. PMID:24464994

  17. Toward accurate tooth segmentation from computed tomography images using a hybrid level set model

    Energy Technology Data Exchange (ETDEWEB)

    Gan, Yangzhou; Zhao, Qunfei [Department of Automation, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240 (China); Xia, Zeyang, E-mail: zy.xia@siat.ac.cn, E-mail: jing.xiong@siat.ac.cn; Hu, Ying [Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, and The Chinese University of Hong Kong, Shenzhen 518055 (China); Xiong, Jing, E-mail: zy.xia@siat.ac.cn, E-mail: jing.xiong@siat.ac.cn [Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 510855 (China); Zhang, Jianwei [TAMS, Department of Informatics, University of Hamburg, Hamburg 22527 (Germany)

    2015-01-15

    Purpose: A three-dimensional (3D) model of the teeth provides important information for orthodontic diagnosis and treatment planning. Tooth segmentation is an essential step in generating the 3D digital model from computed tomography (CT) images. The aim of this study is to develop an accurate and efficient tooth segmentation method from CT images. Methods: The 3D dental CT volumetric images are segmented slice by slice in a two-dimensional (2D) transverse plane. The 2D segmentation is composed of a manual initialization step and an automatic slice by slice segmentation step. In the manual initialization step, the user manually picks a starting slice and selects a seed point for each tooth in this slice. In the automatic slice segmentation step, a developed hybrid level set model is applied to segment tooth contours from each slice. Tooth contour propagation strategy is employed to initialize the level set function automatically. Cone beam CT (CBCT) images of two subjects were used to tune the parameters. Images of 16 additional subjects were used to validate the performance of the method. Volume overlap metrics and surface distance metrics were adopted to assess the segmentation accuracy quantitatively. The volume overlap metrics were volume difference (VD, mm{sup 3}) and Dice similarity coefficient (DSC, %). The surface distance metrics were average symmetric surface distance (ASSD, mm), RMS (root mean square) symmetric surface distance (RMSSSD, mm), and maximum symmetric surface distance (MSSD, mm). Computation time was recorded to assess the efficiency. The performance of the proposed method has been compared with two state-of-the-art methods. Results: For the tested CBCT images, the VD, DSC, ASSD, RMSSSD, and MSSD for the incisor were 38.16 ± 12.94 mm{sup 3}, 88.82 ± 2.14%, 0.29 ± 0.03 mm, 0.32 ± 0.08 mm, and 1.25 ± 0.58 mm, respectively; the VD, DSC, ASSD, RMSSSD, and MSSD for the canine were 49.12 ± 9.33 mm{sup 3}, 91.57 ± 0.82%, 0.27 ± 0.02 mm, 0

  18. Toward accurate tooth segmentation from computed tomography images using a hybrid level set model

    International Nuclear Information System (INIS)

    Gan, Yangzhou; Zhao, Qunfei; Xia, Zeyang; Hu, Ying; Xiong, Jing; Zhang, Jianwei

    2015-01-01

    Purpose: A three-dimensional (3D) model of the teeth provides important information for orthodontic diagnosis and treatment planning. Tooth segmentation is an essential step in generating the 3D digital model from computed tomography (CT) images. The aim of this study is to develop an accurate and efficient tooth segmentation method from CT images. Methods: The 3D dental CT volumetric images are segmented slice by slice in a two-dimensional (2D) transverse plane. The 2D segmentation is composed of a manual initialization step and an automatic slice by slice segmentation step. In the manual initialization step, the user manually picks a starting slice and selects a seed point for each tooth in this slice. In the automatic slice segmentation step, a developed hybrid level set model is applied to segment tooth contours from each slice. Tooth contour propagation strategy is employed to initialize the level set function automatically. Cone beam CT (CBCT) images of two subjects were used to tune the parameters. Images of 16 additional subjects were used to validate the performance of the method. Volume overlap metrics and surface distance metrics were adopted to assess the segmentation accuracy quantitatively. The volume overlap metrics were volume difference (VD, mm 3 ) and Dice similarity coefficient (DSC, %). The surface distance metrics were average symmetric surface distance (ASSD, mm), RMS (root mean square) symmetric surface distance (RMSSSD, mm), and maximum symmetric surface distance (MSSD, mm). Computation time was recorded to assess the efficiency. The performance of the proposed method has been compared with two state-of-the-art methods. Results: For the tested CBCT images, the VD, DSC, ASSD, RMSSSD, and MSSD for the incisor were 38.16 ± 12.94 mm 3 , 88.82 ± 2.14%, 0.29 ± 0.03 mm, 0.32 ± 0.08 mm, and 1.25 ± 0.58 mm, respectively; the VD, DSC, ASSD, RMSSSD, and MSSD for the canine were 49.12 ± 9.33 mm 3 , 91.57 ± 0.82%, 0.27 ± 0.02 mm, 0.28 ± 0.03 mm

  19. Rapid and accurate synthesis of TALE genes from synthetic oligonucleotides.

    Science.gov (United States)

    Wang, Fenghua; Zhang, Hefei; Gao, Jingxia; Chen, Fengjiao; Chen, Sijie; Zhang, Cuizhen; Peng, Gang

    2016-01-01

    Custom synthesis of transcription activator-like effector (TALE) genes has relied upon plasmid libraries of pre-fabricated TALE-repeat monomers or oligomers. Here we describe a novel synthesis method that directly incorporates annealed synthetic oligonucleotides into the TALE-repeat units. Our approach utilizes iterative sets of oligonucleotides and a translational frame check strategy to ensure the high efficiency and accuracy of TALE-gene synthesis. TALE arrays of more than 20 repeats can be constructed, and the majority of the synthesized constructs have perfect sequences. In addition, this novel oligonucleotide-based method can readily accommodate design changes to the TALE repeats. We demonstrated an increased gene targeting efficiency against a genomic site containing a potentially methylated cytosine by incorporating non-conventional repeat variable di-residue (RVD) sequences.

  20. The preliminary exploration of 64-slice volume computed tomography in the accurate measurement of pleural effusion.

    Science.gov (United States)

    Guo, Zhi-Jun; Lin, Qiang; Liu, Hai-Tao; Lu, Jun-Ying; Zeng, Yan-Hong; Meng, Fan-Jie; Cao, Bin; Zi, Xue-Rong; Han, Shu-Ming; Zhang, Yu-Huan

    2013-09-01

    Using computed tomography (CT) to rapidly and accurately quantify pleural effusion volume benefits medical and scientific research. However, the precise volume of pleural effusions still involves many challenges and currently does not have a recognized accurate measuring. To explore the feasibility of using 64-slice CT volume-rendering technology to accurately measure pleural fluid volume and to then analyze the correlation between the volume of the free pleural effusion and the different diameters of the pleural effusion. The 64-slice CT volume-rendering technique was used to measure and analyze three parts. First, the fluid volume of a self-made thoracic model was measured and compared with the actual injected volume. Second, the pleural effusion volume was measured before and after pleural fluid drainage in 25 patients, and the volume reduction was compared with the actual volume of the liquid extract. Finally, the free pleural effusion volume was measured in 26 patients to analyze the correlation between it and the diameter of the effusion, which was then used to calculate the regression equation. After using the 64-slice CT volume-rendering technique to measure the fluid volume of the self-made thoracic model, the results were compared with the actual injection volume. No significant differences were found, P = 0.836. For the 25 patients with drained pleural effusions, the comparison of the reduction volume with the actual volume of the liquid extract revealed no significant differences, P = 0.989. The following linear regression equation was used to compare the pleural effusion volume (V) (measured by the CT volume-rendering technique) with the pleural effusion greatest depth (d): V = 158.16 × d - 116.01 (r = 0.91, P = 0.000). The following linear regression was used to compare the volume with the product of the pleural effusion diameters (l × h × d): V = 0.56 × (l × h × d) + 39.44 (r = 0.92, P = 0.000). The 64-slice CT volume-rendering technique can

  1. The preliminary exploration of 64-slice volume computed tomography in the accurate measurement of pleural effusion

    International Nuclear Information System (INIS)

    Guo, Zhi-Jun; Lin, Qiang; Liu, Hai-Tao

    2013-01-01

    Background: Using computed tomography (CT) to rapidly and accurately quantify pleural effusion volume benefits medical and scientific research. However, the precise volume of pleural effusions still involves many challenges and currently does not have a recognized accurate measuring. Purpose: To explore the feasibility of using 64-slice CT volume-rendering technology to accurately measure pleural fluid volume and to then analyze the correlation between the volume of the free pleural effusion and the different diameters of the pleural effusion. Material and Methods: The 64-slice CT volume-rendering technique was used to measure and analyze three parts. First, the fluid volume of a self-made thoracic model was measured and compared with the actual injected volume. Second, the pleural effusion volume was measured before and after pleural fluid drainage in 25 patients, and the volume reduction was compared with the actual volume of the liquid extract. Finally, the free pleural effusion volume was measured in 26 patients to analyze the correlation between it and the diameter of the effusion, which was then used to calculate the regression equation. Results: After using the 64-slice CT volume-rendering technique to measure the fluid volume of the self-made thoracic model, the results were compared with the actual injection volume. No significant differences were found, P = 0.836. For the 25 patients with drained pleural effusions, the comparison of the reduction volume with the actual volume of the liquid extract revealed no significant differences, P = 0.989. The following linear regression equation was used to compare the pleural effusion volume (V) (measured by the CT volume-rendering technique) with the pleural effusion greatest depth (d): V = 158.16 X d - 116.01 (r = 0.91, P = 0.000). The following linear regression was used to compare the volume with the product of the pleural effusion diameters (l X h X d): V = 0.56 X (l X h X d) + 39.44 (r = 0.92, P = 0

  2. The preliminary exploration of 64-slice volume computed tomography in the accurate measurement of pleural effusion

    Energy Technology Data Exchange (ETDEWEB)

    Guo, Zhi-Jun [Dept. of Radiology, North China Petroleum Bureau General Hospital, Renqiu, Hebei (China)], e-mail: Gzj3@163.com; Lin, Qiang [Dept. of Oncology, North China Petroleum Bureau General Hospital, Renqiu, Hebei (China); Liu, Hai-Tao [Dept. of General Surgery, North China Petroleum Bureau General Hospital, Renqiu, Hebei (China)] [and others])

    2013-09-15

    Background: Using computed tomography (CT) to rapidly and accurately quantify pleural effusion volume benefits medical and scientific research. However, the precise volume of pleural effusions still involves many challenges and currently does not have a recognized accurate measuring. Purpose: To explore the feasibility of using 64-slice CT volume-rendering technology to accurately measure pleural fluid volume and to then analyze the correlation between the volume of the free pleural effusion and the different diameters of the pleural effusion. Material and Methods: The 64-slice CT volume-rendering technique was used to measure and analyze three parts. First, the fluid volume of a self-made thoracic model was measured and compared with the actual injected volume. Second, the pleural effusion volume was measured before and after pleural fluid drainage in 25 patients, and the volume reduction was compared with the actual volume of the liquid extract. Finally, the free pleural effusion volume was measured in 26 patients to analyze the correlation between it and the diameter of the effusion, which was then used to calculate the regression equation. Results: After using the 64-slice CT volume-rendering technique to measure the fluid volume of the self-made thoracic model, the results were compared with the actual injection volume. No significant differences were found, P = 0.836. For the 25 patients with drained pleural effusions, the comparison of the reduction volume with the actual volume of the liquid extract revealed no significant differences, P = 0.989. The following linear regression equation was used to compare the pleural effusion volume (V) (measured by the CT volume-rendering technique) with the pleural effusion greatest depth (d): V = 158.16 X d - 116.01 (r = 0.91, P = 0.000). The following linear regression was used to compare the volume with the product of the pleural effusion diameters (l X h X d): V = 0.56 X (l X h X d) + 39.44 (r = 0.92, P = 0

  3. An efficient and accurate 3D displacements tracking strategy for digital volume correlation

    Science.gov (United States)

    Pan, Bing; Wang, Bo; Wu, Dafang; Lubineau, Gilles

    2014-07-01

    Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost.

  4. An efficient and accurate 3D displacements tracking strategy for digital volume correlation

    KAUST Repository

    Pan, Bing

    2014-07-01

    Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost. © 2014 Elsevier Ltd.

  5. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba

    2016-07-20

    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  6. In pursuit of an accurate spatial and temporal model of biomolecules at the atomistic level: a perspective on computer simulation.

    Science.gov (United States)

    Gray, Alan; Harlen, Oliver G; Harris, Sarah A; Khalid, Syma; Leung, Yuk Ming; Lonsdale, Richard; Mulholland, Adrian J; Pearson, Arwen R; Read, Daniel J; Richardson, Robin A

    2015-01-01

    Despite huge advances in the computational techniques available for simulating biomolecules at the quantum-mechanical, atomistic and coarse-grained levels, there is still a widespread perception amongst the experimental community that these calculations are highly specialist and are not generally applicable by researchers outside the theoretical community. In this article, the successes and limitations of biomolecular simulation and the further developments that are likely in the near future are discussed. A brief overview is also provided of the experimental biophysical methods that are commonly used to probe biomolecular structure and dynamics, and the accuracy of the information that can be obtained from each is compared with that from modelling. It is concluded that progress towards an accurate spatial and temporal model of biomacromolecules requires a combination of all of these biophysical techniques, both experimental and computational.

  7. Spatially Uniform ReliefF (SURF for computationally-efficient filtering of gene-gene interactions

    Directory of Open Access Journals (Sweden)

    Greene Casey S

    2009-09-01

    Full Text Available Abstract Background Genome-wide association studies are becoming the de facto standard in the genetic analysis of common human diseases. Given the complexity and robustness of biological networks such diseases are unlikely to be the result of single points of failure but instead likely arise from the joint failure of two or more interacting components. The hope in genome-wide screens is that these points of failure can be linked to single nucleotide polymorphisms (SNPs which confer disease susceptibility. Detecting interacting variants that lead to disease in the absence of single-gene effects is difficult however, and methods to exhaustively analyze sets of these variants for interactions are combinatorial in nature thus making them computationally infeasible. Efficient algorithms which can detect interacting SNPs are needed. ReliefF is one such promising algorithm, although it has low success rate for noisy datasets when the interaction effect is small. ReliefF has been paired with an iterative approach, Tuned ReliefF (TuRF, which improves the estimation of weights in noisy data but does not fundamentally change the underlying ReliefF algorithm. To improve the sensitivity of studies using these methods to detect small effects we introduce Spatially Uniform ReliefF (SURF. Results SURF's ability to detect interactions in this domain is significantly greater than that of ReliefF. Similarly SURF, in combination with the TuRF strategy significantly outperforms TuRF alone for SNP selection under an epistasis model. It is important to note that this success rate increase does not require an increase in algorithmic complexity and allows for increased success rate, even with the removal of a nuisance parameter from the algorithm. Conclusion Researchers performing genetic association studies and aiming to discover gene-gene interactions associated with increased disease susceptibility should use SURF in place of ReliefF. For instance, SURF should be

  8. Accurate computer simulation of a drift chamber

    International Nuclear Information System (INIS)

    Killian, T.J.

    1980-01-01

    A general purpose program for drift chamber studies is described. First the capacitance matrix is calculated using a Green's function technique. The matrix is used in a linear-least-squares fit to choose optimal operating voltages. Next the electric field is computed, and given knowledge of gas parameters and magnetic field environment, a family of electron trajectories is determined. These are finally used to make drift distance vs time curves which may be used directly by a track reconstruction program. Results are compared with data obtained from the cylindrical chamber in the Axial Field Magnet experiment at the CERN ISR

  9. Suitable Reference Genes for Accurate Gene Expression Analysis in Parsley (Petroselinum crispum) for Abiotic Stresses and Hormone Stimuli.

    Science.gov (United States)

    Li, Meng-Yao; Song, Xiong; Wang, Feng; Xiong, Ai-Sheng

    2016-01-01

    Parsley, one of the most important vegetables in the Apiaceae family, is widely used in the food, medicinal, and cosmetic industries. Recent studies on parsley mainly focus on its chemical composition, and further research involving the analysis of the plant's gene functions and expressions is required. qPCR is a powerful method for detecting very low quantities of target transcript levels and is widely used to study gene expression. To ensure the accuracy of results, a suitable reference gene is necessary for expression normalization. In this study, four software, namely geNorm, NormFinder, BestKeeper, and RefFinder were used to evaluate the expression stabilities of eight candidate reference genes of parsley ( GAPDH, ACTIN, eIF-4 α, SAND, UBC, TIP41, EF-1 α, and TUB ) under various conditions, including abiotic stresses (heat, cold, salt, and drought) and hormone stimuli treatments (GA, SA, MeJA, and ABA). Results showed that EF-1 α and TUB were the most stable genes for abiotic stresses, whereas EF-1 α, GAPDH , and TUB were the top three choices for hormone stimuli treatments. Moreover, EF-1 α and TUB were the most stable reference genes among all tested samples, and UBC was the least stable one. Expression analysis of PcDREB1 and PcDREB2 further verified that the selected stable reference genes were suitable for gene expression normalization. This study can guide the selection of suitable reference genes in gene expression in parsley.

  10. Suitable reference genes for accurate gene expression analysis in parsley (Petroselinum crispum for abiotic stresses and hormone stimuli

    Directory of Open Access Journals (Sweden)

    Meng-Yao Li

    2016-09-01

    Full Text Available Parsley is one of the most important vegetable in Apiaceae family and widely used in food industry, medicinal and cosmetic. The recent studies in parsley are mainly focus on chemical composition, further research involving the analysis of the gene functions and expressions will be required. qPCR is a powerful method for detecting very low quantities of target transcript levels and widely used for gene expression studies. To ensure the accuracy of results, a suitable reference gene is necessary for expression normalization. In this study, three software geNorm, NormFinder, and BestKeeper were used to evaluate the expression stabilities of eight candidate reference genes (GAPDH, ACTIN, eIF-4α, SAND, UBC, TIP41, EF-1α, and TUB under various conditions including abiotic stresses (heat, cold, salt, and drought and hormone stimuli treatments (GA, SA, MeJA, and ABA. The results showed that EF-1α and TUB were identified as the most stable genes for abiotic stresses, while EF-1α, GAPDH, and TUB were the top three choices for hormone stimuli treatments. Moreover, EF-1α and TUB were the most stable reference genes across all the tested samples, while UBC was the least stable one. The expression analysis of PcDREB1 and PcDREB2 further verified that the selected stable reference genes were suitable for gene expression normalization. This study provides a guideline for selection the suitable reference genes in gene expression in parsley.

  11. Gene Ranking of RNA-Seq Data via Discriminant Non-Negative Matrix Factorization.

    Science.gov (United States)

    Jia, Zhilong; Zhang, Xiang; Guan, Naiyang; Bo, Xiaochen; Barnes, Michael R; Luo, Zhigang

    2015-01-01

    RNA-sequencing is rapidly becoming the method of choice for studying the full complexity of transcriptomes, however with increasing dimensionality, accurate gene ranking is becoming increasingly challenging. This paper proposes an accurate and sensitive gene ranking method that implements discriminant non-negative matrix factorization (DNMF) for RNA-seq data. To the best of our knowledge, this is the first work to explore the utility of DNMF for gene ranking. When incorporating Fisher's discriminant criteria and setting the reduced dimension as two, DNMF learns two factors to approximate the original gene expression data, abstracting the up-regulated or down-regulated metagene by using the sample label information. The first factor denotes all the genes' weights of two metagenes as the additive combination of all genes, while the second learned factor represents the expression values of two metagenes. In the gene ranking stage, all the genes are ranked as a descending sequence according to the differential values of the metagene weights. Leveraging the nature of NMF and Fisher's criterion, DNMF can robustly boost the gene ranking performance. The Area Under the Curve analysis of differential expression analysis on two benchmarking tests of four RNA-seq data sets with similar phenotypes showed that our proposed DNMF-based gene ranking method outperforms other widely used methods. Moreover, the Gene Set Enrichment Analysis also showed DNMF outweighs others. DNMF is also computationally efficient, substantially outperforming all other benchmarked methods. Consequently, we suggest DNMF is an effective method for the analysis of differential gene expression and gene ranking for RNA-seq data.

  12. An accurate and efficient method for large-scale SSR genotyping and applications.

    Science.gov (United States)

    Li, Lun; Fang, Zhiwei; Zhou, Junfei; Chen, Hong; Hu, Zhangfeng; Gao, Lifen; Chen, Lihong; Ren, Sheng; Ma, Hongyu; Lu, Long; Zhang, Weixiong; Peng, Hai

    2017-06-02

    Accurate and efficient genotyping of simple sequence repeats (SSRs) constitutes the basis of SSRs as an effective genetic marker with various applications. However, the existing methods for SSR genotyping suffer from low sensitivity, low accuracy, low efficiency and high cost. In order to fully exploit the potential of SSRs as genetic marker, we developed a novel method for SSR genotyping, named as AmpSeq-SSR, which combines multiplexing polymerase chain reaction (PCR), targeted deep sequencing and comprehensive analysis. AmpSeq-SSR is able to genotype potentially more than a million SSRs at once using the current sequencing techniques. In the current study, we simultaneously genotyped 3105 SSRs in eight rice varieties, which were further validated experimentally. The results showed that the accuracies of AmpSeq-SSR were nearly 100 and 94% with a single base resolution for homozygous and heterozygous samples, respectively. To demonstrate the power of AmpSeq-SSR, we adopted it in two applications. The first was to construct discriminative fingerprints of the rice varieties using 3105 SSRs, which offer much greater discriminative power than the 48 SSRs commonly used for rice. The second was to map Xa21, a gene that confers persistent resistance to rice bacterial blight. We demonstrated that genome-scale fingerprints of an organism can be efficiently constructed and candidate genes, such as Xa21 in rice, can be accurately and efficiently mapped using an innovative strategy consisting of multiplexing PCR, targeted sequencing and computational analysis. While the work we present focused on rice, AmpSeq-SSR can be readily extended to animals and micro-organisms. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Constructing an integrated gene similarity network for the identification of disease genes.

    Science.gov (United States)

    Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin

    2017-09-20

    Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .

  14. Accurate computer simulation of a drift chamber

    CERN Document Server

    Killian, T J

    1980-01-01

    The author describes a general purpose program for drift chamber studies. First the capacitance matrix is calculated using a Green's function technique. The matrix is used in a linear-least-squares fit to choose optimal operating voltages. Next the electric field is computed, and given knowledge of gas parameters and magnetic field environment, a family of electron trajectories is determined. These are finally used to make drift distance vs time curves which may be used directly by a track reconstruction program. The results are compared with data obtained from the cylindrical chamber in the Axial Field Magnet experiment at the CERN ISR. (1 refs).

  15. GDdom: An Online Tool for Calculation of Dominant Marker Gene Diversity.

    Science.gov (United States)

    Abuzayed, Mazen; El-Dabba, Nourhan; Frary, Anne; Doganlar, Sami

    2017-04-01

    Gene diversity (GD), also called polymorphism information content, is a commonly used measure of molecular marker polymorphism. Calculation of GD for dominant markers such as AFLP, RAPD, and multilocus SSRs is valuable for researchers. To meet this need, we developed a free online computer program, GDdom, which provides easy, quick, and accurate calculation of dominant marker GD with a commonly used formula. Results are presented in tabular form for quick interpretation.

  16. An accurate determination of the flux within a slab

    International Nuclear Information System (INIS)

    Ganapol, B.D.; Lapenta, G.

    1993-01-01

    During the past decade, several articles have been written concerning accurate solutions to the monoenergetic neutron transport equation in infinite and semi-infinite geometries. The numerical formulations found in these articles were based primarily on the extensive theoretical investigations performed by the open-quotes transport greatsclose quotes such as Chandrasekhar, Busbridge, Sobolev, and Ivanov, to name a few. The development of numerical solutions in infinite and semi-infinite geometries represents an example of how mathematical transport theory can be utilized to provide highly accurate and efficient numerical transport solutions. These solutions, or analytical benchmarks, are useful as open-quotes industry standards,close quotes which provide guidance to code developers and promote learning in the classroom. The high accuracy of these benchmarks is directly attributable to the rapid advancement of the state of computing and computational methods. Transport calculations that were beyond the capability of the open-quotes supercomputersclose quotes of just a few years ago are now possible at one's desk. In this paper, we again build upon the past to tackle the slab problem, which is of the next level of difficulty in comparison to infinite media problems. The formulation is based on the monoenergetic Green's function, which is the most fundamental transport solution. This method of solution requires a fast and accurate evaluation of the Green's function, which, with today's computational power, is now readily available

  17. An Accurate Method for Inferring Relatedness in Large Datasets of Unphased Genotypes via an Embedded Likelihood-Ratio Test

    KAUST Repository

    Rodriguez, Jesse M.

    2013-01-01

    Studies that map disease genes rely on accurate annotations that indicate whether individuals in the studied cohorts are related to each other or not. For example, in genome-wide association studies, the cohort members are assumed to be unrelated to one another. Investigators can correct for individuals in a cohort with previously-unknown shared familial descent by detecting genomic segments that are shared between them, which are considered to be identical by descent (IBD). Alternatively, elevated frequencies of IBD segments near a particular locus among affected individuals can be indicative of a disease-associated gene. As genotyping studies grow to use increasingly large sample sizes and meta-analyses begin to include many data sets, accurate and efficient detection of hidden relatedness becomes a challenge. To enable disease-mapping studies of increasingly large cohorts, a fast and accurate method to detect IBD segments is required. We present PARENTE, a novel method for detecting related pairs of individuals and shared haplotypic segments within these pairs. PARENTE is a computationally-efficient method based on an embedded likelihood ratio test. As demonstrated by the results of our simulations, our method exhibits better accuracy than the current state of the art, and can be used for the analysis of large genotyped cohorts. PARENTE\\'s higher accuracy becomes even more significant in more challenging scenarios, such as detecting shorter IBD segments or when an extremely low false-positive rate is required. PARENTE is publicly and freely available at http://parente.stanford.edu/. © 2013 Springer-Verlag.

  18. Selection of reference genes for quantitative gene expression normalization in flax (Linum usitatissimum L.

    Directory of Open Access Journals (Sweden)

    Neutelings Godfrey

    2010-04-01

    Full Text Available Abstract Background Quantitative real-time PCR (qRT-PCR is currently the most accurate method for detecting differential gene expression. Such an approach depends on the identification of uniformly expressed 'housekeeping genes' (HKGs. Extensive transcriptomic data mining and experimental validation in different model plants have shown that the reliability of these endogenous controls can be influenced by the plant species, growth conditions and organs/tissues examined. It is therefore important to identify the best reference genes to use in each biological system before using qRT-PCR to investigate differential gene expression. In this paper we evaluate different candidate HKGs for developmental transcriptomic studies in the economically-important flax fiber- and oil-crop (Linum usitatissimum L. Results Specific primers were designed in order to quantify the expression levels of 20 different potential housekeeping genes in flax roots, internal- and external-stem tissues, leaves and flowers at different developmental stages. After calculations of PCR efficiencies, 13 HKGs were retained and their expression stabilities evaluated by the computer algorithms geNorm and NormFinder. According to geNorm, 2 Transcriptional Elongation Factors (TEFs and 1 Ubiquitin gene are necessary for normalizing gene expression when all studied samples are considered. However, only 2 TEFs are required for normalizing expression in stem tissues. In contrast, NormFinder identified glyceraldehyde-3-phosphate dehydrogenase (GADPH as the most stably expressed gene when all samples were grouped together, as well as when samples were classed into different sub-groups. qRT-PCR was then used to investigate the relative expression levels of two splice variants of the flax LuMYB1 gene (homologue of AtMYB59. LuMYB1-1 and LuMYB1-2 were highly expressed in the internal stem tissues as compared to outer stem tissues and other samples. This result was confirmed with both ge

  19. Selection of reference genes for quantitative gene expression normalization in flax (Linum usitatissimum L.).

    Science.gov (United States)

    Huis, Rudy; Hawkins, Simon; Neutelings, Godfrey

    2010-04-19

    Quantitative real-time PCR (qRT-PCR) is currently the most accurate method for detecting differential gene expression. Such an approach depends on the identification of uniformly expressed 'housekeeping genes' (HKGs). Extensive transcriptomic data mining and experimental validation in different model plants have shown that the reliability of these endogenous controls can be influenced by the plant species, growth conditions and organs/tissues examined. It is therefore important to identify the best reference genes to use in each biological system before using qRT-PCR to investigate differential gene expression. In this paper we evaluate different candidate HKGs for developmental transcriptomic studies in the economically-important flax fiber- and oil-crop (Linum usitatissimum L). Specific primers were designed in order to quantify the expression levels of 20 different potential housekeeping genes in flax roots, internal- and external-stem tissues, leaves and flowers at different developmental stages. After calculations of PCR efficiencies, 13 HKGs were retained and their expression stabilities evaluated by the computer algorithms geNorm and NormFinder. According to geNorm, 2 Transcriptional Elongation Factors (TEFs) and 1 Ubiquitin gene are necessary for normalizing gene expression when all studied samples are considered. However, only 2 TEFs are required for normalizing expression in stem tissues. In contrast, NormFinder identified glyceraldehyde-3-phosphate dehydrogenase (GADPH) as the most stably expressed gene when all samples were grouped together, as well as when samples were classed into different sub-groups.qRT-PCR was then used to investigate the relative expression levels of two splice variants of the flax LuMYB1 gene (homologue of AtMYB59). LuMYB1-1 and LuMYB1-2 were highly expressed in the internal stem tissues as compared to outer stem tissues and other samples. This result was confirmed with both geNorm-designated- and Norm

  20. Accurate quantum chemical calculations

    Science.gov (United States)

    Bauschlicher, Charles W., Jr.; Langhoff, Stephen R.; Taylor, Peter R.

    1989-01-01

    An important goal of quantum chemical calculations is to provide an understanding of chemical bonding and molecular electronic structure. A second goal, the prediction of energy differences to chemical accuracy, has been much harder to attain. First, the computational resources required to achieve such accuracy are very large, and second, it is not straightforward to demonstrate that an apparently accurate result, in terms of agreement with experiment, does not result from a cancellation of errors. Recent advances in electronic structure methodology, coupled with the power of vector supercomputers, have made it possible to solve a number of electronic structure problems exactly using the full configuration interaction (FCI) method within a subspace of the complete Hilbert space. These exact results can be used to benchmark approximate techniques that are applicable to a wider range of chemical and physical problems. The methodology of many-electron quantum chemistry is reviewed. Methods are considered in detail for performing FCI calculations. The application of FCI methods to several three-electron problems in molecular physics are discussed. A number of benchmark applications of FCI wave functions are described. Atomic basis sets and the development of improved methods for handling very large basis sets are discussed: these are then applied to a number of chemical and spectroscopic problems; to transition metals; and to problems involving potential energy surfaces. Although the experiences described give considerable grounds for optimism about the general ability to perform accurate calculations, there are several problems that have proved less tractable, at least with current computer resources, and these and possible solutions are discussed.

  1. Novel gene sets improve set-level classification of prokaryotic gene expression data.

    Science.gov (United States)

    Holec, Matěj; Kuželka, Ondřej; Železný, Filip

    2015-10-28

    Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.

  2. Crowdsourcing RNA structural alignments with an online computer game.

    Science.gov (United States)

    Waldispühl, Jérôme; Kam, Arthur; Gardner, Paul P

    2015-01-01

    The annotation and classification of ncRNAs is essential to decipher molecular mechanisms of gene regulation in normal and disease states. A database such as Rfam maintains alignments, consensus secondary structures, and corresponding annotations for RNA families. Its primary purpose is the automated, accurate annotation of non-coding RNAs in genomic sequences. However, the alignment of RNAs is computationally challenging, and the data stored in this database are often subject to improvements. Here, we design and evaluate Ribo, a human-computing game that aims to improve the accuracy of RNA alignments already stored in Rfam. We demonstrate the potential of our techniques and discuss the feasibility of large scale collaborative annotation and classification of RNA families.

  3. A flexible and accurate digital volume correlation method applicable to high-resolution volumetric images

    Science.gov (United States)

    Pan, Bing; Wang, Bo

    2017-10-01

    Digital volume correlation (DVC) is a powerful technique for quantifying interior deformation within solid opaque materials and biological tissues. In the last two decades, great efforts have been made to improve the accuracy and efficiency of the DVC algorithm. However, there is still a lack of a flexible, robust and accurate version that can be efficiently implemented in personal computers with limited RAM. This paper proposes an advanced DVC method that can realize accurate full-field internal deformation measurement applicable to high-resolution volume images with up to billions of voxels. Specifically, a novel layer-wise reliability-guided displacement tracking strategy combined with dynamic data management is presented to guide the DVC computation from slice to slice. The displacements at specified calculation points in each layer are computed using the advanced 3D inverse-compositional Gauss-Newton algorithm with the complete initial guess of the deformation vector accurately predicted from the computed calculation points. Since only limited slices of interest in the reference and deformed volume images rather than the whole volume images are required, the DVC calculation can thus be efficiently implemented on personal computers. The flexibility, accuracy and efficiency of the presented DVC approach are demonstrated by analyzing computer-simulated and experimentally obtained high-resolution volume images.

  4. An Accurate Computational Tool for Performance Estimation of FSO Communication Links over Weak to Strong Atmospheric Turbulent Channels

    Directory of Open Access Journals (Sweden)

    Theodore D. Katsilieris

    2017-03-01

    Full Text Available The terrestrial optical wireless communication links have attracted significant research and commercial worldwide interest over the last few years due to the fact that they offer very high and secure data rate transmission with relatively low installation and operational costs, and without need of licensing. However, since the propagation path of the information signal, i.e., the laser beam, is the atmosphere, their effectivity affects the atmospheric conditions strongly in the specific area. Thus, system performance depends significantly on the rain, the fog, the hail, the atmospheric turbulence, etc. Due to the influence of these effects, it is necessary to study, theoretically and numerically, very carefully before the installation of such a communication system. In this work, we present exactly and accurately approximate mathematical expressions for the estimation of the average capacity and the outage probability performance metrics, as functions of the link’s parameters, the transmitted power, the attenuation due to the fog, the ambient noise and the atmospheric turbulence phenomenon. The latter causes the scintillation effect, which results in random and fast fluctuations of the irradiance at the receiver’s end. These fluctuations can be studied accurately with statistical methods. Thus, in this work, we use either the lognormal or the gamma–gamma distribution for weak or moderate to strong turbulence conditions, respectively. Moreover, using the derived mathematical expressions, we design, accomplish and present a computational tool for the estimation of these systems’ performances, while also taking into account the parameter of the link and the atmospheric conditions. Furthermore, in order to increase the accuracy of the presented tool, for the cases where the obtained analytical mathematical expressions are complex, the performance results are verified with the numerical estimation of the appropriate integrals. Finally, using

  5. Genome-wide identification of the regulatory targets of a transcription factor using biochemical characterization and computational genomic analysis

    Directory of Open Access Journals (Sweden)

    Jolly Emmitt R

    2005-11-01

    Full Text Available Abstract Background A major challenge in computational genomics is the development of methodologies that allow accurate genome-wide prediction of the regulatory targets of a transcription factor. We present a method for target identification that combines experimental characterization of binding requirements with computational genomic analysis. Results Our method identified potential target genes of the transcription factor Ndt80, a key transcriptional regulator involved in yeast sporulation, using the combined information of binding affinity, positional distribution, and conservation of the binding sites across multiple species. We have also developed a mathematical approach to compute the false positive rate and the total number of targets in the genome based on the multiple selection criteria. Conclusion We have shown that combining biochemical characterization and computational genomic analysis leads to accurate identification of the genome-wide targets of a transcription factor. The method can be extended to other transcription factors and can complement other genomic approaches to transcriptional regulation.

  6. Efficient and Accurate Computational Framework for Injector Design and Analysis, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — CFD codes used to simulate upper stage expander cycle engines are not adequately mature to support design efforts. Rapid and accurate simulations require more...

  7. Sentinel nodes identified by computed tomography-lymphography accurately stage the axilla in patients with breast cancer

    International Nuclear Information System (INIS)

    Motomura, Kazuyoshi; Sumino, Hiroshi; Noguchi, Atsushi; Horinouchi, Takashi; Nakanishi, Katsuyuki

    2013-01-01

    Sentinel node biopsy often results in the identification and removal of multiple nodes as sentinel nodes, although most of these nodes could be non-sentinel nodes. This study investigated whether computed tomography-lymphography (CT-LG) can distinguish sentinel nodes from non-sentinel nodes and whether sentinel nodes identified by CT-LG can accurately stage the axilla in patients with breast cancer. This study included 184 patients with breast cancer and clinically negative nodes. Contrast agent was injected interstitially. The location of sentinel nodes was marked on the skin surface using a CT laser light navigator system. Lymph nodes located just under the marks were first removed as sentinel nodes. Then, all dyed nodes or all hot nodes were removed. The mean number of sentinel nodes identified by CT-LG was significantly lower than that of dyed and/or hot nodes removed (1.1 vs 1.8, p <0.0001). Twenty-three (12.5%) patients had ≥2 sentinel nodes identified by CT-LG removed, whereas 94 (51.1%) of patients had ≥2 dyed and/or hot nodes removed (p <0.0001). Pathological evaluation demonstrated that 47 (25.5%) of 184 patients had metastasis to at least one node. All 47 patients demonstrated metastases to at least one of the sentinel nodes identified by CT-LG. CT-LG can distinguish sentinel nodes from non-sentinel nodes, and sentinel nodes identified by CT-LG can accurately stage the axilla in patients with breast cancer. Successful identification of sentinel nodes using CT-LG may facilitate image-based diagnosis of metastasis, possibly leading to the omission of sentinel node biopsy

  8. High accurate time system of the Low Latitude Meridian Circle.

    Science.gov (United States)

    Yang, Jing; Wang, Feng; Li, Zhiming

    In order to obtain the high accurate time signal for the Low Latitude Meridian Circle (LLMC), a new GPS accurate time system is developed which include GPS, 1 MC frequency source and self-made clock system. The second signal of GPS is synchronously used in the clock system and information can be collected by a computer automatically. The difficulty of the cancellation of the time keeper can be overcomed by using this system.

  9. Late enhanced computed tomography in Hypertrophic Cardiomyopathy enables accurate left-ventricular volumetry

    Energy Technology Data Exchange (ETDEWEB)

    Langer, Christoph; Lutz, M.; Kuehl, C.; Frey, N. [Christian-Albrechts-Universitaet Kiel, Department of Cardiology, Angiology and Critical Care Medicine, University Medical Center Schleswig-Holstein (Germany); Partner Site Hamburg/Kiel/Luebeck, DZHK (German Centre for Cardiovascular Research), Kiel (Germany); Both, M.; Sattler, B.; Jansen, O; Schaefer, P. [Christian-Albrechts-Universitaet Kiel, Department of Diagnostic Radiology, University Medical Center Schleswig-Holstein (Germany); Harders, H.; Eden, M. [Christian-Albrechts-Universitaet Kiel, Department of Cardiology, Angiology and Critical Care Medicine, University Medical Center Schleswig-Holstein (Germany)

    2014-10-15

    Late enhancement (LE) multi-slice computed tomography (leMDCT) was introduced for the visualization of (intra-) myocardial fibrosis in Hypertrophic Cardiomyopathy (HCM). LE is associated with adverse cardiac events. This analysis focuses on leMDCT derived LV muscle mass (LV-MM) which may be related to LE resulting in LE proportion for potential risk stratification in HCM. N=26 HCM-patients underwent leMDCT (64-slice-CT) and cardiovascular magnetic resonance (CMR). In leMDCT iodine contrast (Iopromid, 350 mg/mL; 150mL) was injected 7 minutes before imaging. Reconstructed short cardiac axis views served for planimetry. The study group was divided into three groups of varying LV-contrast. LeMDCT was correlated with CMR. The mean age was 64.2 ± 14 years. The groups of varying contrast differed in weight and body mass index (p < 0.05). In the group with good LV-contrast assessment of LV-MM resulted in 147.4 ± 64.8 g in leMDCT vs. 147.1 ± 65.9 in CMR (p > 0.05). In the group with sufficient contrast LV-MM appeared with 172 ± 30.8 g in leMDCT vs. 165.9 ± 37.8 in CMR (p > 0.05). Overall intra-/inter-observer variability of semiautomatic assessment of LV-MM showed an accuracy of 0.9 ± 8.6 g and 0.8 ± 9.2 g in leMDCT. All leMDCT-measures correlated well with CMR (r > 0.9). LeMDCT primarily performed for LE-visualization in HCM allows for accurate LV-volumetry including LV-MM in > 90 % of the cases. (orig.)

  10. Gene prediction in metagenomic fragments: A large scale machine learning approach

    Directory of Open Access Journals (Sweden)

    Morgenstern Burkhard

    2008-04-01

    Full Text Available Abstract Background Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions. Results We introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability. Conclusion Large scale machine learning methods are well-suited for gene

  11. Accurate computation of transfer maps from magnetic field data

    International Nuclear Information System (INIS)

    Venturini, Marco; Dragt, Alex J.

    1999-01-01

    Consider an arbitrary beamline magnet. Suppose one component (for example, the radial component) of the magnetic field is known on the surface of some imaginary cylinder coaxial to and contained within the magnet aperture. This information can be obtained either by direct measurement or by computation with the aid of some 3D electromagnetic code. Alternatively, suppose that the field harmonics have been measured by using a spinning coil. We describe how this information can be used to compute the exact transfer map for the beamline element. This transfer map takes into account all effects of real beamline elements including fringe-field, pseudo-multipole, and real multipole error effects. The method we describe automatically takes into account the smoothing properties of the Laplace-Green function. Consequently, it is robust against both measurement and electromagnetic code errors. As an illustration we apply the method to the field analysis of high-gradient interaction region quadrupoles in the Large Hadron Collider (LHC)

  12. An efficient and accurate method for calculating nonlinear diffraction beam fields

    Energy Technology Data Exchange (ETDEWEB)

    Jeong, Hyun Jo; Cho, Sung Jong; Nam, Ki Woong; Lee, Jang Hyun [Division of Mechanical and Automotive Engineering, Wonkwang University, Iksan (Korea, Republic of)

    2016-04-15

    This study develops an efficient and accurate method for calculating nonlinear diffraction beam fields propagating in fluids or solids. The Westervelt equation and quasilinear theory, from which the integral solutions for the fundamental and second harmonics can be obtained, are first considered. A computationally efficient method is then developed using a multi-Gaussian beam (MGB) model that easily separates the diffraction effects from the plane wave solution. The MGB models provide accurate beam fields when compared with the integral solutions for a number of transmitter-receiver geometries. These models can also serve as fast, powerful modeling tools for many nonlinear acoustics applications, especially in making diffraction corrections for the nonlinearity parameter determination, because of their computational efficiency and accuracy.

  13. Computational inference of replication and transcription activator regulator activity in herpesvirus from gene expression data

    NARCIS (Netherlands)

    Recchia, A.; Wit, E.; Vinciotti, V.; Kellam, P.

    One of the main aims of system biology is to understand the structure and dynamics of genomic systems. A computational approach, facilitated by new technologies for high-throughput quantitative experimental data, is put forward to investigate the regulatory system of dynamic interaction among genes

  14. Exploring the relationship between fractal features and bacterial essential genes

    International Nuclear Information System (INIS)

    Yu Yong-Ming; Yang Li-Cai; Zhao Lu-Lu; Liu Zhi-Ping; Zhou Qian

    2016-01-01

    Essential genes are indispensable for the survival of an organism in optimal conditions. Rapid and accurate identifications of new essential genes are of great theoretical and practical significance. Exploring features with predictive power is fundamental for this. Here, we calculate six fractal features from primary gene and protein sequences and then explore their relationship with gene essentiality by statistical analysis and machine learning-based methods. The models are applied to all the currently available identified genes in 27 bacteria from the database of essential genes (DEG). It is found that the fractal features of essential genes generally differ from those of non-essential genes. The fractal features are used to ascertain the parameters of two machine learning classifiers: Naïve Bayes and Random Forest. The area under the curve (AUC) of both classifiers show that each fractal feature is satisfactorily discriminative between essential genes and non-essential genes individually. And, although significant correlations exist among fractal features, gene essentiality can also be reliably predicted by various combinations of them. Thus, the fractal features analyzed in our study can be used not only to construct a good essentiality classifier alone, but also to be significant contributors for computational tools identifying essential genes. (paper)

  15. Computational Prediction of MicroRNAs from Toxoplasma gondii Potentially Regulating the Hosts’ Gene Expression

    Directory of Open Access Journals (Sweden)

    Müşerref Duygu Saçar

    2014-10-01

    Full Text Available MicroRNAs (miRNAs were discovered two decades ago, yet there is still a great need for further studies elucidating their genesis and targeting in different phyla. Since experimental discovery and validation of miRNAs is difficult, computational predictions are indispensable and today most computational approaches employ machine learning. Toxoplasma gondii, a parasite residing within the cells of its hosts like human, uses miRNAs for its post-transcriptional gene regulation. It may also regulate its hosts’ gene expression, which has been shown in brain cancer. Since previous studies have shown that overexpressed miRNAs within the host are causal for disease onset, we hypothesized that T. gondii could export miRNAs into its host cell. We computationally predicted all hairpins from the genome of T. gondii and used mouse and human models to filter possible candidates. These were then further compared to known miRNAs in human and rodents and their expression was examined for T. gondii grown in mouse and human hosts, respectively. We found that among the millions of potential hairpins in T. gondii, only a few thousand pass filtering using a human or mouse model and that even fewer of those are expressed. Since they are expressed and differentially expressed in rodents and human, we suggest that there is a chance that T. gondii may export miRNAs into its hosts for direct regulation.

  16. Computational analysis of candidate disease genes and variants for Salt-sensitive hypertension in indigenous Southern Africans

    KAUST Repository

    Tiffin, Nicki

    2010-09-27

    Multiple factors underlie susceptibility to essential hypertension, including a significant genetic and ethnic component, and environmental effects. Blood pressure response of hypertensive individuals to salt is heterogeneous, but salt sensitivity appears more prevalent in people of indigenous African origin. The underlying genetics of salt-sensitive hypertension, however, are poorly understood. In this study, computational methods including text- and data-mining have been used to select and prioritize candidate aetiological genes for salt-sensitive hypertension. Additionally, we have compared allele frequencies and copy number variation for single nucleotide polymorphisms in candidate genes between indigenous Southern African and Caucasian populations, with the aim of identifying candidate genes with significant variability between the population groups: identifying genetic variability between population groups can exploit ethnic differences in disease prevalence to aid with prioritisation of good candidate genes. Our top-ranking candidate genes include parathyroid hormone precursor (PTH) and type-1angiotensin II receptor (AGTR1). We propose that the candidate genes identified in this study warrant further investigation as potential aetiological genes for salt-sensitive hypertension. © 2010 Tiffin et al.

  17. Computational neurogenetic modeling

    CERN Document Server

    Benuskova, Lubica

    2010-01-01

    Computational Neurogenetic Modeling is a student text, introducing the scope and problems of a new scientific discipline - Computational Neurogenetic Modeling (CNGM). CNGM is concerned with the study and development of dynamic neuronal models for modeling brain functions with respect to genes and dynamic interactions between genes. These include neural network models and their integration with gene network models. This new area brings together knowledge from various scientific disciplines, such as computer and information science, neuroscience and cognitive science, genetics and molecular biol

  18. Experiment and theory at the convergence limit: accurate equilibrium structure of picolinic acid by gas-phase electron diffraction and coupled-cluster computations.

    Science.gov (United States)

    Vogt, Natalja; Marochkin, Ilya I; Rykov, Anatolii N

    2018-04-18

    The accurate molecular structure of picolinic acid has been determined from experimental data and computed at the coupled cluster level of theory. Only one conformer with the O[double bond, length as m-dash]C-C-N and H-O-C[double bond, length as m-dash]O fragments in antiperiplanar (ap) positions, ap-ap, has been detected under conditions of the gas-phase electron diffraction (GED) experiment (Tnozzle = 375(3) K). The semiexperimental equilibrium structure, rsee, of this conformer has been derived from the GED data taking into account the anharmonic vibrational effects estimated from the ab initio force field. The equilibrium structures of the two lowest-energy conformers, ap-ap and ap-sp (with the synperiplanar H-O-C[double bond, length as m-dash]O fragment), have been fully optimized at the CCSD(T)_ae level of theory in conjunction with the triple-ζ basis set (cc-pwCVTZ). The quality of the optimized structures has been improved due to extrapolation to the quadruple-ζ basis set. The high accuracy of both GED determination and CCSD(T) computations has been disclosed by a correct comparison of structures having the same physical meaning. The ap-ap conformer has been found to be stabilized by the relatively strong NH-O hydrogen bond of 1.973(27) Å (GED) and predicted to be lower in energy by 16 kJ mol-1 with respect to the ap-sp conformer without a hydrogen bond. The influence of this bond on the structure of picolinic acid has been analyzed within the Natural Bond Orbital model. The possibility of the decarboxylation of picolinic acid has been considered in the GED analysis, but no significant amounts of pyridine and carbon dioxide could be detected. To reveal the structural changes reflecting the mesomeric and inductive effects due to the carboxylic substituent, the accurate structure of pyridine has been also computed at the CCSD(T)_ae level with basis sets from triple- to 5-ζ quality. The comprehensive structure computations for pyridine as well as for

  19. Funnel metadynamics as accurate binding free-energy method

    Science.gov (United States)

    Limongelli, Vittorio; Bonomi, Massimiliano; Parrinello, Michele

    2013-01-01

    A detailed description of the events ruling ligand/protein interaction and an accurate estimation of the drug affinity to its target is of great help in speeding drug discovery strategies. We have developed a metadynamics-based approach, named funnel metadynamics, that allows the ligand to enhance the sampling of the target binding sites and its solvated states. This method leads to an efficient characterization of the binding free-energy surface and an accurate calculation of the absolute protein–ligand binding free energy. We illustrate our protocol in two systems, benzamidine/trypsin and SC-558/cyclooxygenase 2. In both cases, the X-ray conformation has been found as the lowest free-energy pose, and the computed protein–ligand binding free energy in good agreement with experiments. Furthermore, funnel metadynamics unveils important information about the binding process, such as the presence of alternative binding modes and the role of waters. The results achieved at an affordable computational cost make funnel metadynamics a valuable method for drug discovery and for dealing with a variety of problems in chemistry, physics, and material science. PMID:23553839

  20. bc-GenExMiner 3.0: new mining module computes breast cancer gene expression correlation analyses.

    Science.gov (United States)

    Jézéquel, Pascal; Frénel, Jean-Sébastien; Campion, Loïc; Guérin-Charbonnel, Catherine; Gouraud, Wilfried; Ricolleau, Gabriel; Campone, Mario

    2013-01-01

    We recently developed a user-friendly web-based application called bc-GenExMiner (http://bcgenex.centregauducheau.fr), which offered the possibility to evaluate prognostic informativity of genes in breast cancer by means of a 'prognostic module'. In this study, we develop a new module called 'correlation module', which includes three kinds of gene expression correlation analyses. The first one computes correlation coefficient between 2 or more (up to 10) chosen genes. The second one produces two lists of genes that are most correlated (positively and negatively) to a 'tested' gene. A gene ontology (GO) mining function is also proposed to explore GO 'biological process', 'molecular function' and 'cellular component' terms enrichment for the output lists of most correlated genes. The third one explores gene expression correlation between the 15 telomeric and 15 centromeric genes surrounding a 'tested' gene. These correlation analyses can be performed in different groups of patients: all patients (without any subtyping), in molecular subtypes (basal-like, HER2+, luminal A and luminal B) and according to oestrogen receptor status. Validation tests based on published data showed that these automatized analyses lead to results consistent with studies' conclusions. In brief, this new module has been developed to help basic researchers explore molecular mechanisms of breast cancer. DATABASE URL: http://bcgenex.centregauducheau.fr

  1. Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis.

    Science.gov (United States)

    Masso, Majid; Vaisman, Iosif I

    2008-09-15

    Accurate predictive models for the impact of single amino acid substitutions on protein stability provide insight into protein structure and function. Such models are also valuable for the design and engineering of new proteins. Previously described methods have utilized properties of protein sequence or structure to predict the free energy change of mutants due to thermal (DeltaDeltaG) and denaturant (DeltaDeltaG(H2O)) denaturations, as well as mutant thermal stability (DeltaT(m)), through the application of either computational energy-based approaches or machine learning techniques. However, accuracy associated with applying these methods separately is frequently far from optimal. We detail a computational mutagenesis technique based on a four-body, knowledge-based, statistical contact potential. For any mutation due to a single amino acid replacement in a protein, the method provides an empirical normalized measure of the ensuing environmental perturbation occurring at every residue position. A feature vector is generated for the mutant by considering perturbations at the mutated position and it's ordered six nearest neighbors in the 3-dimensional (3D) protein structure. These predictors of stability change are evaluated by applying machine learning tools to large training sets of mutants derived from diverse proteins that have been experimentally studied and described. Predictive models based on our combined approach are either comparable to, or in many cases significantly outperform, previously published results. A web server with supporting documentation is available at http://proteins.gmu.edu/automute.

  2. A robust and accurate approach to computing compressible multiphase flow: Stratified flow model and AUSM+-up scheme

    International Nuclear Information System (INIS)

    Chang, Chih-Hao; Liou, Meng-Sing

    2007-01-01

    In this paper, we propose a new approach to compute compressible multifluid equations. Firstly, a single-pressure compressible multifluid model based on the stratified flow model is proposed. The stratified flow model, which defines different fluids in separated regions, is shown to be amenable to the finite volume method. We can apply the conservation law to each subregion and obtain a set of balance equations. Secondly, the AUSM + scheme, which is originally designed for the compressible gas flow, is extended to solve compressible liquid flows. By introducing additional dissipation terms into the numerical flux, the new scheme, called AUSM + -up, can be applied to both liquid and gas flows. Thirdly, the contribution to the numerical flux due to interactions between different phases is taken into account and solved by the exact Riemann solver. We will show that the proposed approach yields an accurate and robust method for computing compressible multiphase flows involving discontinuities, such as shock waves and fluid interfaces. Several one-dimensional test problems are used to demonstrate the capability of our method, including the Ransom's water faucet problem and the air-water shock tube problem. Finally, several two dimensional problems will show the capability to capture enormous details and complicated wave patterns in flows having large disparities in the fluid density and velocities, such as interactions between water shock wave and air bubble, between air shock wave and water column(s), and underwater explosion

  3. Prostate cancer nodal oligometastasis accurately assessed using prostate-specific membrane antigen positron emission tomography-computed tomography and confirmed histologically following robotic-assisted lymph node dissection.

    Science.gov (United States)

    O'Kane, Dermot B; Lawrentschuk, Nathan; Bolton, Damien M

    2016-01-01

    We herein present a case of a 76-year-old gentleman, where prostate-specific membrane antigen positron emission tomography-computed tomography (PSMA PET-CT) was used to accurately detect prostate cancer (PCa), pelvic lymph node (LN) metastasis in the setting of biochemical recurrence following definitive treatment for PCa. The positive PSMA PET-CT result was confirmed with histological examination of the involved pelvic LNs following pelvic LN dissection.

  4. Prostate cancer nodal oligometastasis accurately assessed using prostate-specific membrane antigen positron emission tomography-computed tomography and confirmed histologically following robotic-assisted lymph node dissection

    Directory of Open Access Journals (Sweden)

    Dermot B O′Kane

    2016-01-01

    Full Text Available We herein present a case of a 76-year-old gentleman, where prostate-specific membrane antigen positron emission tomography-computed tomography (PSMA PET-CT was used to accurately detect prostate cancer (PCa, pelvic lymph node (LN metastasis in the setting of biochemical recurrence following definitive treatment for PCa. The positive PSMA PET-CT result was confirmed with histological examination of the involved pelvic LNs following pelvic LN dissection.

  5. Integrating Crop Growth Models with Whole Genome Prediction through Approximate Bayesian Computation.

    Directory of Open Access Journals (Sweden)

    Frank Technow

    Full Text Available Genomic selection, enabled by whole genome prediction (WGP methods, is revolutionizing plant breeding. Existing WGP methods have been shown to deliver accurate predictions in the most common settings, such as prediction of across environment performance for traits with additive gene effects. However, prediction of traits with non-additive gene effects and prediction of genotype by environment interaction (G×E, continues to be challenging. Previous attempts to increase prediction accuracy for these particularly difficult tasks employed prediction methods that are purely statistical in nature. Augmenting the statistical methods with biological knowledge has been largely overlooked thus far. Crop growth models (CGMs attempt to represent the impact of functional relationships between plant physiology and the environment in the formation of yield and similar output traits of interest. Thus, they can explain the impact of G×E and certain types of non-additive gene effects on the expressed phenotype. Approximate Bayesian computation (ABC, a novel and powerful computational procedure, allows the incorporation of CGMs directly into the estimation of whole genome marker effects in WGP. Here we provide a proof of concept study for this novel approach and demonstrate its use with synthetic data sets. We show that this novel approach can be considerably more accurate than the benchmark WGP method GBLUP in predicting performance in environments represented in the estimation set as well as in previously unobserved environments for traits determined by non-additive gene effects. We conclude that this proof of concept demonstrates that using ABC for incorporating biological knowledge in the form of CGMs into WGP is a very promising and novel approach to improving prediction accuracy for some of the most challenging scenarios in plant breeding and applied genetics.

  6. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing.

    Directory of Open Access Journals (Sweden)

    Hansaim Lim

    2016-10-01

    Full Text Available Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing

  7. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing.

    Science.gov (United States)

    Lim, Hansaim; Poleksic, Aleksandar; Yao, Yuan; Tong, Hanghang; He, Di; Zhuang, Luke; Meng, Patrick; Xie, Lei

    2016-10-01

    Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and

  8. Accurate Numerical Simulations Of Chemical Phenomena Involved in Energy Production and Storage with MADNESS and MPQC: ALCF-2 Early Science Program Technical Report

    Energy Technology Data Exchange (ETDEWEB)

    Vzquez-Mayagoitia, Alvaro [Argonne National Lab. (ANL), Argonne, IL (United States); Hammond, Jeff R. [Argonne National Lab. (ANL), Argonne, IL (United States)

    2013-09-16

    In order to solve the electronic structure of large molecular systems on petascale computers using MADNESS, a numerical tool kit, are required fast and accurate implementations for linear algebra. MADNESS uses multiresolution analysis and low separation rank which translates high dimensional functions in tensor products using Legendre polynomial. The multiple tensor products make to the singular value decomposition and matrix multiplication the most intense operations in MADNESS. This work discusses the interfacing of Eigen3 as a C++ substitute of LAPACK and introduces Elemental for the diagonalization of large matrices. Furthermore, the present paper shows the performance these libraries on Blue Gene/ Q.

  9. Computer face-matching technology using two-dimensional photographs accurately matches the facial gestalt of unrelated individuals with the same syndromic form of intellectual disability.

    Science.gov (United States)

    Dudding-Byth, Tracy; Baxter, Anne; Holliday, Elizabeth G; Hackett, Anna; O'Donnell, Sheridan; White, Susan M; Attia, John; Brunner, Han; de Vries, Bert; Koolen, David; Kleefstra, Tjitske; Ratwatte, Seshika; Riveros, Carlos; Brain, Steve; Lovell, Brian C

    2017-12-19

    Massively parallel genetic sequencing allows rapid testing of known intellectual disability (ID) genes. However, the discovery of novel syndromic ID genes requires molecular confirmation in at least a second or a cluster of individuals with an overlapping phenotype or similar facial gestalt. Using computer face-matching technology we report an automated approach to matching the faces of non-identical individuals with the same genetic syndrome within a database of 3681 images [1600 images of one of 10 genetic syndrome subgroups together with 2081 control images]. Using the leave-one-out method, two research questions were specified: 1) Using two-dimensional (2D) photographs of individuals with one of 10 genetic syndromes within a database of images, did the technology correctly identify more than expected by chance: i) a top match? ii) at least one match within the top five matches? or iii) at least one in the top 10 with an individual from the same syndrome subgroup? 2) Was there concordance between correct technology-based matches and whether two out of three clinical geneticists would have considered the diagnosis based on the image alone? The computer face-matching technology correctly identifies a top match, at least one correct match in the top five and at least one in the top 10 more than expected by chance (P syndromes except Kabuki syndrome. Although the accuracy of the computer face-matching technology was tested on images of individuals with known syndromic forms of intellectual disability, the results of this pilot study illustrate the potential utility of face-matching technology within deep phenotyping platforms to facilitate the interpretation of DNA sequencing data for individuals who remain undiagnosed despite testing the known developmental disorder genes.

  10. Where we stand, where we are moving: Surveying computational techniques for identifying miRNA genes and uncovering their regulatory role

    KAUST Repository

    Kleftogiannis, Dimitrios A.; Korfiati, Aigli; Theofilatos, Konstantinos A.; Likothanassis, Spiridon D.; Tsakalidis, Athanasios K.; Mavroudi, Seferina P.

    2013-01-01

    Traditional biology was forced to restate some of its principles when the microRNA (miRNA) genes and their regulatory role were firstly discovered. Typically, miRNAs are small non-coding RNA molecules which have the ability to bind to the 3'untraslated region (UTR) of their mRNA target genes for cleavage or translational repression. Existing experimental techniques for their identification and the prediction of the target genes share some important limitations such as low coverage, time consuming experiments and high cost reagents. Hence, many computational methods have been proposed for these tasks to overcome these limitations. Recently, many researchers emphasized on the development of computational approaches to predict the participation of miRNA genes in regulatory networks and to analyze their transcription mechanisms. All these approaches have certain advantages and disadvantages which are going to be described in the present survey. Our work is differentiated from existing review papers by updating the methodologies list and emphasizing on the computational issues that arise from the miRNA data analysis. Furthermore, in the present survey, the various miRNA data analysis steps are treated as an integrated procedure whose aims and scope is to uncover the regulatory role and mechanisms of the miRNA genes. This integrated view of the miRNA data analysis steps may be extremely useful for all researchers even if they work on just a single step. © 2013 Elsevier Inc.

  11. Where we stand, where we are moving: Surveying computational techniques for identifying miRNA genes and uncovering their regulatory role

    KAUST Repository

    Kleftogiannis, Dimitrios A.

    2013-06-01

    Traditional biology was forced to restate some of its principles when the microRNA (miRNA) genes and their regulatory role were firstly discovered. Typically, miRNAs are small non-coding RNA molecules which have the ability to bind to the 3\\'untraslated region (UTR) of their mRNA target genes for cleavage or translational repression. Existing experimental techniques for their identification and the prediction of the target genes share some important limitations such as low coverage, time consuming experiments and high cost reagents. Hence, many computational methods have been proposed for these tasks to overcome these limitations. Recently, many researchers emphasized on the development of computational approaches to predict the participation of miRNA genes in regulatory networks and to analyze their transcription mechanisms. All these approaches have certain advantages and disadvantages which are going to be described in the present survey. Our work is differentiated from existing review papers by updating the methodologies list and emphasizing on the computational issues that arise from the miRNA data analysis. Furthermore, in the present survey, the various miRNA data analysis steps are treated as an integrated procedure whose aims and scope is to uncover the regulatory role and mechanisms of the miRNA genes. This integrated view of the miRNA data analysis steps may be extremely useful for all researchers even if they work on just a single step. © 2013 Elsevier Inc.

  12. The Rice Enhancer of Zeste [E(z] Genes SDG711 and SDG718 are respectively involved in Long Day and Short Day Signaling to Mediate the Accurate Photoperiod Control of Flowering time

    Directory of Open Access Journals (Sweden)

    Xiaoyun eLiu

    2014-10-01

    Full Text Available Recent advances in rice flowering studies have shown that the accurate control of flowering by photoperiod is regulated by key mechanisms that involve the regulation of flowering genes including Hd1, Ehd1, Hd3a, and RFT1. The chromatin mechanism involved in the regulation of rice flowering genes is presently not well known. Here we show that the rice E(z genes SDG711 and SDG718, which encode the Polycomb Repressive Complex2 (PRC2 key subunit that is required for trimethylation of histone H3 lysine 27 (H3K27me3, are respectively involved in long day (LD and short day (SD regulation of key flowering genes. The expression of SDG711 and SDG718 is induced by LD and SD, respectively. Over-expression and down-regulation of SDG711 respectively repressed and promoted flowering in LD, but had no effect in SD. By contrast, down-regulation of SDG718 had no effect in LD but delayed flowering in SD. SDG711 and SDG718 repressed OsLF (a repressor of Hd1 respectively in LD and SD, leading to a higher expression of Hd1 thus late flowering in LD and early flowering in SD. SDG711 was also found to be involved in the repression of Ehd1 in LD. SDG711 was shown to directly target to OsLF and Ehd1 loci to mediate H3K27me3 and gene repression. The function of the rice E(z genes in LD repression and SD promotion of flowering suggests that PRC2-mediated epigenetic repression of gene expression is involved in the accurate photoperiod control of rice flowering.

  13. Multidetector row computed tomography may accurately estimate plaque vulnerability. Does MDCT accurately estimate plaque vulnerability? (Pro)

    International Nuclear Information System (INIS)

    Komatsu, Sei; Imai, Atsuko; Kodama, Kazuhisa

    2011-01-01

    Over the past decade, multidetector row computed tomography (MDCT) has become the most reliable and established of the noninvasive examination techniques for detecting coronary heart disease. Now MDCT is chasing intravascular ultrasound (IVUS) in terms of spatial resolution. Among the components of vulnerable plaque, MDCT may detect lipid-rich plaque, the lipid pool, and calcified spots using computed tomography number. Plaque components are detected by MDCT with high accuracy compared with IVUS and angioscopy when assessing vulnerable plaque. The TWINS study and TOGETHAR trial demonstrated that angioscopic loss of yellow color occurred independently of volumetric plaque change by statin therapy. These 2 studies showed that plaque stabilization and regression reflect independent processes mediated by different mechanisms and time course. Noncalcified plaque and/or low-density plaque was found to be the strongest predictor of cardiac events, regardless of lesion severity, and act as a potential marker of plaque vulnerability. MDCT may be an effective tool for early triage of patients with chest pain who have a normal electrocardiogram (ECG) and cardiac enzymes in the emergency department. MDCT has the potential ability to analyze coronary plaque quantitatively and qualitatively if some problems are resolved. MDCT may become an essential tool for detecting and preventing coronary artery disease in the future. (author)

  14. Rapid identification of sequences for orphan enzymes to power accurate protein annotation.

    Directory of Open Access Journals (Sweden)

    Kevin R Ramkissoon

    Full Text Available The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.

  15. Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation

    Science.gov (United States)

    Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392

  16. Accurate Bit Error Rate Calculation for Asynchronous Chaos-Based DS-CDMA over Multipath Channel

    Science.gov (United States)

    Kaddoum, Georges; Roviras, Daniel; Chargé, Pascal; Fournier-Prunaret, Daniele

    2009-12-01

    An accurate approach to compute the bit error rate expression for multiuser chaosbased DS-CDMA system is presented in this paper. For more realistic communication system a slow fading multipath channel is considered. A simple RAKE receiver structure is considered. Based on the bit energy distribution, this approach compared to others computation methods existing in literature gives accurate results with low computation charge. Perfect estimation of the channel coefficients with the associated delays and chaos synchronization is assumed. The bit error rate is derived in terms of the bit energy distribution, the number of paths, the noise variance, and the number of users. Results are illustrated by theoretical calculations and numerical simulations which point out the accuracy of our approach.

  17. Fast sweeping algorithm for accurate solution of the TTI eikonal equation using factorization

    KAUST Repository

    bin Waheed, Umair

    2017-06-10

    Traveltime computation is essential for many seismic data processing applications and velocity analysis tools. High-resolution seismic imaging requires eikonal solvers to account for anisotropy whenever it significantly affects the seismic wave kinematics. Moreover, computation of auxiliary quantities, such as amplitude and take-off angle, rely on highly accurate traveltime solutions. However, the finite-difference based eikonal solution for a point-source initial condition has an upwind source-singularity at the source position, since the wavefront curvature is large near the source point. Therefore, all finite-difference solvers, even the high-order ones, show inaccuracies since the errors due to source-singularity spread from the source point to the whole computational domain. We address the source-singularity problem for tilted transversely isotropic (TTI) eikonal solvers using factorization. We solve a sequence of factored tilted elliptically anisotropic (TEA) eikonal equations iteratively, each time by updating the right hand side function. At each iteration, we factor the unknown TEA traveltime into two factors. One of the factors is specified analytically, such that the other factor is smooth in the source neighborhood. Therefore, through the iterative procedure we obtain accurate solution to the TTI eikonal equation. Numerical tests show significant improvement in accuracy due to factorization. The idea can be easily extended to compute accurate traveltimes for models with lower anisotropic symmetries, such as orthorhombic, monoclinic or even triclinic media.

  18. Absolute Hounsfield unit measurement on noncontrast computed tomography cannot accurately predict struvite stone composition.

    Science.gov (United States)

    Marchini, Giovanni Scala; Gebreselassie, Surafel; Liu, Xiaobo; Pynadath, Cindy; Snyder, Grace; Monga, Manoj

    2013-02-01

    The purpose of our study was to determine, in vivo, whether single-energy noncontrast computed tomography (NCCT) can accurately predict the presence/percentage of struvite stone composition. We retrospectively searched for all patients with struvite components on stone composition analysis between January 2008 and March 2012. Inclusion criteria were NCCT prior to stone analysis and stone size ≥4 mm. A single urologist, blinded to stone composition, reviewed all NCCT to acquire stone location, dimensions, and Hounsfield unit (HU). HU density (HUD) was calculated by dividing mean HU by the stone's largest transverse diameter. Stone analysis was performed via Fourier transform infrared spectrometry. Independent sample Student's t-test and analysis of variance (ANOVA) were used to compare HU/HUD among groups. Spearman's correlation test was used to determine the correlation between HU and stone size and also HU/HUD to % of each component within the stone. Significance was considered if pR=0.017; p=0.912) and negative with HUD (R=-0.20; p=0.898). Overall, 3 (6.8%) had stones (n=5) with other miscellaneous stones (n=39), no difference was found for HU (p=0.09) but HUD was significantly lower for pure stones (27.9±23.6 v 72.5±55.9, respectively; p=0.006). Again, significant overlaps were seen. Pure struvite stones have significantly lower HUD than mixed struvite stones, but overlap exists. A low HUD may increase the suspicion for a pure struvite calculus.

  19. A molecular computational model improves the preoperative diagnosis of thyroid nodules

    International Nuclear Information System (INIS)

    Tomei, Sara; Marchetti, Ivo; Zavaglia, Katia; Lessi, Francesca; Apollo, Alessandro; Aretini, Paolo; Di Coscio, Giancarlo; Bevilacqua, Generoso; Mazzanti, Chiara

    2012-01-01

    Thyroid nodules with indeterminate cytological features on fine needle aspiration (FNA) cytology have a 20% risk of thyroid cancer. The aim of the current study was to determine the diagnostic utility of an 8-gene assay to distinguish benign from malignant thyroid neoplasm. The mRNA expression level of 9 genes (KIT, SYNGR2, C21orf4, Hs.296031, DDI2, CDH1, LSM7, TC1, NATH) was analysed by quantitative PCR (q-PCR) in 93 FNA cytological samples. To evaluate the diagnostic utility of all the genes analysed, we assessed the area under the curve (AUC) for each gene individually and in combination. BRAF exon 15 status was determined by pyrosequencing. An 8-gene computational model (Neural Network Bayesian Classifier) was built and a multiple-variable analysis was then performed to assess the correlation between the markers. The AUC for each significant marker ranged between 0.625 and 0.900, thus all the significant markers, alone and in combination, can be used to distinguish between malignant and benign FNA samples. The classifier made up of KIT, CDH1, LSM7, C21orf4, DDI2, TC1, Hs.296031 and BRAF had a predictive power of 88.8%. It proved to be useful for risk stratification of the most critical cytological group of the indeterminate lesions for which there is the greatest need of accurate diagnostic markers. The genetic classification obtained with this model is highly accurate at differentiating malignant from benign thyroid lesions and might be a useful adjunct in the preoperative management of patients with thyroid nodules

  20. A Computational Gene Expression Score for Predicting Immune Injury in Renal Allografts.

    Directory of Open Access Journals (Sweden)

    Tara K Sigdel

    Full Text Available Whole genome microarray meta-analyses of 1030 kidney, heart, lung and liver allograft biopsies identified a common immune response module (CRM of 11 genes that define acute rejection (AR across different engrafted tissues. We evaluated if the CRM genes can provide a molecular microscope to quantify graft injury in acute rejection (AR and predict risk of progressive interstitial fibrosis and tubular atrophy (IFTA in histologically normal kidney biopsies.Computational modeling was done on tissue qPCR based gene expression measurements for the 11 CRM genes in 146 independent renal allografts from 122 unique patients with AR (n = 54 and no-AR (n = 92. 24 demographically matched patients with no-AR had 6 and 24 month paired protocol biopsies; all had histologically normal 6 month biopsies, and 12 had evidence of progressive IFTA (pIFTA on their 24 month biopsies. Results were correlated with demographic, clinical and pathology variables.The 11 gene qPCR based tissue CRM score (tCRM was significantly increased in AR (5.68 ± 0.91 when compared to STA (1.29 ± 0.28; p < 0.001 and pIFTA (7.94 ± 2.278 versus 2.28 ± 0.66; p = 0.04, with greatest significance for CXCL9 and CXCL10 in AR (p <0.001 and CD6 (p<0.01, CXCL9 (p<0.05, and LCK (p<0.01 in pIFTA. tCRM was a significant independent correlate of biopsy confirmed AR (p < 0.001; AUC of 0.900; 95% CI = 0.705-903. Gene expression modeling of 6 month biopsies across 7/11 genes (CD6, INPP5D, ISG20, NKG7, PSMB9, RUNX3, and TAP1 significantly (p = 0.037 predicted the development of pIFTA at 24 months.Genome-wide tissue gene expression data mining has supported the development of a tCRM-qPCR based assay for evaluating graft immune inflammation. The tCRM score quantifies injury in AR and stratifies patients at increased risk of future pIFTA prior to any perturbation of graft function or histology.

  1. A computational method based on the integration of heterogeneous networks for predicting disease-gene associations.

    Directory of Open Access Journals (Sweden)

    Xingli Guo

    Full Text Available The identification of disease-causing genes is a fundamental challenge in human health and of great importance in improving medical care, and provides a better understanding of gene functions. Recent computational approaches based on the interactions among human proteins and disease similarities have shown their power in tackling the issue. In this paper, a novel systematic and global method that integrates two heterogeneous networks for prioritizing candidate disease-causing genes is provided, based on the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein interactions. In this method, the association score function between a query disease and a candidate gene is defined as the weighted sum of all the association scores between similar diseases and neighbouring genes. Moreover, the topological correlation of these two heterogeneous networks can be incorporated into the definition of the score function, and finally an iterative algorithm is designed for this issue. This method was tested with 10-fold cross-validation on all 1,126 diseases that have at least a known causal gene, and it ranked the correct gene as one of the top ten in 622 of all the 1,428 cases, significantly outperforming a state-of-the-art method called PRINCE. The results brought about by this method were applied to study three multi-factorial disorders: breast cancer, Alzheimer disease and diabetes mellitus type 2, and some suggestions of novel causal genes and candidate disease-causing subnetworks were provided for further investigation.

  2. Uniform approximation is more appropriate for Wilcoxon Rank-Sum Test in gene set analysis.

    Directory of Open Access Journals (Sweden)

    Zhide Fang

    Full Text Available Gene set analysis is widely used to facilitate biological interpretations in the analyses of differential expression from high throughput profiling data. Wilcoxon Rank-Sum (WRS test is one of the commonly used methods in gene set enrichment analysis. It compares the ranks of genes in a gene set against those of genes outside the gene set. This method is easy to implement and it eliminates the dichotomization of genes into significant and non-significant in a competitive hypothesis testing. Due to the large number of genes being examined, it is impractical to calculate the exact null distribution for the WRS test. Therefore, the normal distribution is commonly used as an approximation. However, as we demonstrate in this paper, the normal approximation is problematic when a gene set with relative small number of genes is tested against the large number of genes in the complementary set. In this situation, a uniform approximation is substantially more powerful, more accurate, and less intensive in computation. We demonstrate the advantage of the uniform approximations in Gene Ontology (GO term analysis using simulations and real data sets.

  3. ARACNe-AP: gene network reverse engineering through adaptive partitioning inference of mutual information.

    Science.gov (United States)

    Lachmann, Alexander; Giorgi, Federico M; Lopez, Gonzalo; Califano, Andrea

    2016-07-15

    The accurate reconstruction of gene regulatory networks from large scale molecular profile datasets represents one of the grand challenges of Systems Biology. The Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) represents one of the most effective tools to accomplish this goal. However, the initial Fixed Bandwidth (FB) implementation is both inefficient and unable to deal with sample sets providing largely uneven coverage of the probability density space. Here, we present a completely new implementation of the algorithm, based on an Adaptive Partitioning strategy (AP) for estimating the Mutual Information. The new AP implementation (ARACNe-AP) achieves a dramatic improvement in computational performance (200× on average) over the previous methodology, while preserving the Mutual Information estimator and the Network inference accuracy of the original algorithm. Given that the previous version of ARACNe is extremely demanding, the new version of the algorithm will allow even researchers with modest computational resources to build complex regulatory networks from hundreds of gene expression profiles. A JAVA cross-platform command line executable of ARACNe, together with all source code and a detailed usage guide are freely available on Sourceforge (http://sourceforge.net/projects/aracne-ap). JAVA version 8 or higher is required. califano@c2b2.columbia.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  4. FASTSIM2: a second-order accurate frictional rolling contact algorithm

    Science.gov (United States)

    Vollebregt, E. A. H.; Wilders, P.

    2011-01-01

    In this paper we consider the frictional (tangential) steady rolling contact problem. We confine ourselves to the simplified theory, instead of using full elastostatic theory, in order to be able to compute results fast, as needed for on-line application in vehicle system dynamics simulation packages. The FASTSIM algorithm is the leading technology in this field and is employed in all dominant railway vehicle system dynamics packages (VSD) in the world. The main contribution of this paper is a new version "FASTSIM2" of the FASTSIM algorithm, which is second-order accurate. This is relevant for VSD, because with the new algorithm 16 times less grid points are required for sufficiently accurate computations of the contact forces. The approach is based on new insights in the characteristics of the rolling contact problem when using the simplified theory, and on taking precise care of the contact conditions in the numerical integration scheme employed.

  5. Accurate calculation of Green functions on the d-dimensional hypercubic lattice

    International Nuclear Information System (INIS)

    Loh, Yen Lee

    2011-01-01

    We write the Green function of the d-dimensional hypercubic lattice in a piecewise form covering the entire real frequency axis. Each piece is a single integral involving modified Bessel functions of the first and second kinds. The smoothness of the integrand allows both real and imaginary parts of the Green function to be computed quickly and accurately for any dimension d and any real frequency, and the computational time scales only linearly with d.

  6. DeepBound: accurate identification of transcript boundaries via deep convolutional neural fields

    KAUST Repository

    Shao, Mingfu; Ma, Jianzhu; Wang, Sheng

    2017-01-01

    Motivation: Reconstructing the full- length expressed transcripts (a. k. a. the transcript assembly problem) from the short sequencing reads produced by RNA-seq protocol plays a central role in identifying novel genes and transcripts as well as in studying gene expressions and gene functions. A crucial step in transcript assembly is to accurately determine the splicing junctions and boundaries of the expressed transcripts from the reads alignment. In contrast to the splicing junctions that can be efficiently detected from spliced reads, the problem of identifying boundaries remains open and challenging, due to the fact that the signal related to boundaries is noisy and weak.

  7. DeepBound: accurate identification of transcript boundaries via deep convolutional neural fields

    KAUST Repository

    Shao, Mingfu

    2017-04-20

    Motivation: Reconstructing the full- length expressed transcripts (a. k. a. the transcript assembly problem) from the short sequencing reads produced by RNA-seq protocol plays a central role in identifying novel genes and transcripts as well as in studying gene expressions and gene functions. A crucial step in transcript assembly is to accurately determine the splicing junctions and boundaries of the expressed transcripts from the reads alignment. In contrast to the splicing junctions that can be efficiently detected from spliced reads, the problem of identifying boundaries remains open and challenging, due to the fact that the signal related to boundaries is noisy and weak.

  8. Creating and validating cis-regulatory maps of tissue-specific gene expression regulation

    Science.gov (United States)

    O'Connor, Timothy R.; Bailey, Timothy L.

    2014-01-01

    Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules–CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for ‘other’ tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a ‘nearest neighbor’ heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps. PMID:25200088

  9. Accurate evaluation of exchange fields in finite element micromagnetic solvers

    Science.gov (United States)

    Chang, R.; Escobar, M. A.; Li, S.; Lubarda, M. V.; Lomakin, V.

    2012-04-01

    Quadratic basis functions (QBFs) are implemented for solving the Landau-Lifshitz-Gilbert equation via the finite element method. This involves the introduction of a set of special testing functions compatible with the QBFs for evaluating the Laplacian operator. The results by using QBFs are significantly more accurate than those via linear basis functions. QBF approach leads to significantly more accurate results than conventionally used approaches based on linear basis functions. Importantly QBFs allow reducing the error of computing the exchange field by increasing the mesh density for structured and unstructured meshes. Numerical examples demonstrate the feasibility of the method.

  10. Spatial reconstruction of single-cell gene expression data.

    Science.gov (United States)

    Satija, Rahul; Farrell, Jeffrey A; Gennert, David; Schier, Alexander F; Regev, Aviv

    2015-05-01

    Spatial localization is a key determinant of cellular fate and behavior, but methods for spatially resolved, transcriptome-wide gene expression profiling across complex tissues are lacking. RNA staining methods assay only a small number of transcripts, whereas single-cell RNA-seq, which measures global gene expression, separates cells from their native spatial context. Here we present Seurat, a computational strategy to infer cellular localization by integrating single-cell RNA-seq data with in situ RNA patterns. We applied Seurat to spatially map 851 single cells from dissociated zebrafish (Danio rerio) embryos and generated a transcriptome-wide map of spatial patterning. We confirmed Seurat's accuracy using several experimental approaches, then used the strategy to identify a set of archetypal expression patterns and spatial markers. Seurat correctly localizes rare subpopulations, accurately mapping both spatially restricted and scattered groups. Seurat will be applicable to mapping cellular localization within complex patterned tissues in diverse systems.

  11. Accurate predictions for the LHC made easy

    CERN Multimedia

    CERN. Geneva

    2014-01-01

    The data recorded by the LHC experiments is of a very high quality. To get the most out of the data, precise theory predictions, including uncertainty estimates, are needed to reduce as much as possible theoretical bias in the experimental analyses. Recently, significant progress has been made in computing Next-to-Leading Order (NLO) computations, including matching to the parton shower, that allow for these accurate, hadron-level predictions. I shall discuss one of these efforts, the MadGraph5_aMC@NLO program, that aims at the complete automation of predictions at the NLO accuracy within the SM as well as New Physics theories. I’ll illustrate some of the theoretical ideas behind this program, show some selected applications to LHC physics, as well as describe the future plans.

  12. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

    Directory of Open Access Journals (Sweden)

    Dewey Colin N

    2011-08-01

    Full Text Available Abstract Background RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. Results We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. Conclusions RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost

  13. Impacts of Nonsynonymous Single Nucleotide Polymorphisms of Adiponectin Receptor 1 Gene on Corresponding Protein Stability: A Computational Approach

    Directory of Open Access Journals (Sweden)

    Md. Abu Saleh

    2016-01-01

    Full Text Available Despite the reported association of adiponectin receptor 1 (ADIPOR1 gene mutations with vulnerability to several human metabolic diseases, there is lack of computational analysis on the functional and structural impacts of single nucleotide polymorphisms (SNPs of the human ADIPOR1 at protein level. Therefore, sequence- and structure-based computational tools were employed in this study to functionally and structurally characterize the coding nsSNPs of ADIPOR1 gene listed in the dbSNP database. Our in silico analysis by SIFT, nsSNPAnalyzer, PolyPhen-2, Fathmm, I-Mutant 2.0, SNPs&GO, PhD-SNP, PANTHER, and SNPeffect tools identified the nsSNPs with distorting functional impacts, namely, rs765425383 (A348G, rs752071352 (H341Y, rs759555652 (R324L, rs200326086 (L224F, and rs766267373 (L143P from 74 nsSNPs of ADIPOR1 gene. Finally the aforementioned five deleterious nsSNPs were introduced using Swiss-PDB Viewer package within the X-ray crystal structure of ADIPOR1 protein, and changes in free energy for these mutations were computed. Although increased free energy was observed for all the mutants, the nsSNP H341Y caused the highest energy increase amongst all. RMSD and TM scores predicted that mutants were structurally similar to wild type protein. Our analyses suggested that the aforementioned variants especially H341Y could directly or indirectly destabilize the amino acid interactions and hydrogen bonding networks of ADIPOR1.

  14. An approach for reduction of false predictions in reverse engineering of gene regulatory networks.

    Science.gov (United States)

    Khan, Abhinandan; Saha, Goutam; Pal, Rajat Kumar

    2018-05-14

    A gene regulatory network discloses the regulatory interactions amongst genes, at a particular condition of the human body. The accurate reconstruction of such networks from time-series genetic expression data using computational tools offers a stiff challenge for contemporary computer scientists. This is crucial to facilitate the understanding of the proper functioning of a living organism. Unfortunately, the computational methods produce many false predictions along with the correct predictions, which is unwanted. Investigations in the domain focus on the identification of as many correct regulations as possible in the reverse engineering of gene regulatory networks to make it more reliable and biologically relevant. One way to achieve this is to reduce the number of incorrect predictions in the reconstructed networks. In the present investigation, we have proposed a novel scheme to decrease the number of false predictions by suitably combining several metaheuristic techniques. We have implemented the same using a dataset ensemble approach (i.e. combining multiple datasets) also. We have employed the proposed methodology on real-world experimental datasets of the SOS DNA Repair network of Escherichia coli and the IMRA network of Saccharomyces cerevisiae. Subsequently, we have experimented upon somewhat larger, in silico networks, namely, DREAM3 and DREAM4 Challenge networks, and 15-gene and 20-gene networks extracted from the GeneNetWeaver database. To study the effect of multiple datasets on the quality of the inferred networks, we have used four datasets in each experiment. The obtained results are encouraging enough as the proposed methodology can reduce the number of false predictions significantly, without using any supplementary prior biological information for larger gene regulatory networks. It is also observed that if a small amount of prior biological information is incorporated here, the results improve further w.r.t. the prediction of true positives

  15. The advantage of the three dimensional computed tomographic (3 D-CT for ensuring accurate bone incision in sagittal split ramus osteotomy

    Directory of Open Access Journals (Sweden)

    Coen Pramono D

    2005-03-01

    Full Text Available Functional and aesthetic dysgnathia surgery requires accurate pre-surgical planning, including the surgical technique to be used related with the difference of anatomical structures amongst individuals. Programs that simulate the surgery become increasingly important. This can be mediated by using a surgical model, conventional x-rays as panoramic, cephalometric projections and another sophisticated method such as a three dimensional computed tomography (3 D-CT. A patient who had undergone double jaw surgeries with difficult anatomical landmarks was presented. In this case the mandible foramens were seen highly relatively related to the sigmoid notches. Therefore, ensuring the bone incisions in sagittal split was presumed to be difficult. A 3D-CT was made and considered to be very helpful in supporting the pre-operative diagnostic.

  16. An integrated tool to study MHC region: accurate SNV detection and HLA genes typing in human MHC region using targeted high-throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Hongzhi Cao

    Full Text Available The major histocompatibility complex (MHC is one of the most variable and gene-dense regions of the human genome. Most studies of the MHC, and associated regions, focus on minor variants and HLA typing, many of which have been demonstrated to be associated with human disease susceptibility and metabolic pathways. However, the detection of variants in the MHC region, and diagnostic HLA typing, still lacks a coherent, standardized, cost effective and high coverage protocol of clinical quality and reliability. In this paper, we presented such a method for the accurate detection of minor variants and HLA types in the human MHC region, using high-throughput, high-coverage sequencing of target regions. A probe set was designed to template upon the 8 annotated human MHC haplotypes, and to encompass the 5 megabases (Mb of the extended MHC region. We deployed our probes upon three, genetically diverse human samples for probe set evaluation, and sequencing data show that ∼97% of the MHC region, and over 99% of the genes in MHC region, are covered with sufficient depth and good evenness. 98% of genotypes called by this capture sequencing prove consistent with established HapMap genotypes. We have concurrently developed a one-step pipeline for calling any HLA type referenced in the IMGT/HLA database from this target capture sequencing data, which shows over 96% typing accuracy when deployed at 4 digital resolution. This cost-effective and highly accurate approach for variant detection and HLA typing in the MHC region may lend further insight into immune-mediated diseases studies, and may find clinical utility in transplantation medicine research. This one-step pipeline is released for general evaluation and use by the scientific community.

  17. Reducing dose calculation time for accurate iterative IMRT planning

    International Nuclear Information System (INIS)

    Siebers, Jeffrey V.; Lauterbach, Marc; Tong, Shidong; Wu Qiuwen; Mohan, Radhe

    2002-01-01

    A time-consuming component of IMRT optimization is the dose computation required in each iteration for the evaluation of the objective function. Accurate superposition/convolution (SC) and Monte Carlo (MC) dose calculations are currently considered too time-consuming for iterative IMRT dose calculation. Thus, fast, but less accurate algorithms such as pencil beam (PB) algorithms are typically used in most current IMRT systems. This paper describes two hybrid methods that utilize the speed of fast PB algorithms yet achieve the accuracy of optimizing based upon SC algorithms via the application of dose correction matrices. In one method, the ratio method, an infrequently computed voxel-by-voxel dose ratio matrix (R=D SC /D PB ) is applied for each beam to the dose distributions calculated with the PB method during the optimization. That is, D PB xR is used for the dose calculation during the optimization. The optimization proceeds until both the IMRT beam intensities and the dose correction ratio matrix converge. In the second method, the correction method, a periodically computed voxel-by-voxel correction matrix for each beam, defined to be the difference between the SC and PB dose computations, is used to correct PB dose distributions. To validate the methods, IMRT treatment plans developed with the hybrid methods are compared with those obtained when the SC algorithm is used for all optimization iterations and with those obtained when PB-based optimization is followed by SC-based optimization. In the 12 patient cases studied, no clinically significant differences exist in the final treatment plans developed with each of the dose computation methodologies. However, the number of time-consuming SC iterations is reduced from 6-32 for pure SC optimization to four or less for the ratio matrix method and five or less for the correction method. Because the PB algorithm is faster at computing dose, this reduces the inverse planning optimization time for our implementation

  18. ZCURVE 3.0: identify prokaryotic genes with higher accuracy as well as automatically and accurately select essential genes

    Science.gov (United States)

    Hua, Zhi-Gang; Lin, Yan; Yuan, Ya-Zhou; Yang, De-Chang; Wei, Wen; Guo, Feng-Biao

    2015-01-01

    In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions. PMID:25977299

  19. A molecular computational model improves the preoperative diagnosis of thyroid nodules

    Directory of Open Access Journals (Sweden)

    Tomei Sara

    2012-09-01

    Full Text Available Abstract Background Thyroid nodules with indeterminate cytological features on fine needle aspiration (FNA cytology have a 20% risk of thyroid cancer. The aim of the current study was to determine the diagnostic utility of an 8-gene assay to distinguish benign from malignant thyroid neoplasm. Methods The mRNA expression level of 9 genes (KIT, SYNGR2, C21orf4, Hs.296031, DDI2, CDH1, LSM7, TC1, NATH was analysed by quantitative PCR (q-PCR in 93 FNA cytological samples. To evaluate the diagnostic utility of all the genes analysed, we assessed the area under the curve (AUC for each gene individually and in combination. BRAF exon 15 status was determined by pyrosequencing. An 8-gene computational model (Neural Network Bayesian Classifier was built and a multiple-variable analysis was then performed to assess the correlation between the markers. Results The AUC for each significant marker ranged between 0.625 and 0.900, thus all the significant markers, alone and in combination, can be used to distinguish between malignant and benign FNA samples. The classifier made up of KIT, CDH1, LSM7, C21orf4, DDI2, TC1, Hs.296031 and BRAF had a predictive power of 88.8%. It proved to be useful for risk stratification of the most critical cytological group of the indeterminate lesions for which there is the greatest need of accurate diagnostic markers. Conclusion The genetic classification obtained with this model is highly accurate at differentiating malignant from benign thyroid lesions and might be a useful adjunct in the preoperative management of patients with thyroid nodules.

  20. A BAYESIAN NONPARAMETRIC MIXTURE MODEL FOR SELECTING GENES AND GENE SUBNETWORKS.

    Science.gov (United States)

    Zhao, Yize; Kang, Jian; Yu, Tianwei

    2014-06-01

    It is very challenging to select informative features from tens of thousands of measured features in high-throughput data analysis. Recently, several parametric/regression models have been developed utilizing the gene network information to select genes or pathways strongly associated with a clinical/biological outcome. Alternatively, in this paper, we propose a nonparametric Bayesian model for gene selection incorporating network information. In addition to identifying genes that have a strong association with a clinical outcome, our model can select genes with particular expressional behavior, in which case the regression models are not directly applicable. We show that our proposed model is equivalent to an infinity mixture model for which we develop a posterior computation algorithm based on Markov chain Monte Carlo (MCMC) methods. We also propose two fast computing algorithms that approximate the posterior simulation with good accuracy but relatively low computational cost. We illustrate our methods on simulation studies and the analysis of Spellman yeast cell cycle microarray data.

  1. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes

    Directory of Open Access Journals (Sweden)

    Yang Yi-Fan

    2007-03-01

    Full Text Available Abstract Background Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. Results This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs and Translation Initiation Sites (TISs. The former is based on a linguistic "Entropy Density Profile" (EDP model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Conclusion Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  2. Refining discordant gene trees.

    Science.gov (United States)

    Górecki, Pawel; Eulenstein, Oliver

    2014-01-01

    Evolutionary studies are complicated by discordance between gene trees and the species tree in which they evolved. Dealing with discordant trees often relies on comparison costs between gene and species trees, including the well-established Robinson-Foulds, gene duplication, and deep coalescence costs. While these costs have provided credible results for binary rooted gene trees, corresponding cost definitions for non-binary unrooted gene trees, which are frequently occurring in practice, are challenged by biological realism. We propose a natural extension of the well-established costs for comparing unrooted and non-binary gene trees with rooted binary species trees using a binary refinement model. For the duplication cost we describe an efficient algorithm that is based on a linear time reduction and also computes an optimal rooted binary refinement of the given gene tree. Finally, we show that similar reductions lead to solutions for computing the deep coalescence and the Robinson-Foulds costs. Our binary refinement of Robinson-Foulds, gene duplication, and deep coalescence costs for unrooted and non-binary gene trees together with the linear time reductions provided here for computing these costs significantly extends the range of trees that can be incorporated into approaches dealing with discordance.

  3. Learning Gene Regulatory Networks Computationally from Gene Expression Data Using Weighted Consensus

    KAUST Repository

    Fujii, Chisato

    2015-04-16

    Gene regulatory networks analyze the relationships between genes allowing us to un- derstand the gene regulatory interactions in systems biology. Gene expression data from the microarray experiments is used to obtain the gene regulatory networks. How- ever, the microarray data is discrete, noisy and non-linear which makes learning the networks a challenging problem and existing gene network inference methods do not give consistent results. Current state-of-the-art study uses the average-ranking-based consensus method to combine and average the ranked predictions from individual methods. However each individual method has an equal contribution to the consen- sus prediction. We have developed a linear programming-based consensus approach which uses learned weights from linear programming among individual methods such that the methods have di↵erent weights depending on their performance. Our result reveals that assigning di↵erent weights to individual methods rather than giving them equal weights improves the performance of the consensus. The linear programming- based consensus method is evaluated and it had the best performance on in silico and Saccharomyces cerevisiae networks, and the second best on the Escherichia coli network outperformed by Inferelator Pipeline method which gives inconsistent results across a wide range of microarray data sets.

  4. Stochastic Boolean networks: An efficient approach to modeling gene regulatory networks

    Directory of Open Access Journals (Sweden)

    Liang Jinghang

    2012-08-01

    Full Text Available Abstract Background Various computational models have been of interest due to their use in the modelling of gene regulatory networks (GRNs. As a logical model, probabilistic Boolean networks (PBNs consider molecular and genetic noise, so the study of PBNs provides significant insights into the understanding of the dynamics of GRNs. This will ultimately lead to advances in developing therapeutic methods that intervene in the process of disease development and progression. The applications of PBNs, however, are hindered by the complexities involved in the computation of the state transition matrix and the steady-state distribution of a PBN. For a PBN with n genes and N Boolean networks, the complexity to compute the state transition matrix is O(nN22n or O(nN2n for a sparse matrix. Results This paper presents a novel implementation of PBNs based on the notions of stochastic logic and stochastic computation. This stochastic implementation of a PBN is referred to as a stochastic Boolean network (SBN. An SBN provides an accurate and efficient simulation of a PBN without and with random gene perturbation. The state transition matrix is computed in an SBN with a complexity of O(nL2n, where L is a factor related to the stochastic sequence length. Since the minimum sequence length required for obtaining an evaluation accuracy approximately increases in a polynomial order with the number of genes, n, and the number of Boolean networks, N, usually increases exponentially with n, L is typically smaller than N, especially in a network with a large number of genes. Hence, the computational efficiency of an SBN is primarily limited by the number of genes, but not directly by the total possible number of Boolean networks. Furthermore, a time-frame expanded SBN enables an efficient analysis of the steady-state distribution of a PBN. These findings are supported by the simulation results of a simplified p53 network, several randomly generated networks and a

  5. Possible Computer Vision Systems and Automated or Computer-Aided Edging and Trimming

    Science.gov (United States)

    Philip A. Araman

    1990-01-01

    This paper discusses research which is underway to help our industry reduce costs, increase product volume and value recovery, and market more accurately graded and described products. The research is part of a team effort to help the hardwood sawmill industry automate with computer vision systems, and computer-aided or computer controlled processing. This paper...

  6. A semi-supervised learning approach to predict synthetic genetic interactions by combining functional and topological properties of functional gene network

    Directory of Open Access Journals (Sweden)

    Han Kyungsook

    2010-06-01

    Full Text Available Abstract Background Genetic interaction profiles are highly informative and helpful for understanding the functional linkages between genes, and therefore have been extensively exploited for annotating gene functions and dissecting specific pathway structures. However, our understanding is rather limited to the relationship between double concurrent perturbation and various higher level phenotypic changes, e.g. those in cells, tissues or organs. Modifier screens, such as synthetic genetic arrays (SGA can help us to understand the phenotype caused by combined gene mutations. Unfortunately, exhaustive tests on all possible combined mutations in any genome are vulnerable to combinatorial explosion and are infeasible either technically or financially. Therefore, an accurate computational approach to predict genetic interaction is highly desirable, and such methods have the potential of alleviating the bottleneck on experiment design. Results In this work, we introduce a computational systems biology approach for the accurate prediction of pairwise synthetic genetic interactions (SGI. First, a high-coverage and high-precision functional gene network (FGN is constructed by integrating protein-protein interaction (PPI, protein complex and gene expression data; then, a graph-based semi-supervised learning (SSL classifier is utilized to identify SGI, where the topological properties of protein pairs in weighted FGN is used as input features of the classifier. We compare the proposed SSL method with the state-of-the-art supervised classifier, the support vector machines (SVM, on a benchmark dataset in S. cerevisiae to validate our method's ability to distinguish synthetic genetic interactions from non-interaction gene pairs. Experimental results show that the proposed method can accurately predict genetic interactions in S. cerevisiae (with a sensitivity of 92% and specificity of 91%. Noticeably, the SSL method is more efficient than SVM, especially for

  7. Simple, accurate equations for human blood O2 dissociation computations.

    Science.gov (United States)

    Severinghaus, J W

    1979-03-01

    Hill's equation can be slightly modified to fit the standard human blood O2 dissociation curve to within plus or minus 0.0055 fractional saturation (S) from O less than S less than 1. Other modifications of Hill's equation may be used to compute Po2 (Torr) from S (Eq. 2), and the temperature coefficient of Po2 (Eq. 3). Variations of the Bohr coefficient with Po2 are given by Eq. 4. S = (((Po2(3) + 150 Po2)(-1) x 23,400) + 1)(-1) (1) In Po2 = 0.385 In (S-1 - 1)(-1) + 3.32 - (72 S)(-1) - 0.17(S6) (2) DELTA In Po2/delta T = 0.058 ((0.243 X Po2/100)(3.88) + 1)(-1) + 0.013 (3) delta In Po2/delta pH = (Po2/26.6)(0.184) - 2.2 (4) Procedures are described to determine Po2 and S of blood iteratively after extraction or addition of a defined amount of O2 and to compute P50 of blood from a single sample after measuring Po2, pH, and S.

  8. ZCURVE 3.0: identify prokaryotic genes with higher accuracy as well as automatically and accurately select essential genes.

    Science.gov (United States)

    Hua, Zhi-Gang; Lin, Yan; Yuan, Ya-Zhou; Yang, De-Chang; Wei, Wen; Guo, Feng-Biao

    2015-07-01

    In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Accurate prediction of secondary metabolite gene clusters in filamentous fungi

    DEFF Research Database (Denmark)

    Andersen, Mikael Rørdam; Nielsen, Jakob Blæsbjerg; Klitgaard, Andreas

    2013-01-01

    Biosynthetic pathways of secondary metabolites from fungi are currently subject to an intense effort to elucidate the genetic basis for these compounds due to their large potential within pharmaceutics and synthetic biochemistry. The preferred method is methodical gene deletions to identify...... used A. nidulans for our method development and validation due to the wealth of available biochemical data, but the method can be applied to any fungus with a sequenced and assembled genome, thus supporting further secondary metabolite pathway elucidation in the fungal kingdom....

  10. Time Accurate Unsteady Pressure Loads Simulated for the Space Launch System at a Wind Tunnel Condition

    Science.gov (United States)

    Alter, Stephen J.; Brauckmann, Gregory J.; Kleb, Bil; Streett, Craig L; Glass, Christopher E.; Schuster, David M.

    2015-01-01

    Using the Fully Unstructured Three-Dimensional (FUN3D) computational fluid dynamics code, an unsteady, time-accurate flow field about a Space Launch System configuration was simulated at a transonic wind tunnel condition (Mach = 0.9). Delayed detached eddy simulation combined with Reynolds Averaged Naiver-Stokes and a Spallart-Almaras turbulence model were employed for the simulation. Second order accurate time evolution scheme was used to simulate the flow field, with a minimum of 0.2 seconds of simulated time to as much as 1.4 seconds. Data was collected at 480 pressure taps at locations, 139 of which matched a 3% wind tunnel model, tested in the Transonic Dynamic Tunnel (TDT) facility at NASA Langley Research Center. Comparisons between computation and experiment showed agreement within 5% in terms of location for peak RMS levels, and 20% for frequency and magnitude of power spectral densities. Grid resolution and time step sensitivity studies were performed to identify methods for improved accuracy comparisons to wind tunnel data. With limited computational resources, accurate trends for reduced vibratory loads on the vehicle were observed. Exploratory methods such as determining minimized computed errors based on CFL number and sub-iterations, as well as evaluating frequency content of the unsteady pressures and evaluation of oscillatory shock structures were used in this study to enhance computational efficiency and solution accuracy. These techniques enabled development of a set of best practices, for the evaluation of future flight vehicle designs in terms of vibratory loads.

  11. Approximation and Computation

    CERN Document Server

    Gautschi, Walter; Rassias, Themistocles M

    2011-01-01

    Approximation theory and numerical analysis are central to the creation of accurate computer simulations and mathematical models. Research in these areas can influence the computational techniques used in a variety of mathematical and computational sciences. This collection of contributed chapters, dedicated to renowned mathematician Gradimir V. Milovanovia, represent the recent work of experts in the fields of approximation theory and numerical analysis. These invited contributions describe new trends in these important areas of research including theoretic developments, new computational alg

  12. Time-Accurate Simulations of Synthetic Jet-Based Flow Control for An Axisymmetric Spinning Body

    National Research Council Canada - National Science Library

    Sahu, Jubaraj

    2004-01-01

    .... A time-accurate Navier-Stokes computational technique has been used to obtain numerical solutions for the unsteady jet-interaction flow field for a spinning projectile at a subsonic speed, Mach...

  13. Matrix-vector multiplication using digital partitioning for more accurate optical computing

    Science.gov (United States)

    Gary, C. K.

    1992-01-01

    Digital partitioning offers a flexible means of increasing the accuracy of an optical matrix-vector processor. This algorithm can be implemented with the same architecture required for a purely analog processor, which gives optical matrix-vector processors the ability to perform high-accuracy calculations at speeds comparable with or greater than electronic computers as well as the ability to perform analog operations at a much greater speed. Digital partitioning is compared with digital multiplication by analog convolution, residue number systems, and redundant number representation in terms of the size and the speed required for an equivalent throughput as well as in terms of the hardware requirements. Digital partitioning and digital multiplication by analog convolution are found to be the most efficient alogrithms if coding time and hardware are considered, and the architecture for digital partitioning permits the use of analog computations to provide the greatest throughput for a single processor.

  14. Covariance approximation for fast and accurate computation of channelized Hotelling observer statistics

    International Nuclear Information System (INIS)

    Bonetto, Paola; Qi, Jinyi; Leahy, Richard M.

    1999-01-01

    We describe a method for computing linear observer statistics for maximum a posteriori (MAP) reconstructions of PET images. The method is based on a theoretical approximation for the mean and covariance of MAP reconstructions. In particular, we derive here a closed form for the channelized Hotelling observer (CHO) statistic applied to 2D MAP images. We show reasonably good correspondence between these theoretical results and Monte Carlo studies. The accuracy and low computational cost of the approximation allow us to analyze the observer performance over a wide range of operating conditions and parameter settings for the MAP reconstruction algorithm

  15. UniGene Tabulator: a full parser for the UniGene format.

    Science.gov (United States)

    Lenzi, Luca; Frabetti, Flavia; Facchin, Federica; Casadei, Raffaella; Vitale, Lorenza; Canaider, Silvia; Carinci, Paolo; Zannotti, Maria; Strippoli, Pierluigi

    2006-10-15

    UniGene Tabulator 1.0 provides a solution for full parsing of UniGene flat file format; it implements a structured graphical representation of each data field present in UniGene following import into a common database managing system usable in a personal computer. This database includes related tables for sequence, protein similarity, sequence-tagged site (STS) and transcript map interval (TXMAP) data, plus a summary table where each record represents a UniGene cluster. UniGene Tabulator enables full local management of UniGene data, allowing parsing, querying, indexing, retrieving, exporting and analysis of UniGene data in a relational database form, usable on Macintosh (OS X 10.3.9 or later) and Windows (2000, with service pack 4, XP, with service pack 2 or later) operating systems-based computers. The current release, including both the FileMaker runtime applications, is freely available at http://apollo11.isto.unibo.it/software/

  16. Multidisciplinary Computational Research

    National Research Council Canada - National Science Library

    Visbal, Miguel R

    2006-01-01

    The purpose of this work is to develop advanced multidisciplinary numerical simulation capabilities for aerospace vehicles with emphasis on highly accurate, massively parallel computational methods...

  17. Reconfigurable computing the theory and practice of FPGA-based computation

    CERN Document Server

    Hauck, Scott

    2010-01-01

    Reconfigurable Computing marks a revolutionary and hot topic that bridges the gap between the separate worlds of hardware and software design- the key feature of reconfigurable computing is its groundbreaking ability to perform computations in hardware to increase performance while retaining the flexibility of a software solution. Reconfigurable computers serve as affordable, fast, and accurate tools for developing designs ranging from single chip architectures to multi-chip and embedded systems. Scott Hauck and Andre DeHon have assembled a group of the key experts in the fields of both hardwa

  18. Google goes cancer: improving outcome prediction for cancer patients by network-based ranking of marker genes.

    Directory of Open Access Journals (Sweden)

    Christof Winter

    Full Text Available Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice.

  19. Solving computationally expensive engineering problems

    CERN Document Server

    Leifsson, Leifur; Yang, Xin-She

    2014-01-01

    Computational complexity is a serious bottleneck for the design process in virtually any engineering area. While migration from prototyping and experimental-based design validation to verification using computer simulation models is inevitable and has a number of advantages, high computational costs of accurate, high-fidelity simulations can be a major issue that slows down the development of computer-aided design methodologies, particularly those exploiting automated design improvement procedures, e.g., numerical optimization. The continuous increase of available computational resources does not always translate into shortening of the design cycle because of the growing demand for higher accuracy and necessity to simulate larger and more complex systems. Accurate simulation of a single design of a given system may be as long as several hours, days or even weeks, which often makes design automation using conventional methods impractical or even prohibitive. Additional problems include numerical noise often pr...

  20. Accurate and efficient calculation of response times for groundwater flow

    Science.gov (United States)

    Carr, Elliot J.; Simpson, Matthew J.

    2018-03-01

    We study measures of the amount of time required for transient flow in heterogeneous porous media to effectively reach steady state, also known as the response time. Here, we develop a new approach that extends the concept of mean action time. Previous applications of the theory of mean action time to estimate the response time use the first two central moments of the probability density function associated with the transition from the initial condition, at t = 0, to the steady state condition that arises in the long time limit, as t → ∞ . This previous approach leads to a computationally convenient estimation of the response time, but the accuracy can be poor. Here, we outline a powerful extension using the first k raw moments, showing how to produce an extremely accurate estimate by making use of asymptotic properties of the cumulative distribution function. Results are validated using an existing laboratory-scale data set describing flow in a homogeneous porous medium. In addition, we demonstrate how the results also apply to flow in heterogeneous porous media. Overall, the new method is: (i) extremely accurate; and (ii) computationally inexpensive. In fact, the computational cost of the new method is orders of magnitude less than the computational effort required to study the response time by solving the transient flow equation. Furthermore, the approach provides a rigorous mathematical connection with the heuristic argument that the response time for flow in a homogeneous porous medium is proportional to L2 / D , where L is a relevant length scale, and D is the aquifer diffusivity. Here, we extend such heuristic arguments by providing a clear mathematical definition of the proportionality constant.

  1. Genometa--a fast and accurate classifier for short metagenomic shotgun reads.

    Science.gov (United States)

    Davenport, Colin F; Neugebauer, Jens; Beckmann, Nils; Friedrich, Benedikt; Kameri, Burim; Kokott, Svea; Paetow, Malte; Siekmann, Björn; Wieding-Drewes, Matthias; Wienhöfer, Markus; Wolf, Stefan; Tümmler, Burkhard; Ahlers, Volker; Sprengel, Frauke

    2012-01-01

    Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer. The Genometa program, a step by step tutorial and Java source code are freely available from http://genomics1.mh-hannover.de/genometa/ and on http://code.google.com/p/genometa/. This program has been tested on Ubuntu Linux and Windows XP/7.

  2. Genometa--a fast and accurate classifier for short metagenomic shotgun reads.

    Directory of Open Access Journals (Sweden)

    Colin F Davenport

    Full Text Available Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer.The Genometa program, a step by step tutorial and Java source code are freely available from http://genomics1.mh-hannover.de/genometa/ and on http://code.google.com/p/genometa/. This program has been tested on Ubuntu Linux and Windows XP/7.

  3. Accurate technique for complete geometric calibration of cone-beam computed tomography systems

    International Nuclear Information System (INIS)

    Cho Youngbin; Moseley, Douglas J.; Siewerdsen, Jeffrey H.; Jaffray, David A.

    2005-01-01

    Cone-beam computed tomography systems have been developed to provide in situ imaging for the purpose of guiding radiation therapy. Clinical systems have been constructed using this approach, a clinical linear accelerator (Elekta Synergy RP) and an iso-centric C-arm. Geometric calibration involves the estimation of a set of parameters that describes the geometry of such systems, and is essential for accurate image reconstruction. We have developed a general analytic algorithm and corresponding calibration phantom for estimating these geometric parameters in cone-beam computed tomography (CT) systems. The performance of the calibration algorithm is evaluated and its application is discussed. The algorithm makes use of a calibration phantom to estimate the geometric parameters of the system. The phantom consists of 24 steel ball bearings (BBs) in a known geometry. Twelve BBs are spaced evenly at 30 deg in two plane-parallel circles separated by a given distance along the tube axis. The detector (e.g., a flat panel detector) is assumed to have no spatial distortion. The method estimates geometric parameters including the position of the x-ray source, position, and rotation of the detector, and gantry angle, and can describe complex source-detector trajectories. The accuracy and sensitivity of the calibration algorithm was analyzed. The calibration algorithm estimates geometric parameters in a high level of accuracy such that the quality of CT reconstruction is not degraded by the error of estimation. Sensitivity analysis shows uncertainty of 0.01 deg. (around beam direction) to 0.3 deg. (normal to the beam direction) in rotation, and 0.2 mm (orthogonal to the beam direction) to 4.9 mm (beam direction) in position for the medical linear accelerator geometry. Experimental measurements using a laboratory bench Cone-beam CT system of known geometry demonstrate the sensitivity of the method in detecting small changes in the imaging geometry with an uncertainty of 0.1 mm in

  4. QuartetS: A Fast and Accurate Algorithm for Large-Scale Orthology Detection

    Science.gov (United States)

    2011-01-01

    of these two genes with all other genes of the other one species. In addition, to be considered orthologs, the BBH pairs had to satisfy two conditions ...BBH pair computations employed as part of the outgroup and QuartetS methods, we used the same two conditions as the ones described above. In our...versus proteins. Genetica , 118, 209–216. 4. Serres,M.H., Kerr,A.R., McCormack,T.J. and Riley,M. (2009) Evolution by leaps: gene duplication in bacteria

  5. Inferring species trees from incongruent multi-copy gene trees using the Robinson-Foulds distance

    Science.gov (United States)

    2013-01-01

    Background Constructing species trees from multi-copy gene trees remains a challenging problem in phylogenetics. One difficulty is that the underlying genes can be incongruent due to evolutionary processes such as gene duplication and loss, deep coalescence, or lateral gene transfer. Gene tree estimation errors may further exacerbate the difficulties of species tree estimation. Results We present a new approach for inferring species trees from incongruent multi-copy gene trees that is based on a generalization of the Robinson-Foulds (RF) distance measure to multi-labeled trees (mul-trees). We prove that it is NP-hard to compute the RF distance between two mul-trees; however, it is easy to calculate this distance between a mul-tree and a singly-labeled species tree. Motivated by this, we formulate the RF problem for mul-trees (MulRF) as follows: Given a collection of multi-copy gene trees, find a singly-labeled species tree that minimizes the total RF distance from the input mul-trees. We develop and implement a fast SPR-based heuristic algorithm for the NP-hard MulRF problem. We compare the performance of the MulRF method (available at http://genome.cs.iastate.edu/CBL/MulRF/) with several gene tree parsimony approaches using gene tree simulations that incorporate gene tree error, gene duplications and losses, and/or lateral transfer. The MulRF method produces more accurate species trees than gene tree parsimony approaches. We also demonstrate that the MulRF method infers in minutes a credible plant species tree from a collection of nearly 2,000 gene trees. Conclusions Our new phylogenetic inference method, based on a generalized RF distance, makes it possible to quickly estimate species trees from large genomic data sets. Since the MulRF method, unlike gene tree parsimony, is based on a generic tree distance measure, it is appealing for analyses of genomic data sets, in which many processes such as deep coalescence, recombination, gene duplication and losses as

  6. Exploring the Optimal Strategy to Predict Essential Genes in Microbes

    Directory of Open Access Journals (Sweden)

    Yao Lu

    2011-12-01

    Full Text Available Accurately predicting essential genes is important in many aspects of biology, medicine and bioengineering. In previous research, we have developed a machine learning based integrative algorithm to predict essential genes in bacterial species. This algorithm lends itself to two approaches for predicting essential genes: learning the traits from known essential genes in the target organism, or transferring essential gene annotations from a closely related model organism. However, for an understudied microbe, each approach has its potential limitations. The first is constricted by the often small number of known essential genes. The second is limited by the availability of model organisms and by evolutionary distance. In this study, we aim to determine the optimal strategy for predicting essential genes by examining four microbes with well-characterized essential genes. Our results suggest that, unless the known essential genes are few, learning from the known essential genes in the target organism usually outperforms transferring essential gene annotations from a related model organism. In fact, the required number of known essential genes is surprisingly small to make accurate predictions. In prokaryotes, when the number of known essential genes is greater than 2% of total genes, this approach already comes close to its optimal performance. In eukaryotes, achieving the same best performance requires over 4% of total genes, reflecting the increased complexity of eukaryotic organisms. Combining the two approaches resulted in an increased performance when the known essential genes are few. Our investigation thus provides key information on accurately predicting essential genes and will greatly facilitate annotations of microbial genomes.

  7. Explorations in quantum computing

    CERN Document Server

    Williams, Colin P

    2011-01-01

    By the year 2020, the basic memory components of a computer will be the size of individual atoms. At such scales, the current theory of computation will become invalid. ""Quantum computing"" is reinventing the foundations of computer science and information theory in a way that is consistent with quantum physics - the most accurate model of reality currently known. Remarkably, this theory predicts that quantum computers can perform certain tasks breathtakingly faster than classical computers -- and, better yet, can accomplish mind-boggling feats such as teleporting information, breaking suppos

  8. In silico method for modelling metabolism and gene product expression at genome scale

    Energy Technology Data Exchange (ETDEWEB)

    Lerman, Joshua A.; Hyduke, Daniel R.; Latif, Haythem; Portnoy, Vasiliy A.; Lewis, Nathan E.; Orth, Jeffrey D.; Rutledge, Alexandra C.; Smith, Richard D.; Adkins, Joshua N.; Zengler, Karsten; Palsson, Bernard O.

    2012-07-03

    Transcription and translation use raw materials and energy generated metabolically to create the macromolecular machinery responsible for all cellular functions, including metabolism. A biochemically accurate model of molecular biology and metabolism will facilitate comprehensive and quantitative computations of an organism's molecular constitution as a function of genetic and environmental parameters. Here we formulate a model of metabolism and macromolecular expression. Prototyping it using the simple microorganism Thermotoga maritima, we show our model accurately simulates variations in cellular composition and gene expression. Moreover, through in silico comparative transcriptomics, the model allows the discovery of new regulons and improving the genome and transcription unit annotations. Our method presents a framework for investigating molecular biology and cellular physiology in silico and may allow quantitative interpretation of multi-omics data sets in the context of an integrated biochemical description of an organism.

  9. The percentage of bacterial genes on leading versus lagging strands is influenced by multiple balancing forces

    Science.gov (United States)

    Mao, Xizeng; Zhang, Han; Yin, Yanbin; Xu, Ying

    2012-01-01

    The majority of bacterial genes are located on the leading strand, and the percentage of such genes has a large variation across different bacteria. Although some explanations have been proposed, these are at most partial explanations as they cover only small percentages of the genes and do not even consider the ones biased toward the lagging strand. We have carried out a computational study on 725 bacterial genomes, aiming to elucidate other factors that may have influenced the strand location of genes in a bacterium. Our analyses suggest that (i) genes of some functional categories such as ribosome have higher preferences to be on the leading strands; (ii) genes of some functional categories such as transcription factor have higher preferences on the lagging strands; (iii) there is a balancing force that tends to keep genes from all moving to the leading and more efficient strand and (iv) the percentage of leading-strand genes in an bacterium can be accurately explained based on the numbers of genes in the functional categories outlined in (i) and (ii), genome size and gene density, indicating that these numbers implicitly contain the information about the percentage of genes on the leading versus lagging strand in a genome. PMID:22735706

  10. EXONSAMPLER: a computer program for genome-wide and candidate gene exon sampling for targeted next-generation sequencing.

    Science.gov (United States)

    Cosart, Ted; Beja-Pereira, Albano; Luikart, Gordon

    2014-11-01

    The computer program EXONSAMPLER automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of EXONSAMPLER to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16,000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection. © 2014 John Wiley & Sons Ltd.

  11. Covariance approximation for fast and accurate computation of channelized Hotelling observer statistics

    Science.gov (United States)

    Bonetto, P.; Qi, Jinyi; Leahy, R. M.

    2000-08-01

    Describes a method for computing linear observer statistics for maximum a posteriori (MAP) reconstructions of PET images. The method is based on a theoretical approximation for the mean and covariance of MAP reconstructions. In particular, the authors derive here a closed form for the channelized Hotelling observer (CHO) statistic applied to 2D MAP images. The theoretical analysis models both the Poission statistics of PET data and the inhomogeneity of tracer uptake. The authors show reasonably good correspondence between these theoretical results and Monte Carlo studies. The accuracy and low computational cost of the approximation allow the authors to analyze the observer performance over a wide range of operating conditions and parameter settings for the MAP reconstruction algorithm.

  12. POLYAR, a new computer program for prediction of poly(A sites in human sequences

    Directory of Open Access Journals (Sweden)

    Qamar Raheel

    2010-11-01

    Full Text Available Abstract Background mRNA polyadenylation is an essential step of pre-mRNA processing in eukaryotes. Accurate prediction of the pre-mRNA 3'-end cleavage/polyadenylation sites is important for defining the gene boundaries and understanding gene expression mechanisms. Results 28761 human mapped poly(A sites have been classified into three classes containing different known forms of polyadenylation signal (PAS or none of them (PAS-strong, PAS-weak and PAS-less, respectively and a new computer program POLYAR for the prediction of poly(A sites of each class was developed. In comparison with polya_svm (till date the most accurate computer program for prediction of poly(A sites while searching for PAS-strong poly(A sites in human sequences, POLYAR had a significantly higher prediction sensitivity (80.8% versus 65.7% and specificity (66.4% versus 51.7% However, when a similar sort of search was conducted for PAS-weak and PAS-less poly(A sites, both programs had a very low prediction accuracy, which indicates that our knowledge about factors involved in the determination of the poly(A sites is not sufficient to identify such polyadenylation regions. Conclusions We present a new classification of polyadenylation sites into three classes and a novel computer program POLYAR for prediction of poly(A sites/regions of each of the class. In tests, POLYAR shows high accuracy of prediction of the PAS-strong poly(A sites, though this program's efficiency in searching for PAS-weak and PAS-less poly(A sites is not very high but is comparable to other available programs. These findings suggest that additional characteristics of such poly(A sites remain to be elucidated. POLYAR program with a stand-alone version for downloading is available at http://cub.comsats.edu.pk/polyapredict.htm.

  13. Tentacle: distributed quantification of genes in metagenomes.

    Science.gov (United States)

    Boulund, Fredrik; Sjögren, Anders; Kristiansson, Erik

    2015-01-01

    In metagenomics, microbial communities are sequenced at increasingly high resolution, generating datasets with billions of DNA fragments. Novel methods that can efficiently process the growing volumes of sequence data are necessary for the accurate analysis and interpretation of existing and upcoming metagenomes. Here we present Tentacle, which is a novel framework that uses distributed computational resources for gene quantification in metagenomes. Tentacle is implemented using a dynamic master-worker approach in which DNA fragments are streamed via a network and processed in parallel on worker nodes. Tentacle is modular, extensible, and comes with support for six commonly used sequence aligners. It is easy to adapt Tentacle to different applications in metagenomics and easy to integrate into existing workflows. Evaluations show that Tentacle scales very well with increasing computing resources. We illustrate the versatility of Tentacle on three different use cases. Tentacle is written for Linux in Python 2.7 and is published as open source under the GNU General Public License (v3). Documentation, tutorials, installation instructions, and the source code are freely available online at: http://bioinformatics.math.chalmers.se/tentacle.

  14. Accurate Computed Enthalpies of Spin Crossover in Iron and Cobalt Complexes

    DEFF Research Database (Denmark)

    Kepp, Kasper Planeta; Cirera, J

    2009-01-01

    Despite their importance in many chemical processes, the relative energies of spin states of transition metal complexes have so far been haunted by large computational errors. By the use of six functionals, B3LYP, BP86, TPSS, TPSSh, M06L, and M06L, this work studies nine complexes (seven with iron...

  15. Accurate phylogenetic tree reconstruction from quartets: a heuristic approach.

    Science.gov (United States)

    Reaz, Rezwana; Bayzid, Md Shamsuzzoha; Rahman, M Sohel

    2014-01-01

    Supertree methods construct trees on a set of taxa (species) combining many smaller trees on the overlapping subsets of the entire set of taxa. A 'quartet' is an unrooted tree over 4 taxa, hence the quartet-based supertree methods combine many 4-taxon unrooted trees into a single and coherent tree over the complete set of taxa. Quartet-based phylogeny reconstruction methods have been receiving considerable attentions in the recent years. An accurate and efficient quartet-based method might be competitive with the current best phylogenetic tree reconstruction methods (such as maximum likelihood or Bayesian MCMC analyses), without being as computationally intensive. In this paper, we present a novel and highly accurate quartet-based phylogenetic tree reconstruction method. We performed an extensive experimental study to evaluate the accuracy and scalability of our approach on both simulated and biological datasets.

  16. Accurate first-principles structures and energies of diversely bonded systems from an efficient density functional.

    Science.gov (United States)

    Sun, Jianwei; Remsing, Richard C; Zhang, Yubo; Sun, Zhaoru; Ruzsinszky, Adrienn; Peng, Haowei; Yang, Zenghui; Paul, Arpita; Waghmare, Umesh; Wu, Xifan; Klein, Michael L; Perdew, John P

    2016-09-01

    One atom or molecule binds to another through various types of bond, the strengths of which range from several meV to several eV. Although some computational methods can provide accurate descriptions of all bond types, those methods are not efficient enough for many studies (for example, large systems, ab initio molecular dynamics and high-throughput searches for functional materials). Here, we show that the recently developed non-empirical strongly constrained and appropriately normed (SCAN) meta-generalized gradient approximation (meta-GGA) within the density functional theory framework predicts accurate geometries and energies of diversely bonded molecules and materials (including covalent, metallic, ionic, hydrogen and van der Waals bonds). This represents a significant improvement at comparable efficiency over its predecessors, the GGAs that currently dominate materials computation. Often, SCAN matches or improves on the accuracy of a computationally expensive hybrid functional, at almost-GGA cost. SCAN is therefore expected to have a broad impact on chemistry and materials science.

  17. Improved fingercode alignment for accurate and compact fingerprint recognition

    CSIR Research Space (South Africa)

    Brown, Dane

    2016-05-01

    Full Text Available Alignment for Accurate and Compact Fingerprint Recognition Dane Brown∗† and Karen Bradshaw∗ ∗Department of Computer Science Rhodes University Grahamstown, South Africa †Council for Scientific and Industrial Research Modelling and Digital Sciences Pretoria.... The experimental analysis and results are discussed in Section IV. Section V concludes the paper. II. RELATED STUDIES FingerCode [1] uses circular tessellation of filtered finger- print images centered at the reference point, which results in a circular ROI...

  18. A New Multiscale Technique for Time-Accurate Geophysics Simulations

    Science.gov (United States)

    Omelchenko, Y. A.; Karimabadi, H.

    2006-12-01

    Large-scale geophysics systems are frequently described by multiscale reactive flow models (e.g., wildfire and climate models, multiphase flows in porous rocks, etc.). Accurate and robust simulations of such systems by traditional time-stepping techniques face a formidable computational challenge. Explicit time integration suffers from global (CFL and accuracy) timestep restrictions due to inhomogeneous convective and diffusion processes, as well as closely coupled physical and chemical reactions. Application of adaptive mesh refinement (AMR) to such systems may not be always sufficient since its success critically depends on a careful choice of domain refinement strategy. On the other hand, implicit and timestep-splitting integrations may result in a considerable loss of accuracy when fast transients in the solution become important. To address this issue, we developed an alternative explicit approach to time-accurate integration of such systems: Discrete-Event Simulation (DES). DES enables asynchronous computation by automatically adjusting the CPU resources in accordance with local timescales. This is done by encapsulating flux- conservative updates of numerical variables in the form of events, whose execution and synchronization is explicitly controlled by imposing accuracy and causality constraints. As a result, at each time step DES self- adaptively updates only a fraction of the global system state, which eliminates unnecessary computation of inactive elements. DES can be naturally combined with various mesh generation techniques. The event-driven paradigm results in robust and fast simulation codes, which can be efficiently parallelized via a new preemptive event processing (PEP) technique. We discuss applications of this novel technology to time-dependent diffusion-advection-reaction and CFD models representative of various geophysics applications.

  19. Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

    Science.gov (United States)

    Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

    2014-01-01

    Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.

  20. An Accurate Method for Computing the Absorption of Solar Radiation by Water Vapor

    Science.gov (United States)

    Chou, M. D.

    1980-01-01

    The method is based upon molecular line parameters and makes use of a far wing scaling approximation and k distribution approach previously applied to the computation of the infrared cooling rate due to water vapor. Taking into account the wave number dependence of the incident solar flux, the solar heating rate is computed for the entire water vapor spectrum and for individual absorption bands. The accuracy of the method is tested against line by line calculations. The method introduces a maximum error of 0.06 C/day. The method has the additional advantage over previous methods in that it can be applied to any portion of the spectral region containing the water vapor bands. The integrated absorptances and line intensities computed from the molecular line parameters were compared with laboratory measurements. The comparison reveals that, among the three different sources, absorptance is the largest for the laboratory measurements.

  1. A Comprehensive Strategy for Accurate Mutation Detection of the Highly Homologous PMS2.

    Science.gov (United States)

    Li, Jianli; Dai, Hongzheng; Feng, Yanming; Tang, Jia; Chen, Stella; Tian, Xia; Gorman, Elizabeth; Schmitt, Eric S; Hansen, Terah A A; Wang, Jing; Plon, Sharon E; Zhang, Victor Wei; Wong, Lee-Jun C

    2015-09-01

    Germline mutations in the DNA mismatch repair gene PMS2 underlie the cancer susceptibility syndrome, Lynch syndrome. However, accurate molecular testing of PMS2 is complicated by a large number of highly homologous sequences. To establish a comprehensive approach for mutation detection of PMS2, we have designed a strategy combining targeted capture next-generation sequencing (NGS), multiplex ligation-dependent probe amplification, and long-range PCR followed by NGS to simultaneously detect point mutations and copy number changes of PMS2. Exonic deletions (E2 to E9, E5 to E9, E8, E10, E14, and E1 to E15), duplications (E11 to E12), and a nonsense mutation, p.S22*, were identified. Traditional multiplex ligation-dependent probe amplification and Sanger sequencing approaches cannot differentiate the origin of the exonic deletions in the 3' region when PMS2 and PMS2CL share identical sequences as a result of gene conversion. Our approach allows unambiguous identification of mutations in the active gene with a straightforward long-range-PCR/NGS method. Breakpoint analysis of multiple samples revealed that recurrent exon 14 deletions are mediated by homologous Alu sequences. Our comprehensive approach provides a reliable tool for accurate molecular analysis of genes containing multiple copies of highly homologous sequences and should improve PMS2 molecular analysis for patients with Lynch syndrome. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  2. Fast and accurate calculation of the properties of water and steam for simulation

    International Nuclear Information System (INIS)

    Szegi, Zs.; Gacs, A.

    1990-01-01

    A basic principle simulator was developed at the CRIP, Budapest, for real time simulation of the transients of WWER-440 type nuclear power plants. Its integral part is the fast and accurate calculation of the thermodynamic properties of water and steam. To eliminate successive approximations, the model system of the secondary coolant circuit requires binary forms which are known as inverse functions, countinuous when crossing the saturation line, accurate and coherent for all argument combinations. A solution which reduces the computer memory and execution time demand is reported. (author) 36 refs.; 5 figs.; 3 tabs

  3. Accurately bi-orthogonal direct and adjoint lambda modes via two-sided Eigen-solvers

    International Nuclear Information System (INIS)

    Roman, J.E.; Vidal, V.; Verdu, G.

    2005-01-01

    This work is concerned with the accurate computation of the dominant l-modes (Lambda mode) of the reactor core in order to approximate the solution of the neutron diffusion equation in different situations such as the transient modal analysis. In a previous work, the problem was already addressed by implementing a parallel program based on SLEPc (Scalable Library for Eigenvalue Problem Computations), a public domain software for the solution of eigenvalue problems. Now, the proposed solution is extended by incorporating also the computation of the adjoint l-modes in such a way that the bi-orthogonality condition is enforced very accurately. This feature is very desirable in some types of analyses, and in the proposed scheme it is achieved by making use of two-sided eigenvalue solving software. Current implementations of some of these software, while still susceptible of improvement, show that they can be competitive in terms of response time and accuracy with respect to other types of eigenvalue solving software. The code developed by the authors has parallel capabilities in order to be able to analyze reactors with a great level of detail in a short time. (authors)

  4. Accurately bi-orthogonal direct and adjoint lambda modes via two-sided Eigen-solvers

    Energy Technology Data Exchange (ETDEWEB)

    Roman, J.E.; Vidal, V. [Valencia Univ. Politecnica, D. Sistemas Informaticos y Computacion (Spain); Verdu, G. [Valencia Univ. Politecnica, D. Ingenieria Quimica y Nuclear (Spain)

    2005-07-01

    This work is concerned with the accurate computation of the dominant l-modes (Lambda mode) of the reactor core in order to approximate the solution of the neutron diffusion equation in different situations such as the transient modal analysis. In a previous work, the problem was already addressed by implementing a parallel program based on SLEPc (Scalable Library for Eigenvalue Problem Computations), a public domain software for the solution of eigenvalue problems. Now, the proposed solution is extended by incorporating also the computation of the adjoint l-modes in such a way that the bi-orthogonality condition is enforced very accurately. This feature is very desirable in some types of analyses, and in the proposed scheme it is achieved by making use of two-sided eigenvalue solving software. Current implementations of some of these software, while still susceptible of improvement, show that they can be competitive in terms of response time and accuracy with respect to other types of eigenvalue solving software. The code developed by the authors has parallel capabilities in order to be able to analyze reactors with a great level of detail in a short time. (authors)

  5. Accurate microRNA target prediction correlates with protein repression levels

    Directory of Open Access Journals (Sweden)

    Simossis Victor A

    2009-09-01

    Full Text Available Abstract Background MicroRNAs are small endogenously expressed non-coding RNA molecules that regulate target gene expression through translation repression or messenger RNA degradation. MicroRNA regulation is performed through pairing of the microRNA to sites in the messenger RNA of protein coding genes. Since experimental identification of miRNA target genes poses difficulties, computational microRNA target prediction is one of the key means in deciphering the role of microRNAs in development and disease. Results DIANA-microT 3.0 is an algorithm for microRNA target prediction which is based on several parameters calculated individually for each microRNA and combines conserved and non-conserved microRNA recognition elements into a final prediction score, which correlates with protein production fold change. Specifically, for each predicted interaction the program reports a signal to noise ratio and a precision score which can be used as an indication of the false positive rate of the prediction. Conclusion Recently, several computational target prediction programs were benchmarked based on a set of microRNA target genes identified by the pSILAC method. In this assessment DIANA-microT 3.0 was found to achieve the highest precision among the most widely used microRNA target prediction programs reaching approximately 66%. The DIANA-microT 3.0 prediction results are available online in a user friendly web server at http://www.microrna.gr/microT

  6. Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information

    Directory of Open Access Journals (Sweden)

    Lemke Ney

    2009-09-01

    Full Text Available Abstract Background The identification of essential genes is important for the understanding of the minimal requirements for cellular life and for practical purposes, such as drug design. However, the experimental techniques for essential genes discovery are labor-intensive and time-consuming. Considering these experimental constraints, a computational approach capable of accurately predicting essential genes would be of great value. We therefore present here a machine learning-based computational approach relying on network topological features, cellular localization and biological process information for prediction of essential genes. Results We constructed a decision tree-based meta-classifier and trained it on datasets with individual and grouped attributes-network topological features, cellular compartments and biological processes-to generate various predictors of essential genes. We showed that the predictors with better performances are those generated by datasets with integrated attributes. Using the predictor with all attributes, i.e., network topological features, cellular compartments and biological processes, we obtained the best predictor of essential genes that was then used to classify yeast genes with unknown essentiality status. Finally, we generated decision trees by training the J48 algorithm on datasets with all network topological features, cellular localization and biological process information to discover cellular rules for essentiality. We found that the number of protein physical interactions, the nuclear localization of proteins and the number of regulating transcription factors are the most important factors determining gene essentiality. Conclusion We were able to demonstrate that network topological features, cellular localization and biological process information are reliable predictors of essential genes. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing

  7. Enzymic colorimetry-based DNA chip: a rapid and accurate assay for detecting mutations for clarithromycin resistance in the 23S rRNA gene of Helicobacter pylori.

    Science.gov (United States)

    Xuan, Shi-Hai; Zhou, Yu-Gui; Shao, Bo; Cui, Ya-Lin; Li, Jian; Yin, Hong-Bo; Song, Xiao-Ping; Cong, Hui; Jing, Feng-Xiang; Jin, Qing-Hui; Wang, Hui-Min; Zhou, Jie

    2009-11-01

    Macrolide drugs, such as clarithromycin (CAM), are a key component of many combination therapies used to eradicate Helicobacter pylori. However, resistance to CAM is increasing in H. pylori and is becoming a serious problem in H. pylori eradication therapy. CAM resistance in H. pylori is mostly due to point mutations (A2142G/C, A2143G) in the peptidyltransferase-encoding region of the 23S rRNA gene. In this study an enzymic colorimetry-based DNA chip was developed to analyse single-nucleotide polymorphisms of the 23S rRNA gene to determine the prevalence of mutations in CAM-related resistance in H. pylori-positive patients. The results of the colorimetric DNA chip were confirmed by direct DNA sequencing. In 63 samples, the incidence of the A2143G mutation was 17.46 % (11/63). The results of the colorimetric DNA chip were concordant with DNA sequencing in 96.83 % of results (61/63). The colorimetric DNA chip could detect wild-type and mutant signals at every site, even at a DNA concentration of 1.53 x 10(2) copies microl(-1). Thus, the colorimetric DNA chip is a reliable assay for rapid and accurate detection of mutations in the 23S rRNA gene of H. pylori that lead to CAM-related resistance, directly from gastric tissues.

  8. Robust and accurate vectorization of line drawings.

    Science.gov (United States)

    Hilaire, Xavier; Tombre, Karl

    2006-06-01

    This paper presents a method for vectorizing the graphical parts of paper-based line drawings. The method consists of separating the input binary image into layers of homogeneous thickness, skeletonizing each layer, segmenting the skeleton by a method based on random sampling, and simplifying the result. The segmentation method is robust with a best bound of 50 percent noise reached for indefinitely long primitives. Accurate estimation of the recognized vector's parameters is enabled by explicitly computing their feasibility domains. Theoretical performance analysis and expression of the complexity of the segmentation method are derived. Experimental results and comparisons with other vectorization systems are also provided.

  9. The importance of accurate meteorological input fields and accurate planetary boundary layer parameterizations, tested against ETEX-1

    International Nuclear Information System (INIS)

    Brandt, J.; Ebel, A.; Elbern, H.; Jakobs, H.; Memmesheimer, M.; Mikkelsen, T.; Thykier-Nielsen, S.; Zlatev, Z.

    1997-01-01

    Atmospheric transport of air pollutants is, in principle, a well understood process. If information about the state of the atmosphere is given in all details (infinitely accurate information about wind speed, etc.) and infinitely fast computers are available then the advection equation could in principle be solved exactly. This is, however, not the case: discretization of the equations and input data introduces some uncertainties and errors in the results. Therefore many different issues have to be carefully studied in order to diminish these uncertainties and to develop an accurate transport model. Some of these are e.g. the numerical treatment of the transport equation, accuracy of the mean meteorological input fields and parameterizations of sub-grid scale phenomena (as e.g. parameterizations of the 2 nd and higher order turbulence terms in order to reach closure in the perturbation equation). A tracer model for studying transport and dispersion of air pollution caused by a single but strong source is under development. The model simulations from the first ETEX release illustrate the differences caused by using various analyzed fields directly in the tracer model or using a meteorological driver. Also different parameterizations of the mixing height and the vertical exchange are compared. (author)

  10. Application of CT-PSF-based computer-simulated lung nodules for evaluating the accuracy of computer-aided volumetry.

    Science.gov (United States)

    Funaki, Ayumu; Ohkubo, Masaki; Wada, Shinichi; Murao, Kohei; Matsumoto, Toru; Niizuma, Shinji

    2012-07-01

    With the wide dissemination of computed tomography (CT) screening for lung cancer, measuring the nodule volume accurately with computer-aided volumetry software is increasingly important. Many studies for determining the accuracy of volumetry software have been performed using a phantom with artificial nodules. These phantom studies are limited, however, in their ability to reproduce the nodules both accurately and in the variety of sizes and densities required. Therefore, we propose a new approach of using computer-simulated nodules based on the point spread function measured in a CT system. The validity of the proposed method was confirmed by the excellent agreement obtained between computer-simulated nodules and phantom nodules regarding the volume measurements. A practical clinical evaluation of the accuracy of volumetry software was achieved by adding simulated nodules onto clinical lung images, including noise and artifacts. The tested volumetry software was revealed to be accurate within an error of 20 % for nodules >5 mm and with the difference between nodule density and background (lung) (CT value) being 400-600 HU. Such a detailed analysis can provide clinically useful information on the use of volumetry software in CT screening for lung cancer. We concluded that the proposed method is effective for evaluating the performance of computer-aided volumetry software.

  11. An accurate method for computer-generating tungsten anode x-ray spectra from 30 to 140 kV.

    Science.gov (United States)

    Boone, J M; Seibert, J A

    1997-11-01

    A tungsten anode spectral model using interpolating polynomials (TASMIP) was used to compute x-ray spectra at 1 keV intervals over the range from 30 kV to 140 kV. The TASMIP is not semi-empirical and uses no physical assumptions regarding x-ray production, but rather interpolates measured constant potential x-ray spectra published by Fewell et al. [Handbook of Computed Tomography X-ray Spectra (U.S. Government Printing Office, Washington, D.C., 1981)]. X-ray output measurements (mR/mAs measured at 1 m) were made on a calibrated constant potential generator in our laboratory from 50 kV to 124 kV, and with 0-5 mm added aluminum filtration. The Fewell spectra were slightly modified (numerically hardened) and normalized based on the attenuation and output characteristics of a constant potential generator and metal-insert x-ray tube in our laboratory. Then, using the modified Fewell spectra of different kVs, the photon fluence phi at each 1 keV energy bin (E) over energies from 10 keV to 140 keV was characterized using polynomial functions of the form phi (E) = a0[E] + a1[E] kV + a2[E] kV2 + ... + a(n)[E] kVn. A total of 131 polynomial functions were used to calculate accurate x-ray spectra, each function requiring between two and four terms. The resulting TASMIP algorithm produced x-ray spectra that match both the quality and quantity characteristics of the x-ray system in our laboratory. For photon fluences above 10% of the peak fluence in the spectrum, the average percent difference (and standard deviation) between the modified Fewell spectra and the TASMIP photon fluence was -1.43% (3.8%) for the 50 kV spectrum, -0.89% (1.37%) for the 70 kV spectrum, and for the 80, 90, 100, 110, 120, 130 and 140 kV spectra, the mean differences between spectra were all less than 0.20% and the standard deviations were less than approximately 1.1%. The model was also extended to include the effects of generator-induced kV ripple. Finally, the x-ray photon fluence in the units of

  12. Gene analogue finder: a GRID solution for finding functionally analogous gene products

    Directory of Open Access Journals (Sweden)

    Licciulli Flavio

    2007-09-01

    Full Text Available Abstract Background To date more than 2,1 million gene products from more than 100000 different species have been described specifying their function, the processes they are involved in and their cellular localization using a very well defined and structured vocabulary, the gene ontology (GO. Such vast, well defined knowledge opens the possibility of compare gene products at the level of functionality, finding gene products which have a similar function or are involved in similar biological processes without relying on the conventional sequence similarity approach. Comparisons within such a large space of knowledge are highly data and computing intensive. For this reason this project was based upon the use of the computational GRID, a technology offering large computing and storage resources. Results We have developed a tool, GENe AnaloGue FINdEr (ENGINE that parallelizes the search process and distributes the calculation and data over the computational GRID, splitting the process into many sub-processes and joining the calculation and the data on the same machine and therefore completing the whole search in about 3 days instead of occupying one single machine for more than 5 CPU years. The results of the functional comparison contain potential functional analogues for more than 79000 gene products from the most important species. 46% of the analyzed gene products are well enough described for such an analysis to individuate functional analogues, such as well-known members of the same gene family, or gene products with similar functions which would never have been associated by standard methods. Conclusion ENGINE has produced a list of potential functionally analogous relations between gene products within and between species using, in place of the sequence, the gene description of the GO, thus demonstrating the potential of the GO. However, the current limiting factor is the quality of the associations of many gene products from non

  13. Exploring the relationship between sequence similarity and accurate phylogenetic trees.

    Science.gov (United States)

    Cantarel, Brandi L; Morrison, Hilary G; Pearson, William

    2006-11-01

    significantly decrease phylogenetic accuracy. In general, although less-divergent sequence families produce more accurate trees, the likelihood of estimating an accurate tree is most dependent on whether radiation in the family was ancient or recent. Accuracy can be improved by combining genes from the same organism when creating species trees or by selecting protein families with the best bootstrap values in comprehensive studies.

  14. Computer applications in nuclear medicine

    International Nuclear Information System (INIS)

    Lancaster, J.L.; Lasher, J.C.; Blumhardt, R.

    1987-01-01

    Digital computers were introduced to nuclear medicine research as an imaging modality in the mid-1960s. Widespread use of imaging computers (scintigraphic computers) was not seen in nuclear medicine clinics until the mid-1970s. For the user, the ability to acquire scintigraphic images into the computer for quantitative purposes, with accurate selection of regions of interest (ROIs), promised almost endless computational capabilities. Investigators quickly developed many new methods for quantitating the distribution patterns of radiopharmaceuticals within the body both spatially and temporally. The computer was used to acquire data on practically every organ that could be imaged by means of gamma cameras or rectilinear scanners. Methods of image processing borrowed from other disciplines were applied to scintigraphic computer images in an attempt to improve image quality. Image processing in nuclear medicine has evolved into a relatively extensive set of tasks that can be called on by the user to provide additional clinical information rather than to improve image quality. Digital computers are utilized in nuclear medicine departments for nonimaging applications also, Patient scheduling, archiving, radiopharmaceutical inventory, radioimmunoassay (RIA), and health physics are just a few of the areas in which the digital computer has proven helpful. The computer is useful in any area in which a large quantity of data needs to be accurately managed, especially over a long period of time

  15. Computational method for discovery of estrogen responsive genes

    DEFF Research Database (Denmark)

    Tang, Suisheng; Tan, Sin Lam; Ramadoss, Suresh Kumar

    2004-01-01

    Estrogen has a profound impact on human physiology and affects numerous genes. The classical estrogen reaction is mediated by its receptors (ERs), which bind to the estrogen response elements (EREs) in target gene's promoter region. Due to tedious and expensive experiments, a limited number of hu...

  16. Global discriminative learning for higher-accuracy computational gene prediction.

    Directory of Open Access Journals (Sweden)

    Axel Bernal

    2007-03-01

    Full Text Available Most ab initio gene predictors use a probabilistic sequence model, typically a hidden Markov model, to combine separately trained models of genomic signals and content. By combining separate models of relevant genomic features, such gene predictors can exploit small training sets and incomplete annotations, and can be trained fairly efficiently. However, that type of piecewise training does not optimize prediction accuracy and has difficulty in accounting for statistical dependencies among different parts of the gene model. With genomic information being created at an ever-increasing rate, it is worth investigating alternative approaches in which many different types of genomic evidence, with complex statistical dependencies, can be integrated by discriminative learning to maximize annotation accuracy. Among discriminative learning methods, large-margin classifiers have become prominent because of the success of support vector machines (SVM in many classification tasks. We describe CRAIG, a new program for ab initio gene prediction based on a conditional random field model with semi-Markov structure that is trained with an online large-margin algorithm related to multiclass SVMs. Our experiments on benchmark vertebrate datasets and on regions from the ENCODE project show significant improvements in prediction accuracy over published gene predictors that use intrinsic features only, particularly at the gene level and on genes with long introns.

  17. The development of accurate data for the desing of fast reactors

    International Nuclear Information System (INIS)

    Rossouw, P.A.

    1976-04-01

    The proposed use of nuclear power in the generation of electricity in South Africa and the use of fast reactors in the country's nuclear porgram, requires a method for fast reactor evluation. The availability of accurate neutron data and neutronics computation techniques for fast reactors are required for such an evaluation. The reacotr physics and reactor parameters of importance in the evaluation of fast reacotrs are discussed, and computer programs for the computation of reactor spectra and reacotr parameters from differential nuclear data are presented in this treatise. In endeavouring to increase the accuracy in fast reactor design, two methods for the improvement of differential nuclear data were developed and are discussed in detail. The computer programs which were developed for this purpose are also given. The neutron data of the most important fissionable and breeding nuclei (U 235 x U 238 x Pu 239 and Pu 240 ) are adjusted using both methods and the improved neutron data are tested by computation with an advanced neutronics computer program. The improved and orginal neutron data are compared and the use of the improved data in fast reactor design is discussed

  18. Towards a scalable and accurate quantum approach for describing vibrations of molecule–metal interfaces

    Directory of Open Access Journals (Sweden)

    David M. Benoit

    2011-08-01

    Full Text Available We present a theoretical framework for the computation of anharmonic vibrational frequencies for large systems, with a particular focus on determining adsorbate frequencies from first principles. We give a detailed account of our local implementation of the vibrational self-consistent field approach and its correlation corrections. We show that our approach is both robust, accurate and can be easily deployed on computational grids in order to provide an efficient computational tool. We also present results on the vibrational spectrum of hydrogen fluoride on pyrene, on the thiophene molecule in the gas phase, and on small neutral gold clusters.

  19. Multiplex-PCR-Based Screening and Computational Modeling of Virulence Factors and T-Cell Mediated Immunity in Helicobacter pylori Infections for Accurate Clinical Diagnosis.

    Science.gov (United States)

    Oktem-Okullu, Sinem; Tiftikci, Arzu; Saruc, Murat; Cicek, Bahattin; Vardareli, Eser; Tozun, Nurdan; Kocagoz, Tanil; Sezerman, Ugur; Yavuz, Ahmet Sinan; Sayi-Yazgan, Ayca

    2015-01-01

    The outcome of H. pylori infection is closely related with bacteria's virulence factors and host immune response. The association between T cells and H. pylori infection has been identified, but the effects of the nine major H. pylori specific virulence factors; cagA, vacA, oipA, babA, hpaA, napA, dupA, ureA, ureB on T cell response in H. pylori infected patients have not been fully elucidated. We developed a multiplex- PCR assay to detect nine H. pylori virulence genes with in a three PCR reactions. Also, the expression levels of Th1, Th17 and Treg cell specific cytokines and transcription factors were detected by using qRT-PCR assays. Furthermore, a novel expert derived model is developed to identify set of factors and rules that can distinguish the ulcer patients from gastritis patients. Within all virulence factors that we tested, we identified a correlation between the presence of napA virulence gene and ulcer disease as a first data. Additionally, a positive correlation between the H. pylori dupA virulence factor and IFN-γ, and H. pylori babA virulence factor and IL-17 was detected in gastritis and ulcer patients respectively. By using computer-based models, clinical outcomes of a patients infected with H. pylori can be predicted by screening the patient's H. pylori vacA m1/m2, ureA and cagA status and IFN-γ (Th1), IL-17 (Th17), and FOXP3 (Treg) expression levels. Herein, we report, for the first time, the relationship between H. pylori virulence factors and host immune responses for diagnostic prediction of gastric diseases using computer-based models.

  20. Research progress in machine learning methods for gene-gene interaction detection.

    Science.gov (United States)

    Peng, Zhe-Ye; Tang, Zi-Jun; Xie, Min-Zhu

    2018-03-20

    Complex diseases are results of gene-gene and gene-environment interactions. However, the detection of high-dimensional gene-gene interactions is computationally challenging. In the last two decades, machine-learning approaches have been developed to detect gene-gene interactions with some successes. In this review, we summarize the progress in research on machine learning methods, as applied to gene-gene interaction detection. It systematically examines the principles and limitations of the current machine learning methods used in genome wide association studies (GWAS) to detect gene-gene interactions, such as neural networks (NN), random forest (RF), support vector machines (SVM) and multifactor dimensionality reduction (MDR), and provides some insights on the future research directions in the field.

  1. A copula method for modeling directional dependence of genes

    Directory of Open Access Journals (Sweden)

    Park Changyi

    2008-05-01

    Full Text Available Abstract Background Genes interact with each other as basic building blocks of life, forming a complicated network. The relationship between groups of genes with different functions can be represented as gene networks. With the deposition of huge microarray data sets in public domains, study on gene networking is now possible. In recent years, there has been an increasing interest in the reconstruction of gene networks from gene expression data. Recent work includes linear models, Boolean network models, and Bayesian networks. Among them, Bayesian networks seem to be the most effective in constructing gene networks. A major problem with the Bayesian network approach is the excessive computational time. This problem is due to the interactive feature of the method that requires large search space. Since fitting a model by using the copulas does not require iterations, elicitation of the priors, and complicated calculations of posterior distributions, the need for reference to extensive search spaces can be eliminated leading to manageable computational affords. Bayesian network approach produces a discretely expression of conditional probabilities. Discreteness of the characteristics is not required in the copula approach which involves use of uniform representation of the continuous random variables. Our method is able to overcome the limitation of Bayesian network method for gene-gene interaction, i.e. information loss due to binary transformation. Results We analyzed the gene interactions for two gene data sets (one group is eight histone genes and the other group is 19 genes which include DNA polymerases, DNA helicase, type B cyclin genes, DNA primases, radiation sensitive genes, repaire related genes, replication protein A encoding gene, DNA replication initiation factor, securin gene, nucleosome assembly factor, and a subunit of the cohesin complex by adopting a measure of directional dependence based on a copula function. We have compared

  2. Can a numerically stable subgrid-scale model for turbulent flow computation be ideally accurate?: a preliminary theoretical study for the Gaussian filtered Navier-Stokes equations.

    Science.gov (United States)

    Ida, Masato; Taniguchi, Nobuyuki

    2003-09-01

    This paper introduces a candidate for the origin of the numerical instabilities in large eddy simulation repeatedly observed in academic and practical industrial flow computations. Without resorting to any subgrid-scale modeling, but based on a simple assumption regarding the streamwise component of flow velocity, it is shown theoretically that in a channel-flow computation, the application of the Gaussian filtering to the incompressible Navier-Stokes equations yields a numerically unstable term, a cross-derivative term, which is similar to one appearing in the Gaussian filtered Vlasov equation derived by Klimas [J. Comput. Phys. 68, 202 (1987)] and also to one derived recently by Kobayashi and Shimomura [Phys. Fluids 15, L29 (2003)] from the tensor-diffusivity subgrid-scale term in a dynamic mixed model. The present result predicts that not only the numerical methods and the subgrid-scale models employed but also only the applied filtering process can be a seed of this numerical instability. An investigation concerning the relationship between the turbulent energy scattering and the unstable term shows that the instability of the term does not necessarily represent the backscatter of kinetic energy which has been considered a possible origin of numerical instabilities in large eddy simulation. The present findings raise the question whether a numerically stable subgrid-scale model can be ideally accurate.

  3. GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies.

    Science.gov (United States)

    Yung, Ling Sing; Yang, Can; Wan, Xiang; Yu, Weichuan

    2011-05-01

    Collecting millions of genetic variations is feasible with the advanced genotyping technology. With a huge amount of genetic variations data in hand, developing efficient algorithms to carry out the gene-gene interaction analysis in a timely manner has become one of the key problems in genome-wide association studies (GWAS). Boolean operation-based screening and testing (BOOST), a recent work in GWAS, completes gene-gene interaction analysis in 2.5 days on a desktop computer. Compared with central processing units (CPUs), graphic processing units (GPUs) are highly parallel hardware and provide massive computing resources. We are, therefore, motivated to use GPUs to further speed up the analysis of gene-gene interactions. We implement the BOOST method based on a GPU framework and name it GBOOST. GBOOST achieves a 40-fold speedup compared with BOOST. It completes the analysis of Wellcome Trust Case Control Consortium Type 2 Diabetes (WTCCC T2D) genome data within 1.34 h on a desktop computer equipped with Nvidia GeForce GTX 285 display card. GBOOST code is available at http://bioinformatics.ust.hk/BOOST.html#GBOOST.

  4. A graphical user interface for RAId, a knowledge integrated proteomics analysis suite with accurate statistics

    OpenAIRE

    Joyce, Brendan; Lee, Danny; Rubio, Alex; Ogurtsov, Aleksey; Alves, Gelio; Yu, Yi-Kuo

    2018-01-01

    Abstract Objective RAId is a software package that has been actively developed for the past 10 years for computationally and visually analyzing MS/MS data. Founded on rigorous statistical methods, RAId’s core program computes accurate E-values for peptides and proteins identified during database searches. Making this robust tool readily accessible for the proteomics community by developing a graphical user interface (GUI) is our main goa...

  5. Progress and challenges in the computational prediction of gene function using networks [v1; ref status: indexed, http://f1000r.es/SqmJUM

    Directory of Open Access Journals (Sweden)

    Paul Pavlidis

    2012-09-01

    Full Text Available In this opinion piece, we attempt to unify recent arguments we have made that serious confounds affect the use of network data to predict and characterize gene function. The development of computational approaches to determine gene function is a major strand of computational genomics research. However, progress beyond using BLAST to transfer annotations has been surprisingly slow. We have previously argued that a large part of the reported success in using "guilt by association" in network data is due to the tendency of methods to simply assign new functions to already well-annotated genes. While such predictions will tend to be correct, they are generic; it is true, but not very helpful, that a gene with many functions is more likely to have any function. We have also presented evidence that much of the remaining performance in cross-validation cannot be usefully generalized to new predictions, making progressive improvement in analysis difficult to engineer. Here we summarize our findings about how these problems will affect network analysis, discuss some ongoing responses within the field to these issues, and consolidate some recommendations and speculation, which we hope will modestly increase the reliability and specificity of gene function prediction.

  6. Integrated Translatome and Proteome: Approach for Accurate Portraying of Widespread Multifunctional Aspects of Trichoderma

    Science.gov (United States)

    Sharma, Vivek; Salwan, Richa; Sharma, P. N.; Gulati, Arvind

    2017-01-01

    Genome-wide studies of transcripts expression help in systematic monitoring of genes and allow targeting of candidate genes for future research. In contrast to relatively stable genomic data, the expression of genes is dynamic and regulated both at time and space level at different level in. The variation in the rate of translation is specific for each protein. Both the inherent nature of an mRNA molecule to be translated and the external environmental stimuli can affect the efficiency of the translation process. In biocontrol agents (BCAs), the molecular response at translational level may represents noise-like response of absolute transcript level and an adaptive response to physiological and pathological situations representing subset of mRNAs population actively translated in a cell. The molecular responses of biocontrol are complex and involve multistage regulation of number of genes. The use of high-throughput techniques has led to rapid increase in volume of transcriptomics data of Trichoderma. In general, almost half of the variations of transcriptome and protein level are due to translational control. Thus, studies are required to integrate raw information from different “omics” approaches for accurate depiction of translational response of BCAs in interaction with plants and plant pathogens. The studies on translational status of only active mRNAs bridging with proteome data will help in accurate characterization of only a subset of mRNAs actively engaged in translation. This review highlights the associated bottlenecks and use of state-of-the-art procedures in addressing the gap to accelerate future accomplishment of biocontrol mechanisms. PMID:28900417

  7. A Highly Accurate Approach for Aeroelastic System with Hysteresis Nonlinearity

    Directory of Open Access Journals (Sweden)

    C. C. Cui

    2017-01-01

    Full Text Available We propose an accurate approach, based on the precise integration method, to solve the aeroelastic system of an airfoil with a pitch hysteresis. A major procedure for achieving high precision is to design a predictor-corrector algorithm. This algorithm enables accurate determination of switching points resulting from the hysteresis. Numerical examples show that the results obtained by the presented method are in excellent agreement with exact solutions. In addition, the high accuracy can be maintained as the time step increases in a reasonable range. It is also found that the Runge-Kutta method may sometimes provide quite different and even fallacious results, though the step length is much less than that adopted in the presented method. With such high computational accuracy, the presented method could be applicable in dynamical systems with hysteresis nonlinearities.

  8. A computational methodology for formulating gasoline surrogate fuels with accurate physical and chemical kinetic properties

    KAUST Repository

    Ahmed, Ahfaz; Goteng, Gokop; Shankar, Vijai; Al-Qurashi, Khalid; Roberts, William L.; Sarathy, Mani

    2015-01-01

    simpler molecular composition that represent real fuel behavior in one or more aspects are needed to enable repeatable experimental and computational combustion investigations. This study presents a novel computational methodology for formulating

  9. Cone beam computed tomography: An accurate imaging technique in comparison with orthogonal portal imaging in intensity-modulated radiotherapy for prostate cancer

    Directory of Open Access Journals (Sweden)

    Om Prakash Gurjar

    2016-03-01

    Full Text Available Purpose: Various factors cause geometric uncertainties during prostate radiotherapy, including interfractional and intrafractional patient motions, organ motion, and daily setup errors. This may lead to increased normal tissue complications when a high dose to the prostate is administered. More-accurate treatment delivery is possible with daily imaging and localization of the prostate. This study aims to measure the shift of the prostate by using kilovoltage (kV cone beam computed tomography (CBCT after position verification by kV orthogonal portal imaging (OPI.Methods: Position verification in 10 patients with prostate cancer was performed by using OPI followed by CBCT before treatment delivery in 25 sessions per patient. In each session, OPI was performed by using an on-board imaging (OBI system and pelvic bone-to-pelvic bone matching was performed. After applying the noted shift by using OPI, CBCT was performed by using the OBI system and prostate-to-prostate matching was performed. The isocenter shifts along all three translational directions in both techniques were combined into a three-dimensional (3-D iso-displacement vector (IDV.Results: The mean (SD IDV (in centimeters calculated during the 250 imaging sessions was 0.931 (0.598, median 0.825 for OPI and 0.515 (336, median 0.43 for CBCT, p-value was less than 0.0001 which shows extremely statistical significant difference.Conclusion: Even after bone-to-bone matching by using OPI, a significant shift in prostate was observed on CBCT. This study concludes that imaging with CBCT provides a more accurate prostate localization than the OPI technique. Hence, CBCT should be chosen as the preferred imaging technique.

  10. Proteogenomics of rare taxonomic phyla: A prospective treasure trove of protein coding genes.

    Science.gov (United States)

    Kumar, Dhirendra; Mondal, Anupam Kumar; Kutum, Rintu; Dash, Debasis

    2016-01-01

    Sustainable innovations in sequencing technologies have resulted in a torrent of microbial genome sequencing projects. However, the prokaryotic genomes sequenced so far are unequally distributed along their phylogenetic tree; few phyla contain the majority, the rest only a few representatives. Accurate genome annotation lags far behind genome sequencing. While automated computational prediction, aided by comparative genomics, remains a popular choice for genome annotation, substantial fraction of these annotations are erroneous. Proteogenomics utilizes protein level experimental observations to annotate protein coding genes on a genome wide scale. Benefits of proteogenomics include discovery and correction of gene annotations regardless of their phylogenetic conservation. This not only allows detection of common, conserved proteins but also the discovery of protein products of rare genes that may be horizontally transferred or taxonomy specific. Chances of encountering such genes are more in rare phyla that comprise a small number of complete genome sequences. We collated all bacterial and archaeal proteogenomic studies carried out to date and reviewed them in the context of genome sequencing projects. Here, we present a comprehensive list of microbial proteogenomic studies, their taxonomic distribution, and also urge for targeted proteogenomics of underexplored taxa to build an extensive reference of protein coding genes. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Exploiting Locality in Quantum Computation for Quantum Chemistry.

    Science.gov (United States)

    McClean, Jarrod R; Babbush, Ryan; Love, Peter J; Aspuru-Guzik, Alán

    2014-12-18

    Accurate prediction of chemical and material properties from first-principles quantum chemistry is a challenging task on traditional computers. Recent developments in quantum computation offer a route toward highly accurate solutions with polynomial cost; however, this solution still carries a large overhead. In this Perspective, we aim to bring together known results about the locality of physical interactions from quantum chemistry with ideas from quantum computation. We show that the utilization of spatial locality combined with the Bravyi-Kitaev transformation offers an improvement in the scaling of known quantum algorithms for quantum chemistry and provides numerical examples to help illustrate this point. We combine these developments to improve the outlook for the future of quantum chemistry on quantum computers.

  12. 40 CFR 194.23 - Models and computer codes.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...

  13. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    Science.gov (United States)

    Fauteux, François; Strömvik, Martina V

    2009-01-01

    Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP) gene promoters from three plant families, namely Brassicaceae (mustards), Fabaceae (legumes) and Poaceae (grasses) using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L.) Heynh.), soybean (Glycine max (L.) Merr.) and rice (Oryza sativa L.) respectively. We have identified three conserved motifs (two RY-like and one ACGT-like) in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination of conserved motifs

  14. Seed storage protein gene promoters contain conserved DNA motifs in Brassicaceae, Fabaceae and Poaceae

    Directory of Open Access Journals (Sweden)

    Fauteux François

    2009-10-01

    Full Text Available Abstract Background Accurate computational identification of cis-regulatory motifs is difficult, particularly in eukaryotic promoters, which typically contain multiple short and degenerate DNA sequences bound by several interacting factors. Enrichment in combinations of rare motifs in the promoter sequence of functionally or evolutionarily related genes among several species is an indicator of conserved transcriptional regulatory mechanisms. This provides a basis for the computational identification of cis-regulatory motifs. Results We have used a discriminative seeding DNA motif discovery algorithm for an in-depth analysis of 54 seed storage protein (SSP gene promoters from three plant families, namely Brassicaceae (mustards, Fabaceae (legumes and Poaceae (grasses using backgrounds based on complete sets of promoters from a representative species in each family, namely Arabidopsis (Arabidopsis thaliana (L. Heynh., soybean (Glycine max (L. Merr. and rice (Oryza sativa L. respectively. We have identified three conserved motifs (two RY-like and one ACGT-like in Brassicaceae and Fabaceae SSP gene promoters that are similar to experimentally characterized seed-specific cis-regulatory elements. Fabaceae SSP gene promoter sequences are also enriched in a novel, seed-specific E2Fb-like motif. Conserved motifs identified in Poaceae SSP gene promoters include a GCN4-like motif, two prolamin-box-like motifs and an Skn-1-like motif. Evidence of the presence of a variant of the TATA-box is found in the SSP gene promoters from the three plant families. Motifs discovered in SSP gene promoters were used to score whole-genome sets of promoters from Arabidopsis, soybean and rice. The highest-scoring promoters are associated with genes coding for different subunits or precursors of seed storage proteins. Conclusion Seed storage protein gene promoter motifs are conserved in diverse species, and different plant families are characterized by a distinct combination

  15. Species-independent MicroRNA Gene Discovery

    KAUST Repository

    Kamanu, Timothy K.

    2012-12-01

    MicroRNA (miRNA) are a class of small endogenous non-coding RNA that are mainly negative transcriptional and post-transcriptional regulators in both plants and animals. Recent studies have shown that miRNA are involved in different types of cancer and other incurable diseases such as autism and Alzheimer’s. Functional miRNAs are excised from hairpin-like sequences that are known as miRNA genes. There are about 21,000 known miRNA genes, most of which have been determined using experimental methods. miRNA genes are classified into different groups (miRNA families). This study reports about 19,000 unknown miRNA genes in nine species whereby approximately 15,300 predictions were computationally validated to contain at least one experimentally verified functional miRNA product. The predictions are based on a novel computational strategy which relies on miRNA family groupings and exploits the physics and geometry of miRNA genes to unveil the hidden palindromic signals and symmetries in miRNA gene sequences. Unlike conventional computational miRNA gene discovery methods, the algorithm developed here is species-independent: it allows prediction at higher accuracy and resolution from arbitrary RNA/DNA sequences in any species and thus enables examination of repeat-prone genomic regions which are thought to be non-informative or ’junk’ sequences. The information non-redundancy of uni-directional RNA sequences compared to information redundancy of bi-directional DNA is demonstrated, a fact that is overlooked by most pattern discovery algorithms. A novel method for computing upstream and downstream miRNA gene boundaries based on mathematical/statistical functions is suggested, as well as cutoffs for annotation of miRNA genes in different miRNA families. Another tool is proposed to allow hypotheses generation and visualization of data matrices, intra- and inter-species chromosomal distribution of miRNA genes or miRNA families. Our results indicate that: miRNA and mi

  16. Computational Fluid Dynamics of Whole-Body Aircraft

    Science.gov (United States)

    Agarwal, Ramesh

    1999-01-01

    The current state of the art in computational aerodynamics for whole-body aircraft flowfield simulations is described. Recent advances in geometry modeling, surface and volume grid generation, and flow simulation algorithms have led to accurate flowfield predictions for increasingly complex and realistic configurations. As a result, computational aerodynamics has emerged as a crucial enabling technology for the design and development of flight vehicles. Examples illustrating the current capability for the prediction of transport and fighter aircraft flowfields are presented. Unfortunately, accurate modeling of turbulence remains a major difficulty in the analysis of viscosity-dominated flows. In the future, inverse design methods, multidisciplinary design optimization methods, artificial intelligence technology, and massively parallel computer technology will be incorporated into computational aerodynamics, opening up greater opportunities for improved product design at substantially reduced costs.

  17. On canonical cylinder sections for accurate determination of contact angle in microgravity

    Science.gov (United States)

    Concus, Paul; Finn, Robert; Zabihi, Farhad

    1992-01-01

    Large shifts of liquid arising from small changes in certain container shapes in zero gravity can be used as a basis for accurately determining contact angle. Canonical geometries for this purpose, recently developed mathematically, are investigated here computationally. It is found that the desired nearly-discontinuous behavior can be obtained and that the shifts of liquid have sufficient volume to be readily observed.

  18. Automated and Accurate Estimation of Gene Family Abundance from Shotgun Metagenomes.

    Directory of Open Access Journals (Sweden)

    Stephen Nayfach

    2015-11-01

    Full Text Available Shotgun metagenomic DNA sequencing is a widely applicable tool for characterizing the functions that are encoded by microbial communities. Several bioinformatic tools can be used to functionally annotate metagenomes, allowing researchers to draw inferences about the functional potential of the community and to identify putative functional biomarkers. However, little is known about how decisions made during annotation affect the reliability of the results. Here, we use statistical simulations to rigorously assess how to optimize annotation accuracy and speed, given parameters of the input data like read length and library size. We identify best practices in metagenome annotation and use them to guide the development of the Shotgun Metagenome Annotation Pipeline (ShotMAP. ShotMAP is an analytically flexible, end-to-end annotation pipeline that can be implemented either on a local computer or a cloud compute cluster. We use ShotMAP to assess how different annotation databases impact the interpretation of how marine metagenome and metatranscriptome functional capacity changes across seasons. We also apply ShotMAP to data obtained from a clinical microbiome investigation of inflammatory bowel disease. This analysis finds that gut microbiota collected from Crohn's disease patients are functionally distinct from gut microbiota collected from either ulcerative colitis patients or healthy controls, with differential abundance of metabolic pathways related to host-microbiome interactions that may serve as putative biomarkers of disease.

  19. Multiplex-PCR-Based Screening and Computational Modeling of Virulence Factors and T-Cell Mediated Immunity in Helicobacter pylori Infections for Accurate Clinical Diagnosis.

    Directory of Open Access Journals (Sweden)

    Sinem Oktem-Okullu

    Full Text Available The outcome of H. pylori infection is closely related with bacteria's virulence factors and host immune response. The association between T cells and H. pylori infection has been identified, but the effects of the nine major H. pylori specific virulence factors; cagA, vacA, oipA, babA, hpaA, napA, dupA, ureA, ureB on T cell response in H. pylori infected patients have not been fully elucidated. We developed a multiplex- PCR assay to detect nine H. pylori virulence genes with in a three PCR reactions. Also, the expression levels of Th1, Th17 and Treg cell specific cytokines and transcription factors were detected by using qRT-PCR assays. Furthermore, a novel expert derived model is developed to identify set of factors and rules that can distinguish the ulcer patients from gastritis patients. Within all virulence factors that we tested, we identified a correlation between the presence of napA virulence gene and ulcer disease as a first data. Additionally, a positive correlation between the H. pylori dupA virulence factor and IFN-γ, and H. pylori babA virulence factor and IL-17 was detected in gastritis and ulcer patients respectively. By using computer-based models, clinical outcomes of a patients infected with H. pylori can be predicted by screening the patient's H. pylori vacA m1/m2, ureA and cagA status and IFN-γ (Th1, IL-17 (Th17, and FOXP3 (Treg expression levels. Herein, we report, for the first time, the relationship between H. pylori virulence factors and host immune responses for diagnostic prediction of gastric diseases using computer-based models.

  20. Prognostic breast cancer signature identified from 3D culture model accurately predicts clinical outcome across independent datasets

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Katherine J.; Patrick, Denis R.; Bissell, Mina J.; Fournier, Marcia V.

    2008-10-20

    One of the major tenets in breast cancer research is that early detection is vital for patient survival by increasing treatment options. To that end, we have previously used a novel unsupervised approach to identify a set of genes whose expression predicts prognosis of breast cancer patients. The predictive genes were selected in a well-defined three dimensional (3D) cell culture model of non-malignant human mammary epithelial cell morphogenesis as down-regulated during breast epithelial cell acinar formation and cell cycle arrest. Here we examine the ability of this gene signature (3D-signature) to predict prognosis in three independent breast cancer microarray datasets having 295, 286, and 118 samples, respectively. Our results show that the 3D-signature accurately predicts prognosis in three unrelated patient datasets. At 10 years, the probability of positive outcome was 52, 51, and 47 percent in the group with a poor-prognosis signature and 91, 75, and 71 percent in the group with a good-prognosis signature for the three datasets, respectively (Kaplan-Meier survival analysis, p<0.05). Hazard ratios for poor outcome were 5.5 (95% CI 3.0 to 12.2, p<0.0001), 2.4 (95% CI 1.6 to 3.6, p<0.0001) and 1.9 (95% CI 1.1 to 3.2, p = 0.016) and remained significant for the two larger datasets when corrected for estrogen receptor (ER) status. Hence the 3D-signature accurately predicts breast cancer outcome in both ER-positive and ER-negative tumors, though individual genes differed in their prognostic ability in the two subtypes. Genes that were prognostic in ER+ patients are AURKA, CEP55, RRM2, EPHA2, FGFBP1, and VRK1, while genes prognostic in ER patients include ACTB, FOXM1 and SERPINE2 (Kaplan-Meier p<0.05). Multivariable Cox regression analysis in the largest dataset showed that the 3D-signature was a strong independent factor in predicting breast cancer outcome. The 3D-signature accurately predicts breast cancer outcome across multiple datasets and holds prognostic

  1. GeneNotes – A novel information management software for biologists

    Directory of Open Access Journals (Sweden)

    Wong Wing H

    2005-02-01

    Full Text Available Abstract Background Collecting and managing information is a challenging task in a genome-wide profiling research project. Most databases and online computational tools require a direct human involvement. Information and computational results are presented in various multimedia formats (e.g., text, image, PDF, word files, etc., many of which cannot be automatically processed by computers in biologically meaningful ways. In addition, the quality of computational results is far from perfect and requires nontrivial manual examination. The timely selection, integration and interpretation of heterogeneous biological information still heavily rely on the sensibility of biologists. Biologists often feel overwhelmed by the huge amount of and the great diversity of distributed heterogeneous biological information. Description We developed an information management application called GeneNotes. GeneNotes is the first application that allows users to collect and manage multimedia biological information about genes/ESTs. GeneNotes provides an integrated environment for users to surf the Internet, collect notes for genes/ESTs, and retrieve notes. GeneNotes is supported by a server that integrates gene annotations from many major databases (e.g., HGNC, MGI, etc.. GeneNotes uses the integrated gene annotations to (a identify genes given various types of gene IDs (e.g., RefSeq ID, GenBank ID, etc., and (b provide quick views of genes. GeneNotes is free for academic usage. The program and the tutorials are available at: http://bayes.fas.harvard.edu/genenotes/. Conclusions GeneNotes provides a novel human-computer interface to assist researchers to collect and manage biological information. It also provides a platform for studying how users behave when they manipulate biological information. The results of such study can lead to innovation of more intelligent human-computer interfaces that greatly shorten the cycle of biology research.

  2. Thermal Conductivities in Solids from First Principles: Accurate Computations and Rapid Estimates

    Science.gov (United States)

    Carbogno, Christian; Scheffler, Matthias

    In spite of significant research efforts, a first-principles determination of the thermal conductivity κ at high temperatures has remained elusive. Boltzmann transport techniques that account for anharmonicity perturbatively become inaccurate under such conditions. Ab initio molecular dynamics (MD) techniques using the Green-Kubo (GK) formalism capture the full anharmonicity, but can become prohibitively costly to converge in time and size. We developed a formalism that accelerates such GK simulations by several orders of magnitude and that thus enables its application within the limited time and length scales accessible in ab initio MD. For this purpose, we determine the effective harmonic potential occurring during the MD, the associated temperature-dependent phonon properties and lifetimes. Interpolation in reciprocal and frequency space then allows to extrapolate to the macroscopic scale. For both force-field and ab initio MD, we validate this approach by computing κ for Si and ZrO2, two materials known for their particularly harmonic and anharmonic character. Eventually, we demonstrate how these techniques facilitate reasonable estimates of κ from existing MD calculations at virtually no additional computational cost.

  3. A fast and accurate dihedral interpolation loop subdivision scheme

    Science.gov (United States)

    Shi, Zhuo; An, Yalei; Wang, Zhongshuai; Yu, Ke; Zhong, Si; Lan, Rushi; Luo, Xiaonan

    2018-04-01

    In this paper, we propose a fast and accurate dihedral interpolation Loop subdivision scheme for subdivision surfaces based on triangular meshes. In order to solve the problem of surface shrinkage, we keep the limit condition unchanged, which is important. Extraordinary vertices are handled using modified Butterfly rules. Subdivision schemes are computationally costly as the number of faces grows exponentially at higher levels of subdivision. To address this problem, our approach is to use local surface information to adaptively refine the model. This is achieved simply by changing the threshold value of the dihedral angle parameter, i.e., the angle between the normals of a triangular face and its adjacent faces. We then demonstrate the effectiveness of the proposed method for various 3D graphic triangular meshes, and extensive experimental results show that it can match or exceed the expected results at lower computational cost.

  4. Parente2: a fast and accurate method for detecting identity by descent

    KAUST Repository

    Rodriguez, Jesse M.; Bercovici, Sivan; Huang, Lin; Frostig, Roy; Batzoglou, Serafim

    2014-01-01

    Identity-by-descent (IBD) inference is the problem of establishing a genetic connection between two individuals through a genomic segment that is inherited by both individuals from a recent common ancestor. IBD inference is an important preceding step in a variety of population genomic studies, ranging from demographic studies to linking genomic variation with phenotype and disease. The problem of accurate IBD detection has become increasingly challenging with the availability of large collections of human genotypes and genomes: Given a cohort's size, a quadratic number of pairwise genome comparisons must be performed. Therefore, computation time and the false discovery rate can also scale quadratically. To enable accurate and efficient large-scale IBD detection, we present Parente2, a novel method for detecting IBD segments. Parente2 is based on an embedded log-likelihood ratio and uses a model that accounts for linkage disequilibrium by explicitly modeling haplotype frequencies. Parente2 operates directly on genotype data without the need to phase data prior to IBD inference. We evaluate Parente2's performance through extensive simulations using real data, and we show that it provides substantially higher accuracy compared to previous state-of-the-art methods while maintaining high computational efficiency.

  5. Efficient strategy for detecting gene × gene joint action and its application in schizophrenia

    NARCIS (Netherlands)

    Won, Sungho; Kwon, Min-Seok; Mattheisen, Manuel; Park, Suyeon; Park, Changsoon; Kihara, Daisuke; Cichon, Sven; Ophoff, Roel; Nöthen, Markus M.; Rietschel, Marcella; Baur, Max; Uitterlinden, Andre G.; Hofmann, A.; Lange, Christoph; Kahn, René S.; Linszen, Don H.; van Os, Jim; Wiersma, Durk; Bruggeman, Richard; Cahn, Wiepke; de Haan, Lieuwe; Krabbendam, Lydia; Myin-Germeys, Inez

    2014-01-01

    We propose a new approach to detect gene × gene joint action in genome-wide association studies (GWASs) for case-control designs. This approach offers an exhaustive search for all two-way joint action (including, as a special case, single gene action) that is computationally feasible at the

  6. Interrogating the topological robustness of gene regulatory circuits by randomization.

    Directory of Open Access Journals (Sweden)

    Bin Huang

    2017-03-01

    Full Text Available One of the most important roles of cells is performing their cellular tasks properly for survival. Cells usually achieve robust functionality, for example, cell-fate decision-making and signal transduction, through multiple layers of regulation involving many genes. Despite the combinatorial complexity of gene regulation, its quantitative behavior has been typically studied on the basis of experimentally verified core gene regulatory circuitry, composed of a small set of important elements. It is still unclear how such a core circuit operates in the presence of many other regulatory molecules and in a crowded and noisy cellular environment. Here we report a new computational method, named random circuit perturbation (RACIPE, for interrogating the robust dynamical behavior of a gene regulatory circuit even without accurate measurements of circuit kinetic parameters. RACIPE generates an ensemble of random kinetic models corresponding to a fixed circuit topology, and utilizes statistical tools to identify generic properties of the circuit. By applying RACIPE to simple toggle-switch-like motifs, we observed that the stable states of all models converge to experimentally observed gene state clusters even when the parameters are strongly perturbed. RACIPE was further applied to a proposed 22-gene network of the Epithelial-to-Mesenchymal Transition (EMT, from which we identified four experimentally observed gene states, including the states that are associated with two different types of hybrid Epithelial/Mesenchymal phenotypes. Our results suggest that dynamics of a gene circuit is mainly determined by its topology, not by detailed circuit parameters. Our work provides a theoretical foundation for circuit-based systems biology modeling. We anticipate RACIPE to be a powerful tool to predict and decode circuit design principles in an unbiased manner, and to quantitatively evaluate the robustness and heterogeneity of gene expression.

  7. Accurate Calculations of Rotationally Inelastic Scattering Cross Sections Using Mixed Quantum/Classical Theory.

    Science.gov (United States)

    Semenov, Alexander; Babikov, Dmitri

    2014-01-16

    For computational treatment of rotationally inelastic scattering of molecules, we propose to use the mixed quantum/classical theory, MQCT. The old idea of treating translational motion classically, while quantum mechanics is used for rotational degrees of freedom, is developed to the new level and is applied to Na + N2 collisions in a broad range of energies. Comparison with full-quantum calculations shows that MQCT accurately reproduces all, even minor, features of energy dependence of cross sections, except scattering resonances at very low energies. The remarkable success of MQCT opens up wide opportunities for computational predictions of inelastic scattering cross sections at higher temperatures and/or for polyatomic molecules and heavier quenchers, which is computationally close to impossible within the full-quantum framework.

  8. Accurate and computationally efficient prediction of thermochemical properties of biomolecules using the generalized connectivity-based hierarchy.

    Science.gov (United States)

    Sengupta, Arkajyoti; Ramabhadran, Raghunath O; Raghavachari, Krishnan

    2014-08-14

    In this study we have used the connectivity-based hierarchy (CBH) method to derive accurate heats of formation of a range of biomolecules, 18 amino acids and 10 barbituric acid/uracil derivatives. The hierarchy is based on the connectivity of the different atoms in a large molecule. It results in error-cancellation reaction schemes that are automated, general, and can be readily used for a broad range of organic molecules and biomolecules. Herein, we first locate stable conformational and tautomeric forms of these biomolecules using an accurate level of theory (viz. CCSD(T)/6-311++G(3df,2p)). Subsequently, the heats of formation of the amino acids are evaluated using the CBH-1 and CBH-2 schemes and routinely employed density functionals or wave function-based methods. The calculated heats of formation obtained herein using modest levels of theory and are in very good agreement with those obtained using more expensive W1-F12 and W2-F12 methods on amino acids and G3 results on barbituric acid derivatives. Overall, the present study (a) highlights the small effect of including multiple conformers in determining the heats of formation of biomolecules and (b) in concurrence with previous CBH studies, proves that use of the more effective error-cancelling isoatomic scheme (CBH-2) results in more accurate heats of formation with modestly sized basis sets along with common density functionals or wave function-based methods.

  9. Accurate calculations of bound rovibrational states for argon trimer

    Energy Technology Data Exchange (ETDEWEB)

    Brandon, Drew; Poirier, Bill [Department of Chemistry and Biochemistry, and Department of Physics, Texas Tech University, Box 41061, Lubbock, Texas 79409-1061 (United States)

    2014-07-21

    This work presents a comprehensive quantum dynamics calculation of the bound rovibrational eigenstates of argon trimer (Ar{sub 3}), using the ScalIT suite of parallel codes. The Ar{sub 3} rovibrational energy levels are computed to a very high level of accuracy (10{sup −3} cm{sup −1} or better), and up to the highest rotational and vibrational excitations for which bound states exist. For many of these rovibrational states, wavefunctions are also computed. Rare gas clusters such as Ar{sub 3} are interesting because the interatomic interactions manifest through long-range van der Waals forces, rather than through covalent chemical bonding. As a consequence, they exhibit strong Coriolis coupling between the rotational and vibrational degrees of freedom, as well as highly delocalized states, all of which renders accurate quantum dynamical calculation difficult. Moreover, with its (comparatively) deep potential well and heavy masses, Ar{sub 3} is an especially challenging rare gas trimer case. There are a great many rovibrational eigenstates to compute, and a very high density of states. Consequently, very few previous rovibrational state calculations for Ar{sub 3} may be found in the current literature—and only for the lowest-lying rotational excitations.

  10. On the accurate fast evaluation of finite Fourier integrals using cubic splines

    International Nuclear Information System (INIS)

    Morishima, N.

    1993-01-01

    Finite Fourier integrals based on a cubic-splines fit to equidistant data are shown to be evaluated fast and accurately. Good performance, especially on computational speed, is achieved by the optimization of the spline fit and the internal use of the fast Fourier transform (FFT) algorithm for complex data. The present procedure provides high accuracy with much shorter CPU time than a trapezoidal FFT. (author)

  11. Spatial reconstruction of single-cell gene expression

    Science.gov (United States)

    Satija, Rahul; Farrell, Jeffrey A.; Gennert, David; Schier, Alexander F.; Regev, Aviv

    2015-01-01

    Spatial localization is a key determinant of cellular fate and behavior, but spatial RNA assays traditionally rely on staining for a limited number of RNA species. In contrast, single-cell RNA-seq allows for deep profiling of cellular gene expression, but established methods separate cells from their native spatial context. Here we present Seurat, a computational strategy to infer cellular localization by integrating single-cell RNA-seq data with in situ RNA patterns. We applied Seurat to spatially map 851 single cells from dissociated zebrafish (Danio rerio) embryos, inferring a transcriptome-wide map of spatial patterning. We confirmed Seurat’s accuracy using several experimental approaches, and used it to identify a set of archetypal expression patterns and spatial markers. Additionally, Seurat correctly localizes rare subpopulations, accurately mapping both spatially restricted and scattered groups. Seurat will be applicable to mapping cellular localization within complex patterned tissues in diverse systems. PMID:25867923

  12. Computer tomography in otolaryngology

    International Nuclear Information System (INIS)

    Gradzki, J.

    1981-01-01

    The principles of design and the action of computer tomography which was applied also for the diagnosis of nose, ear and throat diseases are discussed. Computer tomography makes possible visualization of the structures of the nose, nasal sinuses and facial skeleton in transverse and eoronal planes. The method enables an accurate evaluation of the position and size of neoplasms in these regions and differentiation of inflammatory exudates against malignant masses. In otology computer tomography is used particularly in the diagnosis of pontocerebellar angle tumours and otogenic brain abscesses. Computer tomography of the larynx and pharynx provides new diagnostic data owing to the possibility of obtaining transverse sections and visualization of cartilage. Computer tomograms of some cases are presented. (author)

  13. Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space

    Science.gov (United States)

    Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.

    2013-01-01

    For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960

  14. Combinatorial pooling enables selective sequencing of the barley gene space.

    Directory of Open Access Journals (Sweden)

    Stefano Lonardi

    2013-04-01

    Full Text Available For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

  15. Combinatorial pooling enables selective sequencing of the barley gene space.

    Science.gov (United States)

    Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J

    2013-04-01

    For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

  16. Fast and accurate CMB computations in non-flat FLRW universes

    Science.gov (United States)

    Lesgourgues, Julien; Tram, Thomas

    2014-09-01

    We present a new method for calculating CMB anisotropies in a non-flat Friedmann universe, relying on a very stable algorithm for the calculation of hyperspherical Bessel functions, that can be pushed to arbitrary precision levels. We also introduce a new approximation scheme which gradually takes over in the flat space limit and leads to significant reductions of the computation time. Our method is implemented in the Boltzmann code class. It can be used to benchmark the accuracy of the camb code in curved space, which is found to match expectations. For default precision settings, corresponding to 0.1% for scalar temperature spectra and 0.2% for scalar polarisation spectra, our code is two to three times faster, depending on curvature. We also simplify the temperature and polarisation source terms significantly, so the different contributions to the Cl 's are easy to identify inside the code.

  17. Fast and accurate CMB computations in non-flat FLRW universes

    International Nuclear Information System (INIS)

    Lesgourgues, Julien; Tram, Thomas

    2014-01-01

    We present a new method for calculating CMB anisotropies in a non-flat Friedmann universe, relying on a very stable algorithm for the calculation of hyperspherical Bessel functions, that can be pushed to arbitrary precision levels. We also introduce a new approximation scheme which gradually takes over in the flat space limit and leads to significant reductions of the computation time. Our method is implemented in the Boltzmann code class. It can be used to benchmark the accuracy of the camb code in curved space, which is found to match expectations. For default precision settings, corresponding to 0.1% for scalar temperature spectra and 0.2% for scalar polarisation spectra, our code is two to three times faster, depending on curvature. We also simplify the temperature and polarisation source terms significantly, so the different contributions to the C ℓ  's are easy to identify inside the code

  18. A literature search tool for intelligent extraction of disease-associated genes.

    Science.gov (United States)

    Jung, Jae-Yoon; DeLuca, Todd F; Nelson, Tristan H; Wall, Dennis P

    2014-01-01

    To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder-gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene-disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately.

  19. Allele-sharing models: LOD scores and accurate linkage tests.

    Science.gov (United States)

    Kong, A; Cox, N J

    1997-11-01

    Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.

  20. Accurate phylogenetic classification of DNA fragments based onsequence composition

    Energy Technology Data Exchange (ETDEWEB)

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis; Hugenholtz, Philip; Rigoutsos, Isidore

    2006-05-01

    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequence characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.

  1. Early gene regulation of osteogenesis in embryonic stem cells

    KAUST Repository

    Kirkham, Glen R.; Lovrics, Anna; Byrne, Helen M.; Jensen, Oliver E.; King, John R.; Shakesheff, Kevin M.; Buttery, Lee D. K.

    2012-01-01

    The early gene regulatory networks (GRNs) that mediate stem cell differentiation are complex, and the underlying regulatory associations can be difficult to map accurately. In this study, the expression profiles of the genes Dlx5, Msx2 and Runx2

  2. Genome-wide prediction and analysis of human tissue-selective genes using microarray expression data

    Directory of Open Access Journals (Sweden)

    Teng Shaolei

    2013-01-01

    Full Text Available Abstract Background Understanding how genes are expressed specifically in particular tissues is a fundamental question in developmental biology. Many tissue-specific genes are involved in the pathogenesis of complex human diseases. However, experimental identification of tissue-specific genes is time consuming and difficult. The accurate predictions of tissue-specific gene targets could provide useful information for biomarker development and drug target identification. Results In this study, we have developed a machine learning approach for predicting the human tissue-specific genes using microarray expression data. The lists of known tissue-specific genes for different tissues were collected from UniProt database, and the expression data retrieved from the previously compiled dataset according to the lists were used for input vector encoding. Random Forests (RFs and Support Vector Machines (SVMs were used to construct accurate classifiers. The RF classifiers were found to outperform SVM models for tissue-specific gene prediction. The results suggest that the candidate genes for brain or liver specific expression can provide valuable information for further experimental studies. Our approach was also applied for identifying tissue-selective gene targets for different types of tissues. Conclusions A machine learning approach has been developed for accurately identifying the candidate genes for tissue specific/selective expression. The approach provides an efficient way to select some interesting genes for developing new biomedical markers and improve our knowledge of tissue-specific expression.

  3. The joint effects of background selection and genetic recombination on local gene genealogies.

    Science.gov (United States)

    Zeng, Kai; Charlesworth, Brian

    2011-09-01

    Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data.

  4. Identification and Evaluation of Reliable Reference Genes for Quantitative Real-Time PCR Analysis in Tea Plant (Camellia sinensis (L.) O. Kuntze)

    Science.gov (United States)

    Hao, Xinyuan; Horvath, David P.; Chao, Wun S.; Yang, Yajun; Wang, Xinchao; Xiao, Bin

    2014-01-01

    Reliable reference selection for the accurate quantification of gene expression under various experimental conditions is a crucial step in qRT-PCR normalization. To date, only a few housekeeping genes have been identified and used as reference genes in tea plant. The validity of those reference genes are not clear since their expression stabilities have not been rigorously examined. To identify more appropriate reference genes for qRT-PCR studies on tea plant, we examined the expression stability of 11 candidate reference genes from three different sources: the orthologs of Arabidopsis traditional reference genes and stably expressed genes identified from whole-genome GeneChip studies, together with three housekeeping gene commonly used in tea plant research. We evaluated the transcript levels of these genes in 94 experimental samples. The expression stabilities of these 11 genes were ranked using four different computation programs including geNorm, Normfinder, BestKeeper, and the comparative ∆CT method. Results showed that the three commonly used housekeeping genes of CsTUBULIN1, CsACINT1 and Cs18S rRNA1 together with CsUBQ1 were the most unstable genes in all sample ranking order. However, CsPTB1, CsEF1, CsSAND1, CsCLATHRIN1 and CsUBC1 were the top five appropriate reference genes for qRT-PCR analysis in complex experimental conditions. PMID:25474086

  5. Organic Computing

    CERN Document Server

    Würtz, Rolf P

    2008-01-01

    Organic Computing is a research field emerging around the conviction that problems of organization in complex systems in computer science, telecommunications, neurobiology, molecular biology, ethology, and possibly even sociology can be tackled scientifically in a unified way. From the computer science point of view, the apparent ease in which living systems solve computationally difficult problems makes it inevitable to adopt strategies observed in nature for creating information processing machinery. In this book, the major ideas behind Organic Computing are delineated, together with a sparse sample of computational projects undertaken in this new field. Biological metaphors include evolution, neural networks, gene-regulatory networks, networks of brain modules, hormone system, insect swarms, and ant colonies. Applications are as diverse as system design, optimization, artificial growth, task allocation, clustering, routing, face recognition, and sign language understanding.

  6. Computed Tomography (CT) -- Sinuses

    Medline Plus

    Full Text Available ... When the image slices are reassembled by computer software, the result is a very detailed multidimensional view ... accurate. A major advantage of CT is its ability to image bone, soft tissue and blood vessels ...

  7. Computed Tomography (CT) -- Head

    Medline Plus

    Full Text Available ... When the image slices are reassembled by computer software, the result is a very detailed multidimensional view ... accurate. A major advantage of CT is its ability to image bone, soft tissue and blood vessels ...

  8. ArraySolver: an algorithm for colour-coded graphical display and Wilcoxon signed-rank statistics for comparing microarray gene expression data.

    Science.gov (United States)

    Khan, Haseeb Ahmad

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann-Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n < or = 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform.

  9. A high quality Arabidopsis transcriptome for accurate transcript-level analysis of alternative splicing

    KAUST Repository

    Zhang, Runxuan

    2017-04-05

    Alternative splicing generates multiple transcript and protein isoforms from the same gene and thus is important in gene expression regulation. To date, RNA-sequencing (RNA-seq) is the standard method for quantifying changes in alternative splicing on a genome-wide scale. Understanding the current limitations of RNA-seq is crucial for reliable analysis and the lack of high quality, comprehensive transcriptomes for most species, including model organisms such as Arabidopsis, is a major constraint in accurate quantification of transcript isoforms. To address this, we designed a novel pipeline with stringent filters and assembled a comprehensive Reference Transcript Dataset for Arabidopsis (AtRTD2) containing 82,190 non-redundant transcripts from 34 212 genes. Extensive experimental validation showed that AtRTD2 and its modified version, AtRTD2-QUASI, for use in Quantification of Alternatively Spliced Isoforms, outperform other available transcriptomes in RNA-seq analysis. This strategy can be implemented in other species to build a pipeline for transcript-level expression and alternative splicing analyses.

  10. A high quality Arabidopsis transcriptome for accurate transcript-level analysis of alternative splicing

    KAUST Repository

    Zhang, Runxuan; Calixto, Cristiane  P.  G.; Marquez, Yamile; Venhuizen, Peter; Tzioutziou, Nikoleta A.; Guo, Wenbin; Spensley, Mark; Entizne, Juan Carlos; Lewandowska, Dominika; ten  Have, Sara; Frei  dit  Frey, Nicolas; Hirt, Heribert; James, Allan B.; Nimmo, Hugh G.; Barta, Andrea; Kalyna, Maria; Brown, John  W.  S.

    2017-01-01

    Alternative splicing generates multiple transcript and protein isoforms from the same gene and thus is important in gene expression regulation. To date, RNA-sequencing (RNA-seq) is the standard method for quantifying changes in alternative splicing on a genome-wide scale. Understanding the current limitations of RNA-seq is crucial for reliable analysis and the lack of high quality, comprehensive transcriptomes for most species, including model organisms such as Arabidopsis, is a major constraint in accurate quantification of transcript isoforms. To address this, we designed a novel pipeline with stringent filters and assembled a comprehensive Reference Transcript Dataset for Arabidopsis (AtRTD2) containing 82,190 non-redundant transcripts from 34 212 genes. Extensive experimental validation showed that AtRTD2 and its modified version, AtRTD2-QUASI, for use in Quantification of Alternatively Spliced Isoforms, outperform other available transcriptomes in RNA-seq analysis. This strategy can be implemented in other species to build a pipeline for transcript-level expression and alternative splicing analyses.

  11. Automatic temperature computation for realistic IR simulation

    Science.gov (United States)

    Le Goff, Alain; Kersaudy, Philippe; Latger, Jean; Cathala, Thierry; Stolte, Nilo; Barillot, Philippe

    2000-07-01

    Polygon temperature computation in 3D virtual scenes is fundamental for IR image simulation. This article describes in detail the temperature calculation software and its current extensions, briefly presented in [1]. This software, called MURET, is used by the simulation workshop CHORALE of the French DGA. MURET is a one-dimensional thermal software, which accurately takes into account the material thermal attributes of three-dimensional scene and the variation of the environment characteristics (atmosphere) as a function of the time. Concerning the environment, absorbed incident fluxes are computed wavelength by wavelength, for each half an hour, druing 24 hours before the time of the simulation. For each polygon, incident fluxes are compsed of: direct solar fluxes, sky illumination (including diffuse solar fluxes). Concerning the materials, classical thermal attributes are associated to several layers, such as conductivity, absorption, spectral emissivity, density, specific heat, thickness and convection coefficients are taken into account. In the future, MURET will be able to simulate permeable natural materials (water influence) and vegetation natural materials (woods). This model of thermal attributes induces a very accurate polygon temperature computation for the complex 3D databases often found in CHORALE simulations. The kernel of MUET consists of an efficient ray tracer allowing to compute the history (over 24 hours) of the shadowed parts of the 3D scene and a library, responsible for the thermal computations. The great originality concerns the way the heating fluxes are computed. Using ray tracing, the flux received in each 3D point of the scene accurately takes into account the masking (hidden surfaces) between objects. By the way, this library supplies other thermal modules such as a thermal shows computation tool.

  12. Massively parallel quantum computer simulator

    NARCIS (Netherlands)

    De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.

    2007-01-01

    We describe portable software to simulate universal quantum computers on massive parallel Computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray

  13. Identification of valid reference genes for the normalization of RT qPCR gene expression data in human brain tissue

    Directory of Open Access Journals (Sweden)

    Ravid Rivka

    2008-05-01

    Full Text Available Abstract Background Studies of gene expression in post mortem human brain can contribute to understanding of the pathophysiology of neurodegenerative diseases, including Alzheimer's disease (AD, Parkinson's disease (PD and dementia with Lewy bodies (DLB. Quantitative real-time PCR (RT qPCR is often used to analyse gene expression. The validity of results obtained using RT qPCR is reliant on accurate data normalization. Reference genes are generally used to normalize RT qPCR data. Given that expression of some commonly used reference genes is altered in certain conditions, this study aimed to establish which reference genes were stably expressed in post mortem brain tissue from individuals with AD, PD or DLB. Results The present study investigated the expression stability of 8 candidate reference genes, (ubiquitin C [UBC], tyrosine-3-monooxygenase [YWHAZ], RNA polymerase II polypeptide [RP II], hydroxymethylbilane synthase [HMBS], TATA box binding protein [TBP], β-2-microglobulin [B2M], glyceraldehyde-3-phosphate dehydrogenase [GAPDH], and succinate dehydrogenase complex-subunit A, [SDHA] in cerebellum and medial temporal gyrus of 6 AD, 6 PD, 6 DLB subjects, along with 5 matched controls using RT qPCR (TaqMan® Gene Expression Assays. Gene expression stability was analysed using geNorm to rank the candidate genes in order of decreasing stability in each disease group. The optimal number of genes recommended for accurate data normalization in each disease state was determined by pairwise variation analysis. Conclusion This study identified validated sets of mRNAs which would be appropriate for the normalization of RT qPCR data when studying gene expression in brain tissue of AD, PD, DLB and control subjects.

  14. Inferring Drosophila gap gene regulatory network: Pattern analysis of simulated gene expression profiles and stability analysis

    OpenAIRE

    Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.

    2009-01-01

    Abstract Background Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori assumptions about the interactions, which all simulate the observed patterns. It is important to analyze the properties of the circuits. Findings We have analyzed the simulated gene expression ...

  15. Gene Expression Signature in Endemic Osteoarthritis by Microarray Analysis

    Directory of Open Access Journals (Sweden)

    Xi Wang

    2015-05-01

    Full Text Available Kashin-Beck Disease (KBD is an endemic osteochondropathy with an unknown pathogenesis. Diagnosis of KBD is effective only in advanced cases, which eliminates the possibility of early treatment and leads to an inevitable exacerbation of symptoms. Therefore, we aim to identify an accurate blood-based gene signature for the detection of KBD. Previously published gene expression profile data on cartilage and peripheral blood mononuclear cells (PBMCs from adults with KBD were compared to select potential target genes. Microarray analysis was conducted to evaluate the expression of the target genes in a cohort of 100 KBD patients and 100 healthy controls. A gene expression signature was identified using a training set, which was subsequently validated using an independent test set with a minimum redundancy maximum relevance (mRMR algorithm and support vector machine (SVM algorithm. Fifty unique genes were differentially expressed between KBD patients and healthy controls. A 20-gene signature was identified that distinguished between KBD patients and controls with 90% accuracy, 85% sensitivity, and 95% specificity. This study identified a 20-gene signature that accurately distinguishes between patients with KBD and controls using peripheral blood samples. These results promote the further development of blood-based genetic biomarkers for detection of KBD.

  16. An accurate and computationally efficient small-scale nonlinear FEA of flexible risers

    OpenAIRE

    Rahmati, MT; Bahai, H; Alfano, G

    2016-01-01

    This paper presents a highly efficient small-scale, detailed finite-element modelling method for flexible risers which can be effectively implemented in a fully-nested (FE2) multiscale analysis based on computational homogenisation. By exploiting cyclic symmetry and applying periodic boundary conditions, only a small fraction of a flexible pipe is used for a detailed nonlinear finite-element analysis at the small scale. In this model, using three-dimensional elements, all layer components are...

  17. Validating Internal Control Genes for the Accurate Normalization of qPCR Expression Analysis of the Novel Model Plant Setaria viridis.

    Directory of Open Access Journals (Sweden)

    Julia Lambret-Frotté

    Full Text Available Employing reference genes to normalize the data generated with quantitative PCR (qPCR can increase the accuracy and reliability of this method. Previous results have shown that no single housekeeping gene can be universally applied to all experiments. Thus, the identification of a suitable reference gene represents a critical step of any qPCR analysis. Setaria viridis has recently been proposed as a model system for the study of Panicoid grasses, a crop family of major agronomic importance. Therefore, this paper aims to identify suitable S. viridis reference genes that can enhance the analysis of gene expression in this novel model plant. The first aim of this study was the identification of a suitable RNA extraction method that could retrieve a high quality and yield of RNA. After this, two distinct algorithms were used to assess the gene expression of fifteen different candidate genes in eighteen different samples, which were divided into two major datasets, the developmental and the leaf gradient. The best-ranked pair of reference genes from the developmental dataset included genes that encoded a phosphoglucomutase and a folylpolyglutamate synthase; genes that encoded a cullin and the same phosphoglucomutase as above were the most stable genes in the leaf gradient dataset. Additionally, the expression pattern of two target genes, a SvAP3/PI MADS-box transcription factor and the carbon-fixation enzyme PEPC, were assessed to illustrate the reliability of the chosen reference genes. This study has shown that novel reference genes may perform better than traditional housekeeping genes, a phenomenon which has been previously reported. These results illustrate the importance of carefully validating reference gene candidates for each experimental set before employing them as universal standards. Additionally, the robustness of the expression of the target genes may increase the utility of S. viridis as a model for Panicoid grasses.

  18. An algorithm to discover gene signatures with predictive potential

    Directory of Open Access Journals (Sweden)

    Hallett Robin M

    2010-09-01

    Full Text Available Abstract Background The advent of global gene expression profiling has generated unprecedented insight into our molecular understanding of cancer, including breast cancer. For example, human breast cancer patients display significant diversity in terms of their survival, recurrence, metastasis as well as response to treatment. These patient outcomes can be predicted by the transcriptional programs of their individual breast tumors. Predictive gene signatures allow us to correctly classify human breast tumors into various risk groups as well as to more accurately target therapy to ensure more durable cancer treatment. Results Here we present a novel algorithm to generate gene signatures with predictive potential. The method first classifies the expression intensity for each gene as determined by global gene expression profiling as low, average or high. The matrix containing the classified data for each gene is then used to score the expression of each gene based its individual ability to predict the patient characteristic of interest. Finally, all examined genes are ranked based on their predictive ability and the most highly ranked genes are included in the master gene signature, which is then ready for use as a predictor. This method was used to accurately predict the survival outcomes in a cohort of human breast cancer patients. Conclusions We confirmed the capacity of our algorithm to generate gene signatures with bona fide predictive ability. The simplicity of our algorithm will enable biological researchers to quickly generate valuable gene signatures without specialized software or extensive bioinformatics training.

  19. Selection of housekeeping genes for normalization by real-time RT-PCR: analysis of Or-MYB1 gene expression in Orobanche ramosa development.

    Science.gov (United States)

    González-Verdejo, C I; Die, J V; Nadal, S; Jiménez-Marín, A; Moreno, M T; Román, B

    2008-08-15

    Real-time PCR has become the method of choice for accurate and in-depth expression studies of candidate genes. To avoid bias, real-time PCR is referred to one or several internal control genes that should not fluctuate among treatments. A need for reference genes in the parasitic plant Orobanche ramosa has emerged, and the studies in this area have not yet been evaluated. In this study, the genes 18S rRNA, Or-act1, Or-tub1, and Or-ubq1 were compared in terms of expression stability using the BestKeeper software program. Among the four common endogenous control genes, Or-act1 and Or-ubq1 were the most stable in O. ramosa samples. In parallel, a study was carried out studying the expression of the transcription factor Or-MYB1 that seemed to be implicated during preinfection stages. The normalization strategy presented here is a prerequisite to accurate real-time PCR expression profiling that, among other things, opens up the possibility of studying messenger RNA levels of low-copy-number-like transcription factors.

  20. Fast and accurate Bayesian model criticism and conflict diagnostics using R-INLA

    KAUST Repository

    Ferkingstad, Egil

    2017-10-16

    Bayesian hierarchical models are increasingly popular for realistic modelling and analysis of complex data. This trend is accompanied by the need for flexible, general and computationally efficient methods for model criticism and conflict detection. Usually, a Bayesian hierarchical model incorporates a grouping of the individual data points, as, for example, with individuals in repeated measurement data. In such cases, the following question arises: Are any of the groups “outliers,” or in conflict with the remaining groups? Existing general approaches aiming to answer such questions tend to be extremely computationally demanding when model fitting is based on Markov chain Monte Carlo. We show how group-level model criticism and conflict detection can be carried out quickly and accurately through integrated nested Laplace approximations (INLA). The new method is implemented as a part of the open-source R-INLA package for Bayesian computing (http://r-inla.org).

  1. Computational analysis of TRAPPC9: candidate gene for autosomal recessive non-syndromic mental retardation.

    Science.gov (United States)

    Khattak, Naureen Aslam; Mir, Asif

    2014-01-01

    Mental retardation (MR)/ intellectual disability (ID) is a neuro-developmental disorder characterized by a low intellectual quotient (IQ) and deficits in adaptive behavior related to everyday life tasks such as delayed language acquisition, social skills or self-help skills with onset before age 18. To date, a few genes (PRSS12, CRBN, CC2D1A, GRIK2, TUSC3, TRAPPC9, TECR, ST3GAL3, MED23, MAN1B1, NSUN1) for autosomal-recessive forms of non syndromic MR (NS-ARMR) have been identified and established in various families with ID. The recently reported candidate gene TRAPPC9 was selected for computational analysis to explore its potentially important role in pathology as it is the only gene for ID reported in more than five different familial cases worldwide. YASARA (12.4.1) was utilized to generate three dimensional structures of the candidate gene TRAPPC9. Hybrid structure prediction was employed. Crystal Structure of a Conserved Metalloprotein From Bacillus Cereus (3D19-C) was selected as best suitable template using position-specific iteration-BLAST. Template (3D19-C) parameters were based on E-value, Z-score and resolution and quality score of 0.32, -1.152, 2.30°A and 0.684 respectively. Model reliability showed 93.1% residues placed in the most favored region with 96.684 quality factor, and overall 0.20 G-factor (dihedrals 0.06 and covalent 0.39 respectively). Protein-Protein docking analysis demonstrated that TRAPPC9 showed strong interactions of the amino acid residues S(253), S(251), Y(256), G(243), D(131) with R(105), Q(425), W(226), N(255), S(233), its functional partner 1KBKB. Protein-protein interacting residues could facilitate the exploration of structural and functional outcomes of wild type and mutated TRAPCC9 protein. Actively involved residues can be used to elucidate the binding properties of the protein, and to develop drug therapy for NS-ARMR patients.

  2. RNA-seq reveals more consistent reference genes for gene expression studies in human non-melanoma skin cancers

    Directory of Open Access Journals (Sweden)

    Van L.T. Hoang

    2017-08-01

    Full Text Available Identification of appropriate reference genes (RGs is critical to accurate data interpretation in quantitative real-time PCR (qPCR experiments. In this study, we have utilised next generation RNA sequencing (RNA-seq to analyse the transcriptome of a panel of non-melanoma skin cancer lesions, identifying genes that are consistently expressed across all samples. Genes encoding ribosomal proteins were amongst the most stable in this dataset. Validation of this RNA-seq data was examined using qPCR to confirm the suitability of a set of highly stable genes for use as qPCR RGs. These genes will provide a valuable resource for the normalisation of qPCR data for the analysis of non-melanoma skin cancer.

  3. Toward an ultra-high resolution community climate system model for the BlueGene platform

    International Nuclear Information System (INIS)

    Dennis, John M; Jacob, Robert; Vertenstein, Mariana; Craig, Tony; Loy, Raymond

    2007-01-01

    Global climate models need to simulate several small, regional-scale processes which affect the global circulation in order to accurately simulate the climate. This is particularly important in the ocean where small scale features such as oceanic eddies are currently represented with adhoc parameterizations. There is also a need for higher resolution to provide climate predictions at small, regional scales. New high-performance computing platforms such as the IBM BlueGene can provide the necessary computational power to perform ultra-high resolution climate model integrations. We have begun to investigate the scaling of the individual components of the Community Climate System Model to prepare it for integrations on BlueGene and similar platforms. Our investigations show that it is possible to successfully utilize O(32K) processors. We describe the scalability of five models: the Parallel Ocean Program (POP), the Community Ice CodE (CICE), the Community Land Model (CLM), and the new CCSM sequential coupler (CPL7) which are components of the next generation Community Climate System Model (CCSM); as well as the High-Order Method Modeling Environment (HOMME) which is a dynamical core currently being evaluated within the Community Atmospheric Model. For our studies we concentrate on 1/10 0 resolution for CICE, POP, and CLM models and 1/4 0 resolution for HOMME. The ability to simulate high resolutions on the massively parallel petascale systems that will dominate high-performance computing for the foreseeable future is essential to the advancement of climate science

  4. Virtual Reality Based Accurate Radioactive Source Representation and Dosimetry for Training Applications

    International Nuclear Information System (INIS)

    Molto-Caracena, T.; Vendrell Vidal, E.; Goncalves, J.G.M.; Peerani, P.; )

    2015-01-01

    Virtual Reality (VR) technologies have much potential for training applications. Success relies on the capacity to provide a real-time immersive effect to a trainee. For a training application to be an effective/meaningful tool, 3D realistic scenarios are not enough. Indeed, it is paramount having sufficiently accurate models of the behaviour of the instruments to be used by a trainee. This will enable the required level of user's interactivity. Specifically, when dealing with simulation of radioactive sources, a VR model based application must compute the dose rate with equivalent accuracy and in about the same time as a real instrument. A conflicting requirement is the need to provide a smooth visual rendering enabling spatial interactivity and interaction. This paper presents a VR based prototype which accurately computes the dose rate of radioactive and nuclear sources that can be selected from a wide library. Dose measurements reflect local conditions, i.e., presence of (a) shielding materials with any shape and type and (b) sources with any shape and dimension. Due to a novel way of representing radiation sources, the system is fast enough to grant the necessary user interactivity. The paper discusses the application of this new method and its advantages in terms of time setting, cost and logistics. (author)

  5. Parente2: a fast and accurate method for detecting identity by descent

    KAUST Repository

    Rodriguez, Jesse M.

    2014-10-01

    Identity-by-descent (IBD) inference is the problem of establishing a genetic connection between two individuals through a genomic segment that is inherited by both individuals from a recent common ancestor. IBD inference is an important preceding step in a variety of population genomic studies, ranging from demographic studies to linking genomic variation with phenotype and disease. The problem of accurate IBD detection has become increasingly challenging with the availability of large collections of human genotypes and genomes: Given a cohort\\'s size, a quadratic number of pairwise genome comparisons must be performed. Therefore, computation time and the false discovery rate can also scale quadratically. To enable accurate and efficient large-scale IBD detection, we present Parente2, a novel method for detecting IBD segments. Parente2 is based on an embedded log-likelihood ratio and uses a model that accounts for linkage disequilibrium by explicitly modeling haplotype frequencies. Parente2 operates directly on genotype data without the need to phase data prior to IBD inference. We evaluate Parente2\\'s performance through extensive simulations using real data, and we show that it provides substantially higher accuracy compared to previous state-of-the-art methods while maintaining high computational efficiency.

  6. Reverse transcription-quantitative polymerase chain reaction: description of a RIN-based algorithm for accurate data normalization

    Directory of Open Access Journals (Sweden)

    Boissière-Michot Florence

    2009-04-01

    Full Text Available Abstract Background Reverse transcription-quantitative polymerase chain reaction (RT-qPCR is the gold standard technique for mRNA quantification, but appropriate normalization is required to obtain reliable data. Normalization to accurately quantitated RNA has been proposed as the most reliable method for in vivo biopsies. However, this approach does not correct differences in RNA integrity. Results In this study, we evaluated the effect of RNA degradation on the quantification of the relative expression of nine genes (18S, ACTB, ATUB, B2M, GAPDH, HPRT, POLR2L, PSMB6 and RPLP0 that cover a wide expression spectrum. Our results show that RNA degradation could introduce up to 100% error in gene expression measurements when RT-qPCR data were normalized to total RNA. To achieve greater resolution of small differences in transcript levels in degraded samples, we improved this normalization method by developing a corrective algorithm that compensates for the loss of RNA integrity. This approach allowed us to achieve higher accuracy, since the average error for quantitative measurements was reduced to 8%. Finally, we applied our normalization strategy to the quantification of EGFR, HER2 and HER3 in 104 rectal cancer biopsies. Taken together, our data show that normalization of gene expression measurements by taking into account also RNA degradation allows much more reliable sample comparison. Conclusion We developed a new normalization method of RT-qPCR data that compensates for loss of RNA integrity and therefore allows accurate gene expression quantification in human biopsies.

  7. Cell-specific prediction and application of drug-induced gene expression profiles.

    Science.gov (United States)

    Hodos, Rachel; Zhang, Ping; Lee, Hao-Chih; Duan, Qiaonan; Wang, Zichen; Clark, Neil R; Ma'ayan, Avi; Wang, Fei; Kidd, Brian; Hu, Jianying; Sontag, David; Dudley, Joel

    2018-01-01

    Gene expression profiling of in vitro drug perturbations is useful for many biomedical discovery applications including drug repurposing and elucidation of drug mechanisms. However, limited data availability across cell types has hindered our capacity to leverage or explore the cell-specificity of these perturbations. While recent efforts have generated a large number of drug perturbation profiles across a variety of human cell types, many gaps remain in this combinatorial drug-cell space. Hence, we asked whether it is possible to fill these gaps by predicting cell-specific drug perturbation profiles using available expression data from related conditions--i.e. from other drugs and cell types. We developed a computational framework that first arranges existing profiles into a three-dimensional array (or tensor) indexed by drugs, genes, and cell types, and then uses either local (nearest-neighbors) or global (tensor completion) information to predict unmeasured profiles. We evaluate prediction accuracy using a variety of metrics, and find that the two methods have complementary performance, each superior in different regions in the drug-cell space. Predictions achieve correlations of 0.68 with true values, and maintain accurate differentially expressed genes (AUC 0.81). Finally, we demonstrate that the predicted profiles add value for making downstream associations with drug targets and therapeutic classes.

  8. Computational chemistry at the petascale: Are we there yet?

    International Nuclear Information System (INIS)

    Apra, E; Harrison, R J; Shelton, W A; Tipparaju, V; Vazquez-Mayagoitia, A

    2009-01-01

    We have run computational chemistry calculations approaching the Petascale level of performance (∼ 0.5 PFlops). We used the Coupled Cluster CCSD(T) module of the computational chemistry code NWChem to evaluate accurate energetics of water clusters on a 1.4 PFlops Cray XT5 computer.

  9. Superior Cross-Species Reference Genes: A Blueberry Case Study

    Science.gov (United States)

    Die, Jose V.; Rowland, Lisa J.

    2013-01-01

    The advent of affordable Next Generation Sequencing technologies has had major impact on studies of many crop species, where access to genomic technologies and genome-scale data sets has been extremely limited until now. The recent development of genomic resources in blueberry will enable the application of high throughput gene expression approaches that should relatively quickly increase our understanding of blueberry physiology. These studies, however, require a highly accurate and robust workflow and make necessary the identification of reference genes with high expression stability for correct target gene normalization. To create a set of superior reference genes for blueberry expression analyses, we mined a publicly available transcriptome data set from blueberry for orthologs to a set of Arabidopsis genes that showed the most stable expression in a developmental series. In total, the expression stability of 13 putative reference genes was evaluated by qPCR and a set of new references with high stability values across a developmental series in fruits and floral buds of blueberry were identified. We also demonstrated the need to use at least two, preferably three, reference genes to avoid inconsistencies in results, even when superior reference genes are used. The new references identified here provide a valuable resource for accurate normalization of gene expression in Vaccinium spp. and may be useful for other members of the Ericaceae family as well. PMID:24058469

  10. Superior cross-species reference genes: a blueberry case study.

    Directory of Open Access Journals (Sweden)

    Jose V Die

    Full Text Available The advent of affordable Next Generation Sequencing technologies has had major impact on studies of many crop species, where access to genomic technologies and genome-scale data sets has been extremely limited until now. The recent development of genomic resources in blueberry will enable the application of high throughput gene expression approaches that should relatively quickly increase our understanding of blueberry physiology. These studies, however, require a highly accurate and robust workflow and make necessary the identification of reference genes with high expression stability for correct target gene normalization. To create a set of superior reference genes for blueberry expression analyses, we mined a publicly available transcriptome data set from blueberry for orthologs to a set of Arabidopsis genes that showed the most stable expression in a developmental series. In total, the expression stability of 13 putative reference genes was evaluated by qPCR and a set of new references with high stability values across a developmental series in fruits and floral buds of blueberry were identified. We also demonstrated the need to use at least two, preferably three, reference genes to avoid inconsistencies in results, even when superior reference genes are used. The new references identified here provide a valuable resource for accurate normalization of gene expression in Vaccinium spp. and may be useful for other members of the Ericaceae family as well.

  11. MRI Reporter Genes for Noninvasive Molecular Imaging

    Directory of Open Access Journals (Sweden)

    Caixia Yang

    2016-05-01

    Full Text Available Magnetic resonance imaging (MRI is one of the most important imaging technologies used in clinical diagnosis. Reporter genes for MRI can be applied to accurately track the delivery of cell in cell therapy, evaluate the therapy effect of gene delivery, and monitor tissue/cell-specific microenvironments. Commonly used reporter genes for MRI usually include genes encoding the enzyme (e.g., tyrosinase and β-galactosidase, the receptor on the cells (e.g., transferrin receptor, and endogenous reporter genes (e.g., ferritin reporter gene. However, low sensitivity limits the application of MRI and reporter gene-based multimodal imaging strategies are common including optical imaging and radionuclide imaging. These can significantly improve diagnostic efficiency and accelerate the development of new therapies.

  12. A crossing programme with mutants in peas: Utilization of a gene bank and a computer system

    International Nuclear Information System (INIS)

    Blixt, S.

    1976-01-01

    A gene bank for peas was established at Weibullsholm in 1930, comprising about 1500 lines and including a great number of spontaneous and induced mutants. A Wang 2200 computair system is used in the maintenance and utilization of the gene bank information. The system at present provides programs for establishing and maintaining data-files for lines and crosses, statistical programs for linkage calculations, and technical aids in growing and classifying the material. The system is very promising and can be handled without previous experience of computer work. This system is used in the planning of a specific plant breeding project, PMX and search for material to be used in the project. The PMX project as a whole aims at adapting the pea better to monoculture and to a highly mechanized agriculture, as well as at improving its nutritional value. The material used in the project up to now is four modern varieties, two primitive varieties, two cross-recombinants, one ecotype, three spontaneous mutants and five induced mutants. (author)

  13. Gene Expression Ratios Lead to Accurate and Translatable Predictors of DR5 Agonism across Multiple Tumor Lineages.

    Directory of Open Access Journals (Sweden)

    Anupama Reddy

    Full Text Available Death Receptor 5 (DR5 agonists demonstrate anti-tumor activity in preclinical models but have yet to demonstrate robust clinical responses. A key limitation may be the lack of patient selection strategies to identify those most likely to respond to treatment. To overcome this limitation, we screened a DR5 agonist Nanobody across >600 cell lines representing 21 tumor lineages and assessed molecular features associated with response. High expression of DR5 and Casp8 were significantly associated with sensitivity, but their expression thresholds were difficult to translate due to low dynamic ranges. To address the translational challenge of establishing thresholds of gene expression, we developed a classifier based on ratios of genes that predicted response across lineages. The ratio classifier outperformed the DR5+Casp8 classifier, as well as standard approaches for feature selection and classification using genes, instead of ratios. This classifier was independently validated using 11 primary patient-derived pancreatic xenograft models showing perfect predictions as well as a striking linearity between prediction probability and anti-tumor response. A network analysis of the genes in the ratio classifier captured important biological relationships mediating drug response, specifically identifying key positive and negative regulators of DR5 mediated apoptosis, including DR5, CASP8, BID, cFLIP, XIAP and PEA15. Importantly, the ratio classifier shows translatability across gene expression platforms (from Affymetrix microarrays to RNA-seq and across model systems (in vitro to in vivo. Our approach of using gene expression ratios presents a robust and novel method for constructing translatable biomarkers of compound response, which can also probe the underlying biology of treatment response.

  14. Gene Expression Ratios Lead to Accurate and Translatable Predictors of DR5 Agonism across Multiple Tumor Lineages.

    Science.gov (United States)

    Reddy, Anupama; Growney, Joseph D; Wilson, Nick S; Emery, Caroline M; Johnson, Jennifer A; Ward, Rebecca; Monaco, Kelli A; Korn, Joshua; Monahan, John E; Stump, Mark D; Mapa, Felipa A; Wilson, Christopher J; Steiger, Janine; Ledell, Jebediah; Rickles, Richard J; Myer, Vic E; Ettenberg, Seth A; Schlegel, Robert; Sellers, William R; Huet, Heather A; Lehár, Joseph

    2015-01-01

    Death Receptor 5 (DR5) agonists demonstrate anti-tumor activity in preclinical models but have yet to demonstrate robust clinical responses. A key limitation may be the lack of patient selection strategies to identify those most likely to respond to treatment. To overcome this limitation, we screened a DR5 agonist Nanobody across >600 cell lines representing 21 tumor lineages and assessed molecular features associated with response. High expression of DR5 and Casp8 were significantly associated with sensitivity, but their expression thresholds were difficult to translate due to low dynamic ranges. To address the translational challenge of establishing thresholds of gene expression, we developed a classifier based on ratios of genes that predicted response across lineages. The ratio classifier outperformed the DR5+Casp8 classifier, as well as standard approaches for feature selection and classification using genes, instead of ratios. This classifier was independently validated using 11 primary patient-derived pancreatic xenograft models showing perfect predictions as well as a striking linearity between prediction probability and anti-tumor response. A network analysis of the genes in the ratio classifier captured important biological relationships mediating drug response, specifically identifying key positive and negative regulators of DR5 mediated apoptosis, including DR5, CASP8, BID, cFLIP, XIAP and PEA15. Importantly, the ratio classifier shows translatability across gene expression platforms (from Affymetrix microarrays to RNA-seq) and across model systems (in vitro to in vivo). Our approach of using gene expression ratios presents a robust and novel method for constructing translatable biomarkers of compound response, which can also probe the underlying biology of treatment response.

  15. A gene signature to determine metastatic behavior in thymomas.

    Directory of Open Access Journals (Sweden)

    Yesim Gökmen-Polar

    Full Text Available PURPOSE: Thymoma represents one of the rarest of all malignancies. Stage and completeness of resection have been used to ascertain postoperative therapeutic strategies albeit with limited prognostic accuracy. A molecular classifier would be useful to improve the assessment of metastatic behaviour and optimize patient management. METHODS: qRT-PCR assay for 23 genes (19 test and four reference genes was performed on multi-institutional archival primary thymomas (n = 36. Gene expression levels were used to compute a signature, classifying tumors into classes 1 and 2, corresponding to low or high likelihood for metastases. The signature was validated in an independent multi-institutional cohort of patients (n = 75. RESULTS: A nine-gene signature that can predict metastatic behavior of thymomas was developed and validated. Using radial basis machine modeling in the training set, 5-year and 10-year metastasis-free survival rates were 77% and 26% for predicted low (class 1 and high (class 2 risk of metastasis (P = 0.0047, log-rank, respectively. For the validation set, 5-year metastasis-free survival rates were 97% and 30% for predicted low- and high-risk patients (P = 0.0004, log-rank, respectively. The 5-year metastasis-free survival rates for the validation set were 49% and 41% for Masaoka stages I/II and III/IV (P = 0.0537, log-rank, respectively. In univariate and multivariate Cox models evaluating common prognostic factors for thymoma metastasis, the nine-gene signature was the only independent indicator of metastases (P = 0.036. CONCLUSION: A nine-gene signature was established and validated which predicts the likelihood of metastasis more accurately than traditional staging. This further underscores the biologic determinants of the clinical course of thymoma and may improve patient management.

  16. Radionuclide reporter gene imaging for cardiac gene therapy

    International Nuclear Information System (INIS)

    Inubushi, Masayuki; Tamaki, Nagara

    2007-01-01

    In the field of cardiac gene therapy, angiogenic gene therapy has been most extensively investigated. The first clinical trial of cardiac angiogenic gene therapy was reported in 1998, and at the peak, more than 20 clinical trial protocols were under evaluation. However, most trials have ceased owing to the lack of decisive proof of therapeutic effects and the potential risks of viral vectors. In order to further advance cardiac angiogenic gene therapy, remaining open issues need to be resolved: there needs to be improvement of gene transfer methods, regulation of gene expression, development of much safer vectors and optimisation of therapeutic genes. For these purposes, imaging of gene expression in living organisms is of great importance. In radionuclide reporter gene imaging, ''reporter genes'' transferred into cell nuclei encode for a protein that retains a complementary ''reporter probe'' of a positron or single-photon emitter; thus expression of the reporter genes can be imaged with positron emission tomography or single-photon emission computed tomography. Accordingly, in the setting of gene therapy, the location, magnitude and duration of the therapeutic gene co-expression with the reporter genes can be monitored non-invasively. In the near future, gene therapy may evolve into combination therapy with stem/progenitor cell transplantation, so-called cell-based gene therapy or gene-modified cell therapy. Radionuclide reporter gene imaging is now expected to contribute in providing evidence on the usefulness of this novel therapeutic approach, as well as in investigating the molecular mechanisms underlying neovascularisation and safety issues relevant to further progress in conventional gene therapy. (orig.)

  17. Identification and validation of reference genes for quantitative RT-PCR normalization in wheat

    Directory of Open Access Journals (Sweden)

    Porceddu Enrico

    2009-02-01

    Full Text Available Abstract Background Usually the reference genes used in gene expression analysis have been chosen for their known or suspected housekeeping roles, however the variation observed in most of them hinders their effective use. The assessed lack of validated reference genes emphasizes the importance of a systematic study for their identification. For selecting candidate reference genes we have developed a simple in silico method based on the data publicly available in the wheat databases Unigene and TIGR. Results The expression stability of 32 genes was assessed by qRT-PCR using a set of cDNAs from 24 different plant samples, which included different tissues, developmental stages and temperature stresses. The selected sequences included 12 well-known HKGs representing different functional classes and 20 genes novel with reference to the normalization issue. The expression stability of the 32 candidate genes was tested by the computer programs geNorm and NormFinder using five different data-sets. Some discrepancies were detected in the ranking of the candidate reference genes, but there was substantial agreement between the groups of genes with the most and least stable expression. Three new identified reference genes appear more effective than the well-known and frequently used HKGs to normalize gene expression in wheat. Finally, the expression study of a gene encoding a PDI-like protein showed that its correct evaluation relies on the adoption of suitable normalization genes and can be negatively affected by the use of traditional HKGs with unstable expression, such as actin and α-tubulin. Conclusion The present research represents the first wide screening aimed to the identification of reference genes and of the corresponding primer pairs specifically designed for gene expression studies in wheat, in particular for qRT-PCR analyses. Several of the new identified reference genes outperformed the traditional HKGs in terms of expression stability

  18. Belle computing system

    International Nuclear Information System (INIS)

    Adachi, Ichiro; Hibino, Taisuke; Hinz, Luc; Itoh, Ryosuke; Katayama, Nobu; Nishida, Shohei; Ronga, Frederic; Tsukamoto, Toshifumi; Yokoyama, Masahiko

    2004-01-01

    We describe the present status of the computing system in the Belle experiment at the KEKB e+e- asymmetric-energy collider. So far, we have logged more than 160fb-1 of data, corresponding to the world's largest data sample of 170M BB-bar pairs at the -bar (4S) energy region. A large amount of event data has to be processed to produce an analysis event sample in a timely fashion. In addition, Monte Carlo events have to be created to control systematic errors accurately. This requires stable and efficient usage of computing resources. Here, we review our computing model and then describe how we efficiently proceed DST/MC productions in our system

  19. Accurate spectroscopic characterization of oxirane: A valuable route to its identification in Titan's atmosphere and the assignment of unidentified infrared bands

    Energy Technology Data Exchange (ETDEWEB)

    Puzzarini, Cristina [Dipartimento di Chimica " Giacomo Ciamician," Università di Bologna, Via Selmi 2, I-40126 Bologna (Italy); Biczysko, Malgorzata; Bloino, Julien; Barone, Vincenzo, E-mail: cristina.puzzarini@unibo.it [Scuola Normale Superiore, Piazza dei Cavalieri 7, I-56126 Pisa (Italy)

    2014-04-20

    In an effort to provide an accurate spectroscopic characterization of oxirane, state-of-the-art computational methods and approaches have been employed to determine highly accurate fundamental vibrational frequencies and rotational parameters. Available experimental data were used to assess the reliability of our computations, and an accuracy on average of 10 cm{sup –1} for fundamental transitions as well as overtones and combination bands has been pointed out. Moving to rotational spectroscopy, relative discrepancies of 0.1%, 2%-3%, and 3%-4% were observed for rotational, quartic, and sextic centrifugal-distortion constants, respectively. We are therefore confident that the highly accurate spectroscopic data provided herein can be useful for identification of oxirane in Titan's atmosphere and the assignment of unidentified infrared bands. Since oxirane was already observed in the interstellar medium and some astronomical objects are characterized by very high D/H ratios, we also considered the accurate determination of the spectroscopic parameters for the mono-deuterated species, oxirane-d1. For the latter, an empirical scaling procedure allowed us to improve our computed data and to provide predictions for rotational transitions with a relative accuracy of about 0.02% (i.e., an uncertainty of about 40 MHz for a transition lying at 200 GHz).

  20. Computer Series, 99: Bits and Pieces, 39.

    Science.gov (United States)

    Moore, John W., Ed.

    1989-01-01

    Presents five computer programs: (1) Accurate Numerical Solutions of the One-Dimensional Schrodinger Equation; (2) NMR Simulation and Interactive Drill/Interpretation; (3) A Simple Computer Program for the Calculation of 13C-NMR Chemical Shifts; (4) Constants of 1:1 Complexes from NMR or Spectrophotometric Measurements; and (5) Saturation…

  1. Computational intelligence techniques for biological data mining: An overview

    Science.gov (United States)

    Faye, Ibrahima; Iqbal, Muhammad Javed; Said, Abas Md; Samir, Brahim Belhaouari

    2014-10-01

    Computational techniques have been successfully utilized for a highly accurate analysis and modeling of multifaceted and raw biological data gathered from various genome sequencing projects. These techniques are proving much more effective to overcome the limitations of the traditional in-vitro experiments on the constantly increasing sequence data. However, most critical problems that caught the attention of the researchers may include, but not limited to these: accurate structure and function prediction of unknown proteins, protein subcellular localization prediction, finding protein-protein interactions, protein fold recognition, analysis of microarray gene expression data, etc. To solve these problems, various classification and clustering techniques using machine learning have been extensively used in the published literature. These techniques include neural network algorithms, genetic algorithms, fuzzy ARTMAP, K-Means, K-NN, SVM, Rough set classifiers, decision tree and HMM based algorithms. Major difficulties in applying the above algorithms include the limitations found in the previous feature encoding and selection methods while extracting the best features, increasing classification accuracy and decreasing the running time overheads of the learning algorithms. The application of this research would be potentially useful in the drug design and in the diagnosis of some diseases. This paper presents a concise overview of the well-known protein classification techniques.

  2. Fast magnetic field computation in fusion technology using GPU technology

    Energy Technology Data Exchange (ETDEWEB)

    Chiariello, Andrea Gaetano [Ass. EURATOM/ENEA/CREATE, Dipartimento di Ingegneria Industriale e dell’Informazione, Seconda Università di Napoli, Via Roma 29, Aversa (CE) (Italy); Formisano, Alessandro, E-mail: Alessandro.Formisano@unina2.it [Ass. EURATOM/ENEA/CREATE, Dipartimento di Ingegneria Industriale e dell’Informazione, Seconda Università di Napoli, Via Roma 29, Aversa (CE) (Italy); Martone, Raffaele [Ass. EURATOM/ENEA/CREATE, Dipartimento di Ingegneria Industriale e dell’Informazione, Seconda Università di Napoli, Via Roma 29, Aversa (CE) (Italy)

    2013-10-15

    Highlights: ► The paper deals with high accuracy numerical simulations of high field magnets. ► The porting of existing codes of High Performance Computing architectures allowed to obtain a relevant speedup while not reducing computational accuracy. ► Some examples of applications, referred to ITER-like magnets, are reported. -- Abstract: One of the main issues in the simulation of Tokamaks functioning is the reliable and accurate computation of actual field maps in the plasma chamber. In this paper a tool able to accurately compute magnetic field maps produced by active coils of any 3D shape, wound with high number of conductors, is presented. Under linearity assumption, the coil winding is modeled by means of “sticks”, following each conductor's shape, and the contribution of each stick is computed using high speed Graphic Computing Units (GPU's). Relevant speed enhancements with respect to standard parallel computing environment are achieved in this way.

  3. Accurate and efficient spin integration for particle accelerators

    International Nuclear Information System (INIS)

    Abell, Dan T.; Meiser, Dominic; Ranjbar, Vahid H.; Barber, Desmond P.

    2015-01-01

    Accurate spin tracking is a valuable tool for understanding spin dynamics in particle accelerators and can help improve the performance of an accelerator. In this paper, we present a detailed discussion of the integrators in the spin tracking code GPUSPINTRACK. We have implemented orbital integrators based on drift-kick, bend-kick, and matrix-kick splits. On top of the orbital integrators, we have implemented various integrators for the spin motion. These integrators use quaternions and Romberg quadratures to accelerate both the computation and the convergence of spin rotations. We evaluate their performance and accuracy in quantitative detail for individual elements as well as for the entire RHIC lattice. We exploit the inherently data-parallel nature of spin tracking to accelerate our algorithms on graphics processing units.

  4. Accurate and efficient spin integration for particle accelerators

    Energy Technology Data Exchange (ETDEWEB)

    Abell, Dan T.; Meiser, Dominic [Tech-X Corporation, Boulder, CO (United States); Ranjbar, Vahid H. [Brookhaven National Laboratory, Upton, NY (United States); Barber, Desmond P. [Deutsches Elektronen-Synchrotron (DESY), Hamburg (Germany)

    2015-01-15

    Accurate spin tracking is a valuable tool for understanding spin dynamics in particle accelerators and can help improve the performance of an accelerator. In this paper, we present a detailed discussion of the integrators in the spin tracking code GPUSPINTRACK. We have implemented orbital integrators based on drift-kick, bend-kick, and matrix-kick splits. On top of the orbital integrators, we have implemented various integrators for the spin motion. These integrators use quaternions and Romberg quadratures to accelerate both the computation and the convergence of spin rotations. We evaluate their performance and accuracy in quantitative detail for individual elements as well as for the entire RHIC lattice. We exploit the inherently data-parallel nature of spin tracking to accelerate our algorithms on graphics processing units.

  5. An accurate nonlinear Monte Carlo collision operator

    International Nuclear Information System (INIS)

    Wang, W.X.; Okamoto, M.; Nakajima, N.; Murakami, S.

    1995-03-01

    A three dimensional nonlinear Monte Carlo collision model is developed based on Coulomb binary collisions with the emphasis both on the accuracy and implementation efficiency. The operator of simple form fulfills particle number, momentum and energy conservation laws, and is equivalent to exact Fokker-Planck operator by correctly reproducing the friction coefficient and diffusion tensor, in addition, can effectively assure small-angle collisions with a binary scattering angle distributed in a limited range near zero. Two highly vectorizable algorithms are designed for its fast implementation. Various test simulations regarding relaxation processes, electrical conductivity, etc. are carried out in velocity space. The test results, which is in good agreement with theory, and timing results on vector computers show that it is practically applicable. The operator may be used for accurately simulating collisional transport problems in magnetized and unmagnetized plasmas. (author)

  6. Reranking candidate gene models with cross-species comparison for improved gene prediction

    Directory of Open Access Journals (Sweden)

    Pereira Fernando CN

    2008-10-01

    Full Text Available Abstract Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc. Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models.

  7. Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

    KAUST Repository

    AlShahrani, Mona; Hoehndorf, Robert

    2018-01-01

    In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease's (or patient's) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.

  8. Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

    KAUST Repository

    Alshahrani, Mona

    2018-04-30

    In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease\\'s (or patient\\'s) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.

  9. Confocal quantification of cis-regulatory reporter gene expression in living sea urchin.

    Science.gov (United States)

    Damle, Sagar; Hanser, Bridget; Davidson, Eric H; Fraser, Scott E

    2006-11-15

    Quantification of GFP reporter gene expression at single cell level in living sea urchin embryos can now be accomplished by a new method of confocal laser scanning microscopy (CLSM). Eggs injected with a tissue-specific GFP reporter DNA construct were grown to gastrula stage and their fluorescence recorded as a series of contiguous Z-section slices that spanned the entire embryo. To measure the depth-dependent signal decay seen in the successive slices of an image stack, the eggs were coinjected with a freely diffusible internal fluorescent standard, rhodamine dextran. The measured rhodamine fluorescence was used to generate a computational correction for the depth-dependent loss of GFP fluorescence per slice. The intensity of GFP fluorescence was converted to the number of GFP molecules using a conversion constant derived from CLSM imaging of eggs injected with a measured quantity of GFP protein. The outcome is a validated method for accurately counting GFP molecules in given cells in reporter gene transfer experiments, as we demonstrate by use of an expression construct expressed exclusively in skeletogenic cells.

  10. Learning gene regulatory networks from gene expression data using weighted consensus

    KAUST Repository

    Fujii, Chisato; Kuwahara, Hiroyuki; Yu, Ge; Guo, Lili; Gao, Xin

    2016-01-01

    An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.

  11. Learning gene regulatory networks from gene expression data using weighted consensus

    KAUST Repository

    Fujii, Chisato

    2016-08-25

    An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.

  12. Uropathogenic Escherichia coli virulence genes: invaluable approaches for designing DNA microarray probes.

    Science.gov (United States)

    Jahandeh, Nadia; Ranjbar, Reza; Behzadi, Payam; Behzadi, Elham

    2015-01-01

    The pathotypes of uropathogenic Escherichia coli (UPEC) cause different types of urinary tract infections (UTIs). The presence of a wide range of virulence genes in UPEC enables us to design appropriate DNA microarray probes. These probes, which are used in DNA microarray technology, provide us with an accurate and rapid diagnosis and definitive treatment in association with UTIs caused by UPEC pathotypes. The main goal of this article is to introduce the UPEC virulence genes as invaluable approaches for designing DNA microarray probes. Main search engines such as Google Scholar and databases like NCBI were searched to find and study several original pieces of literature, review articles, and DNA gene sequences. In parallel with in silico studies, the experiences of the authors were helpful for selecting appropriate sources and writing this review article. There is a significant variety of virulence genes among UPEC strains. The DNA sequences of virulence genes are fabulous patterns for designing microarray probes. The location of virulence genes and their sequence lengths influence the quality of probes. The use of selected virulence genes for designing microarray probes gives us a wide range of choices from which the best probe candidates can be chosen. DNA microarray technology provides us with an accurate, rapid, cost-effective, sensitive, and specific molecular diagnostic method which is facilitated by designing microarray probes. Via these tools, we are able to have an accurate diagnosis and a definitive treatment regarding UTIs caused by UPEC pathotypes.

  13. Distributed Pedestrian Detection Alerts Based on Data Fusion with Accurate Localization

    Directory of Open Access Journals (Sweden)

    Arturo de la Escalera

    2013-09-01

    Full Text Available Among Advanced Driver Assistance Systems (ADAS pedestrian detection is a common issue due to the vulnerability of pedestrians in the event of accidents. In the present work, a novel approach for pedestrian detection based on data fusion is presented. Data fusion helps to overcome the limitations inherent to each detection system (computer vision and laser scanner and provides accurate and trustable tracking of any pedestrian movement. The application is complemented by an efficient communication protocol, able to alert vehicles in the surroundings by a fast and reliable communication. The combination of a powerful location, based on a GPS with inertial measurement, and accurate obstacle localization based on data fusion has allowed locating the detected pedestrians with high accuracy. Tests proved the viability of the detection system and the efficiency of the communication, even at long distances. By the use of the alert communication, dangerous situations such as occlusions or misdetections can be avoided.

  14. Distributed pedestrian detection alerts based on data fusion with accurate localization.

    Science.gov (United States)

    García, Fernando; Jiménez, Felipe; Anaya, José Javier; Armingol, José María; Naranjo, José Eugenio; de la Escalera, Arturo

    2013-09-04

    Among Advanced Driver Assistance Systems (ADAS) pedestrian detection is a common issue due to the vulnerability of pedestrians in the event of accidents. In the present work, a novel approach for pedestrian detection based on data fusion is presented. Data fusion helps to overcome the limitations inherent to each detection system (computer vision and laser scanner) and provides accurate and trustable tracking of any pedestrian movement. The application is complemented by an efficient communication protocol, able to alert vehicles in the surroundings by a fast and reliable communication. The combination of a powerful location, based on a GPS with inertial measurement, and accurate obstacle localization based on data fusion has allowed locating the detected pedestrians with high accuracy. Tests proved the viability of the detection system and the efficiency of the communication, even at long distances. By the use of the alert communication, dangerous situations such as occlusions or misdetections can be avoided.

  15. Accurate density-functional calculations on large systems: Fullerenes and magnetic clusters

    International Nuclear Information System (INIS)

    Dunlap, B.I.

    1996-01-01

    Efforts to accurately compute all-electron density-functional energies for large molecules and clusters using Gaussian basis sets will be reviewed. The foundation of this effort, variational fitting, will be described and followed by three applications of the method. The first application concerns fullerenes. When first discovered, C 60 is quite unstable relative to the higher fullerenes. In addition, to raising questions about the relative abundance of the various fullerenes, this work conflicted with the then state-of-the art density-funcitonal calculations on crystalline graphite. Now high accuracy molecular and band structure calculations are in fairly good agreement. Second, we have used these methods to design transition metal clusters having the highest magnetic moment by maximizing the symmetry-required degeneracy of the one-electron orbitals. Most recently, we have developed accurate, variational generalized-gradient approximation (GGA) forces for use in geometry optimization of clusters and in molecular-dynamics simulations of friction. The GGA optimized geometries of a number of large clusters will be given

  16. Simple, fast and accurate two-diode model for photovoltaic modules

    Energy Technology Data Exchange (ETDEWEB)

    Ishaque, Kashif; Salam, Zainal; Taheri, Hamed [Faculty of Electrical Engineering, Universiti Teknologi Malaysia, UTM 81310, Skudai, Johor Bahru (Malaysia)

    2011-02-15

    This paper proposes an improved modeling approach for the two-diode model of photovoltaic (PV) module. The main contribution of this work is the simplification of the current equation, in which only four parameters are required, compared to six or more in the previously developed two-diode models. Furthermore the values of the series and parallel resistances are computed using a simple and fast iterative method. To validate the accuracy of the proposed model, six PV modules of different types (multi-crystalline, mono-crystalline and thin-film) from various manufacturers are tested. The performance of the model is evaluated against the popular single diode models. It is found that the proposed model is superior when subjected to irradiance and temperature variations. In particular the model matches very accurately for all important points of the I-V curves, i.e. the peak power, short-circuit current and open circuit voltage. The modeling method is useful for PV power converter designers and circuit simulator developers who require simple, fast yet accurate model for the PV module. (author)

  17. The accurate assessment of small-angle X-ray scattering data.

    Science.gov (United States)

    Grant, Thomas D; Luft, Joseph R; Carter, Lester G; Matsui, Tsutomu; Weiss, Thomas M; Martel, Anne; Snell, Edward H

    2015-01-01

    Small-angle X-ray scattering (SAXS) has grown in popularity in recent times with the advent of bright synchrotron X-ray sources, powerful computational resources and algorithms enabling the calculation of increasingly complex models. However, the lack of standardized data-quality metrics presents difficulties for the growing user community in accurately assessing the quality of experimental SAXS data. Here, a series of metrics to quantitatively describe SAXS data in an objective manner using statistical evaluations are defined. These metrics are applied to identify the effects of radiation damage, concentration dependence and interparticle interactions on SAXS data from a set of 27 previously described targets for which high-resolution structures have been determined via X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. The studies show that these metrics are sufficient to characterize SAXS data quality on a small sample set with statistical rigor and sensitivity similar to or better than manual analysis. The development of data-quality analysis strategies such as these initial efforts is needed to enable the accurate and unbiased assessment of SAXS data quality.

  18. BLESS 2: accurate, memory-efficient and fast error correction method.

    Science.gov (United States)

    Heo, Yun; Ramachandran, Anand; Hwu, Wen-Mei; Ma, Jian; Chen, Deming

    2016-08-01

    The most important features of error correction tools for sequencing data are accuracy, memory efficiency and fast runtime. The previous version of BLESS was highly memory-efficient and accurate, but it was too slow to handle reads from large genomes. We have developed a new version of BLESS to improve runtime and accuracy while maintaining a small memory usage. The new version, called BLESS 2, has an error correction algorithm that is more accurate than BLESS, and the algorithm has been parallelized using hybrid MPI and OpenMP programming. BLESS 2 was compared with five top-performing tools, and it was found to be the fastest when it was executed on two computing nodes using MPI, with each node containing twelve cores. Also, BLESS 2 showed at least 11% higher gain while retaining the memory efficiency of the previous version for large genomes. Freely available at https://sourceforge.net/projects/bless-ec dchen@illinois.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. Evaluation of Suitable Reference Genes for Normalization of qPCR Gene Expression Studies in Brinjal (Solanum melongena L.) During Fruit Developmental Stages.

    Science.gov (United States)

    Kanakachari, Mogilicherla; Solanke, Amolkumar U; Prabhakaran, Narayanasamy; Ahmad, Israr; Dhandapani, Gurusamy; Jayabalan, Narayanasamy; Kumar, Polumetla Ananda

    2016-02-01

    Brinjal/eggplant/aubergine is one of the major solanaceous vegetable crops. Recent availability of genome information greatly facilitates the fundamental research on brinjal. Gene expression patterns during different stages of fruit development can provide clues towards the understanding of its biological functions. Quantitative real-time PCR (qPCR) has become one of the most widely used methods for rapid and accurate quantification of gene expression. However, its success depends on the use of a suitable reference gene for data normalization. For qPCR analysis, a single reference gene is not universally suitable for all experiments. Therefore, reference gene validation is a crucial step. Suitable reference genes for qPCR analysis of brinjal fruit development have not been investigated so far. In this study, we have selected 21 candidate reference genes from the Brinjal (Solanum melongena) Plant Gene Indices database (compbio.dfci.harvard.edu/tgi/plant.html) and studied their expression profiles by qPCR during six different fruit developmental stages (0, 5, 10, 20, 30, and 50 days post anthesis) along with leaf samples of the Pusa Purple Long (PPL) variety. To evaluate the stability of gene expression, geNorm and NormFinder analytical softwares were used. geNorm identified SAND (SAND family protein) and TBP (TATA binding protein) as the best pairs of reference genes in brinjal fruit development. The results showed that for brinjal fruit development, individual or a combination of reference genes should be selected for data normalization. NormFinder identified Expressed gene (expressed sequence) as the best single reference gene in brinjal fruit development. In this study, we have identified and validated for the first time reference genes to provide accurate transcript normalization and quantification at various fruit developmental stages of brinjal which can also be useful for gene expression studies in other Solanaceae plant species.

  20. Real-time Accurate Surface Reconstruction Pipeline for Vision Guided Planetary Exploration Using Unmanned Ground and Aerial Vehicles

    Science.gov (United States)

    Almeida, Eduardo DeBrito

    2012-01-01

    This report discusses work completed over the summer at the Jet Propulsion Laboratory (JPL), California Institute of Technology. A system is presented to guide ground or aerial unmanned robots using computer vision. The system performs accurate camera calibration, camera pose refinement and surface extraction from images collected by a camera mounted on the vehicle. The application motivating the research is planetary exploration and the vehicles are typically rovers or unmanned aerial vehicles. The information extracted from imagery is used primarily for navigation, as robot location is the same as the camera location and the surfaces represent the terrain that rovers traverse. The processed information must be very accurate and acquired very fast in order to be useful in practice. The main challenge being addressed by this project is to achieve high estimation accuracy and high computation speed simultaneously, a difficult task due to many technical reasons.

  1. The detection of the methylated Wif-1 gene is more accurate than a fecal occult blood test for colorectal cancer screening

    KAUST Repository

    Amiot, Aurelien

    2014-07-15

    Background: The clinical benefit of guaiac fecal occult blood tests (FOBT) is now well established for colorectal cancer screening. Growing evidence has demonstrated that epigenetic modifications and fecal microbiota changes, also known as dysbiosis, are associated with CRC pathogenesis and might be used as surrogate markers of CRC. Patients and Methods: We performed a cross-sectional study that included all consecutive subjects that were referred (from 2003 to 2007) for screening colonoscopies. Prior to colonoscopy, effluents (fresh stools, sera-S and urine-U) were harvested and FOBTs performed. Methylation levels were measured in stools, S and U for 3 genes (Wif1, ALX-4, and Vimentin) selected from a panel of 63 genes; Kras mutations and seven dominant and subdominant bacterial populations in stools were quantified. Calibration was assessed with the Hosmer-Lemeshow chi-square, and discrimination was determined by calculating the C-statistic (Area Under Curve) and Net Reclassification Improvement index. Results: There were 247 individuals (mean age 60.8±12.4 years, 52% of males) in the study group, and 90 (36%) of these individuals were patients with advanced polyps or invasive adenocarcinomas. A multivariate model adjusted for age and FOBT led to a C-statistic of 0.83 [0.77-0.88]. After supplementary sequential (one-by-one) adjustment, Wif-1 methylation (S or U) and fecal microbiota dysbiosis led to increases of the C-statistic to 0.90 [0.84-0.94] (p = 0.02) and 0.81 [0.74-0.86] (p = 0.49), respectively. When adjusted jointly for FOBT and Wif-1 methylation or fecal microbiota dysbiosis, the increase of the C-statistic was even more significant (0.91 and 0.85, p<0.001 and p = 0.10, respectively). Conclusion: The detection of methylated Wif-1 in either S or U has a higher performance accuracy compared to guaiac FOBT for advanced colorectal neoplasia screening. Conversely, fecal microbiota dysbiosis detection was not more accurate. Blood and urine testing could be

  2. The detection of the methylated Wif-1 gene is more accurate than a fecal occult blood test for colorectal cancer screening

    KAUST Repository

    Amiot, Aurelien; Mansour, Hicham; Baumgaertner, Isabelle; Delchier, Jean-Charles; Tournigand, Christophe; Furet, Jean-Pierre; Carrau, Jean-Pierre; Canoui-Poitrine, Florence; Sobhani, Iradj

    2014-01-01

    Background: The clinical benefit of guaiac fecal occult blood tests (FOBT) is now well established for colorectal cancer screening. Growing evidence has demonstrated that epigenetic modifications and fecal microbiota changes, also known as dysbiosis, are associated with CRC pathogenesis and might be used as surrogate markers of CRC. Patients and Methods: We performed a cross-sectional study that included all consecutive subjects that were referred (from 2003 to 2007) for screening colonoscopies. Prior to colonoscopy, effluents (fresh stools, sera-S and urine-U) were harvested and FOBTs performed. Methylation levels were measured in stools, S and U for 3 genes (Wif1, ALX-4, and Vimentin) selected from a panel of 63 genes; Kras mutations and seven dominant and subdominant bacterial populations in stools were quantified. Calibration was assessed with the Hosmer-Lemeshow chi-square, and discrimination was determined by calculating the C-statistic (Area Under Curve) and Net Reclassification Improvement index. Results: There were 247 individuals (mean age 60.8±12.4 years, 52% of males) in the study group, and 90 (36%) of these individuals were patients with advanced polyps or invasive adenocarcinomas. A multivariate model adjusted for age and FOBT led to a C-statistic of 0.83 [0.77-0.88]. After supplementary sequential (one-by-one) adjustment, Wif-1 methylation (S or U) and fecal microbiota dysbiosis led to increases of the C-statistic to 0.90 [0.84-0.94] (p = 0.02) and 0.81 [0.74-0.86] (p = 0.49), respectively. When adjusted jointly for FOBT and Wif-1 methylation or fecal microbiota dysbiosis, the increase of the C-statistic was even more significant (0.91 and 0.85, p<0.001 and p = 0.10, respectively). Conclusion: The detection of methylated Wif-1 in either S or U has a higher performance accuracy compared to guaiac FOBT for advanced colorectal neoplasia screening. Conversely, fecal microbiota dysbiosis detection was not more accurate. Blood and urine testing could be

  3. The detection of the methylated Wif-1 gene is more accurate than a fecal occult blood test for colorectal cancer screening.

    Directory of Open Access Journals (Sweden)

    Aurelien Amiot

    Full Text Available The clinical benefit of guaiac fecal occult blood tests (FOBT is now well established for colorectal cancer screening. Growing evidence has demonstrated that epigenetic modifications and fecal microbiota changes, also known as dysbiosis, are associated with CRC pathogenesis and might be used as surrogate markers of CRC.We performed a cross-sectional study that included all consecutive subjects that were referred (from 2003 to 2007 for screening colonoscopies. Prior to colonoscopy, effluents (fresh stools, sera-S and urine-U were harvested and FOBTs performed. Methylation levels were measured in stools, S and U for 3 genes (Wif1, ALX-4, and Vimentin selected from a panel of 63 genes; Kras mutations and seven dominant and subdominant bacterial populations in stools were quantified. Calibration was assessed with the Hosmer-Lemeshow chi-square, and discrimination was determined by calculating the C-statistic (Area Under Curve and Net Reclassification Improvement index.There were 247 individuals (mean age 60.8±12.4 years, 52% of males in the study group, and 90 (36% of these individuals were patients with advanced polyps or invasive adenocarcinomas. A multivariate model adjusted for age and FOBT led to a C-statistic of 0.83 [0.77-0.88]. After supplementary sequential (one-by-one adjustment, Wif-1 methylation (S or U and fecal microbiota dysbiosis led to increases of the C-statistic to 0.90 [0.84-0.94] (p = 0.02 and 0.81 [0.74-0.86] (p = 0.49, respectively. When adjusted jointly for FOBT and Wif-1 methylation or fecal microbiota dysbiosis, the increase of the C-statistic was even more significant (0.91 and 0.85, p<0.001 and p = 0.10, respectively.The detection of methylated Wif-1 in either S or U has a higher performance accuracy compared to guaiac FOBT for advanced colorectal neoplasia screening. Conversely, fecal microbiota dysbiosis detection was not more accurate. Blood and urine testing could be used in those individuals reluctant to

  4. An Accurate Estimate of the Free Energy and Phase Diagram of All-DNA Bulk Fluids

    Directory of Open Access Journals (Sweden)

    Emanuele Locatelli

    2018-04-01

    Full Text Available We present a numerical study in which large-scale bulk simulations of self-assembled DNA constructs have been carried out with a realistic coarse-grained model. The investigation aims at obtaining a precise, albeit numerically demanding, estimate of the free energy for such systems. We then, in turn, use these accurate results to validate a recently proposed theoretical approach that builds on a liquid-state theory, the Wertheim theory, to compute the phase diagram of all-DNA fluids. This hybrid theoretical/numerical approach, based on the lowest-order virial expansion and on a nearest-neighbor DNA model, can provide, in an undemanding way, a parameter-free thermodynamic description of DNA associating fluids that is in semi-quantitative agreement with experiments. We show that the predictions of the scheme are as accurate as those obtained with more sophisticated methods. We also demonstrate the flexibility of the approach by incorporating non-trivial additional contributions that go beyond the nearest-neighbor model to compute the DNA hybridization free energy.

  5. A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

    Energy Technology Data Exchange (ETDEWEB)

    Santra, Tapesh, E-mail: tapesh.santra@ucd.ie [Systems Biology Ireland, University College Dublin, Dublin (Ireland)

    2014-05-20

    Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.

  6. A Bayesian Framework That Integrates Heterogeneous Data for Inferring Gene Regulatory Networks

    International Nuclear Information System (INIS)

    Santra, Tapesh

    2014-01-01

    Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.

  7. Automatic procedure for realistic 3D finite element modelling of human brain for bioelectromagnetic computations

    International Nuclear Information System (INIS)

    Aristovich, K Y; Khan, S H

    2010-01-01

    Realistic computer modelling of biological objects requires building of very accurate and realistic computer models based on geometric and material data, type, and accuracy of numerical analyses. This paper presents some of the automatic tools and algorithms that were used to build accurate and realistic 3D finite element (FE) model of whole-brain. These models were used to solve the forward problem in magnetic field tomography (MFT) based on Magnetoencephalography (MEG). The forward problem involves modelling and computation of magnetic fields produced by human brain during cognitive processing. The geometric parameters of the model were obtained from accurate Magnetic Resonance Imaging (MRI) data and the material properties - from those obtained from Diffusion Tensor MRI (DTMRI). The 3D FE models of the brain built using this approach has been shown to be very accurate in terms of both geometric and material properties. The model is stored on the computer in Computer-Aided Parametrical Design (CAD) format. This allows the model to be used in a wide a range of methods of analysis, such as finite element method (FEM), Boundary Element Method (BEM), Monte-Carlo Simulations, etc. The generic model building approach presented here could be used for accurate and realistic modelling of human brain and many other biological objects.

  8. Inferring Drosophila gap gene regulatory network: Pattern analysis of simulated gene expression profiles and stability analysis

    NARCIS (Netherlands)

    Fomekong-Nanfack, Y.; Postma, M.; Kaandorp, J.A.

    2009-01-01

    Background: Inference of gene regulatory networks (GRNs) requires accurate data, a method to simulate the expression patterns and an efficient optimization algorithm to estimate the unknown parameters. Using this approach it is possible to obtain alternative circuits without making any a priori

  9. A genetic ensemble approach for gene-gene interaction identification

    Directory of Open Access Journals (Sweden)

    Ho Joshua WK

    2010-10-01

    Full Text Available Abstract Background It has now become clear that gene-gene interactions and gene-environment interactions are ubiquitous and fundamental mechanisms for the development of complex diseases. Though a considerable effort has been put into developing statistical models and algorithmic strategies for identifying such interactions, the accurate identification of those genetic interactions has been proven to be very challenging. Methods In this paper, we propose a new approach for identifying such gene-gene and gene-environment interactions underlying complex diseases. This is a hybrid algorithm and it combines genetic algorithm (GA and an ensemble of classifiers (called genetic ensemble. Using this approach, the original problem of SNP interaction identification is converted into a data mining problem of combinatorial feature selection. By collecting various single nucleotide polymorphisms (SNP subsets as well as environmental factors generated in multiple GA runs, patterns of gene-gene and gene-environment interactions can be extracted using a simple combinatorial ranking method. Also considered in this study is the idea of combining identification results obtained from multiple algorithms. A novel formula based on pairwise double fault is designed to quantify the degree of complementarity. Conclusions Our simulation study demonstrates that the proposed genetic ensemble algorithm has comparable identification power to Multifactor Dimensionality Reduction (MDR and is slightly better than Polymorphism Interaction Analysis (PIA, which are the two most popular methods for gene-gene interaction identification. More importantly, the identification results generated by using our genetic ensemble algorithm are highly complementary to those obtained by PIA and MDR. Experimental results from our simulation studies and real world data application also confirm the effectiveness of the proposed genetic ensemble algorithm, as well as the potential benefits of

  10. HeurAA: accurate and fast detection of genetic variations with a novel heuristic amplicon aligner program for next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Lőrinc S Pongor

    Full Text Available Next generation sequencing (NGS of PCR amplicons is a standard approach to detect genetic variations in personalized medicine such as cancer diagnostics. Computer programs used in the NGS community often miss insertions and deletions (indels that constitute a large part of known human mutations. We have developed HeurAA, an open source, heuristic amplicon aligner program. We tested the program on simulated datasets as well as experimental data from multiplex sequencing of 40 amplicons in 12 oncogenes collected on a 454 Genome Sequencer from lung cancer cell lines. We found that HeurAA can accurately detect all indels, and is more than an order of magnitude faster than previous programs. HeurAA can compare reads and reference sequences up to several thousand base pairs in length, and it can evaluate data from complex mixtures containing reads of different gene-segments from different samples. HeurAA is written in C and Perl for Linux operating systems, the code and the documentation are available for research applications at http://sourceforge.net/projects/heuraa/

  11. Evaluation of endogenous control gene(s) for gene expression studies in human blood exposed to 60Co γ-rays ex vivo

    International Nuclear Information System (INIS)

    Vaiphei, S. Thangminlal; Keppen, Joshua; Nongrum, Saibadaiahun; Sharan, R.N.; Chaubey, R.C.; Kma, L.

    2015-01-01

    In gene expression studies, it is critical to normalize data using a stably expressed endogenous control gene in order to obtain accurate and reliable results. However, we currently do not have a universally applied endogenous control gene for normalization of data for gene expression studies, particularly those involving 60 Co γ-ray-exposed human blood samples. In this study, a comparative assessment of the gene expression of six widely used housekeeping endogenous control genes, namely 18S, ACTB, B2M, GAPDH, MT-ATP6 and CDKN1A, was undertaken for a range of 60 Co γ-ray doses (0.5, 1.0, 2.0 and 4.0 Gy) at 8.4 Gy min -1 at 0 and 24 h post-irradiation time intervals. Using the NormFinder algorithm, real-time PCR data obtained from six individuals (three males and three females) were analyzed with respect to the threshold cycle (Ct) value and abundance, ΔCt pair-wise comparison, intra- and inter-group variability assessments, etc. GAPDH, either alone or in combination with 18S, was found to be the most suitable endogenous control gene and should be used in gene expression studies, especially those involving qPCR of γ-ray-exposed human blood samples. (author)

  12. Feasibility study for application of the compressed-sensing framework to interior computed tomography (ICT) for low-dose, high-accurate dental x-ray imaging

    Science.gov (United States)

    Je, U. K.; Cho, H. M.; Cho, H. S.; Park, Y. O.; Park, C. K.; Lim, H. W.; Kim, K. S.; Kim, G. A.; Park, S. Y.; Woo, T. H.; Choi, S. I.

    2016-02-01

    In this paper, we propose a new/next-generation type of CT examinations, the so-called Interior Computed Tomography (ICT), which may presumably lead to dose reduction to the patient outside the target region-of-interest (ROI), in dental x-ray imaging. Here an x-ray beam from each projection position covers only a relatively small ROI containing a target of diagnosis from the examined structure, leading to imaging benefits such as decreasing scatters and system cost as well as reducing imaging dose. We considered the compressed-sensing (CS) framework, rather than common filtered-backprojection (FBP)-based algorithms, for more accurate ICT reconstruction. We implemented a CS-based ICT algorithm and performed a systematic simulation to investigate the imaging characteristics. Simulation conditions of two ROI ratios of 0.28 and 0.14 between the target and the whole phantom sizes and four projection numbers of 360, 180, 90, and 45 were tested. We successfully reconstructed ICT images of substantially high image quality by using the CS framework even with few-view projection data, still preserving sharp edges in the images.

  13. The accurate particle tracer code

    Science.gov (United States)

    Wang, Yulei; Liu, Jian; Qin, Hong; Yu, Zhi; Yao, Yicun

    2017-11-01

    The Accurate Particle Tracer (APT) code is designed for systematic large-scale applications of geometric algorithms for particle dynamical simulations. Based on a large variety of advanced geometric algorithms, APT possesses long-term numerical accuracy and stability, which are critical for solving multi-scale and nonlinear problems. To provide a flexible and convenient I/O interface, the libraries of Lua and Hdf5 are used. Following a three-step procedure, users can efficiently extend the libraries of electromagnetic configurations, external non-electromagnetic forces, particle pushers, and initialization approaches by use of the extendible module. APT has been used in simulations of key physical problems, such as runaway electrons in tokamaks and energetic particles in Van Allen belt. As an important realization, the APT-SW version has been successfully distributed on the world's fastest computer, the Sunway TaihuLight supercomputer, by supporting master-slave architecture of Sunway many-core processors. Based on large-scale simulations of a runaway beam under parameters of the ITER tokamak, it is revealed that the magnetic ripple field can disperse the pitch-angle distribution significantly and improve the confinement of energetic runaway beam on the same time.

  14. Selection and validation of a set of reliable reference genes for quantitative sod gene expression analysis in C. elegans

    Directory of Open Access Journals (Sweden)

    Vandesompele Jo

    2008-01-01

    Full Text Available Abstract Background In the nematode Caenorhabditis elegans the conserved Ins/IGF-1 signaling pathway regulates many biological processes including life span, stress response, dauer diapause and metabolism. Detection of differentially expressed genes may contribute to a better understanding of the mechanism by which the Ins/IGF-1 signaling pathway regulates these processes. Appropriate normalization is an essential prerequisite for obtaining accurate and reproducible quantification of gene expression levels. The aim of this study was to establish a reliable set of reference genes for gene expression analysis in C. elegans. Results Real-time quantitative PCR was used to evaluate the expression stability of 12 candidate reference genes (act-1, ama-1, cdc-42, csq-1, eif-3.C, mdh-1, gpd-2, pmp-3, tba-1, Y45F10D.4, rgs-6 and unc-16 in wild-type, three Ins/IGF-1 pathway mutants, dauers and L3 stage larvae. After geNorm analysis, cdc-42, pmp-3 and Y45F10D.4 showed the most stable expression pattern and were used to normalize 5 sod expression levels. Significant differences in mRNA levels were observed for sod-1 and sod-3 in daf-2 relative to wild-type animals, whereas in dauers sod-1, sod-3, sod-4 and sod-5 are differentially expressed relative to third stage larvae. Conclusion Our findings emphasize the importance of accurate normalization using stably expressed reference genes. The methodology used in this study is generally applicable to reliably quantify gene expression levels in the nematode C. elegans using quantitative PCR.

  15. Evaluation of endogenous control genes for gene expression studies across multiple tissues and in the specific sets of fat- and muscle-type samples of the pig.

    Science.gov (United States)

    Gu, Y R; Li, M Z; Zhang, K; Chen, L; Jiang, A A; Wang, J Y; Li, X W

    2011-08-01

    To normalize a set of quantitative real-time PCR (q-PCR) data, it is essential to determine an optimal number/set of housekeeping genes, as the abundance of housekeeping genes can vary across tissues or cells during different developmental stages, or even under certain environmental conditions. In this study, of the 20 commonly used endogenous control genes, 13, 18 and 17 genes exhibited credible stability in 56 different tissues, 10 types of adipose tissue and five types of muscle tissue, respectively. Our analysis clearly showed that three optimal housekeeping genes are adequate for an accurate normalization, which correlated well with the theoretical optimal number (r ≥ 0.94). In terms of economical and experimental feasibility, we recommend the use of the three most stable housekeeping genes for calculating the normalization factor. Based on our results, the three most stable housekeeping genes in all analysed samples (TOP2B, HSPCB and YWHAZ) are recommended for accurate normalization of q-PCR data. We also suggest that two different sets of housekeeping genes are appropriate for 10 types of adipose tissue (the HSPCB, ALDOA and GAPDH genes) and five types of muscle tissue (the TOP2B, HSPCB and YWHAZ genes), respectively. Our report will serve as a valuable reference for other studies aimed at measuring tissue-specific mRNA abundance in porcine samples. © 2011 Blackwell Verlag GmbH.

  16. Computational methods to dissect cis-regulatory transcriptional ...

    Indian Academy of Sciences (India)

    The formation of diverse cell types from an invariant set of genes is governed by biochemical and molecular processes that regulate gene activity. A complete understanding of the regulatory mechanisms of gene expression is the major function of genomics. Computational genomics is a rapidly emerging area for ...

  17. Characterization of 3D PET systems for accurate quantification of myocardial blood flow

    OpenAIRE

    Renaud, Jennifer M.; Yip, Kathy; Guimond, Jean; Trottier, Mikaël; Pibarot, Philippe; Turcotte, Éric; Maguire, Conor; Lalonde, Lucille; Gulenchyn, Karen; Farncombe, Troy; Wisenberg, Gerald; Moody, Jonathan; Lee, Benjamin; Port, Steven C.; Turkington, Timothy G

    2016-01-01

    Three-dimensional (3D) mode imaging is the current standard for positron emission tomography-computed tomography (PET-CT) systems. Dynamic imaging for quantification of myocardial blood flow (MBF) with short-lived tracers, such as Rb-82- chloride (Rb-82), requires accuracy to be maintained over a wide range of isotope activities and scanner count-rates. We propose new performance standard measurements to characterize the dynamic range of PET systems for accurate quantitative...

  18. Matrix factorization reveals aging-specific co-expression gene modules in the fat and muscle tissues in nonhuman primates

    Science.gov (United States)

    Wang, Yongcui; Zhao, Weiling; Zhou, Xiaobo

    2016-10-01

    Accurate identification of coherent transcriptional modules (subnetworks) in adipose and muscle tissues is important for revealing the related mechanisms and co-regulated pathways involved in the development of aging-related diseases. Here, we proposed a systematically computational approach, called ICEGM, to Identify the Co-Expression Gene Modules through a novel mathematical framework of Higher-Order Generalized Singular Value Decomposition (HO-GSVD). ICEGM was applied on the adipose, and heart and skeletal muscle tissues in old and young female African green vervet monkeys. The genes associated with the development of inflammation, cardiovascular and skeletal disorder diseases, and cancer were revealed by the ICEGM. Meanwhile, genes in the ICEGM modules were also enriched in the adipocytes, smooth muscle cells, cardiac myocytes, and immune cells. Comprehensive disease annotation and canonical pathway analysis indicated that immune cells, adipocytes, cardiomyocytes, and smooth muscle cells played a synergistic role in cardiac and physical functions in the aged monkeys by regulation of the biological processes associated with metabolism, inflammation, and atherosclerosis. In conclusion, the ICEGM provides an efficiently systematic framework for decoding the co-expression gene modules in multiple tissues. Analysis of genes in the ICEGM module yielded important insights on the cooperative role of multiple tissues in the development of diseases.

  19. Toward an ultra-high resolution community climate system model for the BlueGene platform

    Energy Technology Data Exchange (ETDEWEB)

    Dennis, John M [Computer Science Section, National Center for Atmospheric Research, Boulder, CO (United States); Jacob, Robert [Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL (United States); Vertenstein, Mariana [Climate and Global Dynamics Division, National Center for Atmospheric Research, Boulder, CO (United States); Craig, Tony [Climate and Global Dynamics Division, National Center for Atmospheric Research, Boulder, CO (United States); Loy, Raymond [Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL (United States)

    2007-07-15

    Global climate models need to simulate several small, regional-scale processes which affect the global circulation in order to accurately simulate the climate. This is particularly important in the ocean where small scale features such as oceanic eddies are currently represented with adhoc parameterizations. There is also a need for higher resolution to provide climate predictions at small, regional scales. New high-performance computing platforms such as the IBM BlueGene can provide the necessary computational power to perform ultra-high resolution climate model integrations. We have begun to investigate the scaling of the individual components of the Community Climate System Model to prepare it for integrations on BlueGene and similar platforms. Our investigations show that it is possible to successfully utilize O(32K) processors. We describe the scalability of five models: the Parallel Ocean Program (POP), the Community Ice CodE (CICE), the Community Land Model (CLM), and the new CCSM sequential coupler (CPL7) which are components of the next generation Community Climate System Model (CCSM); as well as the High-Order Method Modeling Environment (HOMME) which is a dynamical core currently being evaluated within the Community Atmospheric Model. For our studies we concentrate on 1/10{sup 0} resolution for CICE, POP, and CLM models and 1/4{sup 0} resolution for HOMME. The ability to simulate high resolutions on the massively parallel petascale systems that will dominate high-performance computing for the foreseeable future is essential to the advancement of climate science.

  20. Accurate and efficient spin integration for particle accelerators

    Directory of Open Access Journals (Sweden)

    Dan T. Abell

    2015-02-01

    Full Text Available Accurate spin tracking is a valuable tool for understanding spin dynamics in particle accelerators and can help improve the performance of an accelerator. In this paper, we present a detailed discussion of the integrators in the spin tracking code gpuSpinTrack. We have implemented orbital integrators based on drift-kick, bend-kick, and matrix-kick splits. On top of the orbital integrators, we have implemented various integrators for the spin motion. These integrators use quaternions and Romberg quadratures to accelerate both the computation and the convergence of spin rotations. We evaluate their performance and accuracy in quantitative detail for individual elements as well as for the entire RHIC lattice. We exploit the inherently data-parallel nature of spin tracking to accelerate our algorithms on graphics processing units.

  1. Rapid and accurate pyrosequencing of angiosperm plastid genomes

    Science.gov (United States)

    Moore, Michael J; Dhingra, Amit; Soltis, Pamela S; Shaw, Regina; Farmerie, William G; Folta, Kevin M; Soltis, Douglas E

    2006-01-01

    Background Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20) System (454 Life Sciences Corporation), to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae) and Platanus occidentalis (Platanaceae). Results More than 99.75% of each plastid genome was simultaneously obtained during two GS 20 sequence runs, to an average depth of coverage of 24.6× in Nandina and 17.3× in Platanus. The Nandina and Platanus plastid genomes shared essentially identical gene complements and possessed the typical angiosperm plastid structure and gene arrangement. To assess the accuracy of the GS 20 sequence, over 45 kilobases of sequence were generated for each genome using conventional sequencing. Overall error rates of 0.043% and 0.031% were observed in GS 20 sequence for Nandina and Platanus, respectively. More than 97% of all observed errors were associated with homopolymer runs, with ~60% of all errors associated with homopolymer runs of 5 or more nucleotides and ~50% of all errors associated with regions of extensive homopolymer runs. No substitution errors were present in either genome. Error rates were generally higher in the single-copy and noncoding regions of both plastid genomes relative to the inverted repeat and coding regions. Conclusion Highly accurate and essentially complete sequence information was obtained for the Nandina and Platanus plastid genomes using the GS 20 System. More importantly, the high accuracy observed in the GS 20 plastid

  2. Rapid and accurate pyrosequencing of angiosperm plastid genomes

    Directory of Open Access Journals (Sweden)

    Farmerie William G

    2006-08-01

    Full Text Available Abstract Background Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20 System (454 Life Sciences Corporation, to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae and Platanus occidentalis (Platanaceae. Results More than 99.75% of each plastid genome was simultaneously obtained during two GS 20 sequence runs, to an average depth of coverage of 24.6× in Nandina and 17.3× in Platanus. The Nandina and Platanus plastid genomes shared essentially identical gene complements and possessed the typical angiosperm plastid structure and gene arrangement. To assess the accuracy of the GS 20 sequence, over 45 kilobases of sequence were generated for each genome using conventional sequencing. Overall error rates of 0.043% and 0.031% were observed in GS 20 sequence for Nandina and Platanus, respectively. More than 97% of all observed errors were associated with homopolymer runs, with ~60% of all errors associated with homopolymer runs of 5 or more nucleotides and ~50% of all errors associated with regions of extensive homopolymer runs. No substitution errors were present in either genome. Error rates were generally higher in the single-copy and noncoding regions of both plastid genomes relative to the inverted repeat and coding regions. Conclusion Highly accurate and essentially complete sequence information was obtained for the Nandina and Platanus plastid genomes using the GS 20 System. More importantly, the high accuracy

  3. A third order accurate Lagrangian finite element scheme for the computation of generalized molecular stress function fluids

    DEFF Research Database (Denmark)

    Fasano, Andrea; Rasmussen, Henrik K.

    2017-01-01

    A third order accurate, in time and space, finite element scheme for the numerical simulation of three- dimensional time-dependent flow of the molecular stress function type of fluids in a generalized formu- lation is presented. The scheme is an extension of the K-BKZ Lagrangian finite element me...

  4. Computed tomography of the pancreas

    International Nuclear Information System (INIS)

    Kolmannskog, F.; Kolbenstvedt, A.; Aakhus, T.; Bergan, A.; Fausa, O.; Elgjo, K.

    1980-01-01

    The findings by computed tomography in 203 cases of suspected pancreatic tumours, pancreatitis or peripancreatic abnormalities were evaluated. The appearances of the normal and the diseased pancreas are described. Computed tomography is highly accurate in detecting pancreatic masses, but can not differentiate neoplastic from inflammatory disease. The only reliable signs of pancreatic carcinoma are a focal mass in the pancreas, together with liver metastasis. When a pancreatic mass is revealed by computed tomography, CT-guided fine-needle aspiration biopsy of the pancreas is recommended. Thus the need for more invasive diagnostic procedures and explorative laparotomy may be avoided in some patients. (Auth.)

  5. HMM-Based Gene Annotation Methods

    Energy Technology Data Exchange (ETDEWEB)

    Haussler, David; Hughey, Richard; Karplus, Keven

    1999-09-20

    Development of new statistical methods and computational tools to identify genes in human genomic DNA, and to provide clues to their functions by identifying features such as transcription factor binding sites, tissue, specific expression and splicing patterns, and remove homologies at the protein level with genes of known function.

  6. Identification of Reference Genes for Normalizing Quantitative Real-Time PCR in Urechis unicinctus

    Science.gov (United States)

    Bai, Yajiao; Zhou, Di; Wei, Maokai; Xie, Yueyang; Gao, Beibei; Qin, Zhenkui; Zhang, Zhifeng

    2018-06-01

    The reverse transcription quantitative real-time PCR (RT-qPCR) has become one of the most important techniques of studying gene expression. A set of valid reference genes are essential for the accurate normalization of data. In this study, five candidate genes were analyzed with geNorm, NormFinder, BestKeeper and ΔCt methods to identify the genes stably expressed in echiuran Urechis unicinctus, an important commercial marine benthic worm, under abiotic (sulfide stress) and normal (adult tissues, embryos and larvae at different development stages) conditions. The comprehensive results indicated that the expression of TBP was the most stable at sulfide stress and in developmental process, while the expression of EF- 1- α was the most stable at sulfide stress and in various tissues. TBP and EF- 1- α were recommended as a suitable reference gene combination to accurately normalize the expression of target genes at sulfide stress; and EF- 1- α, TBP and TUB were considered as a potential reference gene combination for normalizing the expression of target genes in different tissues. No suitable gene combination was obtained among these five candidate genes for normalizing the expression of target genes for developmental process of U. unicinctus. Our results provided a valuable support for quantifying gene expression using RT-qPCR in U. unicinctus.

  7. Dovetailing biology and chemistry: integrating the Gene Ontology with the ChEBI chemical ontology

    Science.gov (United States)

    2013-01-01

    Background The Gene Ontology (GO) facilitates the description of the action of gene products in a biological context. Many GO terms refer to chemical entities that participate in biological processes. To facilitate accurate and consistent systems-wide biological representation, it is necessary to integrate the chemical view of these entities with the biological view of GO functions and processes. We describe a collaborative effort between the GO and the Chemical Entities of Biological Interest (ChEBI) ontology developers to ensure that the representation of chemicals in the GO is both internally consistent and in alignment with the chemical expertise captured in ChEBI. Results We have examined and integrated the ChEBI structural hierarchy into the GO resource through computationally-assisted manual curation of both GO and ChEBI. Our work has resulted in the creation of computable definitions of GO terms that contain fully defined semantic relationships to corresponding chemical terms in ChEBI. Conclusions The set of logical definitions using both the GO and ChEBI has already been used to automate aspects of GO development and has the potential to allow the integration of data across the domains of biology and chemistry. These logical definitions are available as an extended version of the ontology from http://purl.obolibrary.org/obo/go/extensions/go-plus.owl. PMID:23895341

  8. Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood.

    Science.gov (United States)

    Wu, Yufeng

    2012-03-01

    Incomplete lineage sorting can cause incongruence between the phylogenetic history of genes (the gene tree) and that of the species (the species tree), which can complicate the inference of phylogenies. In this article, I present a new coalescent-based algorithm for species tree inference with maximum likelihood. I first describe an improved method for computing the probability of a gene tree topology given a species tree, which is much faster than an existing algorithm by Degnan and Salter (2005). Based on this method, I develop a practical algorithm that takes a set of gene tree topologies and infers species trees with maximum likelihood. This algorithm searches for the best species tree by starting from initial species trees and performing heuristic search to obtain better trees with higher likelihood. This algorithm, called STELLS (which stands for Species Tree InfErence with Likelihood for Lineage Sorting), has been implemented in a program that is downloadable from the author's web page. The simulation results show that the STELLS algorithm is more accurate than an existing maximum likelihood method for many datasets, especially when there is noise in gene trees. I also show that the STELLS algorithm is efficient and can be applied to real biological datasets. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.

  9. Tools for Accurate and Efficient Analysis of Complex Evolutionary Mechanisms in Microbial Genomes. Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Nakhleh, Luay

    2014-03-12

    I proposed to develop computationally efficient tools for accurate detection and reconstruction of microbes' complex evolutionary mechanisms, thus enabling rapid and accurate annotation, analysis and understanding of their genomes. To achieve this goal, I proposed to address three aspects. (1) Mathematical modeling. A major challenge facing the accurate detection of HGT is that of distinguishing between these two events on the one hand and other events that have similar "effects." I proposed to develop a novel mathematical approach for distinguishing among these events. Further, I proposed to develop a set of novel optimization criteria for the evolutionary analysis of microbial genomes in the presence of these complex evolutionary events. (2) Algorithm design. In this aspect of the project, I proposed to develop an array of e cient and accurate algorithms for analyzing microbial genomes based on the formulated optimization criteria. Further, I proposed to test the viability of the criteria and the accuracy of the algorithms in an experimental setting using both synthetic as well as biological data. (3) Software development. I proposed the nal outcome to be a suite of software tools which implements the mathematical models as well as the algorithms developed.

  10. Accurate artificial boundary conditions for the semi-discretized linear Schrödinger and heat equations on rectangular domains

    Science.gov (United States)

    Ji, Songsong; Yang, Yibo; Pang, Gang; Antoine, Xavier

    2018-01-01

    The aim of this paper is to design some accurate artificial boundary conditions for the semi-discretized linear Schrödinger and heat equations in rectangular domains. The Laplace transform in time and discrete Fourier transform in space are applied to get Green's functions of the semi-discretized equations in unbounded domains with single-source. An algorithm is given to compute these Green's functions accurately through some recurrence relations. Furthermore, the finite-difference method is used to discretize the reduced problem with accurate boundary conditions. Numerical simulations are presented to illustrate the accuracy of our method in the case of the linear Schrödinger and heat equations. It is shown that the reflection at the corners is correctly eliminated.

  11. On the relation between gene flow theory and genetic gain

    Directory of Open Access Journals (Sweden)

    Woolliams John A

    2000-01-01

    Full Text Available Abstract In conventional gene flow theory the rate of genetic gain is calculated as the summed products of genetic selection differential and asymptotic proportion of genes deriving from sex-age groups. Recent studies have shown that asymptotic proportions of genes predicted from conventional gene flow theory may deviate considerably from true proportions. However, the rate of genetic gain predicted from conventional gene flow theory was accurate. The current note shows that the connection between asymptotic proportions of genes and rate of genetic gain that is embodied in conventional gene flow theory is invalid, even though genetic gain may be predicted correctly from it.

  12. A simple, robust and efficient high-order accurate shock-capturing scheme for compressible flows: Towards minimalism

    Science.gov (United States)

    Ohwada, Taku; Shibata, Yuki; Kato, Takuma; Nakamura, Taichi

    2018-06-01

    Developed is a high-order accurate shock-capturing scheme for the compressible Euler/Navier-Stokes equations; the formal accuracy is 5th order in space and 4th order in time. The performance and efficiency of the scheme are validated in various numerical tests. The main ingredients of the scheme are nothing special; they are variants of the standard numerical flux, MUSCL, the usual Lagrange's polynomial and the conventional Runge-Kutta method. The scheme can compute a boundary layer accurately with a rational resolution and capture a stationary contact discontinuity sharply without inner points. And yet it is endowed with high resistance against shock anomalies (carbuncle phenomenon, post-shock oscillations, etc.). A good balance between high robustness and low dissipation is achieved by blending three types of numerical fluxes according to physical situation in an intuitively easy-to-understand way. The performance of the scheme is largely comparable to that of WENO5-Rusanov, while its computational cost is 30-40% less than of that of the advanced scheme.

  13. Comparative phyloinformatics of virus genes at micro and macro levels in a distributed computing environment.

    Science.gov (United States)

    Singh, Dadabhai T; Trehan, Rahul; Schmidt, Bertil; Bretschneider, Timo

    2008-01-01

    Preparedness for a possible global pandemic caused by viruses such as the highly pathogenic influenza A subtype H5N1 has become a global priority. In particular, it is critical to monitor the appearance of any new emerging subtypes. Comparative phyloinformatics can be used to monitor, analyze, and possibly predict the evolution of viruses. However, in order to utilize the full functionality of available analysis packages for large-scale phyloinformatics studies, a team of computer scientists, biostatisticians and virologists is needed--a requirement which cannot be fulfilled in many cases. Furthermore, the time complexities of many algorithms involved leads to prohibitive runtimes on sequential computer platforms. This has so far hindered the use of comparative phyloinformatics as a commonly applied tool in this area. In this paper the graphical-oriented workflow design system called Quascade and its efficient usage for comparative phyloinformatics are presented. In particular, we focus on how this task can be effectively performed in a distributed computing environment. As a proof of concept, the designed workflows are used for the phylogenetic analysis of neuraminidase of H5N1 isolates (micro level) and influenza viruses (macro level). The results of this paper are hence twofold. Firstly, this paper demonstrates the usefulness of a graphical user interface system to design and execute complex distributed workflows for large-scale phyloinformatics studies of virus genes. Secondly, the analysis of neuraminidase on different levels of complexity provides valuable insights of this virus's tendency for geographical based clustering in the phylogenetic tree and also shows the importance of glycan sites in its molecular evolution. The current study demonstrates the efficiency and utility of workflow systems providing a biologist friendly approach to complex biological dataset analysis using high performance computing. In particular, the utility of the platform Quascade

  14. An accurate and efficient reliability-based design optimization using the second order reliability method and improved stability transformation method

    Science.gov (United States)

    Meng, Zeng; Yang, Dixiong; Zhou, Huanlin; Yu, Bo

    2018-05-01

    The first order reliability method has been extensively adopted for reliability-based design optimization (RBDO), but it shows inaccuracy in calculating the failure probability with highly nonlinear performance functions. Thus, the second order reliability method is required to evaluate the reliability accurately. However, its application for RBDO is quite challenge owing to the expensive computational cost incurred by the repeated reliability evaluation and Hessian calculation of probabilistic constraints. In this article, a new improved stability transformation method is proposed to search the most probable point efficiently, and the Hessian matrix is calculated by the symmetric rank-one update. The computational capability of the proposed method is illustrated and compared to the existing RBDO approaches through three mathematical and two engineering examples. The comparison results indicate that the proposed method is very efficient and accurate, providing an alternative tool for RBDO of engineering structures.

  15. Evaluation of Appropriate Reference Genes for Gene Expression Normalization during Watermelon Fruit Development.

    Directory of Open Access Journals (Sweden)

    Qiusheng Kong

    Full Text Available Gene expression analysis in watermelon (Citrullus lanatus fruit has drawn considerable attention with the availability of genome sequences to understand the regulatory mechanism of fruit development and to improve its quality. Real-time quantitative reverse-transcription PCR (qRT-PCR is a routine technique for gene expression analysis. However, appropriate reference genes for transcript normalization in watermelon fruits have not been well characterized. The aim of this study was to evaluate the appropriateness of 12 genes for their potential use as reference genes in watermelon fruits. Expression variations of these genes were measured in 48 samples obtained from 12 successive developmental stages of parthenocarpic and fertilized fruits of two watermelon genotypes by using qRT-PCR analysis. Considering the effects of genotype, fruit setting method, and developmental stage, geNorm determined clathrin adaptor complex subunit (ClCAC, β-actin (ClACT, and alpha tubulin 5 (ClTUA5 as the multiple reference genes in watermelon fruit. Furthermore, ClCAC alone or together with SAND family protein (ClSAND was ranked as the single or two best reference genes by NormFinder. By using the top-ranked reference genes to normalize the transcript abundance of phytoene synthase (ClPSY1, a good correlation between lycopene accumulation and ClPSY1 expression pattern was observed in ripening watermelon fruit. These validated reference genes will facilitate the accurate measurement of gene expression in the studies on watermelon fruit biology.

  16. Evaluation of Appropriate Reference Genes for Gene Expression Normalization during Watermelon Fruit Development.

    Science.gov (United States)

    Kong, Qiusheng; Yuan, Jingxian; Gao, Lingyun; Zhao, Liqiang; Cheng, Fei; Huang, Yuan; Bie, Zhilong

    2015-01-01

    Gene expression analysis in watermelon (Citrullus lanatus) fruit has drawn considerable attention with the availability of genome sequences to understand the regulatory mechanism of fruit development and to improve its quality. Real-time quantitative reverse-transcription PCR (qRT-PCR) is a routine technique for gene expression analysis. However, appropriate reference genes for transcript normalization in watermelon fruits have not been well characterized. The aim of this study was to evaluate the appropriateness of 12 genes for their potential use as reference genes in watermelon fruits. Expression variations of these genes were measured in 48 samples obtained from 12 successive developmental stages of parthenocarpic and fertilized fruits of two watermelon genotypes by using qRT-PCR analysis. Considering the effects of genotype, fruit setting method, and developmental stage, geNorm determined clathrin adaptor complex subunit (ClCAC), β-actin (ClACT), and alpha tubulin 5 (ClTUA5) as the multiple reference genes in watermelon fruit. Furthermore, ClCAC alone or together with SAND family protein (ClSAND) was ranked as the single or two best reference genes by NormFinder. By using the top-ranked reference genes to normalize the transcript abundance of phytoene synthase (ClPSY1), a good correlation between lycopene accumulation and ClPSY1 expression pattern was observed in ripening watermelon fruit. These validated reference genes will facilitate the accurate measurement of gene expression in the studies on watermelon fruit biology.

  17. Development of dual stream PCRTM-SOLAR for fast and accurate radiative transfer modeling in the cloudy atmosphere with solar radiation

    Science.gov (United States)

    Yang, Q.; Liu, X.; Wu, W.; Kizer, S.; Baize, R. R.

    2016-12-01

    Fast and accurate radiative transfer model is the key for satellite data assimilation and observation system simulation experiments for numerical weather prediction and climate study applications. We proposed and developed a dual stream PCRTM-SOLAR model which may simulate radiative transfer in the cloudy atmosphere with solar radiation quickly and accurately. Multi-scattering of multiple layers of clouds/aerosols is included in the model. The root-mean-square errors are usually less than 5x10-4 mW/cm2.sr.cm-1. The computation speed is 3 to 4 orders of magnitude faster than the medium speed correlated-k option MODTRAN5. This model will enable a vast new set of scientific calculations that were previously limited due to the computational expenses of available radiative transfer models.

  18. Meta-analysis of Cancer Gene Profiling Data.

    Science.gov (United States)

    Roy, Janine; Winter, Christof; Schroeder, Michael

    2016-01-01

    The simultaneous measurement of thousands of genes gives the opportunity to personalize and improve cancer therapy. In addition, the integration of meta-data such as protein-protein interaction (PPI) information into the analyses helps in the identification and prioritization of genes from these screens. Here, we describe a computational approach that identifies genes prognostic for outcome by combining gene profiling data from any source with a network of known relationships between genes.

  19. Computational Science at the Argonne Leadership Computing Facility

    Science.gov (United States)

    Romero, Nichols

    2014-03-01

    The goal of the Argonne Leadership Computing Facility (ALCF) is to extend the frontiers of science by solving problems that require innovative approaches and the largest-scale computing systems. ALCF's most powerful computer - Mira, an IBM Blue Gene/Q system - has nearly one million cores. How does one program such systems? What software tools are available? Which scientific and engineering applications are able to utilize such levels of parallelism? This talk will address these questions and describe a sampling of projects that are using ALCF systems in their research, including ones in nanoscience, materials science, and chemistry. Finally, the ways to gain access to ALCF resources will be presented. This research used resources of the Argonne Leadership Computing Facility at Argonne National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02-06CH11357.

  20. Evaluation of Reference Genes for Real-Time Quantitative PCR Analysis in Larvae of Spodoptera litura Exposed to Azadirachtin Stress Conditions

    Directory of Open Access Journals (Sweden)

    Benshui Shu

    2018-04-01

    Full Text Available Azadirachtin is an efficient and broad-spectrum botanical insecticide against more than 150 kinds of agricultural pests with the effects of mortality, antifeedant and growth regulation. Real-time quantitative polymerase chain reaction (RT-qPCR could be one of the powerful tools to analyze the gene expression level and investigate the mechanism of azadirachtin at transcriptional level, however, the ideal reference genes are needed to normalize the expression profiling of target genes. In this present study, the fragments of eight candidate reference genes were cloned and identified from the pest Spodoptera litura. In addition, the expression stability of these genes in different samples from larvae of control and azadirachtin treatments were evaluated by the computational methods of NormFinder, BestKeeper, Delta CT, geNorm, and RefFinder. According to our results, two of the reference genes should be the optimal number for RT-qPCR analysis. Furthermore, the best reference genes for different samples were showed as followed: EF-1α and EF2 for cuticle, β-Tubulin and RPL7A for fat body, EF2 and Actin for midgut, EF2 and RPL13A for larva and RPL13A and RPL7A for all the samples. Our results established a reliable normalization for RT-qPCR experiments in S. litura and ensure the data more accurate for the mechanism analysis of azadirachtin.

  1. Evaluation of Reference Genes for Real-Time Quantitative PCR Analysis in Larvae of Spodoptera litura Exposed to Azadirachtin Stress Conditions.

    Science.gov (United States)

    Shu, Benshui; Zhang, Jingjing; Cui, Gaofeng; Sun, Ranran; Sethuraman, Veeran; Yi, Xin; Zhong, Guohua

    2018-01-01

    Azadirachtin is an efficient and broad-spectrum botanical insecticide against more than 150 kinds of agricultural pests with the effects of mortality, antifeedant and growth regulation. Real-time quantitative polymerase chain reaction (RT-qPCR) could be one of the powerful tools to analyze the gene expression level and investigate the mechanism of azadirachtin at transcriptional level, however, the ideal reference genes are needed to normalize the expression profiling of target genes. In this present study, the fragments of eight candidate reference genes were cloned and identified from the pest Spodoptera litura . In addition, the expression stability of these genes in different samples from larvae of control and azadirachtin treatments were evaluated by the computational methods of NormFinder, BestKeeper, Delta CT, geNorm, and RefFinder. According to our results, two of the reference genes should be the optimal number for RT-qPCR analysis. Furthermore, the best reference genes for different samples were showed as followed: EF-1α and EF2 for cuticle, β-Tubulin and RPL7A for fat body, EF2 and Actin for midgut, EF2 and RPL13A for larva and RPL13A and RPL7A for all the samples. Our results established a reliable normalization for RT-qPCR experiments in S. litura and ensure the data more accurate for the mechanism analysis of azadirachtin.

  2. Evaluation of endogenous control gene(s) for gene expression studies in human blood exposed to 60Co γ-rays ex vivo.

    Science.gov (United States)

    Vaiphei, S Thangminlal; Keppen, Joshua; Nongrum, Saibadaiahun; Chaubey, R C; Kma, L; Sharan, R N

    2015-01-01

    In gene expression studies, it is critical to normalize data using a stably expressed endogenous control gene in order to obtain accurate and reliable results. However, we currently do not have a universally applied endogenous control gene for normalization of data for gene expression studies, particularly those involving (60)Co γ-ray-exposed human blood samples. In this study, a comparative assessment of the gene expression of six widely used housekeeping endogenous control genes, namely 18S, ACTB, B2M, GAPDH, MT-ATP6 and CDKN1A, was undertaken for a range of (60)Co γ-ray doses (0.5, 1.0, 2.0 and 4.0 Gy) at 8.4 Gy min(-1) at 0 and 24 h post-irradiation time intervals. Using the NormFinder algorithm, real-time PCR data obtained from six individuals (three males and three females) were analyzed with respect to the threshold cycle (Ct) value and abundance, ΔCt pair-wise comparison, intra- and inter-group variability assessments, etc. GAPDH, either alone or in combination with 18S, was found to be the most suitable endogenous control gene and should be used in gene expression studies, especially those involving qPCR of γ-ray-exposed human blood samples. © The Author 2014. Published by Oxford University Press on behalf of The Japan Radiation Research Society and Japanese Society for Radiation Oncology.

  3. Cerebral fat embolism: Use of MR spectroscopy for accurate diagnosis

    Directory of Open Access Journals (Sweden)

    Laxmi Kokatnur

    2015-01-01

    Full Text Available Cerebral fat embolism (CFE is an uncommon but serious complication following orthopedic procedures. It usually presents with altered mental status, and can be a part of fat embolism syndrome (FES if associated with cutaneous and respiratory manifestations. Because of the presence of other common factors affecting the mental status, particularly in the postoperative period, the diagnosis of CFE can be challenging. Magnetic resonance imaging (MRI of brain typically shows multiple lesions distributed predominantly in the subcortical region, which appear as hyperintense lesions on T2 and diffusion weighted images. Although the location offers a clue, the MRI findings are not specific for CFE. Watershed infarcts, hypoxic encephalopathy, disseminated infections, demyelinating disorders, diffuse axonal injury can also show similar changes on MRI of brain. The presence of fat in these hyperintense lesions, identified by MR spectroscopy as raised lipid peaks will help in accurate diagnosis of CFE. Normal brain tissue or conditions producing similar MRI changes will not show any lipid peak on MR spectroscopy. We present a case of CFE initially misdiagnosed as brain stem stroke based on clinical presentation and cranial computed tomography (CT scan, and later, MR spectroscopy elucidated the accurate diagnosis.

  4. Generating Facial Expressions Using an Anatomically Accurate Biomechanical Model.

    Science.gov (United States)

    Wu, Tim; Hung, Alice; Mithraratne, Kumar

    2014-11-01

    This paper presents a computational framework for modelling the biomechanics of human facial expressions. A detailed high-order (Cubic-Hermite) finite element model of the human head was constructed using anatomical data segmented from magnetic resonance images. The model includes a superficial soft-tissue continuum consisting of skin, the subcutaneous layer and the superficial Musculo-Aponeurotic system. Embedded within this continuum mesh, are 20 pairs of facial muscles which drive facial expressions. These muscles were treated as transversely-isotropic and their anatomical geometries and fibre orientations were accurately depicted. In order to capture the relative composition of muscles and fat, material heterogeneity was also introduced into the model. Complex contact interactions between the lips, eyelids, and between superficial soft tissue continuum and deep rigid skeletal bones were also computed. In addition, this paper investigates the impact of incorporating material heterogeneity and contact interactions, which are often neglected in similar studies. Four facial expressions were simulated using the developed model and the results were compared with surface data obtained from a 3D structured-light scanner. Predicted expressions showed good agreement with the experimental data.

  5. Accurate metacognition for visual sensory memory representations.

    Science.gov (United States)

    Vandenbroucke, Annelinde R E; Sligte, Ilja G; Barrett, Adam B; Seth, Anil K; Fahrenfort, Johannes J; Lamme, Victor A F

    2014-04-01

    The capacity to attend to multiple objects in the visual field is limited. However, introspectively, people feel that they see the whole visual world at once. Some scholars suggest that this introspective feeling is based on short-lived sensory memory representations, whereas others argue that the feeling of seeing more than can be attended to is illusory. Here, we investigated this phenomenon by combining objective memory performance with subjective confidence ratings during a change-detection task. This allowed us to compute a measure of metacognition--the degree of knowledge that subjects have about the correctness of their decisions--for different stages of memory. We show that subjects store more objects in sensory memory than they can attend to but, at the same time, have similar metacognition for sensory memory and working memory representations. This suggests that these subjective impressions are not an illusion but accurate reflections of the richness of visual perception.

  6. TSaT-MUSIC: a novel algorithm for rapid and accurate ultrasonic 3D localization

    Science.gov (United States)

    Mizutani, Kyohei; Ito, Toshio; Sugimoto, Masanori; Hashizume, Hiromichi

    2011-12-01

    We describe a fast and accurate indoor localization technique using the multiple signal classification (MUSIC) algorithm. The MUSIC algorithm is known as a high-resolution method for estimating directions of arrival (DOAs) or propagation delays. A critical problem in using the MUSIC algorithm for localization is its computational complexity. Therefore, we devised a novel algorithm called Time Space additional Temporal-MUSIC, which can rapidly and simultaneously identify DOAs and delays of mul-ticarrier ultrasonic waves from transmitters. Computer simulations have proved that the computation time of the proposed algorithm is almost constant in spite of increasing numbers of incoming waves and is faster than that of existing methods based on the MUSIC algorithm. The robustness of the proposed algorithm is discussed through simulations. Experiments in real environments showed that the standard deviation of position estimations in 3D space is less than 10 mm, which is satisfactory for indoor localization.

  7. Synthetic analog computation in living cells.

    Science.gov (United States)

    Daniel, Ramiz; Rubens, Jacob R; Sarpeshkar, Rahul; Lu, Timothy K

    2013-05-30

    A central goal of synthetic biology is to achieve multi-signal integration and processing in living cells for diagnostic, therapeutic and biotechnology applications. Digital logic has been used to build small-scale circuits, but other frameworks may be needed for efficient computation in the resource-limited environments of cells. Here we demonstrate that synthetic analog gene circuits can be engineered to execute sophisticated computational functions in living cells using just three transcription factors. Such synthetic analog gene circuits exploit feedback to implement logarithmically linear sensing, addition, ratiometric and power-law computations. The circuits exhibit Weber's law behaviour as in natural biological systems, operate over a wide dynamic range of up to four orders of magnitude and can be designed to have tunable transfer functions. Our circuits can be composed to implement higher-order functions that are well described by both intricate biochemical models and simple mathematical functions. By exploiting analog building-block functions that are already naturally present in cells, this approach efficiently implements arithmetic operations and complex functions in the logarithmic domain. Such circuits may lead to new applications for synthetic biology and biotechnology that require complex computations with limited parts, need wide-dynamic-range biosensing or would benefit from the fine control of gene expression.

  8. Ranked retrieval of segmented nuclei for objective assessment of cancer gene repositioning

    Directory of Open Access Journals (Sweden)

    Cukierski William J

    2012-09-01

    Full Text Available Abstract Background Correct segmentation is critical to many applications within automated microscopy image analysis. Despite the availability of advanced segmentation algorithms, variations in cell morphology, sample preparation, and acquisition settings often lead to segmentation errors. This manuscript introduces a ranked-retrieval approach using logistic regression to automate selection of accurately segmented nuclei from a set of candidate segmentations. The methodology is validated on an application of spatial gene repositioning in breast cancer cell nuclei. Gene repositioning is analyzed in patient tissue sections by labeling sequences with fluorescence in situ hybridization (FISH, followed by measurement of the relative position of each gene from the nuclear center to the nuclear periphery. This technique requires hundreds of well-segmented nuclei per sample to achieve statistical significance. Although the tissue samples in this study contain a surplus of available nuclei, automatic identification of the well-segmented subset remains a challenging task. Results Logistic regression was applied to features extracted from candidate segmented nuclei, including nuclear shape, texture, context, and gene copy number, in order to rank objects according to the likelihood of being an accurately segmented nucleus. The method was demonstrated on a tissue microarray dataset of 43 breast cancer patients, comprising approximately 40,000 imaged nuclei in which the HES5 and FRA2 genes were labeled with FISH probes. Three trained reviewers independently classified nuclei into three classes of segmentation accuracy. In man vs. machine studies, the automated method outperformed the inter-observer agreement between reviewers, as measured by area under the receiver operating characteristic (ROC curve. Robustness of gene position measurements to boundary inaccuracies was demonstrated by comparing 1086 manually and automatically segmented nuclei. Pearson

  9. Computed tomography of human joints and radioactive waste drums

    International Nuclear Information System (INIS)

    Martz, Harry E.; Roberson, G. Patrick; Hollerbach, Karin; Logan, Clinton M.; Ashby, Elaine; Bernardi, Richard

    1999-01-01

    X- and gamma-ray imaging techniques in nondestructive evaluation (NDE) and assay (NDA) have seen increasing use in an array of industrial, environmental, military, and medical applications. Much of this growth in recent years is attributed to the rapid development of computed tomography (CT) and the use of NDE throughout the life-cycle of a product. Two diverse examples of CT are discussed, 1.) Our computational approach to normal joint kinematics and prosthetic joint analysis offers an opportunity to evaluate and improve prosthetic human joint replacements before they are manufactured or surgically implanted. Computed tomography data from scanned joints are segmented, resulting in the identification of bone and other tissues of interest, with emphasis on the articular surfaces. 2.) We are developing NDE and NDA techniques to analyze closed waste drums accurately and quantitatively. Active and passive computed tomography (A and PCT) is a comprehensive and accurate gamma-ray NDA method that can identify all detectable radioisotopes present in a container and measure their radioactivity

  10. Identification of a chicken (Gallus gallus) endogenous reference gene (Actb) and its application in meat adulteration.

    Science.gov (United States)

    Xiang, Wenjin; Shang, Ying; Wang, Qin; Xu, Yuancong; Zhu, Pengyu; Huang, Kunlun; Xu, Wentao

    2017-11-01

    The genes commonly used to determine meat species are mainly mitochondrial, but the copy numbers of such genes are high, meaning they cannot be accurately quantified. In this paper, for the first time, the chromosomal gene Actb was selected as an endogenous reference gene for chicken species. It was assayed in four different chicken varieties and 16 other species using both qualitative and quantitative PCR. No amplification of the Actb gene was found in species other than chicken and no allelic variations were detected in chicken. Southern blot and digital-PCR confirmed the Actb gene was present as a single copy in the chicken genome. The quantitative detection limit was 10pg of DNA, which is equivalent to eight copies. All experiments indicated that the Actb gene is a useful endogenous reference gene for chicken, and provides a convenient and accurate approach for detection of chicken in feed and food. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Identification and validation of reference genes for qRT-PCR studies of the obligate aphid pathogenic fungus Pandora neoaphidis during different developmental stages.

    Science.gov (United States)

    Zhang, Shutao; Chen, Chun; Xie, Tingna; Ye, Sudan

    2017-01-01

    The selection of stable reference genes is a critical step for the accurate quantification of gene expression. To identify and validate the reference genes in Pandora neoaphidis-an obligate aphid pathogenic fungus-the expression of 13classical candidate reference genes were evaluated by quantitative real-time reverse transcriptase polymerase chain reaction(qPCR) at four developmental stages (conidia, conidia with germ tubes, short hyphae and elongated hyphae). Four statistical algorithms, including geNorm, NormFinder, BestKeeper and Delta Ct method were used to rank putative reference genes according to their expression stability and indicate the best reference gene or combination of reference genes for accurate normalization. The analysis of comprehensive ranking revealed that ACT1and 18Swas the most stably expressed genes throughout the developmental stages. To further validate the suitability of the reference genes identified in this study, the expression of cell division control protein 25 (CDC25) and Chitinase 1(CHI1) genes were used to further confirm the validated candidate reference genes. Our study presented the first systematic study of reference gene(s) selection for P. neoaphidis study and provided guidelines to obtain more accurate qPCR results for future developmental efforts.

  12. A review for detecting gene-gene interactions using machine learning methods in genetic epidemiology.

    Science.gov (United States)

    Koo, Ching Lee; Liew, Mei Jing; Mohamad, Mohd Saberi; Salleh, Abdul Hakim Mohamed

    2013-01-01

    Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs), support vector machine (SVM), and random forests (RFs) in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease.

  13. A Review for Detecting Gene-Gene Interactions Using Machine Learning Methods in Genetic Epidemiology

    Directory of Open Access Journals (Sweden)

    Ching Lee Koo

    2013-01-01

    Full Text Available Recently, the greatest statistical computational challenge in genetic epidemiology is to identify and characterize the genes that interact with other genes and environment factors that bring the effect on complex multifactorial disease. These gene-gene interactions are also denoted as epitasis in which this phenomenon cannot be solved by traditional statistical method due to the high dimensionality of the data and the occurrence of multiple polymorphism. Hence, there are several machine learning methods to solve such problems by identifying such susceptibility gene which are neural networks (NNs, support vector machine (SVM, and random forests (RFs in such common and multifactorial disease. This paper gives an overview on machine learning methods, describing the methodology of each machine learning methods and its application in detecting gene-gene and gene-environment interactions. Lastly, this paper discussed each machine learning method and presents the strengths and weaknesses of each machine learning method in detecting gene-gene interactions in complex human disease.

  14. A graph-search framework for associating gene identifiers with documents

    Directory of Open Access Journals (Sweden)

    Cohen William W

    2006-10-01

    Full Text Available Abstract Background One step in the model organism database curation process is to find, for each article, the identifier of every gene discussed in the article. We consider a relaxation of this problem suitable for semi-automated systems, in which each article is associated with a ranked list of possible gene identifiers, and experimentally compare methods for solving this geneId ranking problem. In addition to baseline approaches based on combining named entity recognition (NER systems with a "soft dictionary" of gene synonyms, we evaluate a graph-based method which combines the outputs of multiple NER systems, as well as other sources of information, and a learning method for reranking the output of the graph-based method. Results We show that named entity recognition (NER systems with similar F-measure performance can have significantly different performance when used with a soft dictionary for geneId-ranking. The graph-based approach can outperform any of its component NER systems, even without learning, and learning can further improve the performance of the graph-based ranking approach. Conclusion The utility of a named entity recognition (NER system for geneId-finding may not be accurately predicted by its entity-level F1 performance, the most common performance measure. GeneId-ranking systems are best implemented by combining several NER systems. With appropriate combination methods, usefully accurate geneId-ranking systems can be constructed based on easily-available resources, without resorting to problem-specific, engineered components.

  15. Interactive design computation : A case study on quantum design paradigm

    NARCIS (Netherlands)

    Feng, H.

    2013-01-01

    The ever-increasing complexity of design processes fosters novel design computation models to be employed in architectural research and design in order to facilitate accurate data processing and refined decision making. These computation models have enabled designers to work with complex geometry

  16. Towards accurate emergency response behavior

    International Nuclear Information System (INIS)

    Sargent, T.O.

    1981-01-01

    Nuclear reactor operator emergency response behavior has persisted as a training problem through lack of information. The industry needs an accurate definition of operator behavior in adverse stress conditions, and training methods which will produce the desired behavior. Newly assembled information from fifty years of research into human behavior in both high and low stress provides a more accurate definition of appropriate operator response, and supports training methods which will produce the needed control room behavior. The research indicates that operator response in emergencies is divided into two modes, conditioned behavior and knowledge based behavior. Methods which assure accurate conditioned behavior, and provide for the recovery of knowledge based behavior, are described in detail

  17. Development of gene diagnosis for diabetes and cholecystis based on gene analysis of CCK-A receptor

    International Nuclear Information System (INIS)

    Kono, Akira

    1998-01-01

    The gene structures of CCK, A type receptor in human, the rat and the mouse were investigated aiming to clarify that the aberration of the gene is involved in the incidences of diabetes and cholecystis. In this fiscal year, 1997, the normal structure of the gene and the accurate base sequence were analyzed using DNA fragments bound to 32 P-labelled cDNA of human CCKAR originated from the gene library of leucocyte. This gene contained about 2.2 x 10 5 base pairs and the base sequence was completely determined and registered to Japan DNA data bank (D85606). In addition, the genome structures and base sequences of mouse and rat CCKAR were analyzed and registered (D 85605 and D 50608, respectively). The differences in the base sequence of CCKAR among the species were found in the promotor region and the intron regions, suggesting that there might be differences in splicing among species. (M.N.)

  18. Quantum Monte Carlo: Faster, More Reliable, And More Accurate

    Science.gov (United States)

    Anderson, Amos Gerald

    2010-06-01

    The Schrodinger Equation has been available for about 83 years, but today, we still strain to apply it accurately to molecules of interest. The difficulty is not theoretical in nature, but practical, since we're held back by lack of sufficient computing power. Consequently, effort is applied to find acceptable approximations to facilitate real time solutions. In the meantime, computer technology has begun rapidly advancing and changing the way we think about efficient algorithms. For those who can reorganize their formulas to take advantage of these changes and thereby lift some approximations, incredible new opportunities await. Over the last decade, we've seen the emergence of a new kind of computer processor, the graphics card. Designed to accelerate computer games by optimizing quantity instead of quality in processor, they have become of sufficient quality to be useful to some scientists. In this thesis, we explore the first known use of a graphics card to computational chemistry by rewriting our Quantum Monte Carlo software into the requisite "data parallel" formalism. We find that notwithstanding precision considerations, we are able to speed up our software by about a factor of 6. The success of a Quantum Monte Carlo calculation depends on more than just processing power. It also requires the scientist to carefully design the trial wavefunction used to guide simulated electrons. We have studied the use of Generalized Valence Bond wavefunctions to simply, and yet effectively, captured the essential static correlation in atoms and molecules. Furthermore, we have developed significantly improved two particle correlation functions, designed with both flexibility and simplicity considerations, representing an effective and reliable way to add the necessary dynamic correlation. Lastly, we present our method for stabilizing the statistical nature of the calculation, by manipulating configuration weights, thus facilitating efficient and robust calculations. Our

  19. Complex regulation of Hsf1-Skn7 activities by the catalytic subunits of PKA in Saccharomyces cerevisiae: experimental and computational evidences.

    Science.gov (United States)

    Pérez-Landero, Sergio; Sandoval-Motta, Santiago; Martínez-Anaya, Claudia; Yang, Runying; Folch-Mallol, Jorge Luis; Martínez, Luz María; Ventura, Larissa; Guillén-Navarro, Karina; Aldana-González, Maximino; Nieto-Sotelo, Jorge

    2015-07-27

    The cAMP-dependent protein kinase regulatory network (PKA-RN) regulates metabolism, memory, learning, development, and response to stress. Previous models of this network considered the catalytic subunits (CS) as a single entity, overlooking their functional individualities. Furthermore, PKA-RN dynamics are often measured through cAMP levels in nutrient-depleted cells shortly after being fed with glucose, dismissing downstream physiological processes. Here we show that temperature stress, along with deletion of PKA-RN genes, significantly affected HSE-dependent gene expression and the dynamics of the PKA-RN in cells growing in exponential phase. Our genetic analysis revealed complex regulatory interactions between the CS that influenced the inhibition of Hsf1/Skn7 transcription factors. Accordingly, we found new roles in growth control and stress response for Hsf1/Skn7 when PKA activity was low (cdc25Δ cells). Experimental results were used to propose an interaction scheme for the PKA-RN and to build an extension of a classic synchronous discrete modeling framework. Our computational model reproduced the experimental data and predicted complex interactions between the CS and the existence of a repressor of Hsf1/Skn7 that is activated by the CS. Additional genetic analysis identified Ssa1 and Ssa2 chaperones as such repressors. Further modeling of the new data foresaw a third repressor of Hsf1/Skn7, active only in the absence of Tpk2. By averaging the network state over all its attractors, a good quantitative agreement between computational and experimental results was obtained, as the averages reflected more accurately the population measurements. The assumption of PKA being one molecular entity has hindered the study of a wide range of behaviors. Additionally, the dynamics of HSE-dependent gene expression cannot be simulated accurately by considering the activity of single PKA-RN components (i.e., cAMP, individual CS, Bcy1, etc.). We show that the differential

  20. Validation of commonly used reference genes for sleep-related gene expression studies

    Directory of Open Access Journals (Sweden)

    Castro Rosa MRPS

    2009-05-01

    Full Text Available Abstract Background Sleep is a restorative process and is essential for maintenance of mental and physical health. In an attempt to understand the complexity of sleep, multidisciplinary strategies, including genetic approaches, have been applied to sleep research. Although quantitative real time PCR has been used in previous sleep-related gene expression studies, proper validation of reference genes is currently lacking. Thus, we examined the effect of total or paradoxical sleep deprivation (TSD or PSD on the expression stability of the following frequently used reference genes in brain and blood: beta-actin (b-actin, beta-2-microglobulin (B2M, glyceraldehyde-3-phosphate dehydrogenase (GAPDH, and hypoxanthine guanine phosphoribosyl transferase (HPRT. Results Neither TSD nor PSD affected the expression stability of all tested genes in both tissues indicating that b-actin, B2M, GAPDH and HPRT are appropriate reference genes for the sleep-related gene expression studies. In order to further verify these results, the relative expression of brain derived neurotrophic factor (BDNF and glycerol-3-phosphate dehydrogenase1 (GPD1 was evaluated in brain and blood, respectively. The normalization with each of four reference genes produced similar pattern of expression in control and sleep deprived rats, but subtle differences in the magnitude of expression fold change were observed which might affect the statistical significance. Conclusion This study demonstrated that sleep deprivation does not alter the expression stability of commonly used reference genes in brain and blood. Nonetheless, the use of multiple reference genes in quantitative RT-PCR is required for the accurate results.

  1. MAGMA: generalized gene-set analysis of GWAS data.

    Science.gov (United States)

    de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle

    2015-04-01

    By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.

  2. Statistical physics of fracture: scientific discovery through high-performance computing

    International Nuclear Information System (INIS)

    Kumar, Phani; Nukala, V V; Simunovic, Srdan; Mills, Richard T

    2006-01-01

    The paper presents the state-of-the-art algorithmic developments for simulating the fracture of disordered quasi-brittle materials using discrete lattice systems. Large scale simulations are often required to obtain accurate scaling laws; however, due to computational complexity, the simulations using the traditional algorithms were limited to small system sizes. We have developed two algorithms: a multiple sparse Cholesky downdating scheme for simulating 2D random fuse model systems, and a block-circulant preconditioner for simulating 2D random fuse model systems. Using these algorithms, we were able to simulate fracture of largest ever lattice system sizes (L = 1024 in 2D, and L = 64 in 3D) with extensive statistical sampling. Our recent simulations on 1024 processors of Cray-XT3 and IBM Blue-Gene/L have further enabled us to explore fracture of 3D lattice systems of size L = 200, which is a significant computational achievement. These largest ever numerical simulations have enhanced our understanding of physics of fracture; in particular, we analyze damage localization and its deviation from percolation behavior, scaling laws for damage density, universality of fracture strength distribution, size effect on the mean fracture strength, and finally the scaling of crack surface roughness

  3. Computed tomographic diagnosis of abdominal abscess in childhood

    International Nuclear Information System (INIS)

    Kuhn, J.P.; Berger, P.E.

    1980-01-01

    Twenty-eight children suspected clinically of having an abdominal abscess were examined by CT. Eighteen had gallium 67 citrate scans and 22 had ultrasound studies. Computed tomography was found to be the most accurate test for diagnosis and evaluation of an abscess and the computed tomographic appearance of abscess is illustrated. However, because of cost factors, radiation dose, and clinical considerations, computed tomography is not always the first modality of choice in evaluating a suspected abdominal abscess [fr

  4. Accurate ab initio vibrational energies of methyl chloride

    International Nuclear Information System (INIS)

    Owens, Alec; Yurchenko, Sergei N.; Yachmenev, Andrey; Tennyson, Jonathan; Thiel, Walter

    2015-01-01

    Two new nine-dimensional potential energy surfaces (PESs) have been generated using high-level ab initio theory for the two main isotopologues of methyl chloride, CH 3 35 Cl and CH 3 37 Cl. The respective PESs, CBS-35  HL , and CBS-37  HL , are based on explicitly correlated coupled cluster calculations with extrapolation to the complete basis set (CBS) limit, and incorporate a range of higher-level (HL) additive energy corrections to account for core-valence electron correlation, higher-order coupled cluster terms, scalar relativistic effects, and diagonal Born-Oppenheimer corrections. Variational calculations of the vibrational energy levels were performed using the computer program TROVE, whose functionality has been extended to handle molecules of the form XY 3 Z. Fully converged energies were obtained by means of a complete vibrational basis set extrapolation. The CBS-35  HL and CBS-37  HL PESs reproduce the fundamental term values with root-mean-square errors of 0.75 and 1.00 cm −1 , respectively. An analysis of the combined effect of the HL corrections and CBS extrapolation on the vibrational wavenumbers indicates that both are needed to compute accurate theoretical results for methyl chloride. We believe that it would be extremely challenging to go beyond the accuracy currently achieved for CH 3 Cl without empirical refinement of the respective PESs

  5. An accurate approximate solution of optimal sequential age replacement policy for a finite-time horizon

    International Nuclear Information System (INIS)

    Jiang, R.

    2009-01-01

    It is difficult to find the optimal solution of the sequential age replacement policy for a finite-time horizon. This paper presents an accurate approximation to find an approximate optimal solution of the sequential replacement policy. The proposed approximation is computationally simple and suitable for any failure distribution. Their accuracy is illustrated by two examples. Based on the approximate solution, an approximate estimate for the total cost is derived.

  6. The prediction of candidate genes for cervix related cancer through gene ontology and graph theoretical approach.

    Science.gov (United States)

    Hindumathi, V; Kranthi, T; Rao, S B; Manimaran, P

    2014-06-01

    With rapidly changing technology, prediction of candidate genes has become an indispensable task in recent years mainly in the field of biological research. The empirical methods for candidate gene prioritization that succors to explore the potential pathway between genetic determinants and complex diseases are highly cumbersome and labor intensive. In such a scenario predicting potential targets for a disease state through in silico approaches are of researcher's interest. The prodigious availability of protein interaction data coupled with gene annotation renders an ease in the accurate determination of disease specific candidate genes. In our work we have prioritized the cervix related cancer candidate genes by employing Csaba Ortutay and his co-workers approach of identifying the candidate genes through graph theoretical centrality measures and gene ontology. With the advantage of the human protein interaction data, cervical cancer gene sets and the ontological terms, we were able to predict 15 novel candidates for cervical carcinogenesis. The disease relevance of the anticipated candidate genes was corroborated through a literature survey. Also the presence of the drugs for these candidates was detected through Therapeutic Target Database (TTD) and DrugMap Central (DMC) which affirms that they may be endowed as potential drug targets for cervical cancer.

  7. Gene expression in early stage cervical cancer

    NARCIS (Netherlands)

    Biewenga, Petra; Buist, Marrije R.; Moerland, Perry D.; van Thernaat, Emiel Ver Loren; van Kampen, Antoine H. C.; ten Kate, Fiebo J. W.; Baas, Frank

    2008-01-01

    Objective. Pelvic lymph node metastases are the main prognostic factor for survival in early stage cervical cancer, yet accurate detection methods before surgery are lacking. In this study, we examined whether gene expression profiling can predict the presence of lymph node metastasis in early stage

  8. An accurate method for the determination of unlike potential parameters from thermal diffusion data

    International Nuclear Information System (INIS)

    El-Geubeily, S.

    1997-01-01

    A new method is introduced by means of which the unlike intermolecular potential parameters can be determined from the experimental measurements of the thermal diffusion factor as a function of temperature. The method proved to be easy, accurate, and applicable two-, three-, and four-parameter potential functions whose collision integrals are available. The potential parameters computed by this method are found to provide a faith full representation of the thermal diffusion data under consideration. 3 figs., 4 tabs

  9. AN ACCURATE ORBITAL INTEGRATOR FOR THE RESTRICTED THREE-BODY PROBLEM AS A SPECIAL CASE OF THE DISCRETE-TIME GENERAL THREE-BODY PROBLEM

    International Nuclear Information System (INIS)

    Minesaki, Yukitaka

    2013-01-01

    For the restricted three-body problem, we propose an accurate orbital integration scheme that retains all conserved quantities of the two-body problem with two primaries and approximately preserves the Jacobi integral. The scheme is obtained by taking the limit as mass approaches zero in the discrete-time general three-body problem. For a long time interval, the proposed scheme precisely reproduces various periodic orbits that cannot be accurately computed by other generic integrators

  10. Ultradian hormone stimulation induces glucocorticoid receptor-mediated pulses of gene transcription.

    Science.gov (United States)

    Stavreva, Diana A; Wiench, Malgorzata; John, Sam; Conway-Campbell, Becky L; McKenna, Mervyn A; Pooley, John R; Johnson, Thomas A; Voss, Ty C; Lightman, Stafford L; Hager, Gordon L

    2009-09-01

    Studies on glucocorticoid receptor (GR) action typically assess gene responses by long-term stimulation with synthetic hormones. As corticosteroids are released from adrenal glands in a circadian and high-frequency (ultradian) mode, such treatments may not provide an accurate assessment of physiological hormone action. Here we demonstrate that ultradian hormone stimulation induces cyclic GR-mediated transcriptional regulation, or gene pulsing, both in cultured cells and in animal models. Equilibrium receptor-occupancy of regulatory elements precisely tracks the ligand pulses. Nascent RNA transcripts from GR-regulated genes are released in distinct quanta, demonstrating a profound difference between the transcriptional programs induced by ultradian and constant stimulation. Gene pulsing is driven by rapid GR exchange with response elements and by GR recycling through the chaperone machinery, which promotes GR activation and reactivation in response to the ultradian hormone release, thus coupling promoter activity to the naturally occurring fluctuations in hormone levels. The GR signalling pathway has been optimized for a prompt and timely response to fluctuations in hormone levels, indicating that biologically accurate regulation of gene targets by GR requires an ultradian mode of hormone stimulation.

  11. Multiscale Methods, Parallel Computation, and Neural Networks for Real-Time Computer Vision.

    Science.gov (United States)

    Battiti, Roberto

    1990-01-01

    This thesis presents new algorithms for low and intermediate level computer vision. The guiding ideas in the presented approach are those of hierarchical and adaptive processing, concurrent computation, and supervised learning. Processing of the visual data at different resolutions is used not only to reduce the amount of computation necessary to reach the fixed point, but also to produce a more accurate estimation of the desired parameters. The presented adaptive multiple scale technique is applied to the problem of motion field estimation. Different parts of the image are analyzed at a resolution that is chosen in order to minimize the error in the coefficients of the differential equations to be solved. Tests with video-acquired images show that velocity estimation is more accurate over a wide range of motion with respect to the homogeneous scheme. In some cases introduction of explicit discontinuities coupled to the continuous variables can be used to avoid propagation of visual information from areas corresponding to objects with different physical and/or kinematic properties. The human visual system uses concurrent computation in order to process the vast amount of visual data in "real -time." Although with different technological constraints, parallel computation can be used efficiently for computer vision. All the presented algorithms have been implemented on medium grain distributed memory multicomputers with a speed-up approximately proportional to the number of processors used. A simple two-dimensional domain decomposition assigns regions of the multiresolution pyramid to the different processors. The inter-processor communication needed during the solution process is proportional to the linear dimension of the assigned domain, so that efficiency is close to 100% if a large region is assigned to each processor. Finally, learning algorithms are shown to be a viable technique to engineer computer vision systems for different applications starting from

  12. Selection and validation of endogenous reference genes for qRT-PCR analysis in leafy spurge (Euphorbia esula.

    Directory of Open Access Journals (Sweden)

    Wun S Chao

    Full Text Available Quantitative real-time polymerase chain reaction (qRT-PCR is the most important tool in measuring levels of gene expression due to its accuracy, specificity, and sensitivity. However, the accuracy of qRT-PCR analysis strongly depends on transcript normalization using stably expressed reference genes. The aim of this study was to find internal reference genes for qRT-PCR analysis in various experimental conditions for seed, adventitious underground bud, and other organs of leafy spurge. Eleven candidate reference genes (BAM4, PU1, TRP-like, FRO1, ORE9, BAM1, SEU, ARF2, KAPP, ZTL, and MPK4 were selected from among 171 genes based on expression stabilities during seed germination and bud growth. The other ten candidate reference genes were selected from three different sources: (1 3 stably expressed leafy spurge genes (60S, bZIP21, and MD-100 identified from the analyses of leafy spurge microarray data; (2 3 orthologs of Arabidopsis "general purpose" traditional reference genes (GAPDH_1, GAPDH_2, and UBC; and (3 4 orthologs of Arabidopsis stably expressed genes (UBC9, SAND, PTB, and F-box identified from Affymetrix ATH1 whole-genome GeneChip studies. The expression stabilities of these 21 genes were ranked based on the C(T values of 72 samples using four different computation programs including geNorm, Normfinder, BestKeeper, and the comparative ΔC(T method. Our analyses revealed SAND, PTB, ORE9, and ARF2 to be the most appropriate reference genes for accurate normalization of gene expression data. Since SAND and PTB were obtained from 4 orthologs of Arabidopsis, while ORE9 and ARF2 were selected from 171 leafy spurge genes, it was more efficient to identify good reference genes from the orthologs of other plant species that were known to be stably expressed than that of randomly testing endogenous genes. Nevertheless, the two newly identified leafy spurge genes, ORE9 and ARF2, can serve as orthologous candidates in the search for reference genes

  13. Identification and validation of reference genes for qRT-PCR studies of the obligate aphid pathogenic fungus Pandora neoaphidis during different developmental stages.

    Directory of Open Access Journals (Sweden)

    Shutao Zhang

    Full Text Available The selection of stable reference genes is a critical step for the accurate quantification of gene expression. To identify and validate the reference genes in Pandora neoaphidis-an obligate aphid pathogenic fungus-the expression of 13classical candidate reference genes were evaluated by quantitative real-time reverse transcriptase polymerase chain reaction(qPCR at four developmental stages (conidia, conidia with germ tubes, short hyphae and elongated hyphae. Four statistical algorithms, including geNorm, NormFinder, BestKeeper and Delta Ct method were used to rank putative reference genes according to their expression stability and indicate the best reference gene or combination of reference genes for accurate normalization. The analysis of comprehensive ranking revealed that ACT1and 18Swas the most stably expressed genes throughout the developmental stages. To further validate the suitability of the reference genes identified in this study, the expression of cell division control protein 25 (CDC25 and Chitinase 1(CHI1 genes were used to further confirm the validated candidate reference genes. Our study presented the first systematic study of reference gene(s selection for P. neoaphidis study and provided guidelines to obtain more accurate qPCR results for future developmental efforts.

  14. Gene Prediction in Metagenomic Fragments with Deep Learning

    Directory of Open Access Journals (Sweden)

    Shao-Wu Zhang

    2017-01-01

    Full Text Available Next generation sequencing technologies used in metagenomics yield numerous sequencing fragments which come from thousands of different species. Accurately identifying genes from metagenomics fragments is one of the most fundamental issues in metagenomics. In this article, by fusing multifeatures (i.e., monocodon usage, monoamino acid usage, ORF length coverage, and Z-curve features and using deep stacking networks learning model, we present a novel method (called Meta-MFDL to predict the metagenomic genes. The results with 10 CV and independent tests show that Meta-MFDL is a powerful tool for identifying genes from metagenomic fragments.

  15. Computer simulation of ductile fracture

    International Nuclear Information System (INIS)

    Wilkins, M.L.; Streit, R.D.

    1979-01-01

    Finite difference computer simulation programs are capable of very accurate solutions to problems in plasticity with large deformations and rotation. This opens the possibility of developing models of ductile fracture by correlating experiments with equivalent computer simulations. Selected experiments were done to emphasize different aspects of the model. A difficult problem is the establishment of a fracture-size effect. This paper is a study of the strain field around notched tensile specimens of aluminum 6061-T651. A series of geometrically scaled specimens are tested to fracture. The scaled experiments are conducted for different notch radius-to-diameter ratios. The strains at fracture are determined from computer simulations. An estimate is made of the fracture-size effect

  16. Screened exchange hybrid density functional for accurate and efficient structures and interaction energies.

    Science.gov (United States)

    Brandenburg, Jan Gerit; Caldeweyher, Eike; Grimme, Stefan

    2016-06-21

    We extend the recently introduced PBEh-3c global hybrid density functional [S. Grimme et al., J. Chem. Phys., 2015, 143, 054107] by a screened Fock exchange variant based on the Henderson-Janesko-Scuseria exchange hole model. While the excellent performance of the global hybrid is maintained for small covalently bound molecules, its performance for computed condensed phase mass densities is further improved. Most importantly, a speed up of 30 to 50% can be achieved and especially for small orbital energy gap cases, the method is numerically much more robust. The latter point is important for many applications, e.g., for metal-organic frameworks, organic semiconductors, or protein structures. This enables an accurate density functional based electronic structure calculation of a full DNA helix structure on a single core desktop computer which is presented as an example in addition to comprehensive benchmark results.

  17. Autonomic Closure for Turbulent Flows Using Approximate Bayesian Computation

    Science.gov (United States)

    Doronina, Olga; Christopher, Jason; Hamlington, Peter; Dahm, Werner

    2017-11-01

    Autonomic closure is a new technique for achieving fully adaptive and physically accurate closure of coarse-grained turbulent flow governing equations, such as those solved in large eddy simulations (LES). Although autonomic closure has been shown in recent a priori tests to more accurately represent unclosed terms than do dynamic versions of traditional LES models, the computational cost of the approach makes it challenging to implement for simulations of practical turbulent flows at realistically high Reynolds numbers. The optimization step used in the approach introduces large matrices that must be inverted and is highly memory intensive. In order to reduce memory requirements, here we propose to use approximate Bayesian computation (ABC) in place of the optimization step, thereby yielding a computationally-efficient implementation of autonomic closure that trades memory-intensive for processor-intensive computations. The latter challenge can be overcome as co-processors such as general purpose graphical processing units become increasingly available on current generation petascale and exascale supercomputers. In this work, we outline the formulation of ABC-enabled autonomic closure and present initial results demonstrating the accuracy and computational cost of the approach.

  18. Computed tomography of the ossicles

    International Nuclear Information System (INIS)

    Chakeres, D.W.; Weider, D.J.

    1985-01-01

    Otologists and otolaryngologists have described in detail the disorders which are unique to the ossicles. However the anatomy and spectrum of pathology and anatomy of the ossicles are not familiar to most radiologists. Recent advances in computed tomography (CT) and a systematic approach to evaluation now allow accurate identification of even subtle abnormalities of the ossicles. We present the normal anatomy, ossicular abnormalities, and indications for computed tomographic study. Because of the greater diagnostic capability of CT, the radiologist's role has increased in evaluation and treatment planning of patients with suspected ossicular abnormalities. (orig.)

  19. Fast and Accurate Prediction of Stratified Steel Temperature During Holding Period of Ladle

    Science.gov (United States)

    Deodhar, Anirudh; Singh, Umesh; Shukla, Rishabh; Gautham, B. P.; Singh, Amarendra K.

    2017-04-01

    Thermal stratification of liquid steel in a ladle during the holding period and the teeming operation has a direct bearing on the superheat available at the caster and hence on the caster set points such as casting speed and cooling rates. The changes in the caster set points are typically carried out based on temperature measurements at the end of tundish outlet. Thermal prediction models provide advance knowledge of the influence of process and design parameters on the steel temperature at various stages. Therefore, they can be used in making accurate decisions about the caster set points in real time. However, this requires both fast and accurate thermal prediction models. In this work, we develop a surrogate model for the prediction of thermal stratification using data extracted from a set of computational fluid dynamics (CFD) simulations, pre-determined using design of experiments technique. Regression method is used for training the predictor. The model predicts the stratified temperature profile instantaneously, for a given set of process parameters such as initial steel temperature, refractory heat content, slag thickness, and holding time. More than 96 pct of the predicted values are within an error range of ±5 K (±5 °C), when compared against corresponding CFD results. Considering its accuracy and computational efficiency, the model can be extended for thermal control of casting operations. This work also sets a benchmark for developing similar thermal models for downstream processes such as tundish and caster.

  20. Computing camera heading: A study

    Science.gov (United States)

    Zhang, John Jiaxiang

    2000-08-01

    An accurate estimate of the motion of a camera is a crucial first step for the 3D reconstruction of sites, objects, and buildings from video. Solutions to the camera heading problem can be readily applied to many areas, such as robotic navigation, surgical operation, video special effects, multimedia, and lately even in internet commerce. From image sequences of a real world scene, the problem is to calculate the directions of the camera translations. The presence of rotations makes this problem very hard. This is because rotations and translations can have similar effects on the images, and are thus hard to tell apart. However, the visual angles between the projection rays of point pairs are unaffected by rotations, and their changes over time contain sufficient information to determine the direction of camera translation. We developed a new formulation of the visual angle disparity approach, first introduced by Tomasi, to the camera heading problem. Our new derivation makes theoretical analysis possible. Most notably, a theorem is obtained that locates all possible singularities of the residual function for the underlying optimization problem. This allows identifying all computation trouble spots beforehand, and to design reliable and accurate computational optimization methods. A bootstrap-jackknife resampling method simultaneously reduces complexity and tolerates outliers well. Experiments with image sequences show accurate results when compared with the true camera motion as measured with mechanical devices.

  1. Gene expression inference with deep learning.

    Science.gov (United States)

    Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui

    2016-06-15

    Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. D-GEX is available at https://github.com/uci-cbcl/D-GEX CONTACT: xhx@ics.uci.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Synthetic RNAs for Gene Regulation: Design Principles and Computational Tools

    International Nuclear Information System (INIS)

    Laganà, Alessandro; Shasha, Dennis; Croce, Carlo Maria

    2014-01-01

    The use of synthetic non-coding RNAs for post-transcriptional regulation of gene expression has not only become a standard laboratory tool for gene functional studies but it has also opened up new perspectives in the design of new and potentially promising therapeutic strategies. Bioinformatics has provided researchers with a variety of tools for the design, the analysis, and the evaluation of RNAi agents such as small-interfering RNA (siRNA), short-hairpin RNA (shRNA), artificial microRNA (a-miR), and microRNA sponges. More recently, a new system for genome engineering based on the bacterial CRISPR-Cas9 system (Clustered Regularly Interspaced Short Palindromic Repeats), was shown to have the potential to also regulate gene expression at both transcriptional and post-transcriptional level in a more specific way. In this mini review, we present RNAi and CRISPRi design principles and discuss the advantages and limitations of the current design approaches.

  3. Synthetic RNAs for Gene Regulation: Design Principles and Computational Tools

    Energy Technology Data Exchange (ETDEWEB)

    Laganà, Alessandro [Department of Molecular Virology, Immunology and Medical Genetics, Comprehensive Cancer Center, The Ohio State University, Columbus, OH (United States); Shasha, Dennis [Courant Institute of Mathematical Sciences, New York University, New York, NY (United States); Croce, Carlo Maria [Department of Molecular Virology, Immunology and Medical Genetics, Comprehensive Cancer Center, The Ohio State University, Columbus, OH (United States)

    2014-12-11

    The use of synthetic non-coding RNAs for post-transcriptional regulation of gene expression has not only become a standard laboratory tool for gene functional studies but it has also opened up new perspectives in the design of new and potentially promising therapeutic strategies. Bioinformatics has provided researchers with a variety of tools for the design, the analysis, and the evaluation of RNAi agents such as small-interfering RNA (siRNA), short-hairpin RNA (shRNA), artificial microRNA (a-miR), and microRNA sponges. More recently, a new system for genome engineering based on the bacterial CRISPR-Cas9 system (Clustered Regularly Interspaced Short Palindromic Repeats), was shown to have the potential to also regulate gene expression at both transcriptional and post-transcriptional level in a more specific way. In this mini review, we present RNAi and CRISPRi design principles and discuss the advantages and limitations of the current design approaches.

  4. Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.

    Science.gov (United States)

    Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina

    2015-01-01

    Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.

  5. Sequence- vs. chip-assisted genomic selection: accurate biological information is advised.

    Science.gov (United States)

    Pérez-Enciso, Miguel; Rincón, Juan C; Legarra, Andrés

    2015-05-09

    The development of next-generation sequencing technologies (NGS) has made the use of whole-genome sequence data for routine genetic evaluations possible, which has triggered a considerable interest in animal and plant breeding fields. Here, we investigated whether complete or partial sequence data can improve upon existing SNP (single nucleotide polymorphism) array-based selection strategies by simulation using a mixed coalescence - gene-dropping approach. We simulated 20 or 100 causal mutations (quantitative trait nucleotides, QTN) within 65 predefined 'gene' regions, each 10 kb long, within a genome composed of ten 3-Mb chromosomes. We compared prediction accuracy by cross-validation using a medium-density chip (7.5 k SNPs), a high-density (HD, 17 k) and sequence data (335 k). Genetic evaluation was based on a GBLUP method. The simulations showed: (1) a law of diminishing returns with increasing number of SNPs; (2) a modest effect of SNP ascertainment bias in arrays; (3) a small advantage of using whole-genome sequence data vs. HD arrays i.e. ~4%; (4) a minor effect of NGS errors except when imputation error rates are high (≥20%); and (5) if QTN were known, prediction accuracy approached 1. Since this is obviously unrealistic, we explored milder assumptions. We showed that, if all SNPs within causal genes were included in the prediction model, accuracy could also dramatically increase by ~40%. However, this criterion was highly sensitive to either misspecification (including wrong genes) or to the use of an incomplete gene list; in these cases, accuracy fell rapidly towards that reached when all SNPs from sequence data were blindly included in the model. Our study shows that, unless an accurate prior estimate on the functionality of SNPs can be included in the predictor, there is a law of diminishing returns with increasing SNP density. As a result, use of whole-genome sequence data may not result in a highly increased selection response over high

  6. Microbiome Data Accurately Predicts the Postmortem Interval Using Random Forest Regression Models

    Directory of Open Access Journals (Sweden)

    Aeriel Belk

    2018-02-01

    Full Text Available Death investigations often include an effort to establish the postmortem interval (PMI in cases in which the time of death is uncertain. The postmortem interval can lead to the identification of the deceased and the validation of witness statements and suspect alibis. Recent research has demonstrated that microbes provide an accurate clock that starts at death and relies on ecological change in the microbial communities that normally inhabit a body and its surrounding environment. Here, we explore how to build the most robust Random Forest regression models for prediction of PMI by testing models built on different sample types (gravesoil, skin of the torso, skin of the head, gene markers (16S ribosomal RNA (rRNA, 18S rRNA, internal transcribed spacer regions (ITS, and taxonomic levels (sequence variants, species, genus, etc.. We also tested whether particular suites of indicator microbes were informative across different datasets. Generally, results indicate that the most accurate models for predicting PMI were built using gravesoil and skin data using the 16S rRNA genetic marker at the taxonomic level of phyla. Additionally, several phyla consistently contributed highly to model accuracy and may be candidate indicators of PMI.

  7. Combining multiple hypothesis testing and affinity propagation clustering leads to accurate, robust and sample size independent classification on gene expression data

    Directory of Open Access Journals (Sweden)

    Sakellariou Argiris

    2012-10-01

    Full Text Available Abstract Background A feature selection method in microarray gene expression data should be independent of platform, disease and dataset size. Our hypothesis is that among the statistically significant ranked genes in a gene list, there should be clusters of genes that share similar biological functions related to the investigated disease. Thus, instead of keeping N top ranked genes, it would be more appropriate to define and keep a number of gene cluster exemplars. Results We propose a hybrid FS method (mAP-KL, which combines multiple hypothesis testing and affinity propagation (AP-clustering algorithm along with the Krzanowski & Lai cluster quality index, to select a small yet informative subset of genes. We applied mAP-KL on real microarray data, as well as on simulated data, and compared its performance against 13 other feature selection approaches. Across a variety of diseases and number of samples, mAP-KL presents competitive classification results, particularly in neuromuscular diseases, where its overall AUC score was 0.91. Furthermore, mAP-KL generates concise yet biologically relevant and informative N-gene expression signatures, which can serve as a valuable tool for diagnostic and prognostic purposes, as well as a source of potential disease biomarkers in a broad range of diseases. Conclusions mAP-KL is a data-driven and classifier-independent hybrid feature selection method, which applies to any disease classification problem based on microarray data, regardless of the available samples. Combining multiple hypothesis testing and AP leads to subsets of genes, which classify unknown samples from both, small and large patient cohorts with high accuracy.

  8. Touchable Computing: Computing-Inspired Bio-Detection.

    Science.gov (United States)

    Chen, Yifan; Shi, Shaolong; Yao, Xin; Nakano, Tadashi

    2017-12-01

    We propose a new computing-inspired bio-detection framework called touchable computing (TouchComp). Under the rubric of TouchComp, the best solution is the cancer to be detected, the parameter space is the tissue region at high risk of malignancy, and the agents are the nanorobots loaded with contrast medium molecules for tracking purpose. Subsequently, the cancer detection procedure (CDP) can be interpreted from the computational optimization perspective: a population of externally steerable agents (i.e., nanorobots) locate the optimal solution (i.e., cancer) by moving through the parameter space (i.e., tissue under screening), whose landscape (i.e., a prescribed feature of tissue environment) may be altered by these agents but the location of the best solution remains unchanged. One can then infer the landscape by observing the movement of agents by applying the "seeing-is-sensing" principle. The term "touchable" emphasizes the framework's similarity to controlling by touching the screen with a finger, where the external field for controlling and tracking acts as the finger. Given this analogy, we aim to answer the following profound question: can we look to the fertile field of computational optimization algorithms for solutions to achieve effective cancer detection that are fast, accurate, and robust? Along this line of thought, we consider the classical particle swarm optimization (PSO) as an example and propose the PSO-inspired CDP, which differs from the standard PSO by taking into account realistic in vivo propagation and controlling of nanorobots. Finally, we present comprehensive numerical examples to demonstrate the effectiveness of the PSO-inspired CDP for different blood flow velocity profiles caused by tumor-induced angiogenesis. The proposed TouchComp bio-detection framework may be regarded as one form of natural computing that employs natural materials to compute.

  9. Gene Circuit Analysis of the Terminal Gap Gene huckebein

    Science.gov (United States)

    Ashyraliyev, Maksat; Siggens, Ken; Janssens, Hilde; Blom, Joke; Akam, Michael; Jaeger, Johannes

    2009-01-01

    The early embryo of Drosophila melanogaster provides a powerful model system to study the role of genes in pattern formation. The gap gene network constitutes the first zygotic regulatory tier in the hierarchy of the segmentation genes involved in specifying the position of body segments. Here, we use an integrative, systems-level approach to investigate the regulatory effect of the terminal gap gene huckebein (hkb) on gap gene expression. We present quantitative expression data for the Hkb protein, which enable us to include hkb in gap gene circuit models. Gap gene circuits are mathematical models of gene networks used as computational tools to extract regulatory information from spatial expression data. This is achieved by fitting the model to gap gene expression patterns, in order to obtain estimates for regulatory parameters which predict a specific network topology. We show how considering variability in the data combined with analysis of parameter determinability significantly improves the biological relevance and consistency of the approach. Our models are in agreement with earlier results, which they extend in two important respects: First, we show that Hkb is involved in the regulation of the posterior hunchback (hb) domain, but does not have any other essential function. Specifically, Hkb is required for the anterior shift in the posterior border of this domain, which is now reproduced correctly in our models. Second, gap gene circuits presented here are able to reproduce mutants of terminal gap genes, while previously published models were unable to reproduce any null mutants correctly. As a consequence, our models now capture the expression dynamics of all posterior gap genes and some variational properties of the system correctly. This is an important step towards a better, quantitative understanding of the developmental and evolutionary dynamics of the gap gene network. PMID:19876378

  10. READSCAN: A fast and scalable pathogen discovery program with accurate genome relative abundance estimation

    KAUST Repository

    Naeem, Raeece

    2012-11-28

    Summary: READSCAN is a highly scalable parallel program to identify non-host sequences (of potential pathogen origin) and estimate their genome relative abundance in high-throughput sequence datasets. READSCAN accurately classified human and viral sequences on a 20.1 million reads simulated dataset in <27 min using a small Beowulf compute cluster with 16 nodes (Supplementary Material). Availability: http://cbrc.kaust.edu.sa/readscan Contact: or raeece.naeem@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. 2012 The Author(s).

  11. READSCAN: A fast and scalable pathogen discovery program with accurate genome relative abundance estimation

    KAUST Repository

    Naeem, Raeece; Rashid, Mamoon; Pain, Arnab

    2012-01-01

    Summary: READSCAN is a highly scalable parallel program to identify non-host sequences (of potential pathogen origin) and estimate their genome relative abundance in high-throughput sequence datasets. READSCAN accurately classified human and viral sequences on a 20.1 million reads simulated dataset in <27 min using a small Beowulf compute cluster with 16 nodes (Supplementary Material). Availability: http://cbrc.kaust.edu.sa/readscan Contact: or raeece.naeem@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. 2012 The Author(s).

  12. Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization

    Directory of Open Access Journals (Sweden)

    McDonald Karen

    2011-08-01

    Full Text Available Abstract Background Direct gene synthesis is becoming more popular owing to decreases in gene synthesis pricing. Compared with using natural genes, gene synthesis provides a good opportunity to optimize gene sequence for specific applications. In order to facilitate gene optimization, we have developed a stand-alone software called Visual Gene Developer. Results The software not only provides general functions for gene analysis and optimization along with an interactive user-friendly interface, but also includes unique features such as programming capability, dedicated mRNA secondary structure prediction, artificial neural network modeling, network & multi-threaded computing, and user-accessible programming modules. The software allows a user to analyze and optimize a sequence using main menu functions or specialized module windows. Alternatively, gene optimization can be initiated by designing a gene construct and configuring an optimization strategy. A user can choose several predefined or user-defined algorithms to design a complicated strategy. The software provides expandable functionality as platform software supporting module development using popular script languages such as VBScript and JScript in the software programming environment. Conclusion Visual Gene Developer is useful for both researchers who want to quickly analyze and optimize genes, and those who are interested in developing and testing new algorithms in bioinformatics. The software is available for free download at http://www.visualgenedeveloper.net.

  13. Accurate Typing of Human Leukocyte Antigen Class I Genes by Oxford Nanopore Sequencing.

    Science.gov (United States)

    Liu, Chang; Xiao, Fangzhou; Hoisington-Lopez, Jessica; Lang, Kathrin; Quenzel, Philipp; Duffy, Brian; Mitra, Robi David

    2018-04-03

    Oxford Nanopore Technologies' MinION has expanded the current DNA sequencing toolkit by delivering long read lengths and extreme portability. The MinION has the potential to enable expedited point-of-care human leukocyte antigen (HLA) typing, an assay routinely used to assess the immunologic compatibility between organ donors and recipients, but the platform's high error rate makes it challenging to type alleles with accuracy. We developed and validated accurate typing of HLA by Oxford nanopore (Athlon), a bioinformatic pipeline that i) maps nanopore reads to a database of known HLA alleles, ii) identifies candidate alleles with the highest read coverage at different resolution levels that are represented as branching nodes and leaves of a tree structure, iii) generates consensus sequences by remapping the reads to the candidate alleles, and iv) calls the final diploid genotype by blasting consensus sequences against the reference database. Using two independent data sets generated on the R9.4 flow cell chemistry, Athlon achieved a 100% accuracy in class I HLA typing at the two-field resolution. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  14. Accurate Simulation of Parametrically Excited Micromirrors via Direct Computation of the Electrostatic Stiffness

    Directory of Open Access Journals (Sweden)

    Attilio Frangi

    2017-04-01

    Full Text Available Electrostatically actuated torsional micromirrors are key elements in Micro-Opto-Electro- Mechanical-Systems. When forced by means of in-plane comb-fingers, the dynamics of the main torsional response is known to be strongly non-linear and governed by parametric resonance. Here, in order to also trace unstable branches of the mirror response, we implement a simplified continuation method with arc-length control and propose an innovative technique based on Finite Elements and the concepts of material derivative in order to compute the electrostatic stiffness; i.e., the derivative of the torque with respect to the torsional angle, as required by the continuation approach.

  15. Accurate Simulation of Parametrically Excited Micromirrors via Direct Computation of the Electrostatic Stiffness.

    Science.gov (United States)

    Frangi, Attilio; Guerrieri, Andrea; Boni, Nicoló

    2017-04-06

    Electrostatically actuated torsional micromirrors are key elements in Micro-Opto-Electro- Mechanical-Systems. When forced by means of in-plane comb-fingers, the dynamics of the main torsional response is known to be strongly non-linear and governed by parametric resonance. Here, in order to also trace unstable branches of the mirror response, we implement a simplified continuation method with arc-length control and propose an innovative technique based on Finite Elements and the concepts of material derivative in order to compute the electrostatic stiffness; i.e., the derivative of the torque with respect to the torsional angle, as required by the continuation approach.

  16. Inferring Gene Regulatory Networks Using Conditional Regulation Pattern to Guide Candidate Genes.

    Directory of Open Access Journals (Sweden)

    Fei Xiao

    Full Text Available Combining path consistency (PC algorithms with conditional mutual information (CMI are widely used in reconstruction of gene regulatory networks. CMI has many advantages over Pearson correlation coefficient in measuring non-linear dependence to infer gene regulatory networks. It can also discriminate the direct regulations from indirect ones. However, it is still a challenge to select the conditional genes in an optimal way, which affects the performance and computation complexity of the PC algorithm. In this study, we develop a novel conditional mutual information-based algorithm, namely RPNI (Regulation Pattern based Network Inference, to infer gene regulatory networks. For conditional gene selection, we define the co-regulation pattern, indirect-regulation pattern and mixture-regulation pattern as three candidate patterns to guide the selection of candidate genes. To demonstrate the potential of our algorithm, we apply it to gene expression data from DREAM challenge. Experimental results show that RPNI outperforms existing conditional mutual information-based methods in both accuracy and time complexity for different sizes of gene samples. Furthermore, the robustness of our algorithm is demonstrated by noisy interference analysis using different types of noise.

  17. Isotopic analysis of plutonium by computer controlled mass spectrometry

    International Nuclear Information System (INIS)

    1974-01-01

    Isotopic analysis of plutonium chemically purified by ion exchange is achieved using a thermal ionization mass spectrometer. Data acquisition from and control of the instrument is done automatically with a dedicated system computer in real time with subsequent automatic data reduction and reporting. Separation of isotopes is achieved by varying the ion accelerating high voltage with accurate computer control

  18. Applications of X-ray Computed Tomography and Emission Computed Tomography

    International Nuclear Information System (INIS)

    Seletchi, Emilia Dana; Sutac, Victor

    2005-01-01

    Computed Tomography is a non-destructive imaging method that allows visualization of internal features within non-transparent objects such as sedimentary rocks. Filtering techniques have been applied to circumvent the artifacts and achieve high-quality images for quantitative analysis. High-resolution X-ray computed tomography (HRXCT) can be used to identify the position of the growth axis in speleothems by detecting subtle changes in calcite density between growth bands. HRXCT imagery reveals the three-dimensional variability of coral banding providing information on coral growth and climate over the past several centuries. The Nuclear Medicine imaging technique uses a radioactive tracer, several radiation detectors, and sophisticated computer technologies to understand the biochemical basis of normal and abnormal functions within the brain. The goal of Emission Computed Tomography (ECT) is to accurately determine the three-dimensional radioactivity distribution resulting from the radiopharmaceutical uptake inside the patient instead of the attenuation coefficient distribution from different tissues as obtained from X-ray Computer Tomography. ECT is a very useful tool for investigating the cognitive functions. Because of the low radiation doses associated with Positron Emission Tomography (PET), this technique has been applied in clinical research, allowing the direct study of human neurological diseases. (authors)

  19. SIGNATURE: A workbench for gene expression signature analysis

    Directory of Open Access Journals (Sweden)

    Chang Jeffrey T

    2011-11-01

    Full Text Available Abstract Background The biological phenotype of a cell, such as a characteristic visual image or behavior, reflects activities derived from the expression of collections of genes. As such, an ability to measure the expression of these genes provides an opportunity to develop more precise and varied sets of phenotypes. However, to use this approach requires computational methods that are difficult to implement and apply, and thus there is a critical need for intelligent software tools that can reduce the technical burden of the analysis. Tools for gene expression analyses are unusually difficult to implement in a user-friendly way because their application requires a combination of biological data curation, statistical computational methods, and database expertise. Results We have developed SIGNATURE, a web-based resource that simplifies gene expression signature analysis by providing software, data, and protocols to perform the analysis successfully. This resource uses Bayesian methods for processing gene expression data coupled with a curated database of gene expression signatures, all carried out within a GenePattern web interface for easy use and access. Conclusions SIGNATURE is available for public use at http://genepattern.genome.duke.edu/signature/.

  20. Methods for monitoring multiple gene expression

    Energy Technology Data Exchange (ETDEWEB)

    Berka, Randy [Davis, CA; Bachkirova, Elena [Davis, CA; Rey, Michael [Davis, CA

    2012-05-01

    The present invention relates to methods for monitoring differential expression of a plurality of genes in a first filamentous fungal cell relative to expression of the same genes in one or more second filamentous fungal cells using microarrays containing Trichoderma reesei ESTs or SSH clones, or a combination thereof. The present invention also relates to computer readable media and substrates containing such array features for monitoring expression of a plurality of genes in filamentous fungal cells.

  1. Methods for monitoring multiple gene expression

    Energy Technology Data Exchange (ETDEWEB)

    Berka, Randy; Bachkirova, Elena; Rey, Michael

    2013-10-01

    The present invention relates to methods for monitoring differential expression of a plurality of genes in a first filamentous fungal cell relative to expression of the same genes in one or more second filamentous fungal cells using microarrays containing Trichoderma reesei ESTs or SSH clones, or a combination thereof. The present invention also relates to computer readable media and substrates containing such array features for monitoring expression of a plurality of genes in filamentous fungal cells.

  2. Original computer aided support system for safe and accurate implant placement—Collaboration with an university originated venture company

    Directory of Open Access Journals (Sweden)

    Taiji Sohmura

    2010-08-01

    Two clinical cases with implant placement on the three lower molars by flap operation using bone supported surgical guide and flapless operation with teeth supported surgical guide and immediate loading with provisional prostheses prepared beforehand are introduced. The present simulation and drilling support using the surgical guide may help to perform safe and accurate implant surgery.

  3. Computational prediction of CTCF/cohesin-based intra-TAD loops that insulate chromatin contacts and gene expression in mouse liver.

    Science.gov (United States)

    Matthews, Bryan J; Waxman, David J

    2018-05-14

    CTCF and cohesin are key drivers of 3D-nuclear organization, anchoring the megabase-scale Topologically Associating Domains (TADs) that segment the genome. Here, we present and validate a computational method to predict cohesin-and-CTCF binding sites that form intra-TAD DNA loops. The intra-TAD loop anchors identified are structurally indistinguishable from TAD anchors regarding binding partners, sequence conservation, and resistance to cohesin knockdown; further, the intra-TAD loops retain key functional features of TADs, including chromatin contact insulation, blockage of repressive histone mark spread, and ubiquity across tissues. We propose that intra-TAD loops form by the same loop extrusion mechanism as the larger TAD loops, and that their shorter length enables finer regulatory control in restricting enhancer-promoter interactions, which enables selective, high-level expression of gene targets of super-enhancers and genes located within repressive nuclear compartments. These findings elucidate the role of intra-TAD cohesin-and-CTCF binding in nuclear organization associated with widespread insulation of distal enhancer activity. © 2018, Matthews et al.

  4. Experience of computed tomographic myelography and discography in cervical problem

    Energy Technology Data Exchange (ETDEWEB)

    Nakatani, Shigeru; Yamamoto, Masayuki; Uratsuji, Masaaki; Suzuki, Kunio; Matsui, Eigo [Hyogo Prefectural Awaji Hospital, Sumoto, Hyogo (Japan); Kurihara, Akira

    1983-06-01

    CTM (computed tomographic myelography) was performed on 15 cases of cervical lesions, and on 5 of them, CTD (computed tomographic discography) was also made. CTM revealed the intervertebral state, and in combination with CTD, providing more accurate information. The combined method of CTM and CTD was useful for soft disc herniation.

  5. Shielding Benchmark Computational Analysis

    International Nuclear Information System (INIS)

    Hunter, H.T.; Slater, C.O.; Holland, L.B.; Tracz, G.; Marshall, W.J.; Parsons, J.L.

    2000-01-01

    Over the past several decades, nuclear science has relied on experimental research to verify and validate information about shielding nuclear radiation for a variety of applications. These benchmarks are compared with results from computer code models and are useful for the development of more accurate cross-section libraries, computer code development of radiation transport modeling, and building accurate tests for miniature shielding mockups of new nuclear facilities. When documenting measurements, one must describe many parts of the experimental results to allow a complete computational analysis. Both old and new benchmark experiments, by any definition, must provide a sound basis for modeling more complex geometries required for quality assurance and cost savings in nuclear project development. Benchmarks may involve one or many materials and thicknesses, types of sources, and measurement techniques. In this paper the benchmark experiments of varying complexity are chosen to study the transport properties of some popular materials and thicknesses. These were analyzed using three-dimensional (3-D) models and continuous energy libraries of MCNP4B2, a Monte Carlo code developed at Los Alamos National Laboratory, New Mexico. A shielding benchmark library provided the experimental data and allowed a wide range of choices for source, geometry, and measurement data. The experimental data had often been used in previous analyses by reputable groups such as the Cross Section Evaluation Working Group (CSEWG) and the Organization for Economic Cooperation and Development/Nuclear Energy Agency Nuclear Science Committee (OECD/NEANSC)

  6. An accurate modelling of the two-diode model of PV module using a hybrid solution based on differential evolution

    International Nuclear Information System (INIS)

    Chin, Vun Jack; Salam, Zainal; Ishaque, Kashif

    2016-01-01

    Highlights: • An accurate computational method for the two-diode model of PV module is proposed. • The hybrid method employs analytical equations and Differential Evolution (DE). • I PV , I o1 , and R p are computed analytically, while a 1 , a 2 , I o2 and R s are optimized. • This allows the model parameters to be computed without using costly assumptions. - Abstract: This paper proposes an accurate computational technique for the two-diode model of PV module. Unlike previous methods, it does not rely on assumptions that cause the accuracy to be compromised. The key to this improvement is the implementation of a hybrid solution, i.e. by incorporating the analytical method with the differential evolution (DE) optimization technique. Three parameters, i.e. I PV , I o1 , and R p are computed analytically, while the remaining, a 1 , a 2 , I o2 and R s are optimized using the DE. To validate its accuracy, the proposed method is tested on three PV modules of different technologies: mono-crystalline, poly-crystalline and thin film. Furthermore, its performance is evaluated against two popular computational methods for the two-diode model. The proposed method is found to exhibit superior accuracy for the variation in irradiance and temperature for all module types. In particular, the improvement in accuracy is evident at low irradiance conditions; the root-mean-square error is one order of magnitude lower than that of the other methods. In addition, the values of the model parameters are consistent with the physics of PV cell. It is envisaged that the method can be very useful for PV simulation, in which accuracy of the model is of prime concern.

  7. Correlating Information Contents of Gene Ontology Terms to Infer Semantic Similarity of Gene Products

    Directory of Open Access Journals (Sweden)

    Mingxin Gan

    2014-01-01

    Full Text Available Successful applications of the gene ontology to the inference of functional relationships between gene products in recent years have raised the need for computational methods to automatically calculate semantic similarity between gene products based on semantic similarity of gene ontology terms. Nevertheless, existing methods, though having been widely used in a variety of applications, may significantly overestimate semantic similarity between genes that are actually not functionally related, thereby yielding misleading results in applications. To overcome this limitation, we propose to represent a gene product as a vector that is composed of information contents of gene ontology terms annotated for the gene product, and we suggest calculating similarity between two gene products as the relatedness of their corresponding vectors using three measures: Pearson’s correlation coefficient, cosine similarity, and the Jaccard index. We focus on the biological process domain of the gene ontology and annotations of yeast proteins to study the effectiveness of the proposed measures. Results show that semantic similarity scores calculated using the proposed measures are more consistent with known biological knowledge than those derived using a list of existing methods, suggesting the effectiveness of our method in characterizing functional relationships between gene products.

  8. FiGS: a filter-based gene selection workbench for microarray data

    Directory of Open Access Journals (Sweden)

    Yun Taegyun

    2010-01-01

    Full Text Available Abstract Background The selection of genes that discriminate disease classes from microarray data is widely used for the identification of diagnostic biomarkers. Although various gene selection methods are currently available and some of them have shown excellent performance, no single method can retain the best performance for all types of microarray datasets. It is desirable to use a comparative approach to find the best gene selection result after rigorous test of different methodological strategies for a given microarray dataset. Results FiGS is a web-based workbench that automatically compares various gene selection procedures and provides the optimal gene selection result for an input microarray dataset. FiGS builds up diverse gene selection procedures by aligning different feature selection techniques and classifiers. In addition to the highly reputed techniques, FiGS diversifies the gene selection procedures by incorporating gene clustering options in the feature selection step and different data pre-processing options in classifier training step. All candidate gene selection procedures are evaluated by the .632+ bootstrap errors and listed with their classification accuracies and selected gene sets. FiGS runs on parallelized computing nodes that capacitate heavy computations. FiGS is freely accessible at http://gexp.kaist.ac.kr/figs. Conclusion FiGS is an web-based application that automates an extensive search for the optimized gene selection analysis for a microarray dataset in a parallel computing environment. FiGS will provide both an efficient and comprehensive means of acquiring optimal gene sets that discriminate disease states from microarray datasets.

  9. Accurate spectroscopic characterization of protonated oxirane: a potential prebiotic species in Titan's atmosphere

    International Nuclear Information System (INIS)

    Giacomo Ciamician, Università di Bologna, Via Selmi 2, I-40126 Bologna (Italy))" data-affiliation=" (Dipartimento di Chimica Giacomo Ciamician, Università di Bologna, Via Selmi 2, I-40126 Bologna (Italy))" >Puzzarini, Cristina; Ali, Ashraf; Biczysko, Malgorzata; Barone, Vincenzo

    2014-01-01

    An accurate spectroscopic characterization of protonated oxirane has been carried out by means of state-of-the-art computational methods and approaches. The calculated spectroscopic parameters from our recent computational investigation of oxirane together with the corresponding experimental data available were used to assess the accuracy of our predicted rotational and IR spectra of protonated oxirane. We found an accuracy of about 10 cm –1 for vibrational transitions (fundamentals as well as overtones and combination bands) and, in relative terms, of 0.1% for rotational transitions. We are therefore confident that the spectroscopic data provided herein are a valuable support for the detection of protonated oxirane not only in Titan's atmosphere but also in the interstellar medium.

  10. Accurate spectroscopic characterization of protonated oxirane: a potential prebiotic species in Titan's atmosphere

    Energy Technology Data Exchange (ETDEWEB)

    Puzzarini, Cristina [Dipartimento di Chimica " Giacomo Ciamician," Università di Bologna, Via Selmi 2, I-40126 Bologna (Italy); Ali, Ashraf [NASA Goddard Space Flight Center, Greenbelt, MD 20771 (United States); Biczysko, Malgorzata; Barone, Vincenzo, E-mail: cristina.puzzarini@unibo.it [Scuola Normale Superiore, Piazza dei Cavalieri 7, I-56126 Pisa (Italy)

    2014-09-10

    An accurate spectroscopic characterization of protonated oxirane has been carried out by means of state-of-the-art computational methods and approaches. The calculated spectroscopic parameters from our recent computational investigation of oxirane together with the corresponding experimental data available were used to assess the accuracy of our predicted rotational and IR spectra of protonated oxirane. We found an accuracy of about 10 cm{sup –1} for vibrational transitions (fundamentals as well as overtones and combination bands) and, in relative terms, of 0.1% for rotational transitions. We are therefore confident that the spectroscopic data provided herein are a valuable support for the detection of protonated oxirane not only in Titan's atmosphere but also in the interstellar medium.

  11. Accurate Evaluation of Quantum Integrals

    Science.gov (United States)

    Galant, D. C.; Goorvitch, D.; Witteborn, Fred C. (Technical Monitor)

    1995-01-01

    Combining an appropriate finite difference method with Richardson's extrapolation results in a simple, highly accurate numerical method for solving a Schrodinger's equation. Important results are that error estimates are provided, and that one can extrapolate expectation values rather than the wavefunctions to obtain highly accurate expectation values. We discuss the eigenvalues, the error growth in repeated Richardson's extrapolation, and show that the expectation values calculated on a crude mesh can be extrapolated to obtain expectation values of high accuracy.

  12. Does universal 16S rRNA gene amplicon sequencing of environmental communities provide an accurate description of nitrifying guilds?

    DEFF Research Database (Denmark)

    Diwan, Vaibhav; Albrechtsen, Hans-Jørgen; Smets, Barth F.

    2018-01-01

    amplicon sequencing and from guild targeted approaches. The universal amplicon sequencing provided 1) accurate estimates of nitrifier composition, 2) clustering of the samples based on these compositions consistent with sample origin, 3) estimates of the relative abundance of the guilds correlated...

  13. Services Recommendation System based on Heterogeneous Network Analysis in Cloud Computing

    OpenAIRE

    Junping Dong; Qingyu Xiong; Junhao Wen; Peng Li

    2014-01-01

    Resources are provided mainly in the form of services in cloud computing. In the distribute environment of cloud computing, how to find the needed services efficiently and accurately is the most urgent problem in cloud computing. In cloud computing, services are the intermediary of cloud platform, services are connected by lots of service providers and requesters and construct the complex heterogeneous network. The traditional recommendation systems only consider the functional and non-functi...

  14. Chromosomal location and nucleotide sequence of the Escherichia coli dapA gene.

    OpenAIRE

    Richaud, F; Richaud, C; Ratet, P; Patte, J C

    1986-01-01

    In Escherichia coli, the first enzyme of the diaminopimelate and lysine pathway is dihydrodipicolinate synthetase, which is feedback-inhibited by lysine and encoded by the dapA gene. The location of the dapA gene on the bacterial chromosome has been determined accurately with respect to the neighboring purC and dapE genes. The complete nucleotide sequence and the transcriptional start of the dapA gene were determined. The results show that dapA consists of a single cistron encoding a 292-amin...

  15. Gene expression analysis in prostate cancer: the importance of the endogenous control.

    LENUS (Irish Health Repository)

    Vajda, Alice

    2013-03-01

    Aberrant gene expression is a hallmark of cancer. Quantitative reverse-transcription PCR (qRT-PCR) is the gold-standard for quantifying gene expression, and commonly employs a house-keeping gene (HKG) as an endogenous control to normalize results; the choice of which is critical for accurate data interpretation. Many factors, including sample type, pathological state, and oxygen levels influence gene expression including putative HKGs. The aim of this study was to determine the suitability of commonly used HKGs for qRT-PCR in prostate cancer.

  16. Accurate First-Principles Spectra Predictions for Planetological and Astrophysical Applications at Various T-Conditions

    Science.gov (United States)

    Rey, M.; Nikitin, A. V.; Tyuterev, V.

    2014-06-01

    Knowledge of near infrared intensities of rovibrational transitions of polyatomic molecules is essential for the modeling of various planetary atmospheres, brown dwarfs and for other astrophysical applications 1,2,3. For example, to analyze exoplanets, atmospheric models have been developed, thus making the need to provide accurate spectroscopic data. Consequently, the spectral characterization of such planetary objects relies on the necessity of having adequate and reliable molecular data in extreme conditions (temperature, optical path length, pressure). On the other hand, in the modeling of astrophysical opacities, millions of lines are generally involved and the line-by-line extraction is clearly not feasible in laboratory measurements. It is thus suggested that this large amount of data could be interpreted only by reliable theoretical predictions. There exists essentially two theoretical approaches for the computation and prediction of spectra. The first one is based on empirically-fitted effective spectroscopic models. Another way for computing energies, line positions and intensities is based on global variational calculations using ab initio surfaces. They do not yet reach the spectroscopic accuracy stricto sensu but implicitly account for all intramolecular interactions including resonance couplings in a wide spectral range. The final aim of this work is to provide reliable predictions which could be quantitatively accurate with respect to the precision of available observations and as complete as possible. All this thus requires extensive first-principles quantum mechanical calculations essentially based on three necessary ingredients which are (i) accurate intramolecular potential energy surface and dipole moment surface components well-defined in a large range of vibrational displacements and (ii) efficient computational methods combined with suitable choices of coordinates to account for molecular symmetry properties and to achieve a good numerical

  17. Predator-induced defences in Daphnia pulex: Selection and evaluation of internal reference genes for gene expression studies with real-time PCR

    Directory of Open Access Journals (Sweden)

    Gilbert Don

    2010-06-01

    Full Text Available Abstract Background The planktonic microcrustacean Daphnia pulex is among the best-studied animals in ecological, toxicological and evolutionary research. One aspect that has sustained interest in the study system is the ability of D. pulex to develop inducible defence structures when exposed to predators, such as the phantom midge larvae Chaoborus. The available draft genome sequence for D. pulex is accelerating research to identify genes that confer plastic phenotypes that are regularly cued by environmental stimuli. Yet for quantifying gene expression levels, no experimentally validated set of internal control genes exists for the accurate normalization of qRT-PCR data. Results In this study, we tested six candidate reference genes for normalizing transcription levels of D. pulex genes; alpha tubulin (aTub, glyceraldehyde-3-phosphate dehydrogenase (GAPDH, TATA box binding protein (Tbp syntaxin 16 (Stx16, X-box binding protein 1 (Xbp1 and CAPON, a protein associated with the neuronal nitric oxide synthase, were selected on the basis of an earlier study and from microarray studies. One additional gene, a matrix metalloproteinase (MMP, was tested to validate its transcriptional response to Chaoborus, which was earlier observed in a microarray study. The transcription profiles of these seven genes were assessed by qRT-PCR from RNA of juvenile D. pulex that showed induced defences in comparison to untreated control animals. We tested the individual suitability of genes for expression normalization using the programs geNorm, NormFinder and BestKeeper. Intriguingly, Xbp1, Tbp, CAPON and Stx16 were selected as ideal reference genes. Analyses on the relative expression level using the software REST showed that both classical housekeeping candidate genes (aTub and GAPDH were significantly downregulated, whereas the MMP gene was shown to be significantly upregulated, as predicted. aTub is a particularly ill suited reference gene because five copies are

  18. Accurate ab initio vibrational energies of methyl chloride

    Energy Technology Data Exchange (ETDEWEB)

    Owens, Alec, E-mail: owens@mpi-muelheim.mpg.de [Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr (Germany); Department of Physics and Astronomy, University College London, Gower Street, WC1E 6BT London (United Kingdom); Yurchenko, Sergei N.; Yachmenev, Andrey; Tennyson, Jonathan [Department of Physics and Astronomy, University College London, Gower Street, WC1E 6BT London (United Kingdom); Thiel, Walter [Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr (Germany)

    2015-06-28

    Two new nine-dimensional potential energy surfaces (PESs) have been generated using high-level ab initio theory for the two main isotopologues of methyl chloride, CH{sub 3}{sup 35}Cl and CH{sub 3}{sup 37}Cl. The respective PESs, CBS-35{sup  HL}, and CBS-37{sup  HL}, are based on explicitly correlated coupled cluster calculations with extrapolation to the complete basis set (CBS) limit, and incorporate a range of higher-level (HL) additive energy corrections to account for core-valence electron correlation, higher-order coupled cluster terms, scalar relativistic effects, and diagonal Born-Oppenheimer corrections. Variational calculations of the vibrational energy levels were performed using the computer program TROVE, whose functionality has been extended to handle molecules of the form XY {sub 3}Z. Fully converged energies were obtained by means of a complete vibrational basis set extrapolation. The CBS-35{sup  HL} and CBS-37{sup  HL} PESs reproduce the fundamental term values with root-mean-square errors of 0.75 and 1.00 cm{sup −1}, respectively. An analysis of the combined effect of the HL corrections and CBS extrapolation on the vibrational wavenumbers indicates that both are needed to compute accurate theoretical results for methyl chloride. We believe that it would be extremely challenging to go beyond the accuracy currently achieved for CH{sub 3}Cl without empirical refinement of the respective PESs.

  19. Screening Reliable Reference Genes for RT-qPCR Analysis of Gene Expression in Moringa oleifera.

    Science.gov (United States)

    Deng, Li-Ting; Wu, Yu-Ling; Li, Jun-Cheng; OuYang, Kun-Xi; Ding, Mei-Mei; Zhang, Jun-Jie; Li, Shu-Qi; Lin, Meng-Fei; Chen, Han-Bin; Hu, Xin-Sheng; Chen, Xiao-Yang

    2016-01-01

    Moringa oleifera is a promising plant species for oil and forage, but its genetic improvement is limited. Our current breeding program in this species focuses on exploiting the functional genes associated with important agronomical traits. Here, we screened reliable reference genes for accurately quantifying the expression of target genes using the technique of real-time quantitative polymerase chain reaction (RT-qPCR) in M. oleifera. Eighteen candidate reference genes were selected from a transcriptome database, and their expression stabilities were examined in 90 samples collected from the pods in different developmental stages, various tissues, and the roots and leaves under different conditions (low or high temperature, sodium chloride (NaCl)- or polyethyleneglycol (PEG)- simulated water stress). Analyses with geNorm, NormFinder and BestKeeper algorithms revealed that the reliable reference genes differed across sample designs and that ribosomal protein L1 (RPL1) and acyl carrier protein 2 (ACP2) were the most suitable reference genes in all tested samples. The experiment results demonstrated the significance of using the properly validated reference genes and suggested the use of more than one reference gene to achieve reliable expression profiles. In addition, we applied three isotypes of the superoxide dismutase (SOD) gene that are associated with plant adaptation to abiotic stress to confirm the efficacy of the validated reference genes under NaCl and PEG water stresses. Our results provide a valuable reference for future studies on identifying important functional genes from their transcriptional expressions via RT-qPCR technique in M. oleifera.

  20. Fast and accurate algorithm for repeated optical trapping simulations on arbitrarily shaped particles based on boundary element method

    International Nuclear Information System (INIS)

    Xu, Kai-Jiang; Pan, Xiao-Min; Li, Ren-Xian; Sheng, Xin-Qing

    2017-01-01

    In optical trapping applications, the optical force should be investigated within a wide range of parameter space in terms of beam configuration to reach the desirable performance. A simple but reliable way of conducting the related investigation is to evaluate optical forces corresponding to all possible beam configurations. Although the optical force exerted on arbitrarily shaped particles can be well predicted by boundary element method (BEM), such investigation is time costing because it involves many repetitions of expensive computation, where the forces are calculated from the equivalent surface currents. An algorithm is proposed to alleviate the difficulty by exploiting our previously developed skeletonization framework. The proposed algorithm succeeds in reducing the number of repetitions. Since the number of skeleton beams is always much less than that of beams in question, the computation can be very efficient. The proposed algorithm is accurate because the skeletonization is accuracy controllable. - Highlights: • A fast and accurate algorithm is proposed in terms of boundary element method to reduce the number of repetitions of computing the optical forces from the equivalent currents. • The algorithm is accuracy controllable because the accuracy of the associated rank-revealing process is well-controlled. • The accelerate rate can reach over one thousand because the number of skeleton beams can be very small. • The algorithm can be applied to other methods, e.g., FE-BI.

  1. Computed tomography versus invasive coronary angiography

    DEFF Research Database (Denmark)

    Napp, Adriane E.; Haase, Robert; Laule, Michael

    2017-01-01

    Objectives: More than 3.5 million invasive coronary angiographies (ICA) are performed in Europe annually. Approximately 2 million of these invasive procedures might be reduced by noninvasive tests because no coronary intervention is performed. Computed tomography (CT) is the most accurate...... angiography (ICA) is the reference standard for detection of CAD.• Noninvasive computed tomography angiography excludes CAD with high sensitivity.• CT may effectively reduce the approximately 2 million negative ICAs in Europe.• DISCHARGE addresses this hypothesis in patients with low-to-intermediate pretest...

  2. Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities

    Science.gov (United States)

    Fang, Xin; Sastry, Anand; Mih, Nathan; Kim, Donghyuk; Tan, Justin; Lloyd, Colton J.; Gao, Ye; Yang, Laurence; Palsson, Bernhard O.

    2017-01-01

    Transcriptional regulatory networks (TRNs) have been studied intensely for >25 y. Yet, even for the Escherichia coli TRN—probably the best characterized TRN—several questions remain. Here, we address three questions: (i) How complete is our knowledge of the E. coli TRN; (ii) how well can we predict gene expression using this TRN; and (iii) how robust is our understanding of the TRN? First, we reconstructed a high-confidence TRN (hiTRN) consisting of 147 transcription factors (TFs) regulating 1,538 transcription units (TUs) encoding 1,764 genes. The 3,797 high-confidence regulatory interactions were collected from published, validated chromatin immunoprecipitation (ChIP) data and RegulonDB. For 21 different TF knockouts, up to 63% of the differentially expressed genes in the hiTRN were traced to the knocked-out TF through regulatory cascades. Second, we trained supervised machine learning algorithms to predict the expression of 1,364 TUs given TF activities using 441 samples. The algorithms accurately predicted condition-specific expression for 86% (1,174 of 1,364) of the TUs, while 193 TUs (14%) were predicted better than random TRNs. Third, we identified 10 regulatory modules whose definitions were robust against changes to the TRN or expression compendium. Using surrogate variable analysis, we also identified three unmodeled factors that systematically influenced gene expression. Our computational workflow comprehensively characterizes the predictive capabilities and systems-level functions of an organism’s TRN from disparate data types. PMID:28874552

  3. A simplified and accurate detection of the genetically modified wheat MON71800 with one calibrator plasmid.

    Science.gov (United States)

    Kim, Jae-Hwan; Park, Saet-Byul; Roh, Hyo-Jeong; Park, Sunghoon; Shin, Min-Ki; Moon, Gui Im; Hong, Jin-Hwan; Kim, Hae-Yeong

    2015-06-01

    With the increasing number of genetically modified (GM) events, unauthorized GMO releases into the food market have increased dramatically, and many countries have developed detection tools for them. This study described the qualitative and quantitative detection methods of unauthorized the GM wheat MON71800 with a reference plasmid (pGEM-M71800). The wheat acetyl-CoA carboxylase (acc) gene was used as the endogenous gene. The plasmid pGEM-M71800, which contains both the acc gene and the event-specific target MON71800, was constructed as a positive control for the qualitative and quantitative analyses. The limit of detection in the qualitative PCR assay was approximately 10 copies. In the quantitative PCR assay, the standard deviation and relative standard deviation repeatability values ranged from 0.06 to 0.25 and from 0.23% to 1.12%, respectively. This study supplies a powerful and very simple but accurate detection strategy for unauthorized GM wheat MON71800 that utilizes a single calibrator plasmid. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Computational biology for ageing

    Science.gov (United States)

    Wieser, Daniela; Papatheodorou, Irene; Ziehm, Matthias; Thornton, Janet M.

    2011-01-01

    High-throughput genomic and proteomic technologies have generated a wealth of publicly available data on ageing. Easy access to these data, and their computational analysis, is of great importance in order to pinpoint the causes and effects of ageing. Here, we provide a description of the existing databases and computational tools on ageing that are available for researchers. We also describe the computational approaches to data interpretation in the field of ageing including gene expression, comparative and pathway analyses, and highlight the challenges for future developments. We review recent biological insights gained from applying bioinformatics methods to analyse and interpret ageing data in different organisms, tissues and conditions. PMID:21115530

  5. Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease.

    Science.gov (United States)

    Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo

    2014-01-01

    We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. http://rged.wall-eva.net. © The Author(s) 2014. Published by Oxford University Press.

  6. Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease

    Science.gov (United States)

    Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo

    2014-01-01

    We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Availability and implementation: Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. Database URL: http://rged.wall-eva.net PMID:25252782

  7. Approaching system equilibrium with accurate or not accurate feedback information in a two-route system

    Science.gov (United States)

    Zhao, Xiao-mei; Xie, Dong-fan; Li, Qi

    2015-02-01

    With the development of intelligent transport system, advanced information feedback strategies have been developed to reduce traffic congestion and enhance the capacity. However, previous strategies provide accurate information to travelers and our simulation results show that accurate information brings negative effects, especially in delay case. Because travelers prefer to the best condition route with accurate information, and delayed information cannot reflect current traffic condition but past. Then travelers make wrong routing decisions, causing the decrease of the capacity and the increase of oscillations and the system deviating from the equilibrium. To avoid the negative effect, bounded rationality is taken into account by introducing a boundedly rational threshold BR. When difference between two routes is less than the BR, routes have equal probability to be chosen. The bounded rationality is helpful to improve the efficiency in terms of capacity, oscillation and the gap deviating from the system equilibrium.

  8. Microarray MAPH: accurate array-based detection of relative copy number in genomic DNA

    Directory of Open Access Journals (Sweden)

    Chan Alan

    2006-06-01

    Full Text Available Abstract Background Current methods for measurement of copy number do not combine all the desirable qualities of convenience, throughput, economy, accuracy and resolution. In this study, to improve the throughput associated with Multiplex Amplifiable Probe Hybridisation (MAPH we aimed to develop a modification based on the 3-Dimensional, Flow-Through Microarray Platform from PamGene International. In this new method, electrophoretic analysis of amplified products is replaced with photometric analysis of a probed oligonucleotide array. Copy number analysis of hybridised probes is based on a dual-label approach by comparing the intensity of Cy3-labelled MAPH probes amplified from test samples co-hybridised with similarly amplified Cy5-labelled reference MAPH probes. The key feature of using a hybridisation-based end point with MAPH is that discrimination of amplified probes is based on sequence and not fragment length. Results In this study we showed that microarray MAPH measurement of PMP22 gene dosage correlates well with PMP22 gene dosage determined by capillary MAPH and that copy number was accurately reported in analyses of DNA from 38 individuals, 12 of which were known to have Charcot-Marie-Tooth disease type 1A (CMT1A. Conclusion Measurement of microarray-based endpoints for MAPH appears to be of comparable accuracy to electrophoretic methods, and holds the prospect of fully exploiting the potential multiplicity of MAPH. The technology has the potential to simplify copy number assays for genes with a large number of exons, or of expanded sets of probes from dispersed genomic locations.

  9. Microarray MAPH: accurate array-based detection of relative copy number in genomic DNA.

    Science.gov (United States)

    Gibbons, Brian; Datta, Parikkhit; Wu, Ying; Chan, Alan; Al Armour, John

    2006-06-30

    Current methods for measurement of copy number do not combine all the desirable qualities of convenience, throughput, economy, accuracy and resolution. In this study, to improve the throughput associated with Multiplex Amplifiable Probe Hybridisation (MAPH) we aimed to develop a modification based on the 3-Dimensional, Flow-Through Microarray Platform from PamGene International. In this new method, electrophoretic analysis of amplified products is replaced with photometric analysis of a probed oligonucleotide array. Copy number analysis of hybridised probes is based on a dual-label approach by comparing the intensity of Cy3-labelled MAPH probes amplified from test samples co-hybridised with similarly amplified Cy5-labelled reference MAPH probes. The key feature of using a hybridisation-based end point with MAPH is that discrimination of amplified probes is based on sequence and not fragment length. In this study we showed that microarray MAPH measurement of PMP22 gene dosage correlates well with PMP22 gene dosage determined by capillary MAPH and that copy number was accurately reported in analyses of DNA from 38 individuals, 12 of which were known to have Charcot-Marie-Tooth disease type 1A (CMT1A). Measurement of microarray-based endpoints for MAPH appears to be of comparable accuracy to electrophoretic methods, and holds the prospect of fully exploiting the potential multiplicity of MAPH. The technology has the potential to simplify copy number assays for genes with a large number of exons, or of expanded sets of probes from dispersed genomic locations.

  10. Desk-top computer assisted processing of thermoluminescent dosimeters

    International Nuclear Information System (INIS)

    Archer, B.R.; Glaze, S.A.; North, L.B.; Bushong, S.C.

    1977-01-01

    An accurate dosimetric system utilizing a desk-top computer and high sensitivity ribbon type TLDs has been developed. The system incorporates an exposure history file and procedures designed for constant spatial orientation of each dosimeter. Processing of information is performed by two computer programs. The first calculates relative response factors to insure that the corrected response of each TLD is identical following a given dose of radiation. The second program computes a calibration factor and uses it and the relative response factor to determine the actual dose registered by each TLD. (U.K.)

  11. A computational clonal analysis of the developing mouse limb bud.

    Directory of Open Access Journals (Sweden)

    Luciano Marcon

    Full Text Available A comprehensive spatio-temporal description of the tissue movements underlying organogenesis would be an extremely useful resource to developmental biology. Clonal analysis and fate mappings are popular experiments to study tissue movement during morphogenesis. Such experiments allow cell populations to be labeled at an early stage of development and to follow their spatial evolution over time. However, disentangling the cumulative effects of the multiple events responsible for the expansion of the labeled cell population is not always straightforward. To overcome this problem, we develop a novel computational method that combines accurate quantification of 2D limb bud morphologies and growth modeling to analyze mouse clonal data of early limb development. Firstly, we explore various tissue movements that match experimental limb bud shape changes. Secondly, by comparing computational clones with newly generated mouse clonal data we are able to choose and characterize the tissue movement map that better matches experimental data. Our computational analysis produces for the first time a two dimensional model of limb growth based on experimental data that can be used to better characterize limb tissue movement in space and time. The model shows that the distribution and shapes of clones can be described as a combination of anisotropic growth with isotropic cell mixing, without the need for lineage compartmentalization along the AP and PD axis. Lastly, we show that this comprehensive description can be used to reassess spatio-temporal gene regulations taking tissue movement into account and to investigate PD patterning hypothesis.

  12. Neonatal tolerance induction enables accurate evaluation of gene therapy for MPS I in a canine model.

    Science.gov (United States)

    Hinderer, Christian; Bell, Peter; Louboutin, Jean-Pierre; Katz, Nathan; Zhu, Yanqing; Lin, Gloria; Choa, Ruth; Bagel, Jessica; O'Donnell, Patricia; Fitzgerald, Caitlin A; Langan, Therese; Wang, Ping; Casal, Margret L; Haskins, Mark E; Wilson, James M

    2016-09-01

    High fidelity animal models of human disease are essential for preclinical evaluation of novel gene and protein therapeutics. However, these studies can be complicated by exaggerated immune responses against the human transgene. Here we demonstrate that dogs with a genetic deficiency of the enzyme α-l-iduronidase (IDUA), a model of the lysosomal storage disease mucopolysaccharidosis type I (MPS I), can be rendered immunologically tolerant to human IDUA through neonatal exposure to the enzyme. Using MPS I dogs tolerized to human IDUA as neonates, we evaluated intrathecal delivery of an adeno-associated virus serotype 9 vector expressing human IDUA as a therapy for the central nervous system manifestations of MPS I. These studies established the efficacy of the human vector in the canine model, and allowed for estimation of the minimum effective dose, providing key information for the design of first-in-human trials. This approach can facilitate evaluation of human therapeutics in relevant animal models, and may also have clinical applications for the prevention of immune responses to gene and protein replacement therapies. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. CMEIAS color segmentation: an improved computing technology to process color images for quantitative microbial ecology studies at single-cell resolution.

    Science.gov (United States)

    Gross, Colin A; Reddy, Chandan K; Dazzo, Frank B

    2010-02-01

    Quantitative microscopy and digital image analysis are underutilized in microbial ecology largely because of the laborious task to segment foreground object pixels from background, especially in complex color micrographs of environmental samples. In this paper, we describe an improved computing technology developed to alleviate this limitation. The system's uniqueness is its ability to edit digital images accurately when presented with the difficult yet commonplace challenge of removing background pixels whose three-dimensional color space overlaps the range that defines foreground objects. Image segmentation is accomplished by utilizing algorithms that address color and spatial relationships of user-selected foreground object pixels. Performance of the color segmentation algorithm evaluated on 26 complex micrographs at single pixel resolution had an overall pixel classification accuracy of 99+%. Several applications illustrate how this improved computing technology can successfully resolve numerous challenges of complex color segmentation in order to produce images from which quantitative information can be accurately extracted, thereby gain new perspectives on the in situ ecology of microorganisms. Examples include improvements in the quantitative analysis of (1) microbial abundance and phylotype diversity of single cells classified by their discriminating color within heterogeneous communities, (2) cell viability, (3) spatial relationships and intensity of bacterial gene expression involved in cellular communication between individual cells within rhizoplane biofilms, and (4) biofilm ecophysiology based on ribotype-differentiated radioactive substrate utilization. The stand-alone executable file plus user manual and tutorial images for this color segmentation computing application are freely available at http://cme.msu.edu/cmeias/ . This improved computing technology opens new opportunities of imaging applications where discriminating colors really matter most

  14. Chromosomal location and nucleotide sequence of the Escherichia coli dapA gene.

    Science.gov (United States)

    Richaud, F; Richaud, C; Ratet, P; Patte, J C

    1986-04-01

    In Escherichia coli, the first enzyme of the diaminopimelate and lysine pathway is dihydrodipicolinate synthetase, which is feedback-inhibited by lysine and encoded by the dapA gene. The location of the dapA gene on the bacterial chromosome has been determined accurately with respect to the neighboring purC and dapE genes. The complete nucleotide sequence and the transcriptional start of the dapA gene were determined. The results show that dapA consists of a single cistron encoding a 292-amino acid polypeptide of 31,372 daltons.

  15. Chromosomal location and nucleotide sequence of the Escherichia coli dapA gene.

    Science.gov (United States)

    Richaud, F; Richaud, C; Ratet, P; Patte, J C

    1986-01-01

    In Escherichia coli, the first enzyme of the diaminopimelate and lysine pathway is dihydrodipicolinate synthetase, which is feedback-inhibited by lysine and encoded by the dapA gene. The location of the dapA gene on the bacterial chromosome has been determined accurately with respect to the neighboring purC and dapE genes. The complete nucleotide sequence and the transcriptional start of the dapA gene were determined. The results show that dapA consists of a single cistron encoding a 292-amino acid polypeptide of 31,372 daltons. Images PMID:3514578

  16. Variational-moment method for computing magnetohydrodynamic equilibria

    International Nuclear Information System (INIS)

    Lao, L.L.

    1983-08-01

    A fast yet accurate method to compute magnetohydrodynamic equilibria is provided by the variational-moment method, which is similar to the classical Rayleigh-Ritz-Galerkin approximation. The equilibrium solution sought is decomposed into a spectral representation. The partial differential equations describing the equilibrium are then recast into their equivalent variational form and systematically reduced to an optimum finite set of coupled ordinary differential equations. An appropriate spectral decomposition can make the series representing the solution coverge rapidly and hence substantially reduces the amount of computational time involved. The moment method was developed first to compute fixed-boundary inverse equilibria in axisymmetric toroidal geometry, and was demonstrated to be both efficient and accurate. The method since has been generalized to calculate free-boundary axisymmetric equilibria, to include toroidal plasma rotation and pressure anisotropy, and to treat three-dimensional toroidal geometry. In all these formulations, the flux surfaces are assumed to be smooth and nested so that the solutions can be decomposed in Fourier series in inverse coordinates. These recent developments and the advantages and limitations of the moment method are reviewed. The use of alternate coordinates for decomposition is discussed

  17. Fast and accurate automated cell boundary determination for fluorescence microscopy

    Science.gov (United States)

    Arce, Stephen Hugo; Wu, Pei-Hsun; Tseng, Yiider

    2013-07-01

    Detailed measurement of cell phenotype information from digital fluorescence images has the potential to greatly advance biomedicine in various disciplines such as patient diagnostics or drug screening. Yet, the complexity of cell conformations presents a major barrier preventing effective determination of cell boundaries, and introduces measurement error that propagates throughout subsequent assessment of cellular parameters and statistical analysis. State-of-the-art image segmentation techniques that require user-interaction, prolonged computation time and specialized training cannot adequately provide the support for high content platforms, which often sacrifice resolution to foster the speedy collection of massive amounts of cellular data. This work introduces a strategy that allows us to rapidly obtain accurate cell boundaries from digital fluorescent images in an automated format. Hence, this new method has broad applicability to promote biotechnology.

  18. Accurate Calculation of Fringe Fields in the LHC Main Dipoles

    CERN Document Server

    Kurz, S; Siegel, N

    2000-01-01

    The ROXIE program developed at CERN for the design and optimization of the superconducting LHC magnets has been recently extended in a collaboration with the University of Stuttgart, Germany, with a field computation method based on the coupling between the boundary element (BEM) and the finite element (FEM) technique. This avoids the meshing of the coils and the air regions, and avoids the artificial far field boundary conditions. The method is therefore specially suited for the accurate calculation of fields in the superconducting magnets in which the field is dominated by the coil. We will present the fringe field calculations in both 2d and 3d geometries to evaluate the effect of connections and the cryostat on the field quality and the flux density to which auxiliary bus-bars are exposed.

  19. RIO: a new computational framework for accurate initial data of binary black holes

    Science.gov (United States)

    Barreto, W.; Clemente, P. C. M.; de Oliveira, H. P.; Rodriguez-Mueller, B.

    2018-06-01

    We present a computational framework ( Rio) in the ADM 3+1 approach for numerical relativity. This work enables us to carry out high resolution calculations for initial data of two arbitrary black holes. We use the transverse conformal treatment, the Bowen-York and the puncture methods. For the numerical solution of the Hamiltonian constraint we use the domain decomposition and the spectral decomposition of Galerkin-Collocation. The nonlinear numerical code solves the set of equations for the spectral modes using the standard Newton-Raphson method, LU decomposition and Gaussian quadratures. We show the convergence of the Rio code. This code allows for easy deployment of large calculations. We show how the spin of one of the black holes is manifest in the conformal factor.

  20. Identifying the impact of G-quadruplexes on Affymetrix 3' arrays using cloud computing.

    Science.gov (United States)

    Memon, Farhat N; Owen, Anne M; Sanchez-Graillet, Olivia; Upton, Graham J G; Harrison, Andrew P

    2010-01-15

    A tetramer quadruplex structure is formed by four parallel strands of DNA/ RNA containing runs of guanine. These quadruplexes are able to form because guanine can Hoogsteen hydrogen bond to other guanines, and a tetrad of guanines can form a stable arrangement. Recently we have discovered that probes on Affymetrix GeneChips that contain runs of guanine do not measure gene expression reliably. We associate this finding with the likelihood that quadruplexes are forming on the surface of GeneChips. In order to cope with the rapidly expanding size of GeneChip array datasets in the public domain, we are exploring the use of cloud computing to replicate our experiments on 3' arrays to look at the effect of the location of G-spots (runs of guanines). Cloud computing is a recently introduced high-performance solution that takes advantage of the computational infrastructure of large organisations such as Amazon and Google. We expect that cloud computing will become widely adopted because it enables bioinformaticians to avoid capital expenditure on expensive computing resources and to only pay a cloud computing provider for what is used. Moreover, as well as financial efficiency, cloud computing is an ecologically-friendly technology, it enables efficient data-sharing and we expect it to be faster for development purposes. Here we propose the advantageous use of cloud computing to perform a large data-mining analysis of public domain 3' arrays.

  1. GeneBreak: detection of recurrent DNA copy number aberration-associated chromosomal breakpoints within genes [version 2; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Evert van den Broek

    2017-07-01

    Full Text Available Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large series of tumor samples. ‘GeneBreak’ is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH or by (low-pass whole genome sequencing (WGS. First, ‘GeneBreak’ collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, ‘GeneBreak’, is implemented in R (www.cran.r-project.org and is available from Bioconductor (www.bioconductor.org/packages/release/bioc/html/GeneBreak.html.

  2. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  3. Computational identification of putative cytochrome P450 genes in ...

    African Journals Online (AJOL)

    Chattha

    Economically, legumes represent the second most important family of crop plants after Poacea (grass family), accounting for ... further characterization of P450 genes with both known and unknown functions. MATERIALS AND METHODS ..... Cytochrome P450. In: Somerville CR, Meyerowitz EM (eds) .The Arabidopsis book,.

  4. An accurate solver for forward and inverse transport

    International Nuclear Information System (INIS)

    Monard, Francois; Bal, Guillaume

    2010-01-01

    This paper presents a robust and accurate way to solve steady-state linear transport (radiative transfer) equations numerically. Our main objective is to address the inverse transport problem, in which the optical parameters of a domain of interest are reconstructed from measurements performed at the domain's boundary. This inverse problem has important applications in medical and geophysical imaging, and more generally in any field involving high frequency waves or particles propagating in scattering environments. Stable solutions of the inverse transport problem require that the singularities of the measurement operator, which maps the optical parameters to the available measurements, be captured with sufficient accuracy. This in turn requires that the free propagation of particles be calculated with care, which is a difficult problem on a Cartesian grid. A standard discrete ordinates method is used for the direction of propagation of the particles. Our methodology to address spatial discretization is based on rotating the computational domain so that each direction of propagation is always aligned with one of the grid axes. Rotations are performed in the Fourier domain to achieve spectral accuracy. The numerical dispersion of the propagating particles is therefore minimal. As a result, the ballistic and single scattering components of the transport solution are calculated robustly and accurately. Physical blurring effects, such as small angular diffusion, are also incorporated into the numerical tool. Forward and inverse calculations performed in a two-dimensional setting exemplify the capabilities of the method. Although the methodology might not be the fastest way to solve transport equations, its physical accuracy provides us with a numerical tool to assess what can and cannot be reconstructed in inverse transport theory.

  5. A computationally simple and robust method to detect determinism in a time series

    DEFF Research Database (Denmark)

    Lu, Sheng; Ju, Ki Hwan; Kanters, Jørgen K.

    2006-01-01

    We present a new, simple, and fast computational technique, termed the incremental slope (IS), that can accurately distinguish between deterministic from stochastic systems even when the variance of noise is as large or greater than the signal, and remains robust for time-varying signals. The IS ......We present a new, simple, and fast computational technique, termed the incremental slope (IS), that can accurately distinguish between deterministic from stochastic systems even when the variance of noise is as large or greater than the signal, and remains robust for time-varying signals...

  6. Semi-supervised prediction of gene regulatory networks using ...

    Indian Academy of Sciences (India)

    2015-09-28

    Sep 28, 2015 ... Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging ... two types of methods differ primarily based on whether ..... negligible, allowing us to draw the qualitative conclusions .... research will be conducted to develop additional biologically.

  7. Efficient strategy for detecting gene × gene joint action and its application in schizophrenia.

    Science.gov (United States)

    Won, Sungho; Kwon, Min-Seok; Mattheisen, Manuel; Park, Suyeon; Park, Changsoon; Kihara, Daisuke; Cichon, Sven; Ophoff, Roel; Nöthen, Markus M; Rietschel, Marcella; Baur, Max; Uitterlinden, Andre G; Hofmann, A; Lange, Christoph

    2014-01-01

    We propose a new approach to detect gene × gene joint action in genome-wide association studies (GWASs) for case-control designs. This approach offers an exhaustive search for all two-way joint action (including, as a special case, single gene action) that is computationally feasible at the genome-wide level and has reasonable statistical power under most genetic models. We found that the presence of any gene × gene joint action may imply differences in three types of genetic components: the minor allele frequencies and the amounts of Hardy-Weinberg disequilibrium may differ between cases and controls, and between the two genetic loci the degree of linkage disequilibrium may differ between cases and controls. Using Fisher's method, it is possible to combine the different sources of genetic information in an overall test for detecting gene × gene joint action. The proposed statistical analysis is efficient and its simplicity makes it applicable to GWASs. In the current study, we applied the proposed approach to a GWAS on schizophrenia and found several potential gene × gene interactions. Our application illustrates the practical advantage of the proposed method. © 2013 WILEY PERIODICALS, INC.

  8. Cross-scale Efficient Tensor Contractions for Coupled Cluster Computations Through Multiple Programming Model Backends

    Energy Technology Data Exchange (ETDEWEB)

    Ibrahim, Khaled Z. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Epifanovsky, Evgeny [Q-Chem, Inc., Pleasanton, CA (United States); Williams, Samuel W. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Krylov, Anna I. [Univ. of Southern California, Los Angeles, CA (United States). Dept. of Chemistry

    2016-07-26

    Coupled-cluster methods provide highly accurate models of molecular structure by explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix-matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts to extend the Libtensor framework to work in the distributed memory environment in a scalable and energy efficient manner. We achieve up to 240 speedup compared with the best optimized shared memory implementation. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures, (Cray XC30&XC40, BlueGene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance. Nevertheless, we preserve a uni ed interface to both programming models to maintain the productivity of computational quantum chemists.

  9. Probability-based collaborative filtering model for predicting gene–disease associations

    OpenAIRE

    Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan

    2017-01-01

    Background Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene–disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. Methods We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our mo...

  10. Investigating Gene Function in Cereal Rust Fungi by Plant-Mediated Virus-Induced Gene Silencing.

    Science.gov (United States)

    Panwar, Vinay; Bakkeren, Guus

    2017-01-01

    Cereal rust fungi are destructive pathogens, threatening grain production worldwide. Targeted breeding for resistance utilizing host resistance genes has been effective. However, breakdown of resistance occurs frequently and continued efforts are needed to understand how these fungi overcome resistance and to expand the range of available resistance genes. Whole genome sequencing, transcriptomic and proteomic studies followed by genome-wide computational and comparative analyses have identified large repertoire of genes in rust fungi among which are candidates predicted to code for pathogenicity and virulence factors. Some of these genes represent defence triggering avirulence effectors. However, functions of most genes still needs to be assessed to understand the biology of these obligate biotrophic pathogens. Since genetic manipulations such as gene deletion and genetic transformation are not yet feasible in rust fungi, performing functional gene studies is challenging. Recently, Host-induced gene silencing (HIGS) has emerged as a useful tool to characterize gene function in rust fungi while infecting and growing in host plants. We utilized Barley stripe mosaic virus-mediated virus induced gene silencing (BSMV-VIGS) to induce HIGS of candidate rust fungal genes in the wheat host to determine their role in plant-fungal interactions. Here, we describe the methods for using BSMV-VIGS in wheat for functional genomics study in cereal rust fungi.

  11. Simple Comparative Analyses of Differentially Expressed Gene Lists May Overestimate Gene Overlap.

    Science.gov (United States)

    Lawhorn, Chelsea M; Schomaker, Rachel; Rowell, Jonathan T; Rueppell, Olav

    2018-04-16

    Comparing the overlap between sets of differentially expressed genes (DEGs) within or between transcriptome studies is regularly used to infer similarities between biological processes. Significant overlap between two sets of DEGs is usually determined by a simple test. The number of potentially overlapping genes is compared to the number of genes that actually occur in both lists, treating every gene as equal. However, gene expression is controlled by transcription factors that bind to a variable number of transcription factor binding sites, leading to variation among genes in general variability of their expression. Neglecting this variability could therefore lead to inflated estimates of significant overlap between DEG lists. With computer simulations, we demonstrate that such biases arise from variation in the control of gene expression. Significant overlap commonly arises between two lists of DEGs that are randomly generated, assuming that the control of gene expression is variable among genes but consistent between corresponding experiments. More overlap is observed when transcription factors are specific to their binding sites and when the number of genes is considerably higher than the number of different transcription factors. In contrast, overlap between two DEG lists is always lower than expected when the genetic architecture of expression is independent between the two experiments. Thus, the current methods for determining significant overlap between DEGs are potentially confounding biologically meaningful overlap with overlap that arises due to variability in control of expression among genes, and more sophisticated approaches are needed.

  12. Accurate collision integrals for the attractive static screened Coulomb potential with application to electrical conductivity

    International Nuclear Information System (INIS)

    Macdonald, J.

    1991-01-01

    The results of accurate calculations of collision integrals for the attractive static screened Coulomb potential are presented. To obtain high accuracy with minimal computational cost, the integrals are evaluated by a quadrature method based on the Whittaker cardinal function. The collision integrals for the attractive potential are needed for calculation of the electrical conductivity of a dense fully or partially ionized plasma, and the results presented here are appropriate for the conditions in the nondegenerate envelopes of white dwarf stars. 25 refs

  13. A numerical method to compute interior transmission eigenvalues

    International Nuclear Information System (INIS)

    Kleefeld, Andreas

    2013-01-01

    In this paper the numerical calculation of eigenvalues of the interior transmission problem arising in acoustic scattering for constant contrast in three dimensions is considered. From the computational point of view existing methods are very expensive, and are only able to show the existence of such transmission eigenvalues. Furthermore, they have trouble finding them if two or more eigenvalues are situated closely together. We present a new method based on complex-valued contour integrals and the boundary integral equation method which is able to calculate highly accurate transmission eigenvalues. So far, this is the first paper providing such accurate values for various surfaces different from a sphere in three dimensions. Additionally, the computational cost is even lower than those of existing methods. Furthermore, the algorithm is capable of finding complex-valued eigenvalues for which no numerical results have been reported yet. Until now, the proof of existence of such eigenvalues is still open. Finally, highly accurate eigenvalues of the interior Dirichlet problem are provided and might serve as test cases to check newly derived Faber–Krahn type inequalities for larger transmission eigenvalues that are not yet available. (paper)

  14. An accurate projection algorithm for array processor based SPECT systems

    International Nuclear Information System (INIS)

    King, M.A.; Schwinger, R.B.; Cool, S.L.

    1985-01-01

    A data re-projection algorithm has been developed for use in single photon emission computed tomography (SPECT) on an array processor based computer system. The algorithm makes use of an accurate representation of pixel activity (uniform square pixel model of intensity distribution), and is rapidly performed due to the efficient handling of an array based algorithm and the Fast Fourier Transform (FFT) on parallel processing hardware. The algorithm consists of using a pixel driven nearest neighbour projection operation to an array of subdivided projection bins. This result is then convolved with the projected uniform square pixel distribution before being compressed to original bin size. This distribution varies with projection angle and is explicitly calculated. The FFT combined with a frequency space multiplication is used instead of a spatial convolution for more rapid execution. The new algorithm was tested against other commonly used projection algorithms by comparing the accuracy of projections of a simulated transverse section of the abdomen against analytically determined projections of that transverse section. The new algorithm was found to yield comparable or better standard error and yet result in easier and more efficient implementation on parallel hardware. Applications of the algorithm include iterative reconstruction and attenuation correction schemes and evaluation of regions of interest in dynamic and gated SPECT

  15. The duplicated genes database: identification and functional annotation of co-localised duplicated genes across genomes.

    Directory of Open Access Journals (Sweden)

    Marion Ouedraogo

    Full Text Available BACKGROUND: There has been a surge in studies linking genome structure and gene expression, with special focus on duplicated genes. Although initially duplicated from the same sequence, duplicated genes can diverge strongly over evolution and take on different functions or regulated expression. However, information on the function and expression of duplicated genes remains sparse. Identifying groups of duplicated genes in different genomes and characterizing their expression and function would therefore be of great interest to the research community. The 'Duplicated Genes Database' (DGD was developed for this purpose. METHODOLOGY: Nine species were included in the DGD. For each species, BLAST analyses were conducted on peptide sequences corresponding to the genes mapped on a same chromosome. Groups of duplicated genes were defined based on these pairwise BLAST comparisons and the genomic location of the genes. For each group, Pearson correlations between gene expression data and semantic similarities between functional GO annotations were also computed when the relevant information was available. CONCLUSIONS: The Duplicated Gene Database provides a list of co-localised and duplicated genes for several species with the available gene co-expression level and semantic similarity value of functional annotation. Adding these data to the groups of duplicated genes provides biological information that can prove useful to gene expression analyses. The Duplicated Gene Database can be freely accessed through the DGD website at http://dgd.genouest.org.

  16. Quick, Accurate, Smart: 3D Computer Vision Technology Helps Assessing Confined Animals' Behaviour.

    Directory of Open Access Journals (Sweden)

    Shanis Barnard

    Full Text Available Mankind directly controls the environment and lifestyles of several domestic species for purposes ranging from production and research to conservation and companionship. These environments and lifestyles may not offer these animals the best quality of life. Behaviour is a direct reflection of how the animal is coping with its environment. Behavioural indicators are thus among the preferred parameters to assess welfare. However, behavioural recording (usually from video can be very time consuming and the accuracy and reliability of the output rely on the experience and background of the observers. The outburst of new video technology and computer image processing gives the basis for promising solutions. In this pilot study, we present a new prototype software able to automatically infer the behaviour of dogs housed in kennels from 3D visual data and through structured machine learning frameworks. Depth information acquired through 3D features, body part detection and training are the key elements that allow the machine to recognise postures, trajectories inside the kennel and patterns of movement that can be later labelled at convenience. The main innovation of the software is its ability to automatically cluster frequently observed temporal patterns of movement without any pre-set ethogram. Conversely, when common patterns are defined through training, a deviation from normal behaviour in time or between individuals could be assessed. The software accuracy in correctly detecting the dogs' behaviour was checked through a validation process. An automatic behaviour recognition system, independent from human subjectivity, could add scientific knowledge on animals' quality of life in confinement as well as saving time and resources. This 3D framework was designed to be invariant to the dog's shape and size and could be extended to farm, laboratory and zoo quadrupeds in artificial housing. The computer vision technique applied to this software is

  17. Quick, Accurate, Smart: 3D Computer Vision Technology Helps Assessing Confined Animals’ Behaviour

    Science.gov (United States)

    Calderara, Simone; Pistocchi, Simone; Cucchiara, Rita; Podaliri-Vulpiani, Michele; Messori, Stefano; Ferri, Nicola

    2016-01-01

    Mankind directly controls the environment and lifestyles of several domestic species for purposes ranging from production and research to conservation and companionship. These environments and lifestyles may not offer these animals the best quality of life. Behaviour is a direct reflection of how the animal is coping with its environment. Behavioural indicators are thus among the preferred parameters to assess welfare. However, behavioural recording (usually from video) can be very time consuming and the accuracy and reliability of the output rely on the experience and background of the observers. The outburst of new video technology and computer image processing gives the basis for promising solutions. In this pilot study, we present a new prototype software able to automatically infer the behaviour of dogs housed in kennels from 3D visual data and through structured machine learning frameworks. Depth information acquired through 3D features, body part detection and training are the key elements that allow the machine to recognise postures, trajectories inside the kennel and patterns of movement that can be later labelled at convenience. The main innovation of the software is its ability to automatically cluster frequently observed temporal patterns of movement without any pre-set ethogram. Conversely, when common patterns are defined through training, a deviation from normal behaviour in time or between individuals could be assessed. The software accuracy in correctly detecting the dogs’ behaviour was checked through a validation process. An automatic behaviour recognition system, independent from human subjectivity, could add scientific knowledge on animals’ quality of life in confinement as well as saving time and resources. This 3D framework was designed to be invariant to the dog’s shape and size and could be extended to farm, laboratory and zoo quadrupeds in artificial housing. The computer vision technique applied to this software is innovative in non

  18. Quick, Accurate, Smart: 3D Computer Vision Technology Helps Assessing Confined Animals' Behaviour.

    Science.gov (United States)

    Barnard, Shanis; Calderara, Simone; Pistocchi, Simone; Cucchiara, Rita; Podaliri-Vulpiani, Michele; Messori, Stefano; Ferri, Nicola

    2016-01-01

    Mankind directly controls the environment and lifestyles of several domestic species for purposes ranging from production and research to conservation and companionship. These environments and lifestyles may not offer these animals the best quality of life. Behaviour is a direct reflection of how the animal is coping with its environment. Behavioural indicators are thus among the preferred parameters to assess welfare. However, behavioural recording (usually from video) can be very time consuming and the accuracy and reliability of the output rely on the experience and background of the observers. The outburst of new video technology and computer image processing gives the basis for promising solutions. In this pilot study, we present a new prototype software able to automatically infer the behaviour of dogs housed in kennels from 3D visual data and through structured machine learning frameworks. Depth information acquired through 3D features, body part detection and training are the key elements that allow the machine to recognise postures, trajectories inside the kennel and patterns of movement that can be later labelled at convenience. The main innovation of the software is its ability to automatically cluster frequently observed temporal patterns of movement without any pre-set ethogram. Conversely, when common patterns are defined through training, a deviation from normal behaviour in time or between individuals could be assessed. The software accuracy in correctly detecting the dogs' behaviour was checked through a validation process. An automatic behaviour recognition system, independent from human subjectivity, could add scientific knowledge on animals' quality of life in confinement as well as saving time and resources. This 3D framework was designed to be invariant to the dog's shape and size and could be extended to farm, laboratory and zoo quadrupeds in artificial housing. The computer vision technique applied to this software is innovative in non

  19. Harnessing diversity towards the reconstructing of large scale gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Takeshi Hase

    Full Text Available Elucidating gene regulatory network (GRN from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks.

  20. Diagnostic accuracy of GeneXpert MTB/RIF in musculoskeletal ...

    African Journals Online (AJOL)

    GeneXpert MTB/RIF is an accurate test for the detection of TB in tissue samples of HIV-infected .... continuous data were summarised by means and 95% CIs and non- ... One sample was excluded as the culture sample was sent in formalin.