WorldWideScience

Sample records for accurate genome alignment

  1. Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner

    DEFF Research Database (Denmark)

    Lu, David V; Brown, Randall H; Arumugam, Manimozhiyan;

    2009-01-01

    MOTIVATION: The most accurate way to determine the intron-exon structures in a genome is to align spliced cDNA sequences to the genome. Thus, cDNA-to-genome alignment programs are a key component of most annotation pipelines. The scoring system used to choose the best alignment is a primary...

  2. Faster and More Accurate Sequence Alignment with SNAP

    CERN Document Server

    Zaharia, Matei; Curtis, Kristal; Fox, Armando; Patterson, David; Shenker, Scott; Stoica, Ion; Karp, Richard M; Sittler, Taylor

    2011-01-01

    We present the Scalable Nucleotide Alignment Program (SNAP), a new short and long read aligner that is both more accurate (i.e., aligns more reads with fewer errors) and 10-100x faster than state-of-the-art tools such as BWA. Unlike recent aligners based on the Burrows-Wheeler transform, SNAP uses a simple hash index of short seed sequences from the genome, similar to BLAST's. However, SNAP greatly reduces the number and cost of local alignment checks performed through several measures: it uses longer seeds to reduce the false positive locations considered, leverages larger memory capacities to speed index lookup, and excludes most candidate locations without fully computing their edit distance to the read. The result is an algorithm that scales well for reads from one hundred to thousands of bases long and provides a rich error model that can match classes of mutations (e.g., longer indels) that today's fast aligners ignore. We calculate that SNAP can align a dataset with 30x coverage of a human genome in le...

  3. Genome Update: alignment of bacterial chromosomes

    DEFF Research Database (Denmark)

    Ussery, David; Jensen, Mette; Poulsen, Tine Rugh;

    2004-01-01

    There are four new microbial genomes listed in this month's Genome Update, three belonging to Gram-positive bacteria and one belonging to an archaeon that lives at pH 0; all of these genomes are listed in Table 1⇓. The method of genome comparison this month is that of genome alignment and, as an ...

  4. Cactus: Algorithms for genome multiple sequence alignment

    OpenAIRE

    Paten, Benedict; Earl, Dent; Nguyen, Ngan; Diekhans, Mark; Zerbino, Daniel; Haussler, David

    2011-01-01

    Much attention has been given to the problem of creating reliable multiple sequence alignments in a model incorporating substitutions, insertions, and deletions. Far less attention has been paid to the problem of optimizing alignments in the presence of more general rearrangement and copy number variation. Using Cactus graphs, recently introduced for representing sequence alignments, we describe two complementary algorithms for creating genomic alignments. We have implemented these algorithms...

  5. Strategies and tools for whole genome alignments

    Energy Technology Data Exchange (ETDEWEB)

    Couronne, Olivier; Poliakov, Alexander; Bray, Nicolas; Ishkhanov,Tigran; Ryaboy, Dmitriy; Rubin, Edward; Pachter, Lior; Dubchak, Inna

    2002-11-25

    The availability of the assembled mouse genome makespossible, for the first time, an alignment and comparison of two largevertebrate genomes. We have investigated different strategies ofalignment for the subsequent analysis of conservation of genomes that areeffective for different quality assemblies. These strategies were appliedto the comparison of the working draft of the human genome with the MouseGenome Sequencing Consortium assembly, as well as other intermediatemouse assemblies. Our methods are fast and the resulting alignmentsexhibit a high degree of sensitivity, covering more than 90 percent ofknown coding exons in the human genome. We have obtained such coveragewhile preserving specificity. With a view towards the end user, we havedeveloped a suite of tools and websites for automatically aligning, andsubsequently browsing and working with whole genome comparisons. Wedescribe the use of these tools to identify conserved non-coding regionsbetween the human and mouse genomes, some of which have not beenidentified by other methods.

  6. Accurate genome relative abundance estimation based on shotgun metagenomic reads.

    Directory of Open Access Journals (Sweden)

    Li C Xia

    Full Text Available Accurate estimation of microbial community composition based on metagenomic sequencing data is fundamental for subsequent metagenomics analysis. Prevalent estimation methods are mainly based on directly summarizing alignment results or its variants; often result in biased and/or unstable estimates. We have developed a unified probabilistic framework (named GRAMMy by explicitly modeling read assignment ambiguities, genome size biases and read distributions along the genomes. Maximum likelihood method is employed to compute Genome Relative Abundance of microbial communities using the Mixture Model theory (GRAMMy. GRAMMy has been demonstrated to give estimates that are accurate and robust across both simulated and real read benchmark datasets. We applied GRAMMy to a collection of 34 metagenomic read sets from four metagenomics projects and identified 99 frequent species (minimally 0.5% abundant in at least 50% of the data-sets in the human gut samples. Our results show substantial improvements over previous studies, such as adjusting the over-estimated abundance for Bacteroides species for human gut samples, by providing a new reference-based strategy for metagenomic sample comparisons. GRAMMy can be used flexibly with many read assignment tools (mapping, alignment or composition-based even with low-sensitivity mapping results from huge short-read datasets. It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes.

  7. BFAST: an alignment tool for large scale genome resequencing.

    Directory of Open Access Journals (Sweden)

    Nils Homer

    Full Text Available BACKGROUND: The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation. METHODOLOGY: We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels. CONCLUSIONS: We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.

  8. Multiple Whole Genome Alignments Without a Reference Organism

    Energy Technology Data Exchange (ETDEWEB)

    Dubchak, Inna; Poliakov, Alexander; Kislyuk, Andrey; Brudno, Michael

    2009-01-16

    Multiple sequence alignments have become one of the most commonly used resources in genomics research. Most algorithms for multiple alignment of whole genomes rely either on a reference genome, against which all of the other sequences are laid out, or require a one-to-one mapping between the nucleotides of the genomes, preventing the alignment of recently duplicated regions. Both approaches have drawbacks for whole-genome comparisons. In this paper we present a novel symmetric alignment algorithm. The resulting alignments not only represent all of the genomes equally well, but also include all relevant duplications that occurred since the divergence from the last common ancestor. Our algorithm, implemented as a part of the VISTA Genome Pipeline (VGP), was used to align seven vertebrate and sixDrosophila genomes. The resulting whole-genome alignments demonstrate a higher sensitivity and specificity than the pairwise alignments previously available through the VGP and have higher exon alignment accuracy than comparable public whole-genome alignments. Of the multiple alignment methods tested, ours performed the best at aligning genes from multigene families?perhaps the most challenging test for whole-genome alignments. Our whole-genome multiple alignments are available through the VISTA Browser at http://genome.lbl.gov/vista/index.shtml.

  9. Improving pan-genome annotation using whole genome multiple alignment

    Directory of Open Access Journals (Sweden)

    Salzberg Steven L

    2011-06-01

    Full Text Available Abstract Background Rapid annotation and comparisons of genomes from multiple isolates (pan-genomes is becoming commonplace due to advances in sequencing technology. Genome annotations can contain inconsistencies and errors that hinder comparative analysis even within a single species. Tools are needed to compare and improve annotation quality across sets of closely related genomes. Results We introduce a new tool, Mugsy-Annotator, that identifies orthologs and evaluates annotation quality in prokaryotic genomes using whole genome multiple alignment. Mugsy-Annotator identifies anomalies in annotated gene structures, including inconsistently located translation initiation sites and disrupted genes due to draft genome sequencing or pseudogenes. An evaluation of species pan-genomes using the tool indicates that such anomalies are common, especially at translation initiation sites. Mugsy-Annotator reports alternate annotations that improve consistency and are candidates for further review. Conclusions Whole genome multiple alignment can be used to efficiently identify orthologs and annotation problem areas in a bacterial pan-genome. Comparisons of annotated gene structures within a species may show more variation than is actually present in the genome, indicating errors in genome annotation. Our new tool Mugsy-Annotator assists re-annotation efforts by highlighting edits that improve annotation consistency.

  10. JAGuaR: junction alignments to genome for RNA-seq reads.

    Directory of Open Access Journals (Sweden)

    Yaron S Butterfield

    Full Text Available JAGuaR is an alignment protocol for RNA-seq reads that uses an extended reference to increase alignment sensitivity. It uses BWA to align reads to the genome and reference transcript models (including annotated exon-exon junctions specifically allowing for the possibility of a single read spanning multiple exons. Reads aligned to the transcript models are then re-mapped on to genomic coordinates, transforming alignments that span multiple exons into large-gapped alignments on the genome. While JAGuaR does not detect novel junctions, we demonstrate how JAGuaR generates fast and accurate transcriptome alignments, which allows for both sensitive and specific SNV calling.

  11. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    Directory of Open Access Journals (Sweden)

    Arabi E. keshk

    2014-05-01

    Full Text Available The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between sequences. This paper introduces an enhancement of dynamic algorithm of genome sequence alignment, which called EDAGSA. It is filling the three main diagonals without filling the entire matrix by the unused data. It gets the optimal solution with decreasing the execution time and therefore the performance is increased. To illustrate the effectiveness of optimizing the performance of the proposed algorithm, it is compared with the traditional methods such as Needleman-Wunsch, Smith-Waterman and longest common subsequence algorithms. Also, database is implemented for using the algorithm in multi-sequence alignments for searching the optimal sequence that matches the given sequence.

  12. BBMap: A Fast, Accurate, Splice-Aware Aligner

    Energy Technology Data Exchange (ETDEWEB)

    Bushnell, Brian

    2014-03-17

    Alignment of reads is one of the primary computational tasks in bioinformatics. Of paramount importance to resequencing, alignment is also crucial to other areas - quality control, scaffolding, string-graph assembly, homology detection, assembly evaluation, error-correction, expression quantification, and even as a tool to evaluate other tools. An optimal aligner would greatly improve virtually any sequencing process, but optimal alignment is prohibitively expensive for gigabases of data. Here, we will present BBMap [1], a fast splice-aware aligner for short and long reads. We will demonstrate that BBMap has superior speed, sensitivity, and specificity to alternative high-throughput aligners bowtie2 [2], bwa [3], smalt, [4] GSNAP [5], and BLASR [6].

  13. Genomic multiple sequence alignments: refinement using a genetic algorithm

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2005-08-01

    Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only

  14. Optimizing cell arrays for accurate functional genomics

    Directory of Open Access Journals (Sweden)

    Fengler Sven

    2012-07-01

    Full Text Available Abstract Background Cellular responses emerge from a complex network of dynamic biochemical reactions. In order to investigate them is necessary to develop methods that allow perturbing a high number of gene products in a flexible and fast way. Cell arrays (CA enable such experiments on microscope slides via reverse transfection of cellular colonies growing on spotted genetic material. In contrast to multi-well plates, CA are susceptible to contamination among neighboring spots hindering accurate quantification in cell-based screening projects. Here we have developed a quality control protocol for quantifying and minimizing contamination in CA. Results We imaged checkered CA that express two distinct fluorescent proteins and segmented images into single cells to quantify the transfection efficiency and interspot contamination. Compared with standard procedures, we measured a 3-fold reduction of contaminants when arrays containing HeLa cells were washed shortly after cell seeding. We proved that nucleic acid uptake during cell seeding rather than migration among neighboring spots was the major source of contamination. Arrays of MCF7 cells developed without the washing step showed 7-fold lower percentage of contaminant cells, demonstrating that contamination is dependent on specific cell properties. Conclusions Previously published methodological works have focused on achieving high transfection rate in densely packed CA. Here, we focused in an equally important parameter: The interspot contamination. The presented quality control is essential for estimating the rate of contamination, a major source of false positives and negatives in current microscopy based functional genomics screenings. We have demonstrated that a washing step after seeding enhances CA quality for HeLA but is not necessary for MCF7. The described method provides a way to find optimal seeding protocols for cell lines intended to be used for the first time in CA.

  15. A fast and accurate initial alignment method for strapdown inertial navigation system on stationary base

    Institute of Scientific and Technical Information of China (English)

    Xinlong WANG; Gongxun SHEN

    2005-01-01

    In this work,a fast and accurate stationary alignment method for strapdown inertial navigation system (SINS) is proposed.It has been demonstrated that the stationary alignment of SINS can be improved by employing the multiposition technique,but the alignment time of the azimuth error is relatively longer.Over here,the two-position alignment principle is presented.On the basis of this SINS error model,a fast estimation algorithm of the azimuth error for the initial alignment of SINS on stationary base is derived fully from the horizontal velocity outputs and the output rates,and the novel azimuth error estimation algorithm is used for the two-position alignment.Consequently,the speed and accuracy of the SINS's initial alignment is enhanced greatly.The computer simulation results illustrate the efficiency of this alignment method.

  16. Comparative genomics beyond sequence-based alignments

    DEFF Research Database (Denmark)

    Þórarinsson, Elfar; Yao, Zizhen; Wiklund, Eric D.;

    2008-01-01

    Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment me...

  17. Using ESTs for phylogenomics: Can one accurately infer a phylogenetic tree from a gappy alignment?

    Directory of Open Access Journals (Sweden)

    Hartmann Stefanie

    2008-03-01

    Full Text Available Abstract Background While full genome sequences are still only available for a handful of taxa, large collections of partial gene sequences are available for many more. The alignment of partial gene sequences results in a multiple sequence alignment containing large gaps that are arranged in a staggered pattern. The consequences of this pattern of missing data on the accuracy of phylogenetic analysis are not well understood. We conducted a simulation study to determine the accuracy of phylogenetic trees obtained from gappy alignments using three commonly used phylogenetic reconstruction methods (Neighbor Joining, Maximum Parsimony, and Maximum Likelihood and studied ways to improve the accuracy of trees obtained from such datasets. Results We found that the pattern of gappiness in multiple sequence alignments derived from partial gene sequences substantially compromised phylogenetic accuracy even in the absence of alignment error. The decline in accuracy was beyond what would be expected based on the amount of missing data. The decline was particularly dramatic for Neighbor Joining and Maximum Parsimony, where the majority of gappy alignments contained 25% to 40% incorrect quartets. To improve the accuracy of the trees obtained from a gappy multiple sequence alignment, we examined two approaches. In the first approach, alignment masking, potentially problematic columns and input sequences are excluded from from the dataset. Even in the absence of alignment error, masking improved phylogenetic accuracy up to 100-fold. However, masking retained, on average, only 83% of the input sequences. In the second approach, alignment subdivision, the missing data is statistically modelled in order to retain as many sequences as possible in the phylogenetic analysis. Subdivision resulted in more modest improvements to alignment accuracy, but succeeded in including almost all of the input sequences. Conclusion These results demonstrate that partial gene

  18. Rapid and accurate pyrosequencing of angiosperm plastid genomes

    Directory of Open Access Journals (Sweden)

    Farmerie William G

    2006-08-01

    Full Text Available Abstract Background Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20 System (454 Life Sciences Corporation, to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae and Platanus occidentalis (Platanaceae. Results More than 99.75% of each plastid genome was simultaneously obtained during two GS 20 sequence runs, to an average depth of coverage of 24.6× in Nandina and 17.3× in Platanus. The Nandina and Platanus plastid genomes shared essentially identical gene complements and possessed the typical angiosperm plastid structure and gene arrangement. To assess the accuracy of the GS 20 sequence, over 45 kilobases of sequence were generated for each genome using conventional sequencing. Overall error rates of 0.043% and 0.031% were observed in GS 20 sequence for Nandina and Platanus, respectively. More than 97% of all observed errors were associated with homopolymer runs, with ~60% of all errors associated with homopolymer runs of 5 or more nucleotides and ~50% of all errors associated with regions of extensive homopolymer runs. No substitution errors were present in either genome. Error rates were generally higher in the single-copy and noncoding regions of both plastid genomes relative to the inverted repeat and coding regions. Conclusion Highly accurate and essentially complete sequence information was obtained for the Nandina and Platanus plastid genomes using the GS 20 System. More importantly, the high accuracy

  19. SOAP3-dp: fast, accurate and sensitive GPU-based short read aligner.

    Directory of Open Access Journals (Sweden)

    Ruibang Luo

    Full Text Available To tackle the exponentially increasing throughput of Next-Generation Sequencing (NGS, most of the existing short-read aligners can be configured to favor speed in trade of accuracy and sensitivity. SOAP3-dp, through leveraging the computational power of both CPU and GPU with optimized algorithms, delivers high speed and sensitivity simultaneously. Compared with widely adopted aligners including BWA, Bowtie2, SeqAlto, CUSHAW2, GEM and GPU-based aligners BarraCUDA and CUSHAW, SOAP3-dp was found to be two to tens of times faster, while maintaining the highest sensitivity and lowest false discovery rate (FDR on Illumina reads with different lengths. Transcending its predecessor SOAP3, which does not allow gapped alignment, SOAP3-dp by default tolerates alignment similarity as low as 60%. Real data evaluation using human genome demonstrates SOAP3-dp's power to enable more authentic variants and longer Indels to be discovered. Fosmid sequencing shows a 9.1% FDR on newly discovered deletions. SOAP3-dp natively supports BAM file format and provides the same scoring scheme as BWA, which enables it to be integrated into existing analysis pipelines. SOAP3-dp has been deployed on Amazon-EC2, NIH-Biowulf and Tianhe-1A.

  20. Volume visualization of multiple alignment of genomic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shah, Nameeta; Weber, Gunther H.; Dillard, Scott E.; Hamann, Bernd

    2004-05-01

    Genomes of hundreds of species have been sequenced to date and many more are being sequenced. As more and more sequence data sets become available, and as the challenge of comparing these massive ''billion basepair DNA sequences'' becomes substantial, so does the need for more powerful tools supporting the exploration of these data sets. Similarity score data used to compare aligned DNA sequences is inherently one-dimensional. One-dimensional (1D) representations of these data sets do not effectively utilize screen real estate. We present a technique to arrange 1D data in 3D space to allow us to apply state-of-the-art interactive volume visualization techniques for data exploration. We provide results for aligned DNA sequence data and compare it with traditional 1D line plots. Our technique, coupled with 1D line plots, results in effective multiresolution visualization of very large aligned sequence data sets.

  1. Alignment-free phylogeny of whole genomes using underlying subwords

    Directory of Open Access Journals (Sweden)

    Comin Matteo

    2012-12-01

    Full Text Available Abstract Background With the progress of modern sequencing technologies a large number of complete genomes are now available. Traditionally the comparison of two related genomes is carried out by sequence alignment. There are cases where these techniques cannot be applied, for example if two genomes do not share the same set of genes, or if they are not alignable to each other due to low sequence similarity, rearrangements and inversions, or more specifically to their lengths when the organisms belong to different species. For these cases the comparison of complete genomes can be carried out only with ad hoc methods that are usually called alignment-free methods. Methods In this paper we propose a distance function based on subword compositions called Underlying Approach (UA. We prove that the matching statistics, a popular concept in the field of string algorithms able to capture the statistics of common words between two sequences, can be derived from a small set of “independent” subwords, namely the irredundant common subwords. We define a distance-like measure based on these subwords, such that each region of genomes contributes only once, thus avoiding to count shared subwords a multiple number of times. In a nutshell, this filter discards subwords occurring in regions covered by other more significant subwords. Results The Underlying Approach (UA builds a scoring function based on this set of patterns, called underlying. We prove that this set is by construction linear in the size of input, without overlaps, and can be efficiently constructed. Results show the validity of our method in the reconstruction of phylogenetic trees, where the Underlying Approach outperforms the current state of the art methods. Moreover, we show that the accuracy of UA is achieved with a very small number of subwords, which in some cases carry meaningful biological information. Availability http://www.dei.unipd.it/∼ciompin/main/underlying.html

  2. Is multiple-sequence alignment required for accurate inference of phylogeny?

    Science.gov (United States)

    Höhl, Michael; Ragan, Mark A

    2007-04-01

    the correct phylogeny as accurately as does an approach based on maximum-likelihood distance estimates of multiply aligned sequences.

  3. Genome comparison without alignment using shortest unique substrings

    Directory of Open Access Journals (Sweden)

    Möller Friedrich

    2005-05-01

    Full Text Available Abstract Background Sequence comparison by alignment is a fundamental tool of molecular biology. In this paper we show how a number of sequence comparison tasks, including the detection of unique genomic regions, can be accomplished efficiently without an alignment step. Our procedure for nucleotide sequence comparison is based on shortest unique substrings. These are substrings which occur only once within the sequence or set of sequences analysed and which cannot be further reduced in length without losing the property of uniqueness. Such substrings can be detected using generalized suffix trees. Results We find that the shortest unique substrings in Caenorhabditis elegans, human and mouse are no longer than 11 bp in the autosomes of these organisms. In mouse and human these unique substrings are significantly clustered in upstream regions of known genes. Moreover, the probability of finding such short unique substrings in the genomes of human or mouse by chance is extremely small. We derive an analytical expression for the null distribution of shortest unique substrings, given the GC-content of the query sequences. Furthermore, we apply our method to rapidly detect unique genomic regions in the genome of Staphylococcus aureus strain MSSA476 compared to four other staphylococcal genomes. Conclusion We combine a method to rapidly search for shortest unique substrings in DNA sequences and a derivation of their null distribution. We show that unique regions in an arbitrary sample of genomes can be efficiently detected with this method. The corresponding programs shustring (SHortest Unique subSTRING and shulen are written in C and available at http://adenine.biz.fh-weihenstephan.de/shustring/.

  4. Alignment of capillary electrophoresis-mass spectrometry datasets using accurate mass information.

    Science.gov (United States)

    Nevedomskaya, Ekaterina; Derks, Rico; Deelder, André M; Mayboroda, Oleg A; Palmblad, Magnus

    2009-12-01

    Capillary electrophoresis-mass spectrometry (CE-MS) is a powerful technique for the analysis of small soluble compounds in biological fluids. A major drawback of CE is the poor migration time reproducibility, which makes it difficult to combine data from different experiments and correctly assign compounds. A number of alignment algorithms have been developed but not all of them can cope with large and irregular time shifts between CE-MS runs. Here we present a genetic algorithm designed for alignment of CE-MS data using accurate mass information. The utility of the algorithm was demonstrated on real data, and the results were compared with one of the existing packages. The new algorithm showed a significant reduction of elution time variation in the aligned datasets. The importance of mass accuracy for the performance of the algorithm was also demonstrated by comparing alignments of datasets from a standard time-of-flight (TOF) instrument with those from the new ultrahigh resolution TOF maXis (Bruker Daltonics).

  5. Volume visualization of multiple alignment of large genomicDNA

    Energy Technology Data Exchange (ETDEWEB)

    Shah, Nameeta; Dillard, Scott E.; Weber, Gunther H.; Hamann, Bernd

    2005-07-25

    Genomes of hundreds of species have been sequenced to date, and many more are being sequenced. As more and more sequence data sets become available, and as the challenge of comparing these massive ''billion basepair DNA sequences'' becomes substantial, so does the need for more powerful tools supporting the exploration of these data sets. Similarity score data used to compare aligned DNA sequences is inherently one-dimensional. One-dimensional (1D) representations of these data sets do not effectively utilize screen real estate. As a result, tools using 1D representations are incapable of providing informatory overview for extremely large data sets. We present a technique to arrange 1D data in 3D space to allow us to apply state-of-the-art interactive volume visualization techniques for data exploration. We demonstrate our technique using multi-millions-basepair-long aligned DNA sequence data and compare it with traditional 1D line plots. The results show that our technique is superior in providing an overview of entire data sets. Our technique, coupled with 1D line plots, results in effective multi-resolution visualization of very large aligned sequence data sets.

  6. Automated whole-genome multiple alignment of rat, mouse, and human

    Energy Technology Data Exchange (ETDEWEB)

    Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

    2004-07-04

    We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.

  7. FAMSA: Fast and accurate multiple sequence alignment of huge protein families

    Science.gov (United States)

    Deorowicz, Sebastian; Debudaj-Grabysz, Agnieszka; Gudyś, Adam

    2016-01-01

    Rapid development of modern sequencing platforms has contributed to the unprecedented growth of protein families databases. The abundance of sets containing hundreds of thousands of sequences is a formidable challenge for multiple sequence alignment algorithms. The article introduces FAMSA, a new progressive algorithm designed for fast and accurate alignment of thousands of protein sequences. Its features include the utilization of the longest common subsequence measure for determining pairwise similarities, a novel method of evaluating gap costs, and a new iterative refinement scheme. What matters is that its implementation is highly optimized and parallelized to make the most of modern computer platforms. Thanks to the above, quality indicators, i.e. sum-of-pairs and total-column scores, show FAMSA to be superior to competing algorithms, such as Clustal Omega or MAFFT for datasets exceeding a few thousand sequences. Quality does not compromise on time or memory requirements, which are an order of magnitude lower than those in the existing solutions. For example, a family of 415519 sequences was analyzed in less than two hours and required no more than 8 GB of RAM. FAMSA is available for free at http://sun.aei.polsl.pl/REFRESH/famsa. PMID:27670777

  8. Analysis of chimpanzee history based on genome sequence alignments.

    Directory of Open Access Journals (Sweden)

    Jennifer L Caswell

    2008-04-01

    Full Text Available Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously available. We show that bonobos and common chimpanzees were separated approximately 1,290,000 years ago, western and other common chimpanzees approximately 510,000 years ago, and eastern and central chimpanzees at least 50,000 years ago. We infer that the central chimpanzee population size increased by at least a factor of 4 since its separation from western chimpanzees, while the western chimpanzee effective population size decreased. Surprisingly, in about one percent of the genome, the genetic relationships between humans, chimpanzees, and bonobos appear to be different from the species relationships. We used PCR-based resequencing to confirm 11 regions where chimpanzees and bonobos are not most closely related. Study of such loci should provide information about the period of time 5-7 million years ago when the ancestors of humans separated from those of the chimpanzees.

  9. Implicit Hitting Set Problems and Multi-genome Alignment

    Science.gov (United States)

    Karp, Richard M.

    Let U be a finite set and S a family of subsets of U. Define a hitting set as a subset of U that intersects every element of S. The optimal hitting set problem is: given a positive weight for each element of U, find a hitting set of minimum total weight. This problem is equivalent to the classic weighted set cover problem.We consider the optimal hitting set problem in the case where the set system S is not explicitly given, but there is an oracle that will supply members of S satisfying certain conditions; for example, we might ask the oracle for a minimum-cardinality set in S that is disjoint from a given set Q. The problems of finding a minimum feedback arc set or minimum feedback vertex set in a digraph are examples of implicit hitting set problems. Our interest is in the number of oracle queries required to find an optimal hitting set. After presenting some generic algorithms for this problem we focus on our computational experience with an implicit hitting set problem related to multi-genome alignment in genomics. This is joint work with Erick Moreno Centeno.

  10. Genomic divergences among cattle, dog and human estimated from large-scale alignments of genomic sequences

    Directory of Open Access Journals (Sweden)

    Shade Larry L

    2006-06-01

    Full Text Available Abstract Background Approximately 11 Mb of finished high quality genomic sequences were sampled from cattle, dog and human to estimate genomic divergences and their regional variation among these lineages. Results Optimal three-way multi-species global sequence alignments for 84 cattle clones or loci (each >50 kb of genomic sequence were constructed using the human and dog genome assemblies as references. Genomic divergences and substitution rates were examined for each clone and for various sequence classes under different functional constraints. Analysis of these alignments revealed that the overall genomic divergences are relatively constant (0.32–0.37 change/site for pairwise comparisons among cattle, dog and human; however substitution rates vary across genomic regions and among different sequence classes. A neutral mutation rate (2.0–2.2 × 10(-9 change/site/year was derived from ancestral repetitive sequences, whereas the substitution rate in coding sequences (1.1 × 10(-9 change/site/year was approximately half of the overall rate (1.9–2.0 × 10(-9 change/site/year. Relative rate tests also indicated that cattle have a significantly faster rate of substitution as compared to dog and that this difference is about 6%. Conclusion This analysis provides a large-scale and unbiased assessment of genomic divergences and regional variation of substitution rates among cattle, dog and human. It is expected that these data will serve as a baseline for future mammalian molecular evolution studies.

  11. Constrained-DFT method for accurate energy-level alignment of metal/molecule interfaces

    KAUST Repository

    Souza, A. M.

    2013-10-07

    We present a computational scheme for extracting the energy-level alignment of a metal/molecule interface, based on constrained density functional theory and local exchange and correlation functionals. The method, applied here to benzene on Li(100), allows us to evaluate charge-transfer energies, as well as the spatial distribution of the image charge induced on the metal surface. We systematically study the energies for charge transfer from the molecule to the substrate as function of the molecule-substrate distance, and investigate the effects arising from image-charge confinement and local charge neutrality violation. For benzene on Li(100) we find that the image-charge plane is located at about 1.8 Å above the Li surface, and that our calculated charge-transfer energies compare perfectly with those obtained with a classical electrostatic model having the image plane located at the same position. The methodology outlined here can be applied to study any metal/organic interface in the weak coupling limit at the computational cost of a total energy calculation. Most importantly, as the scheme is based on total energies and not on correcting the Kohn-Sham quasiparticle spectrum, accurate results can be obtained with local/semilocal exchange and correlation functionals. This enables a systematic approach to convergence.

  12. Whole genome phylogeny of Prochlorococcus marinus group of cyanobacteria: genome alignment and overlapping gene approach.

    Science.gov (United States)

    Prabha, Ratna; Singh, Dhananjaya P; Gupta, Shailendra K; Rai, Anil

    2014-06-01

    Prochlorococcus is the smallest known oxygenic phototrophic marine cyanobacterium dominating the mid-latitude oceans. Physiologically and genetically distinct P. marinus isolates from many oceans in the world were assigned two different groups, a tightly clustered high-light (HL)-adapted and a divergent low-light (LL-) adapted clade. Phylogenetic analysis of this cyanobacterium on the basis of 16S rRNA and other conserved genes did not show consistency with its phenotypic behavior. We analyzed phylogeny of this genus on the basis of complete genome sequences through genome alignment, overlapping-gene content and gene-order approach. Phylogenetic tree of P. marinus obtained by comparing whole genome sequences in contrast to that based on 16S rRNA gene, corresponded well with the HL/LL ecotypic distinction of twelve strains and showed consistency with phenotypic classification of P. marinus. Evidence for the horizontal descent and acquisition of genes within and across the genus was observed. Many genes involved in metabolic functions were found to be conserved across these genomes and many were continuously gained by different strains as per their needs during the course of their evolution. Consistency in the physiological and genetic phylogeny based on whole genome sequence is established. These observations improve our understanding about the adaptation and diversification of these organisms under evolutionary pressure.

  13. OCPAT: an online codon-preserved alignment tool for evolutionary genomic analysis of protein coding sequences

    Directory of Open Access Journals (Sweden)

    Grossman Lawrence I

    2007-09-01

    Full Text Available Abstract Background Rapidly accumulating genome sequence data from multiple species offer powerful opportunities for the detection of DNA sequence evolution. Phylogenetic tree construction and codon-based tests for natural selection are the prevailing tools used to detect functionally important evolutionary change in protein coding sequences. These analyses often require multiple DNA sequence alignments that maintain the correct reading frame for each collection of putative orthologous sequences. Since this feature is not available in most alignment tools, codon reading frames often must be checked manually before evolutionary analyses can commence. Results Here we report an online codon-preserved alignment tool (OCPAT that generates multiple sequence alignments automatically from the coding sequences of any list of human gene IDs and their putative orthologs from genomes of other vertebrate tetrapods. OCPAT is programmed to extract putative orthologous genes from genomes and to align the orthologs with the reading frame maintained in all species. OCPAT also optimizes the alignment by trimming the most variable alignment regions at the 5' and 3' ends of each gene. The resulting output of alignments is returned in several formats, which facilitates further molecular evolutionary analyses by appropriate available software. Alignments are generally robust and reliable, retaining the correct reading frame. The tool can serve as the first step for comparative genomic analyses of protein-coding gene sequences including phylogenetic tree reconstruction and detection of natural selection. We aligned 20,658 human RefSeq mRNAs using OCPAT. Most alignments are missing sequence(s from at least one species; however, functional annotation clustering of the ~1700 transcripts that were alignable to all species shows that genes involved in multi-subunit protein complexes are highly conserved. Conclusion The OCPAT program facilitates large-scale evolutionary and

  14. Using a priori knowledge to align sequencing reads to their exact genomic position

    NARCIS (Netherlands)

    Böttcher, René; Amberg, Ronny; Ruzius, F P; Guryev, V; Verhaegh, Wim F J; Beyerlein, Peter; van der Zaag, P J

    2012-01-01

    The use of a priori knowledge in the alignment of targeted sequencing data is investigated using computational experiments. Adapting a Needleman-Wunsch algorithm to incorporate the genomic position information from the targeted capture, we demonstrate that alignment can be done to just the target re

  15. Adenoviral vector DNA for accurate genome editing with engineered nucleases.

    Science.gov (United States)

    Holkers, Maarten; Maggio, Ignazio; Henriques, Sara F D; Janssen, Josephine M; Cathomen, Toni; Gonçalves, Manuel A F V

    2014-10-01

    Engineered sequence-specific nucleases and donor DNA templates can be customized to edit mammalian genomes via the homologous recombination (HR) pathway. Here we report that the nature of the donor DNA greatly affects the specificity and accuracy of the editing process following site-specific genomic cleavage by transcription activator-like effector nucleases (TALENs) and clustered, regularly interspaced, short palindromic repeats (CRISPR)-Cas9 nucleases. By applying these designer nucleases together with donor DNA delivered as protein-capped adenoviral vector (AdV), free-ended integrase-defective lentiviral vector or nonviral vector templates, we found that the vast majority of AdV-modified human cells underwent scarless homology-directed genome editing. In contrast, a significant proportion of cells exposed to free-ended or to covalently closed HR substrates were subjected to random and illegitimate recombination events. These findings are particularly relevant for genome engineering approaches aiming at high-fidelity genetic modification of human cells.

  16. Multiple Whole Genome Alignments and Novel Biomedical Applicationsat the VISTA Portal

    Energy Technology Data Exchange (ETDEWEB)

    Brudno, Michael; Poliakov, Alexander; Minovitsky, Simon; Ratnere,Igor; Dubchak, Inna

    2007-02-01

    The VISTA portal for comparative genomics is designed togive biomedical scientists a unified set of tools to lead them from theraw DNA sequences through the alignment and annotation to thevisualization of the results. The VISTA portal also hosts alignments of anumber of genomes computed by our group, allowing users to study regionsof their interest without having to manually download the individualsequences. Here we describe various algorithmic and functionalimprovements implemented in the VISTA portal over the last two years. TheVISTA Portal is accessible at http://genome.lbl.gov/vista.

  17. Abundance of ultramicro inversions within local alignments between human and chimpanzee genomes

    Directory of Open Access Journals (Sweden)

    Hara Yuichiro

    2011-10-01

    Full Text Available Abstract Background Chromosomal inversion is one of the most important mechanisms of evolution. Recent studies of comparative genomics have revealed that chromosomal inversions are abundant in the human genome. While such previously characterized inversions are large enough to be identified as a single alignment or a string of local alignments, the impact of ultramicro inversions, which are such short that the local alignments completely cover them, on evolution is still uncertain. Results In this study, we developed a method for identifying ultramicro inversions by scanning of local alignments. This technique achieved a high sensitivity and a very low rate of false positives. We identified 2,377 ultramicro inversions ranging from five to 125 bp within the orthologous alignments between the human and chimpanzee genomes. The false positive rate was estimated to be around 4%. Based on phylogenetic profiles using the primate outgroups, 479 ultramicro inversions were inferred to have specifically inverted in the human lineage. Ultramicro inversions exclusively involving adenine and thymine were the most frequent; 461 inversions (19.4% of the total. Furthermore, the density of ultramicro inversions in chromosome Y and the neighborhoods of transposable elements was higher than average. Sixty-five ultramicro inversions were identified within the exons of human protein-coding genes. Conclusions We defined ultramicro inversions as the inverted regions equal to or smaller than 125 bp buried within local alignments. Our observations suggest that ultramicro inversions are abundant among the human and chimpanzee genomes, and that location of the inversions correlated with the genome structural instability. Some of the ultramicro inversions may contribute to gene evolution. Our inversion-identification method is also applicable in the fine-tuning of genome alignments by distinguishing ultramicro inversions from nucleotide substitutions and indels.

  18. Measurement of word frequencies in genomic DNA sequences based on partial alignment and fuzzy set.

    Science.gov (United States)

    Shida, Fumiya; Mizuta, Satoshi

    2014-08-01

    Accompanied with the rapid increase of the amount of data registered in the databases of biological sequences, the need for a fast method of sequence comparison applicable to sequences of large size is also increasing. In general, alignment is used for sequence comparison. However, the alignment may not be appropriate for comparison of sequences of large size such as whole genome sequences due to its large time complexity. In this article, we propose a semi alignment-free method of sequence comparison based on word frequency distributions, in which we partially use the alignment to measure word frequencies along with the idea of fuzzy set theory. Experiments with ten bacterial genome sequences demonstrated that the fuzzy measurements has the effect that facilitates discrimination between close relatives and distant relatives.

  19. Accurate alignment of functional EPI data to anatomical MRI using a physics-based distortion model.

    Science.gov (United States)

    Studholme, C; Constable, R T; Duncan, J S

    2000-11-01

    Mapping of functional magnetic resonance imaging (fMRI) to conventional anatomical MRI is a valuable step in the interpretation of fMRI activations. One of the main limits on the accuracy of this alignment arises from differences in the geometric distortion induced by magnetic field inhomogeneity. This paper describes an approach to the registration of echo planar image (EPI) data to conventional anatomical images which takes into account this difference in geometric distortion. We make use of an additional spin echo EPI image and use the known signal conservation in spin echo distortion to derive a specialized multimodality nonrigid registration algorithm. We also examine a plausible modification using log-intensity evaluation of the criterion to provide increased sensitivity in areas of low EPI signal. A phantom-based imaging experiment is used to evaluate the behavior of the different criteria, comparing nonrigid displacement estimates to those provided by a imagnetic field mapping acquisition. The algorithm is then applied to a range of nine brain imaging studies illustrating global and local improvement in the anatomical alignment and localization of fMRI activations.

  20. Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.

    Science.gov (United States)

    Smith, Harold E; Yun, Sijung

    2017-01-01

    Whole-genome sequencing is a powerful tool for analyzing genetic variation on a global scale. One particularly useful application is the identification of mutations obtained by classical phenotypic screens in model species. Sequence data from the mutant strain is aligned to the reference genome, and then variants are called to generate a list of candidate alleles. A number of software pipelines for mutation identification have been targeted to C. elegans, with particular emphasis on ease of use, incorporation of mapping strain data, subtraction of background variants, and similar criteria. Although success is predicated upon the sensitive and accurate detection of candidate alleles, relatively little effort has been invested in evaluating the underlying software components that are required for mutation identification. Therefore, we have benchmarked a number of commonly used tools for sequence alignment and variant calling, in all pair-wise combinations, against both simulated and actual datasets. We compared the accuracy of those pipelines for mutation identification in C. elegans, and found that the combination of BBMap for alignment plus FreeBayes for variant calling offers the most robust performance.

  1. Base-By-Base: Single nucleotide-level analysis of whole viral genome alignments

    Directory of Open Access Journals (Sweden)

    Tcherepanov Vasily

    2004-07-01

    Full Text Available Abstract Background With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes is not feasible without new bioinformatics tools. Results A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1 rapidly identify and correct alignment errors in large, multiple genome alignments; and 2 generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs to retrieve detailed annotation information about the aligned genomes or use information from text files. Conclusion Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.

  2. TMO: time and memory optimized algorithm applicable for more accurate alignment of trinucleotide repeat disorders associated genes

    Directory of Open Access Journals (Sweden)

    Done Stojanov

    2016-03-01

    Full Text Available In this study, time and memory optimized (TMO algorithm is presented. Compared with Smith–Waterman's algorithm, TMO is applicable for a more accurate detection of continuous insertion/deletions (indels in genes’ fragments, associated with disorders caused by over-repetition of a certain codon. The improvement comes from the tendency to pinpoint indels in the least preserved nucleotide pairs. All nucleotide pairs that occur less frequently are classified as less preserved and they are considered as mutated codons whose mid-nucleotides were deleted. Other benefit of the proposed algorithm is its general tendency to maximize the number of matching nucleotides included per alignment, regardless of any specific alignment metrics. Since the structure of the solution, when applying Smith–Waterman, depends on the adjustment of the alignment parameters and, therefore, an incomplete (shortened solution may be derived, our algorithm does not reject any of the consistent matching nucleotides that can be included in the final solution. In terms of computational aspects, our algorithm runs faster than Smith–Waterman for very similar DNA and requires less memory than the most memory efficient dynamic programming algorithms. The speed up comes from the reduced number of nucleotide comparisons that have to be performed, without having to imperil the completeness of the solution. Due to the fact that four integers (16 Bytes are required for tracking matching fragment, regardless its length, our algorithm requires less memory than Huang's algorithm.

  3. Accurate determination of DNA yield from individual mosquitoes for population genomic applications

    Institute of Scientific and Technical Information of China (English)

    Craig S.Wilding; D.Weetman; K.Steen; M.J.Donnelly

    2009-01-01

    Accurate estimates of DNA quantity are likely to become increasingly important for successful genomic screening of insect populations via recently developed, highly multiplexed genotyping assays and high-throughput sequencing methods. Here we show that genomic DNA extractions from single Anopheles gambiae Giles using a standard commercial kit-based methodology yield extracts with concentrations below the linear range of spectrophotometric absorbance at 260 nm. Concentrations determined by spectrophotometry were not reproducible, and are therefore neither accurate nor reliable. However,DNA quantification using a fluorescent nucleic acid stain (PicoGreenR) gave highly reproducible concentration estimates, and indicated that, on average, single mosquitoes yielded approximately 300 ng of DNA. Such a total yield is currently insufficient for many highthroughput genome screening applications, necessitating whole genome amplification of all or most individuals in a population prior to genotyping.

  4. Tools for Accurate and Efficient Analysis of Complex Evolutionary Mechanisms in Microbial Genomes. Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Nakhleh, Luay

    2014-03-12

    I proposed to develop computationally efficient tools for accurate detection and reconstruction of microbes' complex evolutionary mechanisms, thus enabling rapid and accurate annotation, analysis and understanding of their genomes. To achieve this goal, I proposed to address three aspects. (1) Mathematical modeling. A major challenge facing the accurate detection of HGT is that of distinguishing between these two events on the one hand and other events that have similar "effects." I proposed to develop a novel mathematical approach for distinguishing among these events. Further, I proposed to develop a set of novel optimization criteria for the evolutionary analysis of microbial genomes in the presence of these complex evolutionary events. (2) Algorithm design. In this aspect of the project, I proposed to develop an array of e cient and accurate algorithms for analyzing microbial genomes based on the formulated optimization criteria. Further, I proposed to test the viability of the criteria and the accuracy of the algorithms in an experimental setting using both synthetic as well as biological data. (3) Software development. I proposed the nal outcome to be a suite of software tools which implements the mathematical models as well as the algorithms developed.

  5. A powerful test of independent assortment that determines genome-wide significance quickly and accurately.

    Science.gov (United States)

    Stewart, W C L; Hager, V R

    2016-08-01

    In the analysis of DNA sequences on related individuals, most methods strive to incorporate as much information as possible, with little or no attention paid to the issue of statistical significance. For example, a modern workstation can easily handle the computations needed to perform a large-scale genome-wide inheritance-by-descent (IBD) scan, but accurate assessment of the significance of that scan is often hindered by inaccurate approximations and computationally intensive simulation. To address these issues, we developed gLOD-a test of co-segregation that, for large samples, models chromosome-specific IBD statistics as a collection of stationary Gaussian processes. With this simple model, the parametric bootstrap yields an accurate and rapid assessment of significance-the genome-wide corrected P-value. Furthermore, we show that (i) under the null hypothesis, the limiting distribution of the gLOD is the standard Gumbel distribution; (ii) our parametric bootstrap simulator is approximately 40 000 times faster than gene-dropping methods, and it is more powerful than methods that approximate the adjusted P-value; and, (iii) the gLOD has the same statistical power as the widely used maximum Kong and Cox LOD. Thus, our approach gives researchers the ability to determine quickly and accurately the significance of most large-scale IBD scans, which may contain multiple traits, thousands of families and tens of thousands of DNA sequences.

  6. Genomic Signal Processing Methods for Computation of Alignment-Free Distances from DNA Sequences

    Science.gov (United States)

    Borrayo, Ernesto; Mendizabal-Ruiz, E. Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P.; Morales, J. Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments. PMID:25393409

  7. Genomic signal processing methods for computation of alignment-free distances from DNA sequences.

    Science.gov (United States)

    Borrayo, Ernesto; Mendizabal-Ruiz, E Gerardo; Vélez-Pérez, Hugo; Romo-Vázquez, Rebeca; Mendizabal, Adriana P; Morales, J Alejandro

    2014-01-01

    Genomic signal processing (GSP) refers to the use of digital signal processing (DSP) tools for analyzing genomic data such as DNA sequences. A possible application of GSP that has not been fully explored is the computation of the distance between a pair of sequences. In this work we present GAFD, a novel GSP alignment-free distance computation method. We introduce a DNA sequence-to-signal mapping function based on the employment of doublet values, which increases the number of possible amplitude values for the generated signal. Additionally, we explore the use of three DSP distance metrics as descriptors for categorizing DNA signal fragments. Our results indicate the feasibility of employing GAFD for computing sequence distances and the use of descriptors for characterizing DNA fragments.

  8. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

    Directory of Open Access Journals (Sweden)

    Dewey Colin N

    2011-08-01

    Full Text Available Abstract Background RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. Results We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. Conclusions RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost

  9. READSCAN: A fast and scalable pathogen discovery program with accurate genome relative abundance estimation

    KAUST Repository

    Naeem, Raeece

    2012-11-28

    Summary: READSCAN is a highly scalable parallel program to identify non-host sequences (of potential pathogen origin) and estimate their genome relative abundance in high-throughput sequence datasets. READSCAN accurately classified human and viral sequences on a 20.1 million reads simulated dataset in <27 min using a small Beowulf compute cluster with 16 nodes (Supplementary Material). Availability: http://cbrc.kaust.edu.sa/readscan Contact: or raeece.naeem@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. 2012 The Author(s).

  10. Construction, alignment and analysis of twelve framework physical maps that represent the ten genome types of the genus Oryza.

    Science.gov (United States)

    Kim, HyeRan; Hurwitz, Bonnie; Yu, Yeisoo; Collura, Kristi; Gill, Navdeep; SanMiguel, Phillip; Mullikin, James C; Maher, Christopher; Nelson, William; Wissotski, Marina; Braidotti, Michele; Kudrna, David; Goicoechea, José Luis; Stein, Lincoln; Ware, Doreen; Jackson, Scott A; Soderlund, Carol; Wing, Rod A

    2008-01-01

    We describe the establishment and analysis of a genus-wide comparative framework composed of 12 bacterial artificial chromosome fingerprint and end-sequenced physical maps representing the 10 genome types of Oryza aligned to the O. sativa ssp. japonica reference genome sequence. Over 932 Mb of end sequence was analyzed for repeats, simple sequence repeats, miRNA and single nucleotide variations, providing the most extensive analysis of Oryza sequence to date.

  11. Rapid genome mapping in nanochannel arrays for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome.

    Science.gov (United States)

    Hastie, Alex R; Dong, Lingli; Smith, Alexis; Finklestein, Jeff; Lam, Ernest T; Huo, Naxin; Cao, Han; Kwok, Pui-Yan; Deal, Karin R; Dvorak, Jan; Luo, Ming-Cheng; Gu, Yong; Xiao, Ming

    2013-01-01

    Next-generation sequencing (NGS) technologies have enabled high-throughput and low-cost generation of sequence data; however, de novo genome assembly remains a great challenge, particularly for large genomes. NGS short reads are often insufficient to create large contigs that span repeat sequences and to facilitate unambiguous assembly. Plant genomes are notorious for containing high quantities of repetitive elements, which combined with huge genome sizes, makes accurate assembly of these large and complex genomes intractable thus far. Using two-color genome mapping of tiling bacterial artificial chromosomes (BAC) clones on nanochannel arrays, we completed high-confidence assembly of a 2.1-Mb, highly repetitive region in the large and complex genome of Aegilops tauschii, the D-genome donor of hexaploid wheat (Triticum aestivum). Genome mapping is based on direct visualization of sequence motifs on single DNA molecules hundreds of kilobases in length. With the genome map as a scaffold, we anchored unplaced sequence contigs, validated the initial draft assembly, and resolved instances of misassembly, some involving contigs assembly from 75% to 95% complete.

  12. Rapid genome mapping in nanochannel arrays for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome.

    Directory of Open Access Journals (Sweden)

    Alex R Hastie

    Full Text Available Next-generation sequencing (NGS technologies have enabled high-throughput and low-cost generation of sequence data; however, de novo genome assembly remains a great challenge, particularly for large genomes. NGS short reads are often insufficient to create large contigs that span repeat sequences and to facilitate unambiguous assembly. Plant genomes are notorious for containing high quantities of repetitive elements, which combined with huge genome sizes, makes accurate assembly of these large and complex genomes intractable thus far. Using two-color genome mapping of tiling bacterial artificial chromosomes (BAC clones on nanochannel arrays, we completed high-confidence assembly of a 2.1-Mb, highly repetitive region in the large and complex genome of Aegilops tauschii, the D-genome donor of hexaploid wheat (Triticum aestivum. Genome mapping is based on direct visualization of sequence motifs on single DNA molecules hundreds of kilobases in length. With the genome map as a scaffold, we anchored unplaced sequence contigs, validated the initial draft assembly, and resolved instances of misassembly, some involving contigs <2 kb long, to dramatically improve the assembly from 75% to 95% complete.

  13. Alignment of leading-edge and peak-picking time of arrival methods to obtain accurate source locations

    Energy Technology Data Exchange (ETDEWEB)

    Roussel-Dupre, R.; Symbalisty, E.; Fox, C.; and Vanderlinde, O.

    2009-08-01

    The location of a radiating source can be determined by time-tagging the arrival of the radiated signal at a network of spatially distributed sensors. The accuracy of this approach depends strongly on the particular time-tagging algorithm employed at each of the sensors. If different techniques are used across the network, then the time tags must be referenced to a common fiducial for maximum location accuracy. In this report we derive the time corrections needed to temporally align leading-edge, time-tagging techniques with peak-picking algorithms. We focus on broadband radio frequency (RF) sources, an ionospheric propagation channel, and narrowband receivers, but the final results can be generalized to apply to any source, propagation environment, and sensor. Our analytic results are checked against numerical simulations for a number of representative cases and agree with the specific leading-edge algorithm studied independently by Kim and Eng (1995) and Pongratz (2005 and 2007).

  14. SNPsplit: Allele-specific splitting of alignments between genomes with known SNP genotypes [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Felix Krueger

    2016-06-01

    Full Text Available Sequencing reads overlapping polymorphic sites in diploid mammalian genomes may be assigned to one allele or the other. This holds the potential to detect gene expression, chromatin modifications, DNA methylation or nuclear interactions in an allele-specific fashion. SNPsplit is an allele-specific alignment sorter designed to read files in SAM/BAM format and determine the allelic origin of reads or read-pairs that cover known single nucleotide polymorphic (SNP positions. For this to work libraries must have been aligned to a genome in which all known SNP positions were masked with the ambiguity base ’N’ and aligned using a suitable mapping program such as Bowtie2, TopHat, STAR, HISAT2, HiCUP or Bismark. SNPsplit also provides an automated solution to generate N-masked reference genomes for hybrid mouse strains based on the variant call information provided by the Mouse Genomes Project. The unique ability of SNPsplit to work with various different kinds of sequencing data including RNA-Seq, ChIP-Seq, Bisulfite-Seq or Hi-C opens new avenues for the integrative exploration of allele-specific data.

  15. SNPsplit: Allele-specific splitting of alignments between genomes with known SNP genotypes [version 2; referees: 3 approved

    Directory of Open Access Journals (Sweden)

    Felix Krueger

    2016-07-01

    Full Text Available Sequencing reads overlapping polymorphic sites in diploid mammalian genomes may be assigned to one allele or the other. This holds the potential to detect gene expression, chromatin modifications, DNA methylation or nuclear interactions in an allele-specific fashion. SNPsplit is an allele-specific alignment sorter designed to read files in SAM/BAM format and determine the allelic origin of reads or read-pairs that cover known single nucleotide polymorphic (SNP positions. For this to work libraries must have been aligned to a genome in which all known SNP positions were masked with the ambiguity base 'N' and aligned using a suitable mapping program such as Bowtie2, TopHat, STAR, HISAT2, HiCUP or Bismark. SNPsplit also provides an automated solution to generate N-masked reference genomes for hybrid mouse strains based on the variant call information provided by the Mouse Genomes Project. The unique ability of SNPsplit to work with various different kinds of sequencing data including RNA-Seq, ChIP-Seq, Bisulfite-Seq or Hi-C opens new avenues for the integrative exploration of allele-specific data.

  16. When whole-genome alignments just won't work: kSNP v2 software for alignment-free SNP discovery and phylogenetics of hundreds of microbial genomes.

    Science.gov (United States)

    Gardner, Shea N; Hall, Barry G

    2013-01-01

    Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four "raw read" genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths.

  17. C-Sibelia: an easy-to-use and highly accurate tool for bacterial genome comparison [v1; ref status: indexed, http://f1000r.es/27n

    Directory of Open Access Journals (Sweden)

    Ilya Minkin

    2013-11-01

    Full Text Available We present C-Sibelia, a highly accurate and easy-to-use software tool for comparing two closely related bacterial genomes, which can be presented as either finished sequences or fragmented assemblies. C-Sibelia takes as input two FASTA files and produces: (1 a VCF file containing all identified single nucleotide variations and indels; (2 an XMFA file containing alignment information. The software also produces Circos diagrams visualizing high level genomic architecture for rearrangement analyses. C-Sibelia is a part of the Sibelia comparative genomics suite, which is freely available under the GNU GPL v.2 license at http://sourceforge.net/projects/sibelia-bio. C-Sibelia is compatible with Unix-like operating systems. A web-based version of the software is available at http://etool.me/software/csibelia.

  18. Accurate Prediction of the Statistics of Repetitions in Random Sequences: A Case Study in Archaea Genomes.

    Science.gov (United States)

    Régnier, Mireille; Chassignet, Philippe

    2016-01-01

    Repetitive patterns in genomic sequences have a great biological significance and also algorithmic implications. Analytic combinatorics allow to derive formula for the expected length of repetitions in a random sequence. Asymptotic results, which generalize previous works on a binary alphabet, are easily computable. Simulations on random sequences show their accuracy. As an application, the sample case of Archaea genomes illustrates how biological sequences may differ from random sequences.

  19. Can a semi-automated surface matching and principal axis-based algorithm accurately quantify femoral shaft fracture alignment in six degrees of freedom?

    Science.gov (United States)

    Crookshank, Meghan C; Beek, Maarten; Singh, Devin; Schemitsch, Emil H; Whyne, Cari M

    2013-07-01

    Accurate alignment of femoral shaft fractures treated with intramedullary nailing remains a challenge for orthopaedic surgeons. The aim of this study is to develop and validate a cone-beam CT-based, semi-automated algorithm to quantify the malalignment in six degrees of freedom (6DOF) using a surface matching and principal axes-based approach. Complex comminuted diaphyseal fractures were created in nine cadaveric femora and cone-beam CT images were acquired (27 cases total). Scans were cropped and segmented using intensity-based thresholding, producing superior, inferior and comminution volumes. Cylinders were fit to estimate the long axes of the superior and inferior fragments. The angle and distance between the two cylindrical axes were calculated to determine flexion/extension and varus/valgus angulation and medial/lateral and anterior/posterior translations, respectively. Both surfaces were unwrapped about the cylindrical axes. Three methods of matching the unwrapped surface for determination of periaxial rotation were compared based on minimizing the distance between features. The calculated corrections were compared to the input malalignment conditions. All 6DOF were calculated to within current clinical tolerances for all but two cases. This algorithm yielded accurate quantification of malalignment of femoral shaft fractures for fracture gaps up to 60 mm, based on a single CBCT image of the fractured limb.

  20. Multidimensional Genome-wide Analyses Show Accurate FVIII Integration by ZFN in Primary Human Cells

    Science.gov (United States)

    Sivalingam, Jaichandran; Kenanov, Dimitar; Han, Hao; Nirmal, Ajit Johnson; Ng, Wai Har; Lee, Sze Sing; Masilamani, Jeyakumar; Phan, Toan Thang; Maurer-Stroh, Sebastian; Kon, Oi Lian

    2016-01-01

    Costly coagulation factor VIII (FVIII) replacement therapy is a barrier to optimal clinical management of hemophilia A. Therapy using FVIII-secreting autologous primary cells is potentially efficacious and more affordable. Zinc finger nucleases (ZFN) mediate transgene integration into the AAVS1 locus but comprehensive evaluation of off-target genome effects is currently lacking. In light of serious adverse effects in clinical trials which employed genome-integrating viral vectors, this study evaluated potential genotoxicity of ZFN-mediated transgenesis using different techniques. We employed deep sequencing of predicted off-target sites, copy number analysis, whole-genome sequencing, and RNA-seq in primary human umbilical cord-lining epithelial cells (CLECs) with AAVS1 ZFN-mediated FVIII transgene integration. We combined molecular features to enhance the accuracy and activity of ZFN-mediated transgenesis. Our data showed a low frequency of ZFN-associated indels, no detectable off-target transgene integrations or chromosomal rearrangements. ZFN-modified CLECs had very few dysregulated transcripts and no evidence of activated oncogenic pathways. We also showed AAVS1 ZFN activity and durable FVIII transgene secretion in primary human dermal fibroblasts, bone marrow- and adipose tissue-derived stromal cells. Our study suggests that, with close attention to the molecular design of genome-modifying constructs, AAVS1 ZFN-mediated FVIII integration in several primary human cell types may be safe and efficacious. PMID:26689265

  1. Bisulfite-based epityping on pooled genomic DNA provides an accurate estimate of average group DNA methylation

    Directory of Open Access Journals (Sweden)

    Docherty Sophia J

    2009-03-01

    Full Text Available Abstract Background DNA methylation plays a vital role in normal cellular function, with aberrant methylation signatures being implicated in a growing number of human pathologies and complex human traits. Methods based on the modification of genomic DNA with sodium bisulfite are considered the 'gold-standard' for DNA methylation profiling on genomic DNA; however, they require relatively large amounts of DNA and may be prohibitively expensive when used on the large sample sizes necessary to detect small effects. We propose that a high-throughput DNA pooling approach will facilitate the use of emerging methylomic profiling techniques in large samples. Results Compared with data generated from 89 individual samples, our analysis of 205 CpG sites spanning nine independent regions of the genome demonstrates that DNA pools can be used to provide an accurate and reliable quantitative estimate of average group DNA methylation. Comparison of data generated from the pooled DNA samples with results averaged across the individual samples comprising each pool revealed highly significant correlations for individual CpG sites across all nine regions, with an average overall correlation across all regions and pools of 0.95 (95% bootstrapped confidence intervals: 0.94 to 0.96. Conclusion In this study we demonstrate the validity of using pooled DNA samples to accurately assess group DNA methylation averages. Such an approach can be readily applied to the assessment of disease phenotypes reducing the time, cost and amount of DNA starting material required for large-scale epigenetic analyses.

  2. Creating and evaluating accurate CRISPR-Cas9 scalpels for genomic surgery.

    Science.gov (United States)

    Bolukbasi, Mehmet Fatih; Gupta, Ankit; Wolfe, Scot A

    2016-01-01

    The simplicity of site-specific genome targeting by type II clustered, regularly interspaced, short palindromic repeat (CRISPR)-Cas9 nucleases, along with their robust activity profile, has changed the landscape of genome editing. These favorable properties have made the CRISPR-Cas9 system the technology of choice for sequence-specific modifications in vertebrate systems. For many applications, whether the focus is on basic science investigations or therapeutic efficacy, activity and precision are important considerations when one is choosing a nuclease platform, target site and delivery method. Here we review recent methods for increasing the activity and accuracy of Cas9 and assessing the extent of off-target cleavage events.

  3. Accurate DNA assembly and genome engineering with optimized uracil excision cloning

    DEFF Research Database (Denmark)

    Cavaleiro, Mafalda; Kim, Se Hyeuk; Seppala, Susanna;

    2015-01-01

    Simple and reliable DNA editing by uracil excision (a.k.a. USER cloning) has been described by several research groups, but the optimal design of cohesive DNA ends for multigene assembly remains elusive. Here, we use two model constructs based on expression of gfp and a four-gene pathway that pro......Simple and reliable DNA editing by uracil excision (a.k.a. USER cloning) has been described by several research groups, but the optimal design of cohesive DNA ends for multigene assembly remains elusive. Here, we use two model constructs based on expression of gfp and a four-gene pathway...... that produces β-carotene to optimize assembly junctions and the uracil excision protocol. By combining uracil excision cloning with a genomic integration technology, we demonstrate that up to six DNA fragments can be assembled in a one-tube reaction for direct genome integration with high accuracy, greatly...

  4. PipMaker—A Web Server for Aligning Two Genomic DNA Sequences

    OpenAIRE

    Schwartz, Scott; Zheng ZHANG; Frazer, Kelly A; Smit, Arian; Riemer, Cathy; Bouck, John; Gibbs, Richard; Hardison, Ross; Miller, Webb

    2000-01-01

    PipMaker (http://bio.cse.psu.edu) is a World-Wide Web site for comparing two long DNA sequences to identify conserved segments and for producing informative, high-resolution displays of the resulting alignments. One display is a percent identity plot (pip), which shows both the position in one sequence and the degree of similarity for each aligning segment between the two sequences in a compact and easily understandable form. Positions along the horizontal axis can be labeled with features su...

  5. CVTree3 Web Server for Whole-genome-based and Alignment-free Prokaryotic Phylogeny and Taxonomy.

    Science.gov (United States)

    Zuo, Guanghong; Hao, Bailin

    2015-10-01

    A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference and detailed comparison with taxonomy are great challenges. We now provide one solution in the latest Release 3.0 of the alignment-free and whole-genome-based web server CVTree3. The server resides in a cluster of 64 cores and is equipped with an interactive, collapsible, and expandable tree display. It is capable of comparing the tree branching order with prokaryotic classification at all taxonomic ranks from domains down to species and strains. CVTree3 allows for inquiry by taxon names and trial on lineage modifications. In addition, it reports a summary of monophyletic and non-monophyletic taxa at all ranks as well as produces print-quality subtree figures. After giving an overview of retrospective verification of the CVTree approach, the power of the new server is described for the mega-classification of prokaryotes and determination of taxonomic placement of some newly-sequenced genomes. A few discrepancies between CVTree and 16S rRNA analyses are also summarized with regard to possible taxonomic revisions. CVTree3 is freely accessible to all users at http://tlife.fudan.edu.cn/cvtree3/ without login requirements.

  6. CVTree3 Web Server for Whole-genome-based and Alignment-free Prokaryotic Phylogeny and Taxonomy

    Directory of Open Access Journals (Sweden)

    Guanghong Zuo

    2015-10-01

    Full Text Available A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference and detailed comparison with taxonomy are great challenges. We now provide one solution in the latest Release 3.0 of the alignment-free and whole-genome-based web server CVTree3. The server resides in a cluster of 64 cores and is equipped with an interactive, collapsible, and expandable tree display. It is capable of comparing the tree branching order with prokaryotic classification at all taxonomic ranks from domains down to species and strains. CVTree3 allows for inquiry by taxon names and trial on lineage modifications. In addition, it reports a summary of monophyletic and non-monophyletic taxa at all ranks as well as produces print-quality subtree figures. After giving an overview of retrospective verification of the CVTree approach, the power of the new server is described for the mega-classification of prokaryotes and determination of taxonomic placement of some newly-sequenced genomes. A few discrepancies between CVTree and 16S rRNA analyses are also summarized with regard to possible taxonomic revisions. CVTree3 is freely accessible to all users at http://tlife.fudan.edu.cn/cvtree3/ without login requirements.

  7. An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.

    Science.gov (United States)

    Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir

    2013-01-01

    DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.

  8. High-throughput automated microfluidic sample preparation for accurate microbial genomics

    Science.gov (United States)

    Kim, Soohong; De Jonghe, Joachim; Kulesa, Anthony B.; Feldman, David; Vatanen, Tommi; Bhattacharyya, Roby P.; Berdy, Brittany; Gomez, James; Nolan, Jill; Epstein, Slava; Blainey, Paul C.

    2017-01-01

    Low-cost shotgun DNA sequencing is transforming the microbial sciences. Sequencing instruments are so effective that sample preparation is now the key limiting factor. Here, we introduce a microfluidic sample preparation platform that integrates the key steps in cells to sequence library sample preparation for up to 96 samples and reduces DNA input requirements 100-fold while maintaining or improving data quality. The general-purpose microarchitecture we demonstrate supports workflows with arbitrary numbers of reaction and clean-up or capture steps. By reducing the sample quantity requirements, we enabled low-input (∼10,000 cells) whole-genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil micro-colonies with superior results. We also leveraged the enhanced throughput to sequence ∼400 clinical Pseudomonas aeruginosa libraries and demonstrate excellent single-nucleotide polymorphism detection performance that explained phenotypically observed antibiotic resistance. Fully-integrated lab-on-chip sample preparation overcomes technical barriers to enable broader deployment of genomics across many basic research and translational applications. PMID:28128213

  9. Sequencing and alignment of mitochondrial genomes of Tibetan chicken and two lowland chicken breeds

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Tibetan chicken lives in high-altitude area and has adapted well to hypoxia genetically. Shouguang chicken and Silky chicken are both lowland chicken breeds. In the present study, the complete mito-chondrial genome sequences of the three chicken breeds were all sequenced. The results showed that the mitochondrial DNAs (mtDNAs) of Shouguang chicken and Silky chicken consist of 16784 bp and 16785 bp respectively, and Tibetan chicken mitochondrial genome varies from 16784 bp to 16786 bp. After sequence analysis, 120 mutations, including 4 single nucleotide polymorphisms (SNPs) in tRNA genes, 9 SNPs and 1 insertion in rRNA genes, 38 SNPs and 1 deletion in D-LOOP, 66 SNPs in pro-tein-coding genes, were found. This work will provide clues for the future study on the association between mitochondrial genes and the adaptation to hypoxia.Tibetan chicken, lowland chicken, mitochondrial genome, hypoxia.

  10. Sequencing and alignment of mitochondrial genomes of Tibetan chicken and two lowland chicken breeds

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Tibetan chicken lives in high-altitude area and has adapted well to hypoxia genetically. Shouguang chicken and Silky chicken are both lowland chicken breeds. In the present study, the complete mitochondrial genome sequences of the three chicken breeds were all sequenced. The results showed that the mitochondrial DNAs (mtDNAs) of Shouguang chicken and Silky chicken consist of 16784 bp and 16785 bp respectively, and Tibetan chicken mitochondrial genome varies from 16784 bp to 16786 bp. After sequence analysis, 120 mutations, including 4 single nucleotide polymorphisms (SNPs) in tRNA genes, 9 SNPs and 1 insertion in rRNA genes, 38 SNPs and 1 deletion in D-LOOP, 66 SNPs in protein-coding genes, were found. This work will provide clues for the future study on the association between mitochondrial genes and the adaptation to hypoxia.

  11. Splign: algorithms for computing spliced alignments with identification of paralogs

    Directory of Open Access Journals (Sweden)

    Tatusova Tatiana

    2008-05-01

    Full Text Available Abstract Background The computation of accurate alignments of cDNA sequences against a genome is at the foundation of modern genome annotation pipelines. Several factors such as presence of paralogs, small exons, non-consensus splice signals, sequencing errors and polymorphic sites pose recognized difficulties to existing spliced alignment algorithms. Results We describe a set of algorithms behind a tool called Splign for computing cDNA-to-Genome alignments. The algorithms include a high-performance preliminary alignment, a compartment identification based on a formally defined model of adjacent duplicated regions, and a refined sequence alignment. In a series of tests, Splign has produced more accurate results than other tools commonly used to compute spliced alignments, in a reasonable amount of time. Conclusion Splign's ability to deal with various issues complicating the spliced alignment problem makes it a helpful tool in eukaryotic genome annotation processes and alternative splicing studies. Its performance is enough to align the largest currently available pools of cDNA data such as the human EST set on a moderate-sized computing cluster in a matter of hours. The duplications identification (compartmentization algorithm can be used independently in other areas such as the study of pseudogenes. Reviewers This article was reviewed by: Steven Salzberg, Arcady Mushegian and Andrey Mironov (nominated by Mikhail Gelfand.

  12. Alignment validation

    Energy Technology Data Exchange (ETDEWEB)

    ALICE; ATLAS; CMS; LHCb; Golling, Tobias

    2008-09-06

    The four experiments, ALICE, ATLAS, CMS and LHCb are currently under constructionat CERN. They will study the products of proton-proton collisions at the Large Hadron Collider. All experiments are equipped with sophisticated tracking systems, unprecedented in size and complexity. Full exploitation of both the inner detector andthe muon system requires an accurate alignment of all detector elements. Alignmentinformation is deduced from dedicated hardware alignment systems and the reconstruction of charged particles. However, the system is degenerate which means the data is insufficient to constrain all alignment degrees of freedom, so the techniques are prone to converging on wrong geometries. This deficiency necessitates validation and monitoring of the alignment. An exhaustive discussion of means to validate is subject to this document, including examples and plans from all four LHC experiments, as well as other high energy experiments.

  13. Input/Output Scalability of Genomic Alignment: How to Configure a Computational Biology Cluster

    Energy Technology Data Exchange (ETDEWEB)

    Vaidyanathan, P; Madhyastha, T M; Jones, T

    2001-10-03

    Many scientific applications are I/O-intensive, which makes optimization and scaling difficult, especially on parallel architectures. The I/O requirements of computational biology applications are different from other scientific applications. The main difference is that many computational biology applications are embarrassingly parallel and require repeated read-only access to a large global database. In this paper we examine the scalability of an embarrassingly parallel computational biology application: psLayout, which played a crucial role in the mapping of the human genome. This study was carried out on three architecture: the native UCSC Linux cluster, a Linux cluster at Lawrence Livermore National Labs with a faster interconnect and NFS server, and the ASCI Blue-Pacific supercomputer. We show that a cluster equipped with a fast network and parallel file system or a scalable NFS server has reasonable I/O scalability. We believe that replication is an important issue when scaling to larger numbers of processors, and we introduce the design of a library for automatic data replication to address this issue.

  14. The Oryza map alignment project: Construction, alignment and analysis of 12 BAC fingerprint/end sequence framework physical maps that represent the 10 genome types of genus Oryza

    Science.gov (United States)

    The Oryza Map Alignment Project (OMAP) provides the first comprehensive experimental system for understanding the evolution, physiology and biochemistry of a full genus in plants or animals. We have constructed twelve deep-coverage BAC libraries that are representative of both diploid and tetraploid...

  15. Long Read Alignment with Parallel MapReduce Cloud Platform.

    Science.gov (United States)

    Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki

    2015-01-01

    Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.

  16. Archived neonatal dried blood spot samples can be used for accurate whole genome and exome-targeted next-generation sequencing

    DEFF Research Database (Denmark)

    Hollegaard, Mads Vilhelm; Grauholm, Jonas; Nielsen, Ronni;

    2013-01-01

    Dried blood spot samples (DBSS) have been collected and stored for decades as part of newborn screening programmes worldwide. Representing almost an entire population under a certain age and collected with virtually no bias, the Newborn Screening Biobanks are of immense value in medical studies......, for example, to examine the genetics of various disorders. We have previously demonstrated that DNA extracted from a fraction (2×3.2mm discs) of an archived DBSS can be whole genome amplified (wgaDNA) and used for accurate array genotyping. However, until now, it has been uncertain whether wgaDNA from DBSS...... can be used for accurate whole genome sequencing (WGS) and exome sequencing (WES). This study examined two individuals represented by three different types of samples each: whole-blood (reference samples), 3-year-old DBSS spotted with reference material (refDBSS), and 27- to 29-year-old archived...

  17. Accurate Localization of the Integration Sites of Two Genomic Islands at Single-Nucleotide Resolution in the Genome of Bacillus cereus ATCC 10987

    Directory of Open Access Journals (Sweden)

    Ren Zhang

    2008-01-01

    Full Text Available We have identified two genomic islands, that is, BCEGI-1 and BCEGI-2, in the genome of Bacillus cereus ATCC 10987, based on comparative analysis with Bacillus cereus ATCC 14579. Furthermore, by using the cumulative GC profile and performing homology searches between the two genomes, the integration sites of the two genomic islands were determined at single-nucleotide resolution. BCEGI-1 is integrated between 159705 bp and 198000 bp, whereas BCEGI-2 is integrated between the end of ORF BCE4594 and the start of the intergenic sequence immediately following BCE4626, that is, from 4256803 bp to 4285534 bp. BCEGI-1 harbors two bacterial Tn7 transposons, which have two sets of genes encoding TnsA, B, C, and D. It is generally believed that unlike the TnsABC+E pathway, the TnsABC+D pathway would only promote vertical transmission to daughter cells. The evidence presented in this paper, however, suggests a role of the TnsABC+D pathway in the horizontal transfer of some genomic islands.

  18. Long Read Alignment with Parallel MapReduce Cloud Platform

    Directory of Open Access Journals (Sweden)

    Ahmed Abdulhakim Al-Absi

    2015-01-01

    Full Text Available Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner’s Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.

  19. A context dependent pair hidden Markov model for statistical alignment

    CERN Document Server

    Arribas-Gil, Ana

    2011-01-01

    This article proposes a novel approach to statistical alignment of nucleotide sequences by introducing a context dependent structure on the substitution process in the underlying evolutionary model. We propose to estimate alignments and context dependent mutation rates relying on the observation of two homologous sequences. The procedure is based on a generalized pair-hidden Markov structure, where conditional on the alignment path, the nucleotide sequences follow a Markov distribution. We use a stochastic approximation expectation maximization (saem) algorithm to give accurate estimators of parameters and alignments. We provide results both on simulated data and vertebrate genomes, which are known to have a high mutation rate from CG dinucleotide. In particular, we establish that the method improves the accuracy of the alignment of a human pseudogene and its functional gene.

  20. Roles of the Y-family DNA polymerase Dbh in accurate replication of the Sulfolobus genome at high temperature.

    Science.gov (United States)

    Sakofsky, Cynthia J; Foster, Patricia L; Grogan, Dennis W

    2012-04-01

    The intrinsically thermostable Y-family DNA polymerases of Sulfolobus spp. have revealed detailed three-dimensional structure and catalytic mechanisms of trans-lesion DNA polymerases, yet their functions in maintaining their native genomes remain largely unexplored. To identify functions of the Y-family DNA polymerase Dbh in replicating the Sulfolobus genome under extreme conditions, we disrupted the dbh gene in Sulfolobus acidocaldarius and characterized the resulting mutant strains phenotypically. Disruption of dbh did not cause any obvious growth defect, sensitivity to any of several DNA-damaging agents, or change in overall rate of spontaneous mutation at a well-characterized target gene. Loss of dbh did, however, cause significant changes in the spectrum of spontaneous forward mutation in each of two orthologous target genes of different sequence. Relative to wild-type strains, dbh(-) constructs exhibited fewer frame-shift and other small insertion-deletion mutations, but exhibited more base-pair substitutions that converted G:C base pairs to T:A base pairs. These changes, which were confirmed to be statistically significant, indicate two distinct activities of the Dbh polymerase in Sulfolobus cells growing under nearly optimal culture conditions (78-80°C and pH 3). The first activity promotes slipped-strand events within simple repetitive motifs, such as mononucleotide runs or triplet repeats, and the second promotes insertion of C opposite a potentially miscoding form of G, thereby avoiding G:C to T:A transversions.

  1. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.

    Directory of Open Access Journals (Sweden)

    Bryan N Howie

    2009-06-01

    Full Text Available Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000 Genomes Project will soon allow a broader range of SNPs to be imputed with higher accuracy, thereby increasing power. We describe a genotype imputation method (IMPUTE version 2 that is designed to address the challenges presented by these new datasets. The main innovation of our approach is a flexible modelling framework that increases accuracy and combines information across multiple reference panels while remaining computationally feasible. We find that IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made. We also find that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%-20% lower than those of the closest competing method. One particularly challenging aspect of next-generation association studies is to integrate information across multiple reference panels genotyped on different sets of SNPs; we show that our approach to this problem has practical advantages over other suggested solutions.

  2. Accurate Breakpoint Mapping in Apparently Balanced Translocation Families with Discordant Phenotypes Using Whole Genome Mate-Pair Sequencing

    Science.gov (United States)

    Aristidou, Constantia; Koufaris, Costas; Theodosiou, Athina; Bak, Mads; Mehrjouy, Mana M.; Behjati, Farkhondeh; Tanteles, George; Christophidou-Anastasiadou, Violetta; Tommerup, Niels

    2017-01-01

    Familial apparently balanced translocations (ABTs) segregating with discordant phenotypes are extremely challenging for interpretation and counseling due to the scarcity of publications and lack of routine techniques for quick investigation. Recently, next generation sequencing has emerged as an efficacious methodology for precise detection of translocation breakpoints. However, studies so far have mainly focused on de novo translocations. The present study focuses specifically on familial cases in order to shed some light to this diagnostic dilemma. Whole-genome mate-pair sequencing (WG-MPS) was applied to map the breakpoints in nine two-way ABT carriers from four families. Translocation breakpoints and patient-specific structural variants were validated by Sanger sequencing and quantitative Real Time PCR, respectively. Identical sequencing patterns and breakpoints were identified in affected and non-affected members carrying the same translocations. PTCD1, ATP5J2-PTCD1, CADPS2, and STPG1 were disrupted by the translocations in three families, rendering them initially as possible disease candidate genes. However, subsequent mutation screening and structural variant analysis did not reveal any pathogenic mutations or unique variants in the affected individuals that could explain the phenotypic differences between carriers of the same translocations. In conclusion, we suggest that NGS-based methods, such as WG-MPS, can be successfully used for detailed mapping of translocation breakpoints, which can also be used in routine clinical investigation of ABT cases. Unlike de novo translocations, no associations were determined here between familial two-way ABTs and the phenotype of the affected members, in which the presence of cryptic imbalances and complex chromosomal rearrangements has been excluded. Future whole-exome or whole-genome sequencing will potentially reveal unidentified mutations in the patients underlying the discordant phenotypes within each family. In

  3. Accurate variant detection across non-amplified and whole genome amplified DNA using targeted next generation sequencing

    Directory of Open Access Journals (Sweden)

    ElSharawy Abdou

    2012-09-01

    Full Text Available Abstract Background Many hypothesis-driven genetic studies require the ability to comprehensively and efficiently target specific regions of the genome to detect sequence variations. Often, sample availability is limited requiring the use of whole genome amplification (WGA. We evaluated a high-throughput microdroplet-based PCR approach in combination with next generation sequencing (NGS to target 384 discrete exons from 373 genes involved in cancer. In our evaluation, we compared the performance of six non-amplified gDNA samples from two HapMap family trios. Three of these samples were also preamplified by WGA and evaluated. We tested sample pooling or multiplexing strategies at different stages of the tested targeted NGS (T-NGS workflow. Results The results demonstrated comparable sequence performance between non-amplified and preamplified samples and between different indexing strategies [sequence specificity of 66.0% ± 3.4%, uniformity (coverage at 0.2× of the mean of 85.6% ± 0.6%]. The average genotype concordance maintained across all the samples was 99.5% ± 0.4%, regardless of sample type or pooling strategy. We did not detect any errors in the Mendelian patterns of inheritance of genotypes between the parents and offspring within each trio. We also demonstrated the ability to detect minor allele frequencies within the pooled samples that conform to predicted models. Conclusion Our described PCR-based sample multiplex approach and the ability to use WGA material for NGS may enable researchers to perform deep resequencing studies and explore variants at very low frequencies and cost.

  4. Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data

    Science.gov (United States)

    Kosugi, Shunichi; Natsume, Satoshi; Yoshida, Kentaro; MacLean, Daniel; Cano, Liliana; Kamoun, Sophien; Terauchi, Ryohei

    2013-01-01

    Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/. PMID:24116042

  5. Coval: improving alignment quality and variant calling accuracy for next-generation sequencing data.

    Directory of Open Access Journals (Sweden)

    Shunichi Kosugi

    Full Text Available Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in 'targeted' alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/.

  6. SPA: a probabilistic algorithm for spliced alignment.

    Directory of Open Access Journals (Sweden)

    Erik van Nimwegen

    2006-04-01

    Full Text Available Recent large-scale cDNA sequencing efforts show that elaborate patterns of splice variation are responsible for much of the proteome diversity in higher eukaryotes. To obtain an accurate account of the repertoire of splice variants, and to gain insight into the mechanisms of alternative splicing, it is essential that cDNAs are very accurately mapped to their respective genomes. Currently available algorithms for cDNA-to-genome alignment do not reach the necessary level of accuracy because they use ad hoc scoring models that cannot correctly trade off the likelihoods of various sequencing errors against the probabilities of different gene structures. Here we develop a Bayesian probabilistic approach to cDNA-to-genome alignment. Gene structures are assigned prior probabilities based on the lengths of their introns and exons, and based on the sequences at their splice boundaries. A likelihood model for sequencing errors takes into account the rates at which misincorporation, as well as insertions and deletions of different lengths, occurs during sequencing. The parameters of both the prior and likelihood model can be automatically estimated from a set of cDNAs, thus enabling our method to adapt itself to different organisms and experimental procedures. We implemented our method in a fast cDNA-to-genome alignment program, SPA, and applied it to the FANTOM3 dataset of over 100,000 full-length mouse cDNAs and a dataset of over 20,000 full-length human cDNAs. Comparison with the results of four other mapping programs shows that SPA produces alignments of significantly higher quality. In particular, the quality of the SPA alignments near splice boundaries and SPA's mapping of the 5' and 3' ends of the cDNAs are highly improved, allowing for more accurate identification of transcript starts and ends, and accurate identification of subtle splice variations. Finally, our splice boundary analysis on the human dataset suggests the existence of a novel non

  7. A rank-based sequence aligner with applications in phylogenetic analysis.

    Directory of Open Access Journals (Sweden)

    Liviu P Dinu

    Full Text Available Recent tools for aligning short DNA reads have been designed to optimize the trade-off between correctness and speed. This paper introduces a method for assigning a set of short DNA reads to a reference genome, under Local Rank Distance (LRD. The rank-based aligner proposed in this work aims to improve correctness over speed. However, some indexing strategies to speed up the aligner are also investigated. The LRD aligner is improved in terms of speed by storing [Formula: see text]-mer positions in a hash table for each read. Another improvement, that produces an approximate LRD aligner, is to consider only the positions in the reference that are likely to represent a good positional match of the read. The proposed aligner is evaluated and compared to other state of the art alignment tools in several experiments. A set of experiments are conducted to determine the precision and the recall of the proposed aligner, in the presence of contaminated reads. In another set of experiments, the proposed aligner is used to find the order, the family, or the species of a new (or unknown organism, given only a set of short Next-Generation Sequencing DNA reads. The empirical results show that the aligner proposed in this work is highly accurate from a biological point of view. Compared to the other evaluated tools, the LRD aligner has the important advantage of being very accurate even for a very low base coverage. Thus, the LRD aligner can be considered as a good alternative to standard alignment tools, especially when the accuracy of the aligner is of high importance. Source code and UNIX binaries of the aligner are freely available for future development and use at http://lrd.herokuapp.com/aligners. The software is implemented in C++ and Java, being supported on UNIX and MS Windows.

  8. Carbohydrate catabolic flexibility in the mammalian intestinal commensal Lactobacillus ruminis revealed by fermentation studies aligned to genome annotations

    LENUS (Irish Health Repository)

    2011-08-30

    Abstract Background Lactobacillus ruminis is a poorly characterized member of the Lactobacillus salivarius clade that is part of the intestinal microbiota of pigs, humans and other mammals. Its variable abundance in human and animals may be linked to historical changes over time and geographical differences in dietary intake of complex carbohydrates. Results In this study, we investigated the ability of nine L. ruminis strains of human and bovine origin to utilize fifty carbohydrates including simple sugars, oligosaccharides, and prebiotic polysaccharides. The growth patterns were compared with metabolic pathways predicted by annotation of a high quality draft genome sequence of ATCC 25644 (human isolate) and the complete genome of ATCC 27782 (bovine isolate). All of the strains tested utilized prebiotics including fructooligosaccharides (FOS), soybean-oligosaccharides (SOS) and 1,3:1,4-β-D-gluco-oligosaccharides to varying degrees. Six strains isolated from humans utilized FOS-enriched inulin, as well as FOS. In contrast, three strains isolated from cows grew poorly in FOS-supplemented medium. In general, carbohydrate utilisation patterns were strain-dependent and also varied depending on the degree of polymerisation or complexity of structure. Six putative operons were identified in the genome of the human isolate ATCC 25644 for the transport and utilisation of the prebiotics FOS, galacto-oligosaccharides (GOS), SOS, and 1,3:1,4-β-D-Gluco-oligosaccharides. One of these comprised a novel FOS utilisation operon with predicted capacity to degrade chicory-derived FOS. However, only three of these operons were identified in the ATCC 27782 genome that might account for the utilisation of only SOS and 1,3:1,4-β-D-Gluco-oligosaccharides. Conclusions This study has provided definitive genome-based evidence to support the fermentation patterns of nine strains of Lactobacillus ruminis, and has linked it to gene distribution patterns in strains from different sources

  9. A fast and accurate method to detect allelic genomic imbalances underlying mosaic rearrangements using SNP array data

    Directory of Open Access Journals (Sweden)

    Pique-Regi Roger

    2011-05-01

    Full Text Available Abstract Background Mosaicism for copy number and copy neutral chromosomal rearrangements has been recently identified as a relatively common source of genetic variation in the normal population. However its prevalence is poorly defined since it has been only studied systematically in one large-scale study and by using non optimal ad-hoc SNP array data analysis tools, uncovering rather large alterations (> 1 Mb and affecting a high proportion of cells. Here we propose a novel methodology, Mosaic Alteration Detection-MAD, by providing a software tool that is effective for capturing previously described alterations as wells as new variants that are smaller in size and/or affecting a low percentage of cells. Results The developed method identified all previously known mosaic abnormalities reported in SNP array data obtained from controls, bladder cancer and HapMap individuals. In addition MAD tool was able to detect new mosaic variants not reported before that were smaller in size and with lower percentage of cells affected. The performance of the tool was analysed by studying simulated data for different scenarios. Our method showed high sensitivity and specificity for all assessed scenarios. Conclusions The tool presented here has the ability to identify mosaic abnormalities with high sensitivity and specificity. Our results confirm the lack of sensitivity of former methods by identifying new mosaic variants not reported in previously utilised datasets. Our work suggests that the prevalence of mosaic alterations could be higher than initially thought. The use of appropriate SNP array data analysis methods would help in defining the human genome mosaic map.

  10. Whole genome sequences of the USMARC beef cattle diversity panel v2.9 aligned to the bovine reference genome assembly

    Science.gov (United States)

    A searchable and publicly viewable set of mapped genomes from 96 beef sires from 19 popular breeds of U.S. cattle was created. These sires with minimal pedigree relationships, represent >99% of the germplasm used in the US beef industry circa 2000. The group is estimated to contain more than 187 u...

  11. Horizontally Transferred Genetic Elements in the Tsetse Fly Genome: An Alignment-Free Clustering Approach Using Batch Learning Self-Organising Map (BLSOM)

    Science.gov (United States)

    Nakao, Ryo; Funayama, Shunsuke

    2016-01-01

    Tsetse flies (Glossina spp.) are the primary vectors of trypanosomes, which can cause human and animal African trypanosomiasis in Sub-Saharan African countries. The objective of this study was to explore the genome of Glossina morsitans morsitans for evidence of horizontal gene transfer (HGT) from microorganisms. We employed an alignment-free clustering method, that is, batch learning self-organising map (BLSOM), in which sequence fragments are clustered based on the similarity of oligonucleotide frequencies independently of sequence homology. After an initial scan of HGT events using BLSOM, we identified 3.8% of the tsetse fly genome as HGT candidates. The predicted donors of these HGT candidates included known symbionts, such as Wolbachia, as well as bacteria that have not previously been associated with the tsetse fly. We detected HGT candidates from diverse bacteria such as Bacillus and Flavobacteria, suggesting a past association between these taxa. Functional annotation revealed that the HGT candidates encoded loci in various functional pathways, such as metabolic and antibiotic biosynthesis pathways. These findings provide a basis for understanding the coevolutionary history of the tsetse fly and its microbes and establish the effectiveness of BLSOM for the detection of HGT events. PMID:28074180

  12. AGORA: Assembly Guided by Optical Restriction Alignment

    Directory of Open Access Journals (Sweden)

    Lin Henry C

    2012-08-01

    Full Text Available Abstract Background Genome assembly is difficult due to repeated sequences within the genome, which create ambiguities and cause the final assembly to be broken up into many separate sequences (contigs. Long range linking information, such as mate-pairs or mapping data, is necessary to help assembly software resolve repeats, thereby leading to a more complete reconstruction of genomes. Prior work has used optical maps for validating assemblies and scaffolding contigs, after an initial assembly has been produced. However, optical maps have not previously been used within the genome assembly process. Here, we use optical map information within the popular de Bruijn graph assembly paradigm to eliminate paths in the de Bruijn graph which are not consistent with the optical map and help determine the correct reconstruction of the genome. Results We developed a new algorithm called AGORA: Assembly Guided by Optical Restriction Alignment. AGORA is the first algorithm to use optical map information directly within the de Bruijn graph framework to help produce an accurate assembly of a genome that is consistent with the optical map information provided. Our simulations on bacterial genomes show that AGORA is effective at producing assemblies closely matching the reference sequences. Additionally, we show that noise in the optical map can have a strong impact on the final assembly quality for some complex genomes, and we also measure how various characteristics of the starting de Bruijn graph may impact the quality of the final assembly. Lastly, we show that a proper choice of restriction enzyme for the optical map may substantially improve the quality of the final assembly. Conclusions Our work shows that optical maps can be used effectively to assemble genomes within the de Bruijn graph assembly framework. Our experiments also provide insights into the characteristics of the mapping data that most affect the performance of our algorithm, indicating the

  13. Asexual populations of the human malaria parasite, Plasmodium falciparum, use a two-step genomic strategy to acquire accurate, beneficial DNA amplifications.

    Directory of Open Access Journals (Sweden)

    Jennifer L Guler

    Full Text Available Malaria drug resistance contributes to up to a million annual deaths. Judicious deployment of new antimalarials and vaccines could benefit from an understanding of early molecular events that promote the evolution of parasites. Continuous in vitro challenge of Plasmodium falciparum parasites with a novel dihydroorotate dehydrogenase (DHODH inhibitor reproducibly selected for resistant parasites. Genome-wide analysis of independently-derived resistant clones revealed a two-step strategy to evolutionary success. Some haploid blood-stage parasites first survive antimalarial pressure through fortuitous DNA duplications that always included the DHODH gene. Independently-selected parasites had different sized amplification units but they were always flanked by distant A/T tracks. Higher level amplification and resistance was attained using a second, more efficient and more accurate, mechanism for head-to-tail expansion of the founder unit. This second homology-based process could faithfully tune DNA copy numbers in either direction, always retaining the unique DNA amplification sequence from the original A/T-mediated duplication for that parasite line. Pseudo-polyploidy at relevant genomic loci sets the stage for gaining additional mutations at the locus of interest. Overall, we reveal a population-based genomic strategy for mutagenesis that operates in human stages of P. falciparum to efficiently yield resistance-causing genetic changes at the correct locus in a successful parasite. Importantly, these founding events arise with precision; no other new amplifications are seen in the resistant haploid blood stage parasite. This minimizes the need for meiotic genetic cleansing that can only occur in sexual stage development of the parasite in mosquitoes.

  14. General Alignment Concept of the CMS experiment

    CERN Document Server

    Lampen, T

    2006-01-01

    Efficient and accurate track reconstruction requires proper alignment of the tracking devices used. Here we describe the general alignment strategy envisaged for the CMS experiment. The hardware alignment devices of CMS are presented as well as the different track based alignment approaches.

  15. PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions.

    Science.gov (United States)

    Bendl, Jaroslav; Musil, Miloš; Štourač, Jan; Zendulka, Jaroslav; Damborský, Jiří; Brezovský, Jan

    2016-05-01

    An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools' predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To

  16. SAMMate: a GUI tool for processing short read alignments in SAM/BAM format

    Directory of Open Access Journals (Sweden)

    Flemington Erik

    2011-01-01

    Full Text Available Abstract Background Next Generation Sequencing (NGS technology generates tens of millions of short reads for each DNA/RNA sample. A key step in NGS data analysis is the short read alignment of the generated sequences to a reference genome. Although storing alignment information in the Sequence Alignment/Map (SAM or Binary SAM (BAM format is now standard, biomedical researchers still have difficulty accessing this information. Results We have developed a Graphical User Interface (GUI software tool named SAMMate. SAMMate allows biomedical researchers to quickly process SAM/BAM files and is compatible with both single-end and paired-end sequencing technologies. SAMMate also automates some standard procedures in DNA-seq and RNA-seq data analysis. Using either standard or customized annotation files, SAMMate allows users to accurately calculate the short read coverage of genomic intervals. In particular, for RNA-seq data SAMMate can accurately calculate the gene expression abundance scores for customized genomic intervals using short reads originating from both exons and exon-exon junctions. Furthermore, SAMMate can quickly calculate a whole-genome signal map at base-wise resolution allowing researchers to solve an array of bioinformatics problems. Finally, SAMMate can export both a wiggle file for alignment visualization in the UCSC genome browser and an alignment statistics report. The biological impact of these features is demonstrated via several case studies that predict miRNA targets using short read alignment information files. Conclusions With just a few mouse clicks, SAMMate will provide biomedical researchers easy access to important alignment information stored in SAM/BAM files. Our software is constantly updated and will greatly facilitate the downstream analysis of NGS data. Both the source code and the GUI executable are freely available under the GNU General Public License at http://sammate.sourceforge.net.

  17. A cross-species alignment tool (CAT)

    DEFF Research Database (Denmark)

    Li, Heng; Guan, Liang; Liu, Tao;

    2007-01-01

    sensitive methods which are usually applied in aligning inter-species sequences. RESULTS: Here we present a new algorithm called CAT (for Cross-species Alignment Tool). It is designed to align mRNA sequences to mammalian-sized genomes. CAT is implemented using C scripts and is freely available on the web...

  18. Pyro-Align: Sample-Align based Multiple Alignment system for Pyrosequencing Reads of Large Number

    CERN Document Server

    Saeed, Fahad

    2009-01-01

    Pyro-Align is a multiple alignment program specifically designed for pyrosequencing reads of huge number. Multiple sequence alignment is shown to be NP-hard and heuristics are designed for approximate solutions. Multiple sequence alignment of pyrosequenceing reads is complex mainly because of 2 factors. One being the huge number of reads, making the use of traditional heuristics,that scale very poorly for large number, unsuitable. The second reason is that the alignment cannot be performed arbitrarily, because the position of the reads with respect to the original genome is important and has to be taken into account.In this report we present a short description of the multiple alignment system for pyrosequencing reads.

  19. Overcoming low-alignment signal contrast induced alignment failure by alignment signal enhancement

    Science.gov (United States)

    Lee, Byeong Soo; Kim, Young Ha; Hwang, Hyunwoo; Lee, Jeongjin; Kong, Jeong Heung; Kang, Young Seog; Paarhuis, Bart; Kok, Haico; de Graaf, Roelof; Weichselbaum, Stefan; Droste, Richard; Mason, Christopher; Aarts, Igor; de Boeij, Wim P.

    2016-03-01

    Overlay is one of the key factors which enables optical lithography extension to 1X node DRAM manufacturing. It is natural that accurate wafer alignment is a prerequisite for good device overlay. However, alignment failures or misalignments are commonly observed in a fab. There are many factors which could induce alignment problems. Low alignment signal contrast is one of the main issues. Alignment signal contrast can be degraded by opaque stack materials or by alignment mark degradation due to processes like CMP. This issue can be compounded by mark sub-segmentation from design rules in combination with double or quadruple spacer process. Alignment signal contrast can be improved by applying new material or process optimization, which sometimes lead to the addition of another process-step with higher costs. If we can amplify the signal components containing the position information and reduce other unwanted signal and background contributions then we can improve alignment performance without process change. In this paper we use ASML's new alignment sensor (as was introduced and released on the NXT:1980Di) and sample wafers with special stacks which can induce poor alignment signal to demonstrate alignment and overlay improvement.

  20. Use of Alignment-Free Phylogenetics for Rapid Genome Sequence-Based Typing of Helicobacter pylori Virulence Markers and Antibiotic Susceptibility

    NARCIS (Netherlands)

    van Vliet, Arnoud H M; Kusters, Johannes G.

    2015-01-01

    Whole-genome sequencing is becoming a leading technology in the typing and epidemiology of microbial pathogens, but the increase in genomic information necessitates significant investment in bioinformatic resources and expertise, and currently used methodologies struggle with genetically heterogeneo

  1. Accurate Dna Assembly And Direct Genome Integration With Optimized Uracil Excision Cloning To Facilitate Engineering Of Escherichia Coli As A Cell Factory

    DEFF Research Database (Denmark)

    Cavaleiro, Mafalda; Kim, Se Hyeuk; Nørholm, Morten

    2015-01-01

    Plants produce a vast diversity of valuable compounds with medical properties, but these are often difficult to purify from the natural source or produce by organic synthesis. An alternative is to transfer the biosynthetic pathways to an efficient production host like the bacterium Escherichia co......-excision-based cloning and combining it with a genome-engineering approach to allow direct integration of whole metabolic pathways into the genome of E. coli, to facilitate the advanced engineering of cell factories....

  2. Beyond Alignment

    DEFF Research Database (Denmark)

    Beyond Alignment: Applying Systems Thinking to Architecting Enterprises is a comprehensive reader about how enterprises can apply systems thinking in their enterprise architecture practice, for business transformation and for strategic execution. The book's contributors find that systems thinking...... is a valuable way of thinking about the viable enterprise and how to architect it....

  3. Mulan: Multiple-Sequence Local Alignment and Visualization for Studying Function and Evolution

    Energy Technology Data Exchange (ETDEWEB)

    Ovcharenko, I; Loots, G; Giardine, B; Hou, M; Ma, J; Hardison, R; Stubbs, L; Miller, W

    2004-07-14

    Multiple sequence alignment analysis is a powerful approach for understanding phylogenetic relationships, annotating genes and detecting functional regulatory elements. With a growing number of partly or fully sequenced vertebrate genomes, effective tools for performing multiple comparisons are required to accurately and efficiently assist biological discoveries. Here we introduce Mulan (http://mulan.dcode.org/), a novel method and a network server for comparing multiple draft and finished-quality sequences to identify functional elements conserved over evolutionary time. Mulan brings together several novel algorithms: the tba multi-aligner program for rapid identification of local sequence conservation and the multiTF program for detecting evolutionarily conserved transcription factor binding sites in multiple alignments. In addition, Mulan supports two-way communication with the GALA database; alignments of multiple species dynamically generated in GALA can be viewed in Mulan, and conserved transcription factor binding sites identified with Mulan/multiTF can be integrated and overlaid with extensive genome annotation data using GALA. Local multiple alignments computed by Mulan ensure reliable representation of short-and large-scale genomic rearrangements in distant organisms. Mulan allows for interactive modification of critical conservation parameters to differentially predict conserved regions in comparisons of both closely and distantly related species. We illustrate the uses and applications of the Mulan tool through multi-species comparisons of the GATA3 gene locus and the identification of elements that are conserved differently in avians than in other genomes allowing speculation on the evolution of birds. Source code for the aligners and the aligner-evaluation software can be freely downloaded from http://bio.cse.psu.edu/.

  4. Plant Genome Duplication Database.

    Science.gov (United States)

    Lee, Tae-Ho; Kim, Junah; Robertson, Jon S; Paterson, Andrew H

    2017-01-01

    Genome duplication, widespread in flowering plants, is a driving force in evolution. Genome alignments between/within genomes facilitate identification of homologous regions and individual genes to investigate evolutionary consequences of genome duplication. PGDD (the Plant Genome Duplication Database), a public web service database, provides intra- or interplant genome alignment information. At present, PGDD contains information for 47 plants whose genome sequences have been released. Here, we describe methods for identification and estimation of dates of genome duplication and speciation by functions of PGDD.The database is freely available at http://chibba.agtec.uga.edu/duplication/.

  5. Accurate overlaying for mobile augmented reality

    NARCIS (Netherlands)

    Pasman, W; van der Schaaf, A; Lagendijk, RL; Jansen, F.W.

    1999-01-01

    Mobile augmented reality requires accurate alignment of virtual information with objects visible in the real world. We describe a system for mobile communications to be developed to meet these strict alignment criteria using a combination of computer vision. inertial tracking and low-latency renderi

  6. Handling Permutation in Sequence Comparison: Genome-Wide Enhancer Prediction in Vertebrates by a Novel Non-Linear Alignment Scoring Principle.

    Directory of Open Access Journals (Sweden)

    Dirk Dolle

    Full Text Available Enhancers have been described to evolve by permutation without changing function. This has posed the problem of how to predict enhancer elements that are hidden from alignment-based approaches due to the loss of co-linearity. Alignment-free algorithms have been proposed as one possible solution. However, this approach is hampered by several problems inherent to its underlying working principle. Here we present a new approach, which combines the power of alignment and alignment-free techniques into one algorithm. It allows the prediction of enhancers based on the query and target sequence only, no matter whether the regulatory logic is co-linear or reshuffled. To test our novel approach, we employ it for the prediction of enhancers across the evolutionary distance of ~450Myr between human and medaka. We demonstrate its efficacy by subsequent in vivo validation resulting in 82% (9/11 of the predicted medaka regions showing reporter activity. These include five candidates with partially co-linear and four with reshuffled motif patterns. Orthology in flanking genes and conservation of the detected co-linear motifs indicates that those candidates are likely functionally equivalent enhancers. In sum, our results demonstrate that the proposed principle successfully predicts mutated as well as permuted enhancer regions at an encouragingly high rate.

  7. AlignHUSH: Alignment of HMMs using structure and hydrophobicity information

    OpenAIRE

    Krishnadev Oruganty; Srinivasan Narayanaswamy

    2011-01-01

    Abstract Background Sensitive remote homology detection and accurate alignments especially in the midnight zone of sequence similarity are needed for better function annotation and structural modeling of proteins. An algorithm, AlignHUSH for HMM-HMM alignment has been developed which is capable of recognizing distantly related domain families The method uses structural information, in the form of predicted secondary structure probabilities, and hydrophobicity of amino acids to align HMMs of t...

  8. Magnetic alignment and the Poisson alignment reference system

    Science.gov (United States)

    Griffith, L. V.; Schenz, R. F.; Sommargren, G. E.

    1990-08-01

    Three distinct metrological operations are necessary to align a free-electron laser (FEL): the magnetic axis must be located, a straight line reference (SLR) must be generated, and the magnetic axis must be related to the SLR. This article begins with a review of the motivation for developing an alignment system that will assure better than 100-μm accuracy in the alignment of the magnetic axis throughout an FEL. The 100-μm accuracy is an error circle about an ideal axis for 300 m or more. The article describes techniques for identifying the magnetic axes of solenoids, quadrupoles, and wiggler poles. Propagation of a laser beam is described to the extent of revealing sources of nonlinearity in the beam. Development of a straight-line reference based on the Poisson line, a diffraction effect, is described in detail. Spheres in a large-diameter laser beam create Poisson lines and thus provide a necessary mechanism for gauging between the magnetic axis and the SLR. Procedures for installing FEL components and calibrating alignment fiducials to the magnetic axes of the components are also described. The Poisson alignment reference system should be accurate to 25 μm over 300 m, which is believed to be a factor-of-4 improvement over earlier techniques. An error budget shows that only 25% of the total budgeted tolerance is used for the alignment reference system, so the remaining tolerances should fall within the allowable range for FEL alignment.

  9. Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.).

    Science.gov (United States)

    Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil

    2015-02-01

    The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp.

  10. DIDA: Distributed Indexing Dispatched Alignment.

    Directory of Open Access Journals (Sweden)

    Hamid Mohamadi

    Full Text Available One essential application in bioinformatics that is affected by the high-throughput sequencing data deluge is the sequence alignment problem, where nucleotide or amino acid sequences are queried against targets to find regions of close similarity. When queries are too many and/or targets are too large, the alignment process becomes computationally challenging. This is usually addressed by preprocessing techniques, where the queries and/or targets are indexed for easy access while searching for matches. When the target is static, such as in an established reference genome, the cost of indexing is amortized by reusing the generated index. However, when the targets are non-static, such as contigs in the intermediate steps of a de novo assembly process, a new index must be computed for each run. To address such scalability problems, we present DIDA, a novel framework that distributes the indexing and alignment tasks into smaller subtasks over a cluster of compute nodes. It provides a workflow beyond the common practice of embarrassingly parallel implementations. DIDA is a cost-effective, scalable and modular framework for the sequence alignment problem in terms of memory usage and runtime. It can be employed in large-scale alignments to draft genomes and intermediate stages of de novo assembly runs. The DIDA source code, sample files and user manual are available through http://www.bcgsc.ca/platform/bioinfo/software/dida. The software is released under the British Columbia Cancer Agency License (BCCA, and is free for academic use.

  11. Seeking the perfect alignment

    CERN Multimedia

    2002-01-01

    The first full-scale tests of the ATLAS Muon Spectrometer are about to begin in Prévessin. The set-up includes several layers of Monitored Drift Tubes Chambers (MDTs) and will allow tests of the performance of the detectors and of their highly accurate alignment system.   Monitored Drift Chambers in Building 887 in Prévessin, where they are just about to be tested. Muon chambers are keeping the ATLAS Muon Spectrometer team quite busy this summer. Now that most people go on holiday, the beam and alignment tests for these chambers are just starting. These chambers will measure with high accuracy the momentum of high-energy muons, and this implies very demanding requirements for their alignment. The MDT chambers consist of drift tubes, which are gas-filled metal tubes, 3 cm in diameter, with wires running down their axes. With high voltage between the wire and the tube wall, the ionisation due to traversing muons is detected as electrical pulses. With careful timing of the pulses, the position of the muon t...

  12. Review of alignment and SNP calling algorithms for next-generation sequencing data.

    Science.gov (United States)

    Mielczarek, M; Szyda, J

    2016-02-01

    Application of the massive parallel sequencing technology has become one of the most important issues in life sciences. Therefore, it was crucial to develop bioinformatics tools for next-generation sequencing (NGS) data processing. Currently, two of the most significant tasks include alignment to a reference genome and detection of single nucleotide polymorphisms (SNPs). In many types of genomic analyses, great numbers of reads need to be mapped to the reference genome; therefore, selection of the aligner is an essential step in NGS pipelines. Two main algorithms-suffix tries and hash tables-have been introduced for this purpose. Suffix array-based aligners are memory-efficient and work faster than hash-based aligners, but they are less accurate. In contrast, hash table algorithms tend to be slower, but more sensitive. SNP and genotype callers may also be divided into two main different approaches: heuristic and probabilistic methods. A variety of software has been subsequently developed over the past several years. In this paper, we briefly review the current development of NGS data processing algorithms and present the available software.

  13. STELLAR: fast and exact local alignments

    Directory of Open Access Journals (Sweden)

    Weese David

    2011-10-01

    Full Text Available Abstract Background Large-scale comparison of genomic sequences requires reliable tools for the search of local alignments. Practical local aligners are in general fast, but heuristic, and hence sometimes miss significant matches. Results We present here the local pairwise aligner STELLAR that has full sensitivity for ε-alignments, i.e. guarantees to report all local alignments of a given minimal length and maximal error rate. The aligner is composed of two steps, filtering and verification. We apply the SWIFT algorithm for lossless filtering, and have developed a new verification strategy that we prove to be exact. Our results on simulated and real genomic data confirm and quantify the conjecture that heuristic tools like BLAST or BLAT miss a large percentage of significant local alignments. Conclusions STELLAR is very practical and fast on very long sequences which makes it a suitable new tool for finding local alignments between genomic sequences under the edit distance model. Binaries are freely available for Linux, Windows, and Mac OS X at http://www.seqan.de/projects/stellar. The source code is freely distributed with the SeqAn C++ library version 1.3 and later at http://www.seqan.de.

  14. Alignment method for solar collector arrays

    Science.gov (United States)

    Driver, Jr., Richard B

    2012-10-23

    The present invention is directed to an improved method for establishing camera fixture location for aligning mirrors on a solar collector array (SCA) comprising multiple mirror modules. The method aligns the mirrors on a module by comparing the location of the receiver image in photographs with the predicted theoretical receiver image location. To accurately align an entire SCA, a common reference is used for all of the individual module images within the SCA. The improved method can use relative pixel location information in digital photographs along with alignment fixture inclinometer data to calculate relative locations of the fixture between modules. The absolute locations are determined by minimizing alignment asymmetry for the SCA. The method inherently aligns all of the mirrors in an SCA to the receiver, even with receiver position and module-to-module alignment errors.

  15. GenomePeek—an online tool for prokaryotic genome and metagenome analysis

    Directory of Open Access Journals (Sweden)

    Katelyn McNair

    2015-06-01

    Full Text Available As more and more prokaryotic sequencing takes place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping error rates low, as well as offering unique data visualization options.

  16. Considerations for clinical read alignment and mutational profiling using next-generation sequencing

    Directory of Open Access Journals (Sweden)

    Gavin R Oliver

    2012-07-01

    Full Text Available Next-generation sequencing technologies are increasingly being applied in clinical settings, however the data are characterized by a range of platform-specific artifacts making downstream analysis problematic and error prone. One major application of NGS is in the profiling of clinically relevant mutations whereby sequences are aligned to a reference genome and potential mutations assessed and scored. Accurate sequence alignment is pivotal in reliable assessment of potential mutations however selection of appropriate alignment tools is a non-trivial task complicated by the availability of multiple solutions each with its own performance characteristics. Using BRCA1 as an example, we have simulated and mutated a test dataset based on Illumina sequencing technology. Our findings reveal key differences in the performances of a range of common commercial and open source tools and will be of importance to anyone using NGS to profile mutations in clinical or basic research.

  17. Two genome sequences of the same bacterial strain, Gluconacetobacter diazotrophicus PAl 5, suggest a new standard in genome sequence submission.

    Science.gov (United States)

    Giongo, Adriana; Tyler, Heather L; Zipperer, Ursula N; Triplett, Eric W

    2010-06-15

    Gluconacetobacter diazotrophicus PAl 5 is of agricultural significance due to its ability to provide fixed nitrogen to plants. Consequently, its genome sequence has been eagerly anticipated to enhance understanding of endophytic nitrogen fixation. Two groups have sequenced the PAl 5 genome from the same source (ATCC 49037), though the resulting sequences contain a surprisingly high number of differences. Therefore, an optical map of PAl 5 was constructed in order to determine which genome assembly more closely resembles the chromosomal DNA by aligning each sequence against a physical map of the genome. While one sequence aligned very well, over 98% of the second sequence contained numerous rearrangements. The many differences observed between these two genome sequences could be owing to either assembly errors or rapid evolutionary divergence. The extent of the differences derived from sequence assembly errors could be assessed if the raw sequencing reads were provided by both genome centers at the time of genome sequence submission. Hence, a new genome sequence standard is proposed whereby the investigator supplies the raw reads along with the closed sequence so that the community can make more accurate judgments on whether differences observed in a single stain may be of biological origin or are simply caused by differences in genome assembly procedures.

  18. Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.

    Directory of Open Access Journals (Sweden)

    Lőrinc S Pongor

    Full Text Available Next generation sequencing (NGS of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2 and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner.

  19. R3D Align: global pairwise alignment of RNA 3D structures using local superpositions

    Science.gov (United States)

    Rahrig, Ryan R.; Leontis, Neocles B.; Zirbel, Craig L.

    2010-01-01

    Motivation: Comparing 3D structures of homologous RNA molecules yields information about sequence and structural variability. To compare large RNA 3D structures, accurate automatic comparison tools are needed. In this article, we introduce a new algorithm and web server to align large homologous RNA structures nucleotide by nucleotide using local superpositions that accommodate the flexibility of RNA molecules. Local alignments are merged to form a global alignment by employing a maximum clique algorithm on a specially defined graph that we call the ‘local alignment’ graph. Results: The algorithm is implemented in a program suite and web server called ‘R3D Align’. The R3D Align alignment of homologous 3D structures of 5S, 16S and 23S rRNA was compared to a high-quality hand alignment. A full comparison of the 16S alignment with the other state-of-the-art methods is also provided. The R3D Align program suite includes new diagnostic tools for the structural evaluation of RNA alignments. The R3D Align alignments were compared to those produced by other programs and were found to be the most accurate, in comparison with a high quality hand-crafted alignment and in conjunction with a series of other diagnostics presented. The number of aligned base pairs as well as measures of geometric similarity are used to evaluate the accuracy of the alignments. Availability: R3D Align is freely available through a web server http://rna.bgsu.edu/R3DAlign. The MATLAB source code of the program suite is also freely available for download at that location. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: r-rahrig@onu.edu PMID:20929913

  20. Bacterial genome reengineering.

    Science.gov (United States)

    Zhou, Jindan; Rudd, Kenneth E

    2011-01-01

    The web application PrimerPair at ecogene.org generates large sets of paired DNA sequences surrounding- all protein and RNA genes of Escherichia coli K-12. Many DNA fragments, which these primers amplify, can be used to implement a genome reengineering strategy using complementary in vitro cloning and in vivo recombineering. The integration of a primer design tool with a model organism database increases the level of quality control. Computer-assisted design of gene primer pairs relies upon having highly accurate genomic DNA sequence information that exactly matches the DNA of the cells being used in the laboratory to ensure predictable DNA hybridizations. It is equally crucial to have confidence that the predicted start codons define the locations of genes accurately. Annotations in the EcoGene database are queried by PrimerPair to eliminate pseudogenes, IS elements, and other problematic genes before the design process starts. These projects progressively familiarize users with the EcoGene content, scope, and application interfaces that are useful for genome reengineering projects. The first protocol leads to the design of a pair of primer sequences that were used to clone and express a single gene. The N-terminal protein sequence was experimentally verified and the protein was detected in the periplasm. This is followed by instructions to design PCR primer pairs for cloning gene fragments encoding 50 periplasmic proteins without their signal peptides. The design process begins with the user simply designating one pair of forward and reverse primer endpoint positions relative to all start and stop codon positions. The gene name, genomic coordinates, and primer DNA sequences are reported to the user. When making chromosomal deletions, the integrity of the provisional primer design is checked to see whether it will generate any unwanted double deletions with adjacent genes. The bad designs are recalculated and replacement primers are provided alongside the

  1. FANSe: an accurate algorithm for quantitative mapping of large scale sequencing reads.

    Science.gov (United States)

    Zhang, Gong; Fedyunin, Ivan; Kirchner, Sebastian; Xiao, Chuanle; Valleriani, Angelo; Ignatova, Zoya

    2012-06-01

    The most crucial step in data processing from high-throughput sequencing applications is the accurate and sensitive alignment of the sequencing reads to reference genomes or transcriptomes. The accurate detection of insertions and deletions (indels) and errors introduced by the sequencing platform or by misreading of modified nucleotides is essential for the quantitative processing of the RNA-based sequencing (RNA-Seq) datasets and for the identification of genetic variations and modification patterns. We developed a new, fast and accurate algorithm for nucleic acid sequence analysis, FANSe, with adjustable mismatch allowance settings and ability to handle indels to accurately and quantitatively map millions of reads to small or large reference genomes. It is a seed-based algorithm which uses the whole read information for mapping and high sensitivity and low ambiguity are achieved by using short and non-overlapping reads. Furthermore, FANSe uses hotspot score to prioritize the processing of highly possible matches and implements modified Smith-Watermann refinement with reduced scoring matrix to accelerate the calculation without compromising its sensitivity. The FANSe algorithm stably processes datasets from various sequencing platforms, masked or unmasked and small or large genomes. It shows a remarkable coverage of low-abundance mRNAs which is important for quantitative processing of RNA-Seq datasets.

  2. Shuttle onboard IMU alignment methods

    Science.gov (United States)

    Henderson, D. M.

    1976-01-01

    The current approach to the shuttle IMU alignment is based solely on the Apollo Deterministic Method. This method is simple, fast, reliable and provides an accurate estimate for the present cluster to mean of 1,950 transformation matrix. If four or more star sightings are available, the application of least squares analysis can be utilized. The least squares method offers the next level of sophistication to the IMU alignment solution. The least squares method studied shows that a more accurate estimate for the misalignment angles is computed, and the IMU drift rates are a free by-product of the analysis. Core storage requirements are considerably more; estimated 20 to 30 times the core required for the Apollo Deterministic Method. The least squares method offers an intermediate solution utilizing as much data that is available without a complete statistical analysis as in Kalman filtering.

  3. Multiple sequence alignment accuracy and phylogenetic inference.

    Science.gov (United States)

    Ogden, T Heath; Rosenberg, Michael S

    2006-04-01

    Phylogenies are often thought to be more dependent upon the specifics of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.

  4. Descriptive Statistics of the Genome: Phylogenetic Classification of Viruses.

    Science.gov (United States)

    Hernandez, Troy; Yang, Jie

    2016-10-01

    The typical process for classifying and submitting a newly sequenced virus to the NCBI database involves two steps. First, a BLAST search is performed to determine likely family candidates. That is followed by checking the candidate families with the pairwise sequence alignment tool for similar species. The submitter's judgment is then used to determine the most likely species classification. The aim of this article is to show that this process can be automated into a fast, accurate, one-step process using the proposed alignment-free method and properly implemented machine learning techniques. We present a new family of alignment-free vectorizations of the genome, the generalized vector, that maintains the speed of existing alignment-free methods while outperforming all available methods. This new alignment-free vectorization uses the frequency of genomic words (k-mers), as is done in the composition vector, and incorporates descriptive statistics of those k-mers' positional information, as inspired by the natural vector. We analyze five different characterizations of genome similarity using k-nearest neighbor classification and evaluate these on two collections of viruses totaling over 10,000 viruses. We show that our proposed method performs better than, or as well as, other methods at every level of the phylogenetic hierarchy. The data and R code is available upon request.

  5. The twilight zone of cis element alignments.

    Science.gov (United States)

    Sebastian, Alvaro; Contreras-Moreira, Bruno

    2013-02-01

    Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare (http://floresta.eead.csic.es/tfcompare), a structural alignment method for protein-DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein-DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.

  6. Accelerated large-scale multiple sequence alignment

    Directory of Open Access Journals (Sweden)

    Lloyd Scott

    2011-12-01

    Full Text Available Abstract Background Multiple sequence alignment (MSA is a fundamental analysis method used in bioinformatics and many comparative genomic applications. Prior MSA acceleration attempts with reconfigurable computing have only addressed the first stage of progressive alignment and consequently exhibit performance limitations according to Amdahl's Law. This work is the first known to accelerate the third stage of progressive alignment on reconfigurable hardware. Results We reduce subgroups of aligned sequences into discrete profiles before they are pairwise aligned on the accelerator. Using an FPGA accelerator, an overall speedup of up to 150 has been demonstrated on a large data set when compared to a 2.4 GHz Core2 processor. Conclusions Our parallel algorithm and architecture accelerates large-scale MSA with reconfigurable computing and allows researchers to solve the larger problems that confront biologists today. Program source is available from http://dna.cs.byu.edu/msa/.

  7. Alignment-free phylogenetics and population genetics.

    Science.gov (United States)

    Haubold, Bernhard

    2014-05-01

    Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are based on comparative data, today usually DNA sequences. These have become so plentiful that alignment-free sequence comparison is of growing importance in the race between scientists and sequencing machines. In phylogenetics, efficient distance computation is the major contribution of alignment-free methods. A distance measure should reflect the number of substitutions per site, which underlies classical alignment-based phylogeny reconstruction. Alignment-free distance measures are either based on word counts or on match lengths, and I apply examples of both approaches to simulated and real data to assess their accuracy and efficiency. While phylogeny reconstruction is based on the number of substitutions, in population genetics, the distribution of mutations along a sequence is also considered. This distribution can be explored by match lengths, thus opening the prospect of alignment-free population genomics.

  8. Murasaki: a fast, parallelizable algorithm to find anchors from multiple genomes.

    Directory of Open Access Journals (Sweden)

    Kris Popendorf

    Full Text Available BACKGROUND: With the number of available genome sequences increasing rapidly, the magnitude of sequence data required for multiple-genome analyses is a challenging problem. When large-scale rearrangements break the collinearity of gene orders among genomes, genome comparison algorithms must first identify sets of short well-conserved sequences present in each genome, termed anchors. Previously, anchor identification among multiple genomes has been achieved using pairwise alignment tools like BLASTZ through progressive alignment tools like TBA, but the computational requirements for sequence comparisons of multiple genomes quickly becomes a limiting factor as the number and scale of genomes grows. METHODOLOGY/PRINCIPAL FINDINGS: Our algorithm, named Murasaki, makes it possible to identify anchors within multiple large sequences on the scale of several hundred megabases in few minutes using a single CPU. Two advanced features of Murasaki are (1 adaptive hash function generation, which enables efficient use of arbitrary mismatch patterns (spaced seeds and therefore the comparison of multiple mammalian genomes in a practical amount of computation time, and (2 parallelizable execution that decreases the required wall-clock and CPU times. Murasaki can perform a sensitive anchoring of eight mammalian genomes (human, chimp, rhesus, orangutan, mouse, rat, dog, and cow in 21 hours CPU time (42 minutes wall time. This is the first single-pass in-core anchoring of multiple mammalian genomes. We evaluated Murasaki by comparing it with the genome alignment programs BLASTZ and TBA. We show that Murasaki can anchor multiple genomes in near linear time, compared to the quadratic time requirements of BLASTZ and TBA, while improving overall accuracy. CONCLUSIONS/SIGNIFICANCE: Murasaki provides an open source platform to take advantage of long patterns, cluster computing, and novel hash algorithms to produce accurate anchors across multiple genomes with

  9. Vibrating wire alignment technique

    CERN Document Server

    Xiao-Long, Wang; lei, Wu; Chun-Hua, Li

    2013-01-01

    Vibrating wire alignment technique is a kind of method which through measuring the spatial distribution of magnetic field to do the alignment and it can achieve very high alignment accuracy. Vibrating wire alignment technique can be applied for magnet fiducialization and accelerator straight section components alignment, it is a necessary supplement for conventional alignment method. This article will systematically expound the international research achievements of vibrating wire alignment technique, including vibrating wire model analysis, system frequency calculation, wire sag calculation and the relation between wire amplitude and magnetic induction intensity. On the basis of model analysis this article will introduce the alignment method which based on magnetic field measurement and the alignment method which based on amplitude and phase measurement. Finally, some basic questions will be discussed and the solutions will be given.

  10. nGASP - the nematode genome annotation assessment project

    Energy Technology Data Exchange (ETDEWEB)

    Coghlan, A; Fiedler, T J; McKay, S J; Flicek, P; Harris, T W; Blasiar, D; Allen, J; Stein, L D

    2008-12-19

    While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets for 10 Mb of the C. elegans genome. Predictions were compared to reference gene sets consisting of confirmed or manually curated gene models from WormBase. The most accurate gene-finders were 'combiner' algorithms, which made use of transcript- and protein-alignments and multi-genome alignments, as well as gene predictions from other gene-finders. Gene-finders that used alignments of ESTs, mRNAs and proteins came in second place. There was a tie for third place between gene-finders that used multi-genome alignments and ab initio gene-finders. The median gene level sensitivity of combiners was 78% and their specificity was 42%, which is nearly the same accuracy as reported for combiners in the human genome. C. elegans genes with exons of unusual hexamer content, as well as those with many exons, short exons, long introns, a weak translation start signal, weak splice sites, or poorly conserved orthologs were the most challenging for gene-finders. While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets for 10 Mb of the C

  11. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G.Gomez.

    Since June of 2009, the muon alignment group has focused on providing new alignment constants and on finalizing the hardware alignment reconstruction. Alignment constants for DTs and CSCs were provided for CRAFT09 data reprocessing. For DT chambers, the track-based alignment was repeated using CRAFT09 cosmic ray muons and validated using segment extrapolation and split cosmic tools. One difference with respect to the previous alignment is that only five degrees of freedom were aligned, leaving the rotation around the local x-axis to be better determined by the hardware system. Similarly, DT chambers poorly aligned by tracks (due to limited statistics) were aligned by a combination of photogrammetry and hardware-based alignment. For the CSC chambers, the hardware system provided alignment in global z and rotations about local x. Entire muon endcap rings were further corrected in the transverse plane (global x and y) by the track-based alignment. Single chamber track-based alignment suffers from poor statistic...

  12. SinicView: A visualization environment for comparisons of multiple nucleotide sequence alignment tools

    OpenAIRE

    Wong Chun-Yi; Wu Yu-Wei; Chen Shiang-Heng; Peng Chin-Lin; Lin Laurent; Lee DT; Shih Arthur; Chou Meng-Yuan; Shiao Tze-Chang; Hsieh Mu-Fen

    2006-01-01

    Abstract Background Deluged by the rate and complexity of completed genomic sequences, the need to align longer sequences becomes more urgent, and many more tools have thus been developed. In the initial stage of genomic sequence analysis, a biologist is usually faced with the questions of how to choose the best tool to align sequences of interest and how to analyze and visualize the alignment results, and then with the question of whether poorly aligned regions produced by the tool are indee...

  13. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G. Gomez and J. Pivarski

    2011-01-01

    Alignment efforts in the first few months of 2011 have shifted away from providing alignment constants (now a well established procedure) and focussed on some critical remaining issues. The single most important task left was to understand the systematic differences observed between the track-based (TB) and hardware-based (HW) barrel alignments: a systematic difference in r-φ and in z, which grew as a function of z, and which amounted to ~4-5 mm differences going from one end of the barrel to the other. This difference is now understood to be caused by the tracker alignment. The systematic differences disappear when the track-based barrel alignment is performed using the new “twist-free” tracker alignment. This removes the largest remaining source of systematic uncertainty. Since the barrel alignment is based on hardware, it does not suffer from the tracker twist. However, untwisting the tracker causes endcap disks (which are aligned ...

  14. Accurate discrimination of conserved coding and non-coding regions through multiple indicators of evolutionary dynamics

    Directory of Open Access Journals (Sweden)

    Pesole Graziano

    2009-09-01

    Full Text Available Abstract Background The conservation of sequences between related genomes has long been recognised as an indication of functional significance and recognition of sequence homology is one of the principal approaches used in the annotation of newly sequenced genomes. In the context of recent findings that the number non-coding transcripts in higher organisms is likely to be much higher than previously imagined, discrimination between conserved coding and non-coding sequences is a topic of considerable interest. Additionally, it should be considered desirable to discriminate between coding and non-coding conserved sequences without recourse to the use of sequence similarity searches of protein databases as such approaches exclude the identification of novel conserved proteins without characterized homologs and may be influenced by the presence in databases of sequences which are erroneously annotated as coding. Results Here we present a machine learning-based approach for the discrimination of conserved coding sequences. Our method calculates various statistics related to the evolutionary dynamics of two aligned sequences. These features are considered by a Support Vector Machine which designates the alignment coding or non-coding with an associated probability score. Conclusion We show that our approach is both sensitive and accurate with respect to comparable methods and illustrate several situations in which it may be applied, including the identification of conserved coding regions in genome sequences and the discrimination of coding from non-coding cDNA sequences.

  15. Implementation of a Parallel Protein Structure Alignment Service on Cloud

    Directory of Open Access Journals (Sweden)

    Che-Lun Hung

    2013-01-01

    Full Text Available Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.

  16. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G.Gomez

    2010-01-01

    The main developments in muon alignment since March 2010 have been the production, approval and deployment of alignment constants for the ICHEP data reprocessing. In the barrel, a new geometry, combining information from both hardware and track-based alignment systems, has been developed for the first time. The hardware alignment provides an initial DT geometry, which is then anchored as a rigid solid, using the link alignment system, to a reference frame common to the tracker. The “GlobalPositionRecords” for both the Tracker and Muon systems are being used for the first time, and the initial tracker-muon relative positioning, based on the link alignment, yields good results within the photogrammetry uncertainties of the Tracker and alignment ring positions. For the first time, the optical and track-based alignments show good agreement between them; the optical alignment being refined by the track-based alignment. The resulting geometry is the most complete to date, aligning all 250 DTs, ...

  17. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    Z. Szillasi and G. Gomez.

    2013-01-01

    When CMS is opened up, major components of the Link and Barrel Alignment systems will be removed. This operation, besides allowing for maintenance of the detector underneath, is needed for making interventions that will reinforce the alignment measurements and make the operation of the alignment system more reliable. For that purpose and also for their general maintenance and recalibration, the alignment components will be transferred to the Alignment Lab situated in the ISR area. For the track-based alignment, attention is focused on the determination of systematic uncertainties, which have become dominant, since now there is a large statistics of muon tracks. This will allow for an improved Monte Carlo misalignment scenario and updated alignment position errors, crucial for high-momentum muon analysis such as Z′ searches.

  18. A novel approach to multiple sequence alignment using hadoop data grids.

    Science.gov (United States)

    Sudha Sadasivam, G; Baktavatchalam, G

    2010-01-01

    Multiple alignment of protein sequences helps to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of alignment. Although dynamic programming algorithms produce accurate alignments, they are computation intensive. In this paper we propose a time efficient approach to sequence alignment that also produces quality alignment. The dynamic nature of the algorithm coupled with data and computational parallelism of hadoop data grids improves the accuracy and speed of sequence alignment. The principle of block splitting in hadoop coupled with its scalability facilitates alignment of very large sequences.

  19. Enhanced de novo assembly of high throughput pyrosequencing data using whole genome mapping.

    Science.gov (United States)

    Onmus-Leone, Fatma; Hang, Jun; Clifford, Robert J; Yang, Yu; Riley, Matthew C; Kuschner, Robert A; Waterman, Paige E; Lesho, Emil P

    2013-01-01

    Despite major advances in next-generation sequencing, assembly of sequencing data, especially data from novel microorganisms or re-emerging pathogens, remains constrained by the lack of suitable reference sequences. De novo assembly is the best approach to achieve an accurate finished sequence, but multiple sequencing platforms or paired-end libraries are often required to achieve full genome coverage. In this study, we demonstrated a method to assemble complete bacterial genome sequences by integrating shotgun Roche 454 pyrosequencing with optical whole genome mapping (WGM). The whole genome restriction map (WGRM) was used as the reference to scaffold de novo assembled sequence contigs through a stepwise process. Large de novo contigs were placed in the correct order and orientation through alignment to the WGRM. De novo contigs that were not aligned to WGRM were merged into scaffolds using contig branching structure information. These extended scaffolds were then aligned to the WGRM to identify the overlaps to be eliminated and the gaps and mismatches to be resolved with unused contigs. The process was repeated until a sequence with full coverage and alignment with the whole genome map was achieved. Using this method we were able to achieved 100% WGRM coverage without a paired-end library. We assembled complete sequences for three distinct genetic components of a clinical isolate of Providencia stuartii: a bacterial chromosome, a novel bla NDM-1 plasmid, and a novel bacteriophage, without separately purifying them to homogeneity.

  20. Shod wear and foot alignment in clinical gait analysis.

    Science.gov (United States)

    Louey, Melissa Gar Yee; Sangeux, Morgan

    2016-09-01

    Sagittal plane alignment of the foot presents challenges when the subject wears shoes during gait analysis. Typically, visual alignment is performed by positioning two markers, the heel and toe markers, aligned with the foot within the shoe. Alternatively, software alignment is possible when the sole of the shoe lies parallel to the ground, and the change in the shoe's sole thickness is measured and entered as a parameter. The aim of this technical note was to evaluate the accuracy of visual and software foot alignment during shod gait analysis. We calculated the static standing ankle angles of 8 participants (mean age: 8.7 years, SD: 2.9 years) wearing bilateral solid ankle foot orthoses (BSAFOs) with and without shoes using the visual and software alignment methods. All participants were able to stand with flat feet in both static trials and the ankle angles obtained in BSAFOs without shoes was considered the reference. We showed that the current implementation of software alignment introduces a bias towards more ankle dorsiflexion, mean=3°, SD=3.4°, p=0.006, and proposed an adjusted software alignment method. We found no statistical differences using visual alignment and adjusted software alignment between the shoe and shoeless conditions, p=0.19 for both. Visual alignment or adjusted software alignment are advised to represent foot alignment accurately.

  1. Scaling statistical multiple sequence alignment to large datasets

    Directory of Open Access Journals (Sweden)

    Michael Nute

    2016-11-01

    Full Text Available Abstract Background Multiple sequence alignment is an important task in bioinformatics, and alignments of large datasets containing hundreds or thousands of sequences are increasingly of interest. While many alignment methods exist, the most accurate alignments are likely to be based on stochastic models where sequences evolve down a tree with substitutions, insertions, and deletions. While some methods have been developed to estimate alignments under these stochastic models, only the Bayesian method BAli-Phy has been able to run on even moderately large datasets, containing 100 or so sequences. A technique to extend BAli-Phy to enable alignments of thousands of sequences could potentially improve alignment and phylogenetic tree accuracy on large-scale data beyond the best-known methods today. Results We use simulated data with up to 10,000 sequences representing a variety of model conditions, including some that are significantly divergent from the statistical models used in BAli-Phy and elsewhere. We give a method for incorporating BAli-Phy into PASTA and UPP, two strategies for enabling alignment methods to scale to large datasets, and give alignment and tree accuracy results measured against the ground truth from simulations. Comparable results are also given for other methods capable of aligning this many sequences. Conclusions Extensions of BAli-Phy using PASTA and UPP produce significantly more accurate alignments and phylogenetic trees than the current leading methods.

  2. Whole genome phylogenies for multiple Drosophila species

    Directory of Open Access Journals (Sweden)

    Seetharam Arun

    2012-12-01

    Full Text Available Abstract Background Reconstructing the evolutionary history of organisms using traditional phylogenetic methods may suffer from inaccurate sequence alignment. An alternative approach, particularly effective when whole genome sequences are available, is to employ methods that don’t use explicit sequence alignments. We extend a novel phylogenetic method based on Singular Value Decomposition (SVD to reconstruct the phylogeny of 12 sequenced Drosophila species. SVD analysis provides accurate comparisons for a high fraction of sequences within whole genomes without the prior identification of orthologs or homologous sites. With this method all protein sequences are converted to peptide frequency vectors within a matrix that is decomposed to provide simplified vector representations for each protein of the genome in a reduced dimensional space. These vectors are summed together to provide a vector representation for each species, and the angle between these vectors provides distance measures that are used to construct species trees. Results An unfiltered whole genome analysis (193,622 predicted proteins strongly supports the currently accepted phylogeny for 12 Drosophila species at higher dimensions except for the generally accepted but difficult to discern sister relationship between D. erecta and D. yakuba. Also, in accordance with previous studies, many sequences appear to support alternative phylogenies. In this case, we observed grouping of D. erecta with D. sechellia when approximately 55% to 95% of the proteins were removed using a filter based on projection values or by reducing resolution by using fewer dimensions. Similar results were obtained when just the melanogaster subgroup was analyzed. Conclusions These results indicate that using our novel phylogenetic method, it is possible to consult and interpret all predicted protein sequences within multiple whole genomes to produce accurate phylogenetic estimations of relatedness between

  3. Universal seeds for cDNA-to-genome comparison

    Directory of Open Access Journals (Sweden)

    Florea Liliana

    2008-01-01

    Full Text Available Abstract Background To meet the needs of gene annotation for newly sequenced organisms, optimized spaced seeds can be implemented into cross-species sequence alignment programs to accurately align gene sequences to the genome of a related species. So far, seed performance has been tested for comparisons between closely related species, such as human and mouse, or on simulated data. As the number and variety of genomes increases, it becomes desirable to identify a small set of universal seeds that perform optimally or near-optimally on a large range of comparisons. Results Using statistical regression methods, we investigate the sensitivity of seeds, in particular good seeds, between four cDNA-to-genome comparisons at different evolutionary distances (human-dog, human-mouse, human-chicken and human-zebrafish, and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed. In addition, we find that with high confidence good seeds for more distant comparisons perform well on closer comparisons, within 98–99% of the optimal seeds, and thus represent universal good seeds. Conclusion We show for the first time that optimal and near-optimal seeds for distant species-to-species comparisons are more generally applicable to a wide range of comparisons. This finding will be instrumental in developing practical and user-friendly cDNA-to-genome alignment applications, to aid in the annotation of new model organisms.

  4. Alignment modification for pencil eye shields

    Energy Technology Data Exchange (ETDEWEB)

    Evans, M.D.; Pla, M.; Podgorsak, E.B. (McGill Univ., Quebec (Canada))

    1989-01-01

    Accurate alignment of pencil beam eye shields to protect the lens of the eye may be made easier by means of a simple modification of existing apparatus. This involves drilling a small hole through the center of the shield to isolate the rayline directed to the lens and fabricating a suitable plug for this hole.

  5. Using local alignments for relation recognition

    NARCIS (Netherlands)

    S. Katrenko; P. Adriaans; M. van Someren

    2010-01-01

    This paper discusses the problem of marrying structural similarity with semantic relatedness for Information Extraction from text. Aiming at accurate recognition of relations, we introduce local alignment kernels and explore various possibilities of using them for this task. We give a definition of

  6. Ontology alignment with OLA

    OpenAIRE

    Euzenat, Jérôme; Loup, David; Touzani, Mohamed; Valtchev, Petko

    2004-01-01

    euzenat2004d; International audience; Using ontologies is the standard way to achieve interoperability of heterogeneous systems within the Semantic web. However, as the ontologies underlying two systems are not necessarily compatible, they may in turn need to be aligned. Similarity-based approaches to alignment seems to be both powerful and flexible enough to match the expressive power of languages like OWL. We present an alignment tool that follows the similarity-based paradigm, called OLA. ...

  7. Erasing errors due to alignment ambiguity when estimating positive selection.

    Science.gov (United States)

    Redelings, Benjamin

    2014-08-01

    Current estimates of diversifying positive selection rely on first having an accurate multiple sequence alignment. Simulation studies have shown that under biologically plausible conditions, relying on a single estimate of the alignment from commonly used alignment software can lead to unacceptably high false-positive rates in detecting diversifying positive selection. We present a novel statistical method that eliminates excess false positives resulting from alignment error by jointly estimating the degree of positive selection and the alignment under an evolutionary model. Our model treats both substitutions and insertions/deletions as sequence changes on a tree and allows site heterogeneity in the substitution process. We conduct inference starting from unaligned sequence data by integrating over all alignments. This approach naturally accounts for ambiguous alignments without requiring ambiguously aligned sites to be identified and removed prior to analysis. We take a Bayesian approach and conduct inference using Markov chain Monte Carlo to integrate over all alignments on a fixed evolutionary tree topology. We introduce a Bayesian version of the branch-site test and assess the evidence for positive selection using Bayes factors. We compare two models of differing dimensionality using a simple alternative to reversible-jump methods. We also describe a more accurate method of estimating the Bayes factor using Rao-Blackwellization. We then show using simulated data that jointly estimating the alignment and the presence of positive selection solves the problem with excessive false positives from erroneous alignments and has nearly the same power to detect positive selection as when the true alignment is known. We also show that samples taken from the posterior alignment distribution using the software BAli-Phy have substantially lower alignment error compared with MUSCLE, MAFFT, PRANK, and FSA alignments.

  8. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G.Gomez

    2010-01-01

    Most of the work in muon alignment since December 2009 has focused on the geometry reconstruction from the optical systems and improvements in the internal alignment of the DT chambers. The barrel optical alignment system has progressively evolved from reconstruction of single active planes to super-planes (December 09) to a new, full barrel reconstruction. Initial validation studies comparing this full barrel alignment at 0T with photogrammetry provide promising results. In addition, the method has been applied to CRAFT09 data, and the resulting alignment at 3.8T yields residuals from tracks (extrapolated from the tracker) which look smooth, suggesting a good internal barrel alignment with a small overall offset with respect to the tracker. This is a significant improvement, which should allow the optical system to provide a start-up alignment for 2010. The end-cap optical alignment has made considerable progress in the analysis of transfer line data. The next set of alignment constants for CSCs will there...

  9. Rapid identification of sequences for orphan enzymes to power accurate protein annotation.

    Directory of Open Access Journals (Sweden)

    Kevin R Ramkissoon

    Full Text Available The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.

  10. Rapid identification of sequences for orphan enzymes to power accurate protein annotation.

    Science.gov (United States)

    Ramkissoon, Kevin R; Miller, Jennifer K; Ojha, Sunil; Watson, Douglas S; Bomar, Martha G; Galande, Amit K; Shearer, Alexander G

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.

  11. Backup Alignment Devices on Shuttle: Heads-Up Display or Crew Optical Alignment Sight

    Science.gov (United States)

    Chavez, Melissa A.

    2011-01-01

    NASA s Space Shuttle was built to withstand multiple failures while still keeping the crew and vehicle safe. Although the design of the Space Shuttle had a great deal of redundancy built into each system, there were often additional ways to keep systems in the best configuration if a failure were to occur. One such method was to use select pieces of hardware in a way for which they were not primarily intended. The primary function of the Heads-Up Display (HUD) was to provide the crew with a display of flight critical information during the entry phase. The primary function of the Crew Optical Alignment Sight (COAS) was to provide the crew an optical alignment capability for rendezvous and docking phases. An alignment device was required to keep the Inertial Measurement Units (IMUs) well aligned for a safe Entry; nominally this alignment device would be the two on-board Star Trackers. However, in the event of a Star Tracker failure, the HUD or COAS could also be used as a backup alignment device, but only if the device had been calibrated beforehand. Once the HUD or COAS was calibrated and verified then it was considered an adequate backup to the Star Trackers for entry IMU alignment. There were procedures in place and the astronauts were trained on how to accurately calibrate the HUD or COAS and how to use them as an alignment device. The calibration procedure for the HUD and COAS had been performed on many Shuttle missions. Many of the first calibrations performed were for data gathering purposes to determine which device was more accurate as a backup alignment device, HUD or COAS. Once this was determined, the following missions would frequently calibrate the HUD in order to be one step closer to having the device ready in case it was needed as a backup alignment device.

  12. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    Science.gov (United States)

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  13. An efficient genetic algorithm for structural RNA pairwise alignment and its application to non-coding RNA discovery in yeast

    Directory of Open Access Journals (Sweden)

    Taneda Akito

    2008-12-01

    Full Text Available Abstract Background Aligning RNA sequences with low sequence identity has been a challenging problem since such a computation essentially needs an algorithm with high complexities for taking structural conservation into account. Although many sophisticated algorithms for the purpose have been proposed to date, further improvement in efficiency is necessary to accelerate its large-scale applications including non-coding RNA (ncRNA discovery. Results We developed a new genetic algorithm, Cofolga2, for simultaneously computing pairwise RNA sequence alignment and consensus folding, and benchmarked it using BRAliBase 2.1. The benchmark results showed that our new algorithm is accurate and efficient in both time and memory usage. Then, combining with the originally trained SVM, we applied the new algorithm to novel ncRNA discovery where we compared S. cerevisiae genome with six related genomes in a pairwise manner. By focusing our search to the relatively short regions (50 bp to 2,000 bp sandwiched by conserved sequences, we successfully predict 714 intergenic and 1,311 sense or antisense ncRNA candidates, which were found in the pairwise alignments with stable consensus secondary structure and low sequence identity (≤ 50%. By comparing with the previous predictions, we found that > 92% of the candidates is novel candidates. The estimated rate of false positives in the predicted candidates is 51%. Twenty-five percent of the intergenic candidates has supports for expression in cell, i.e. their genomic positions overlap those of the experimentally determined transcripts in literature. By manual inspection of the results, moreover, we obtained four multiple alignments with low sequence identity which reveal consensus structures shared by three species/sequences. Conclusion The present method gives an efficient tool complementary to sequence-alignment-based ncRNA finders.

  14. Efficient Word Alignment with Markov Chain Monte Carlo

    Directory of Open Access Journals (Sweden)

    Östling Robert

    2016-10-01

    Full Text Available We present EFMARAL, a new system for efficient and accurate word alignment using a Bayesian model with Markov Chain Monte Carlo (MCMC inference. Through careful selection of data structures and model architecture we are able to surpass the fast_align system, commonly used for performance-critical word alignment, both in computational efficiency and alignment accuracy. Our evaluation shows that a phrase-based statistical machine translation (SMT system produces translations of higher quality when using word alignments from EFMARAL than from fast_align, and that translation quality is on par with what is obtained using GIZA++, a tool requiring orders of magnitude more processing time. More generally we hope to convince the reader that Monte Carlo sampling, rather than being viewed as a slow method of last resort, should actually be the method of choice for the SMT practitioner and others interested in word alignment.

  15. SOAP2: an improved ultrafast tool for short read alignment

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Yu, Chang; Li, Yingrui

    2009-01-01

    SUMMARY: SOAP2 is a significantly improved version of the short oligonucleotide alignment program that both reduces computer memory usage and increases alignment speed at an unprecedented rate. We used a Burrows Wheeler Transformation (BWT) compression index to substitute the seed strategy...... for indexing the reference sequence in the main memory. We tested it on the whole human genome and found that this new algorithm reduced memory usage from 14.7 to 5.4 GB and improved alignment speed by 20-30 times. SOAP2 is compatible with both single- and paired-end reads. Additionally, this tool now supports...... multiple text and compressed file formats. A consensus builder has also been developed for consensus assembly and SNP detection from alignment of short reads on a reference genome. AVAILABILITY: http://soap.genomics.org.cn....

  16. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G.Gomez

    2011-01-01

    The Muon Alignment work now focuses on producing a new track-based alignment with higher track statistics, making systematic studies between the results of the hardware and track-based alignment methods and aligning the barrel using standalone muon tracks. Currently, the muon track reconstruction software uses a hardware-based alignment in the barrel (DT) and a track-based alignment in the endcaps (CSC). An important task is to assess the muon momentum resolution that can be achieved using the current muon alignment, especially for highly energetic muons. For this purpose, cosmic ray muons are used, since the rate of high-energy muons from collisions is very low and the event statistics are still limited. Cosmics have the advantage of higher statistics in the pT region above 100 GeV/c, but they have the disadvantage of having a mostly vertical topology, resulting in a very few global endcap muons. Only the barrel alignment has therefore been tested so far. Cosmic muons traversing CMS from top to bottom are s...

  17. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    Gervasio Gomez

    The main progress of the muon alignment group since March has been in the refinement of both the track-based alignment for the DTs and the hardware-based alignment for the CSCs. For DT track-based alignment, there has been significant improvement in the internal alignment of the superlayers inside the DTs. In particular, the distance between superlayers is now corrected, eliminating the residual dependence on track impact angles, and good agreement is found between survey and track-based corrections. The new internal geometry has been approved to be included in the forthcoming reprocessing of CRAFT samples. The alignment of DTs with respect to the tracker using global tracks has also improved significantly, since the algorithms use the latest B-field mapping, better run selection criteria, optimized momentum cuts, and an alignment is now obtained for all six degrees of freedom (three spatial coordinates and three rotations) of the aligned DTs. This work is ongoing and at a stage where we are trying to unders...

  18. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G. Gomez

    Since December, the muon alignment community has focused on analyzing the data recorded so far in order to produce new DT and CSC Alignment Records for the second reprocessing of CRAFT data. Two independent algorithms were developed which align the DT chambers using global tracks, thus providing, for the first time, a relative alignment of the barrel with respect to the tracker. These results are an important ingredient for the second CRAFT reprocessing and allow, for example, a more detailed study of any possible mis-modelling of the magnetic field in the muon spectrometer. Both algorithms are constructed in such a way that the resulting alignment constants are not affected, to first order, by any such mis-modelling. The CSC chambers have not yet been included in this global track-based alignment due to a lack of statistics, since only a few cosmics go through the tracker and the CSCs. A strategy exists to align the CSCs using the barrel as a reference until collision tracks become available. Aligning the ...

  19. SPEAR3 Construction Alignment

    Energy Technology Data Exchange (ETDEWEB)

    LeCocq, Catherine; Banuelos, Cristobal; Fuss, Brian; Gaudreault, Francis; Gaydosh, Michael; Griffin, Levirt; Imfeld, Hans; McDougal, John; Perry, Michael; Rogers,; /SLAC

    2005-08-17

    An ambitious seven month shutdown of the existing SPEAR2 synchrotron radiation facility was successfully completed in March 2004 when the first synchrotron light was observed in the new SPEAR3 ring, SPEAR3 completely replaced SPEAR2 with new components aligned on a new highly-flat concrete floor. Devices such as magnets and vacuum chambers had to be fiducialized and later aligned on girder rafts that were then placed into the ring over pre-aligned support plates. Key to the success of aligning this new ring was to ensure that the new beam orbit matched the old SPEAR2 orbit so that existing experimental beamlines would not have to be reoriented. In this presentation a pictorial summary of the Alignment Engineering Group's surveying tasks for the construction of the SPEAR3 ring is provided. Details on the networking and analysis of various surveys throughout the project can be found in the accompanying paper.

  20. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G. Gomez

    2011-01-01

    A new set of muon alignment constants was approved in August. The relative position between muon chambers is essentially unchanged, indicating good detector stability. The main changes concern the global positioning of the barrel and of the endcap rings to match the new Tracker geometry. Detailed studies of the differences between track-based and optical alignment of DTs have proven to be a valuable tool for constraining Tracker alignment weak modes, and this information is now being used as part of the alignment procedure. In addition to the “split-cosmic” analysis used to investigate the muon momentum resolution at high momentum, a new procedure based on reconstructing the invariant mass of di-muons from boosted Zs is under development. Both procedures show an improvement in the momentum precision of Global Muons with respect to Tracker-only Muons. Recent developments in track-based alignment include a better treatment of the tails of residual distributions and accounting for correla...

  1. Physics of Grain Alignment

    CERN Document Server

    Lazarian, A

    2000-01-01

    Aligned grains provide one of the easiest ways to study magnetic fields in diffuse gas and molecular clouds. How reliable our conclusions about the inferred magnetic field depends critically on our understanding of the physics of grain alignment. Although grain alignment is a problem of half a century standing recent progress achieved in the field makes us believe that we are approaching the solution of this mystery. I review basic physical processes involved in grain alignment and show why mechanisms that were favored for decades do not look so promising right now. I also discuss why the radiative torque mechanism ignored for more than 20 years looks right now the most powerful means of grain alignment.

  2. Rapid protein alignment in the cloud: HAMOND combines fast DIAMOND alignments with Hadoop parallelism.

    Science.gov (United States)

    Yu, Jia; Blom, Jochen; Sczyrba, Alexander; Goesmann, Alexander

    2017-02-21

    The introduction of next generation sequencing has caused a steady increase in the amounts of data that have to be processed in modern life science. Sequence alignment plays a key role in the analysis of sequencing data e.g. within whole genome sequencing or metagenome projects. BLAST is a commonly used alignment tool that was the standard approach for more than two decades, but in the last years faster alternatives have been proposed including RapSearch, GHOSTX, and DIAMOND. Here we introduce HAMOND, an application that uses Apache Hadoop to parallelize DIAMOND computation in order to scale-out the calculation of alignments. HAMOND is fault tolerant and scalable by utilizing large cloud computing infrastructures like Amazon Web Services. HAMOND has been tested in comparative genomics analyses and showed promising results both in efficiency and accuracy.

  3. MSA-PAD: DNA multiple sequence alignment framework based on PFAM accessed domain information.

    Science.gov (United States)

    Balech, Bachir; Vicario, Saverio; Donvito, Giacinto; Monaco, Alfonso; Notarangelo, Pasquale; Pesole, Graziano

    2015-08-01

    Here we present the MSA-PAD application, a DNA multiple sequence alignment framework that uses PFAM protein domain information to align DNA sequences encoding either single or multiple protein domains. MSA-PAD has two alignment options: gene and genome mode.

  4. DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

    Directory of Open Access Journals (Sweden)

    Steven Kelly

    Full Text Available The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.

  5. Galaxy alignments: An overview

    CERN Document Server

    Joachimi, Benjamin; Kitching, Thomas D; Leonard, Adrienne; Mandelbaum, Rachel; Schäfer, Björn Malte; Sifón, Cristóbal; Hoekstra, Henk; Kiessling, Alina; Kirk, Donnacha; Rassat, Anais

    2015-01-01

    The alignments between galaxies, their underlying matter structures, and the cosmic web constitute vital ingredients for a comprehensive understanding of gravity, the nature of matter, and structure formation in the Universe. We provide an overview on the state of the art in the study of these alignment processes and their observational signatures, aimed at a non-specialist audience. The development of the field over the past one hundred years is briefly reviewed. We also discuss the impact of galaxy alignments on measurements of weak gravitational lensing, and discuss avenues for making theoretical and observational progress over the coming decade.

  6. Discriminative Shape Alignment

    DEFF Research Database (Denmark)

    Loog, M.; de Bruijne, M.

    2009-01-01

    The alignment of shape data to a common mean before its subsequent processing is an ubiquitous step within the area shape analysis. Current approaches to shape analysis or, as more specifically considered in this work, shape classification perform the alignment in a fully unsupervised way......, not taking into account that eventually the shapes are to be assigned to two or more different classes. This work introduces a discriminative variation to well-known Procrustes alignment and demonstrates its benefit over this classical method in shape classification tasks. The focus is on two......-dimensional shapes from a two-class recognition problem....

  7. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G.Gomez

    Since September, the muon alignment system shifted from a mode of hardware installation and commissioning to operation and data taking. All three optical subsystems (Barrel, Endcap and Link alignment) have recorded data before, during and after CRAFT, at different magnetic fields and during ramps of the magnet. This first data taking experience has several interesting goals: •    study detector deformations and movements under the influence of the huge magnetic forces; •    study the stability of detector structures and of the alignment system over long periods, •    study geometry reproducibility at equal fields (specially at 0T and 3.8T); •    reconstruct B=0T geometry and compare to nominal/survey geometries; •    reconstruct B=3.8T geometry and provide DT and CSC alignment records for CMSSW. However, the main goal is to recons...

  8. The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

    Directory of Open Access Journals (Sweden)

    Yandell Mark

    2010-07-01

    Full Text Available Abstract Background In today's age of genomic discovery, no attempt has been made to comprehensively sequence a gymnosperm genome. The largest genus in the coniferous family Pinaceae is Pinus, whose 110-120 species have extremely large genomes (c. 20-40 Gb, 2N = 24. The size and complexity of these genomes have prompted much speculation as to the feasibility of completing a conifer genome sequence. Conifer genomes are reputed to be highly repetitive, but there is little information available on the nature and identity of repetitive units in gymnosperms. The pines have extensive genetic resources, with approximately 329000 ESTs from eleven species and genetic maps in eight species, including a dense genetic map of the twelve linkage groups in Pinus taeda. Results We present here the Sanger sequence and annotation of ten P. taeda BAC clones and Genome Analyzer II whole genome shotgun (WGS sequences representing 7.5% of the genome. Computational annotation of ten BACs predicts three putative protein-coding genes and at least fifteen likely pseudogenes in nearly one megabase of sequence. We found three conifer-specific LTR retroelements in the BACs, and tentatively identified at least 15 others based on evidence from the distantly related angiosperms. Alignment of WGS sequences to the BACs indicates that 80% of BAC sequences have similar copies (≥ 75% nucleotide identity elsewhere in the genome, but only 23% have identical copies (99% identity. The three most common repetitive elements in the genome were identified and, when combined, represent less than 5% of the genome. Conclusions This study indicates that the majority of repeats in the P. taeda genome are 'novel' and will therefore require additional BAC or genomic sequencing for accurate characterization. The pine genome contains a very large number of diverged and probably defunct repetitive elements. This study also provides new evidence that sequencing a pine genome using a WGS approach is

  9. CloudAligner: A fast and full-featured MapReduce based tool for sequence mapping

    Directory of Open Access Journals (Sweden)

    Shi Weisong

    2011-06-01

    Full Text Available Abstract Background Research in genetics has developed rapidly recently due to the aid of next generation sequencing (NGS. However, massively-parallel NGS produces enormous amounts of data, which leads to storage, compatibility, scalability, and performance issues. The Cloud Computing and MapReduce framework, which utilizes hundreds or thousands of shared computers to map sequencing reads quickly and efficiently to reference genome sequences, appears to be a very promising solution for these issues. Consequently, it has been adopted by many organizations recently, and the initial results are very promising. However, since these are only initial steps toward this trend, the developed software does not provide adequate primary functions like bisulfite, pair-end mapping, etc., in on-site software such as RMAP or BS Seeker. In addition, existing MapReduce-based applications were not designed to process the long reads produced by the most recent second-generation and third-generation NGS instruments and, therefore, are inefficient. Last, it is difficult for a majority of biologists untrained in programming skills to use these tools because most were developed on Linux with a command line interface. Results To urge the trend of using Cloud technologies in genomics and prepare for advances in second- and third-generation DNA sequencing, we have built a Hadoop MapReduce-based application, CloudAligner, which achieves higher performance, covers most primary features, is more accurate, and has a user-friendly interface. It was also designed to be able to deal with long sequences. The performance gain of CloudAligner over Cloud-based counterparts (35 to 80% mainly comes from the omission of the reduce phase. In comparison to local-based approaches, the performance gain of CloudAligner is from the partition and parallel processing of the huge reference genome as well as the reads. The source code of CloudAligner is available at http

  10. The UCSC Archaeal Genome Browser: 2012 update

    OpenAIRE

    Chan, Patricia P.; Holmes, Andrew D.; Smith, Andrew M.; Tran, Danny; Lowe, Todd M.

    2011-01-01

    The UCSC Archaeal Genome Browser (http://archaea.ucsc.edu) offers a graphical web-based resource for exploration and discovery within archaeal and other selected microbial genomes. By bringing together existing gene annotations, gene expression data, multiple-genome alignments, pre-computed sequence comparisons and other specialized analysis tracks, the genome browser is a powerful aggregator of varied genomic information. The genome browser environment maintains the current look-and-feel of ...

  11. Multiple sequence alignment with user-defined anchor points

    Directory of Open Access Journals (Sweden)

    Pöhler Dirk

    2006-04-01

    Full Text Available Abstract Background Automated software tools for multiple alignment often fail to produce biologically meaningful results. In such situations, expert knowledge can help to improve the quality of alignments. Results Herein, we describe a semi-automatic version of the alignment program DIALIGN that can take pre-defined constraints into account. It is possible for the user to specify parts of the sequences that are assumed to be homologous and should therefore be aligned to each other. Our software program can use these sites as anchor points by creating a multiple alignment respecting these constraints. This way, our alignment method can produce alignments that are biologically more meaningful than alignments produced by fully automated procedures. As a demonstration of how our method works, we apply our approach to genomic sequences around the Hox gene cluster and to a set of DNA-binding proteins. As a by-product, we obtain insights about the performance of the greedy algorithm that our program uses for multiple alignment and about the underlying objective function. This information will be useful for the further development of DIALIGN. The described alignment approach has been integrated into the TRACKER software system.

  12. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G. Gomez

    2012-01-01

      A new muon alignment has been produced for 2012 A+B data reconstruction. It uses the latest Tracker alignment and single-muon data samples to align both DTs and CSCs. Physics validation has been performed and shows a modest improvement in stand-alone muon momentum resolution in the barrel, where the alignment is essentially unchanged from the previous version. The reference-target track-based algorithm using only collision muons is employed for the first time to align the CSCs, and a substantial improvement in resolution is observed in the endcap and overlap regions for stand-alone muons. This new alignment is undergoing the approval process and is expected to be deployed as part of a new global tag in the beginning of December. The pT dependence of the φ-bias in curvature observed in Monte Carlo was traced to a relative vertical misalignment between the Tracker and barrel muon systems. Moving the barrel as a whole to match the Tracker cures this pT dependence, leaving only the &phi...

  13. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    S. Szillasi

    2013-01-01

    The CMS detector has been gradually opened and whenever a wheel became exposed the first operation was the removal of the MABs, the sensor structures of the Hardware Barrel Alignment System. By the last days of June all 36 MABs have arrived at the Alignment Lab at the ISR where, as part of the Alignment Upgrade Project, they are refurbished with new Survey target holders. Their electronic checkout is on the way and finally they will be recalibrated. During LS1 the alignment system will be upgraded in order to allow more precise reconstruction of the MB4 chambers in Sector 10 and Sector 4. This requires new sensor components, so called MiniMABs (pictured below), that have already been assembled and calibrated. Image 6: Calibrated MiniMABs are ready for installation For the track-based alignment, the systematic uncertainties of the algorithm are under scrutiny: this study will enable the production of an improved Monte Carlo misalignment scenario and to update alignment position errors eventually, crucial...

  14. Incremental Alignment Manifold Learning

    Institute of Scientific and Technical Information of China (English)

    Zhi Han; De-Yu Meng; Zong-Sen Xu; Nan-Nan Gu

    2011-01-01

    A new manifold learning method, called incremental alignment method (IAM), is proposed for nonlinear dimensionality reduction of high dimensional data with intrinsic low dimensionality. The main idea is to incrementally align low-dimensional coordinates of input data patch-by-patch to iteratively generate the representation of the entire dataset. The method consists of two major steps, the incremental step and the alignment step. The incremental step incrementally searches neighborhood patch to be aligned in the next step, and the alignment step iteratively aligns the low-dimensional coordinates of the neighborhood patch searched to generate the embeddings of the entire dataset. Compared with the existing manifold learning methods, the proposed method dominates in several aspects: high efficiency, easy out-of-sample extension, well metric-preserving, and averting of the local minima issue. All these properties are supported by a series of experiments performed on the synthetic and real-life datasets. In addition, the computational complexity of the proposed method is analyzed, and its efficiency is theoretically argued and experimentally demonstrated.

  15. Evaluation of alignment marks using ASML ATHENA alignment system in 90nm BEOL process

    CERN Document Server

    Tan Chin Boon; Koh Hui Peng; Koo Chee, Kiong; Siew Yong Kong; Yeo Swee Hock

    2003-01-01

    As the critical dimension (CD) in integrated circuit (IC) device reduces, the total overlay budget needs to be more stringent. Typically, the allowable overlay error is 1/3 of the CD in the IC device. In this case, robustness of alignment mark is critical, as accurate signal is required by the scanner's alignment system to precisely align a layer of pattern to the previous layer. Alignment issue is more severe in back-end process partly due to the influenced of Chemical Mechanical Polishing (CMP), which contribute to the asymmetric or total destruction of the alignment marks. Alignment marks on the wafer can be placed along the scribe-line of the IC pattern. ASML scanner allows such type of wafer alignment using phase grating mark, known as Scribe-line Primary Mark (SPM) which can be fit into a standard 80um scribe-line. In this paper, we have studied the feasibility of introducing Narrow SPM (NSPM) to enable a smaller scribe-line. The width of NSPM has been shrunk down to 70% of the SPM and the length remain...

  16. Read clouds uncover variation in complex regions of the human genome.

    Science.gov (United States)

    Bishara, Alex; Liu, Yuling; Weng, Ziming; Kashef-Haghighi, Dorna; Newburger, Daniel E; West, Robert; Sidow, Arend; Batzoglou, Serafim

    2015-10-01

    Although an increasing amount of human genetic variation is being identified and recorded, determining variants within repeated sequences of the human genome remains a challenge. Most population and genome-wide association studies have therefore been unable to consider variation in these regions. Core to the problem is the lack of a sequencing technology that produces reads with sufficient length and accuracy to enable unique mapping. Here, we present a novel methodology of using read clouds, obtained by accurate short-read sequencing of DNA derived from long fragment libraries, to confidently align short reads within repeat regions and enable accurate variant discovery. Our novel algorithm, Random Field Aligner (RFA), captures the relationships among the short reads governed by the long read process via a Markov Random Field. We utilized a modified version of the Illumina TruSeq synthetic long-read protocol, which yielded shallow-sequenced read clouds. We test RFA through extensive simulations and apply it to discover variants on the NA12878 human sample, for which shallow TruSeq read cloud sequencing data are available, and on an invasive breast carcinoma genome that we sequenced using the same method. We demonstrate that RFA facilitates accurate recovery of variation in 155 Mb of the human genome, including 94% of 67 Mb of segmental duplication sequence and 96% of 11 Mb of transcribed sequence, that are currently hidden from short-read technologies.

  17. HAMSA: Highly Accelerated Multiple Sequence Aligner

    Directory of Open Access Journals (Sweden)

    Naglaa M. Reda

    2016-06-01

    Full Text Available For biologists, the existence of an efficient tool for multiple sequence alignment is essential. This work presents a new parallel aligner called HAMSA. HAMSA is a bioinformatics application designed for highly accelerated alignment of multiple sequences of proteins and DNA/RNA on a multi-core cluster system. The design of HAMSA is based on a combination of our new optimized algorithms proposed recently of vectorization, partitioning, and scheduling. It mainly operates on a distance vector instead of a distance matrix. It accomplishes similarity computations and generates the guide tree in a highly accelerated and accurate manner. HAMSA outperforms MSAProbs with 21.9- fold speedup, and ClustalW-MPI of 11-fold speedup. It can be considered as an essential tool for structure prediction, protein classification, motive finding and drug design studies.

  18. Using structure to explore the sequence alignment space of remote homologs.

    Directory of Open Access Journals (Sweden)

    Andrew Kuziemko

    2011-10-01

    Full Text Available Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.

  19. Curriculum Alignment Research Suggests that Alignment Can Improve Student Achievement

    Science.gov (United States)

    Squires, David

    2012-01-01

    Curriculum alignment research has developed showing the relationship among three alignment categories: the taught curriculum, the tested curriculum and the written curriculum. Each pair (for example, the taught and the written curriculum) shows a positive impact for aligning those results. Following this, alignment results from the Third…

  20. Karect: accurate correction of substitution, insertion and deletion errors for next-generation sequencing data

    KAUST Repository

    Allam, Amin

    2015-07-14

    Motivation: Next-generation sequencing generates large amounts of data affected by errors in the form of substitutions, insertions or deletions of bases. Error correction based on the high-coverage information, typically improves de novo assembly. Most existing tools can correct substitution errors only; some support insertions and deletions, but accuracy in many cases is low. Results: We present Karect, a novel error correction technique based on multiple alignment. Our approach supports substitution, insertion and deletion errors. It can handle non-uniform coverage as well as moderately covered areas of the sequenced genome. Experiments with data from Illumina, 454 FLX and Ion Torrent sequencing machines demonstrate that Karect is more accurate than previous methods, both in terms of correcting individual-bases errors (up to 10% increase in accuracy gain) and post de novo assembly quality (up to 10% increase in NGA50). We also introduce an improved framework for evaluating the quality of error correction.

  1. MaxAlign: maximizing usable data in an alignment

    DEFF Research Database (Denmark)

    Oliveira, Rodrigo Gouveia; Sackett, Peter Wad; Pedersen, Anders Gorm

    2007-01-01

    BACKGROUND: The presence of gaps in an alignment of nucleotide or protein sequences is often an inconvenience for bioinformatical studies. In phylogenetic and other analyses, for instance, gapped columns are often discarded entirely from the alignment. RESULTS: MaxAlign is a program that optimizes...... the alignment prior to such analyses. Specifically, it maximizes the number of nucleotide (or amino acid) symbols that are present in gap-free columns - the alignment area - by selecting the optimal subset of sequences to exclude from the alignment. MaxAlign can be used prior to phylogenetic and bioinformatical...... analyses as well as in other situations where this form of alignment improvement is useful. In this work we test MaxAlign's performance in these tasks and compare the accuracy of phylogenetic estimates including and excluding gapped columns from the analysis, with and without processing with MaxAlign...

  2. An Introduction to Genome Annotation.

    Science.gov (United States)

    Campbell, Michael S; Yandell, Mark

    2015-12-17

    Genome projects have evolved from large international undertakings to tractable endeavors for a single lab. Accurate genome annotation is critical for successful genomic, genetic, and molecular biology experiments. These annotations can be generated using a number of approaches and available software tools. This unit describes methods for genome annotation and a number of software tools commonly used in gene annotation.

  3. Parameter Identification Method for SINS Initial Alignment under Inertial Frame

    Directory of Open Access Journals (Sweden)

    Haijian Xue

    2016-01-01

    Full Text Available The performance of a strapdown inertial navigation system (SINS largely depends on the accuracy and rapidness of the initial alignment. The conventional alignment method with parameter identification has been already applied widely, but it needs to calculate the gyroscope drifts through two-position method; then the time of initial alignment is greatly prolonged. For this issue, a novel self-alignment algorithm by parameter identification method under inertial frame for SINS is proposed in this paper. Firstly, this coarse alignment method using the gravity in the inertial frame as a reference is discussed to overcome the limit of dynamic disturbance on a rocking base and fulfill the requirement for the fine alignment. Secondly, the fine alignment method by parameter identification under inertial frame is formulated. The theoretical analysis results show that the fine alignment model is fully self-aligned with no external reference information and the gyrodrifts can be estimated in real time. The simulation results demonstrate that the proposed method can achieve rapid and highly accurate initial alignment for SINS.

  4. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    M. Dallavalle

    2013-01-01

    A new Muon misalignment scenario for 2011 (7 TeV) Monte Carlo re-processing was re-leased. The scenario is based on running of standard track-based reference-target algorithm (exactly as in data) using single-muon simulated sample (with the transverse-momentum spectrum matching data). It used statistics similar to what was used for alignment with 2011 data, starting from an initially misaligned Muon geometry from uncertainties of hardware measurements and using the latest Tracker misalignment geometry. Validation of the scenario (with muons from Z decay and high-pT simulated muons) shows that it describes data well. The study of systematic uncertainties (dominant by now due to huge amount of data collected by CMS and used for muon alignment) is finalised. Realistic alignment position errors are being obtained from the estimated uncertainties and are expected to improve the muon reconstruction performance. Concerning the Hardware Alignment System, the upgrade of the Barrel Alignment is in progress. By now, d...

  5. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G. Gomez

    2010-01-01

    For the last three months, the Muon Alignment group has focussed on providing a new, improved set of alignment constants for the end-of-year data reprocessing. These constants were delivered on time and approved by the CMS physics validation team on November 17. The new alignment incorporates several improvements over the previous one from March for nearly all sub-systems. Motivated by the loss of information from a hardware failure in May (an entire MAB was lost), the optical barrel alignment has moved from a modular, super-plane reconstruction, to a full, single loop calculation of the entire geometry for all DTs in stations 1, 2 and 3. This makes better use of the system redundancy, mitigating the effect of the information loss. Station 4 is factorised and added afterwards to make the system smaller (and therefore faster to run), and also because the MAB calibration at the MB4 zone is less precise. This new alignment procedure was tested at 0 T against photogrammetry resulting in precisions of the order...

  6. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    Gervasio Gomez

    2012-01-01

      The new alignment for the DT chambers has been successfully used in physics analysis starting with the 52X Global Tag. The remaining main areas of development over the next few months will be preparing a new track-based CSC alignment and producing realistic APEs (alignment position errors) and MC misalignment scenarios to match the latest muon alignment constants. Work on these items has been delayed from the intended timeline, mostly due to a large involvement of the muon alignment man-power in physics analyses over the first half of this year. As CMS keeps probing higher and higher energies, special attention must be paid to the reconstruction of very-high-energy muons. Recent muon POG reports from mid-June show a φ-dependence in curvature bias in Monte Carlo samples. This bias is observed already at the tracker level, where it is constant with muon pT, while it grows with pT as muon chamber information is added to the tracks. Similar studies show a much smaller effect in data, at le...

  7. Ergodic Secret Alignment

    CERN Document Server

    Bassily, Raef

    2010-01-01

    In this paper, we introduce two new achievable schemes for the fading multiple access wiretap channel (MAC-WT). In the model that we consider, we assume that perfect knowledge of the state of all channels is available at all the nodes in a causal fashion. Our schemes use this knowledge together with the time varying nature of the channel model to align the interference from different users at the eavesdropper perfectly in a one-dimensional space while creating a higher dimensionality space for the interfering signals at the legitimate receiver hence allowing for better chance of recovery. While we achieve this alignment through signal scaling at the transmitters in our first scheme (scaling based alignment (SBA)), we let nature provide this alignment through the ergodicity of the channel coefficients in the second scheme (ergodic secret alignment (ESA)). For each scheme, we obtain the resulting achievable secrecy rate region. We show that the secrecy rates achieved by both schemes scale with SNR as 1/2log(SNR...

  8. Syntenator: Multiple gene order alignments with a gene-specific scoring function

    Directory of Open Access Journals (Sweden)

    Dieterich Christoph

    2008-11-01

    Full Text Available Abstract Background Identification of homologous regions or conserved syntenies across genomes is one crucial step in comparative genomics. This task is usually performed by genome alignment softwares like WABA or blastz. In case of conserved syntenies, such regions are defined as conserved gene orders. On the gene order level, homologous regions can even be found between distantly related genomes, which do not align on the nucleotide sequence level. Results We present a novel approach to identify regions of conserved synteny across multiple genomes. Syntenator represents genomes and alignments thereof as partial order graphs (POGs. These POGs are aligned by a dynamic programming approach employing a gene-specific scoring function. The scoring function reflects the level of protein sequence similarity for each possible gene pair. Our method consistently defines larger homologous regions in pairwise gene order alignments than nucleotide-level comparisons. Our method is superior to methods that work on predefined homology gene sets (as implemented in Blockfinder. Syntenator successfully reproduces 80% of the EnsEMBL man-mouse conserved syntenic blocks. The full potential of our method becomes visible by comparing remotely related genomes and multiple genomes. Gene order alignments potentially resolve up to 75% of the EnsEMBL 1:many orthology relations and 27% of the many:many orthology relations. Conclusion We propose Syntenator as a software solution to reliably infer conserved syntenies among distantly related genomes. The software is available from http://www2.tuebingen.mpg.de/abt4/plone.

  9. Secure Fingerprint Alignment and Matching Protocols

    OpenAIRE

    Bayatbabolghani, Fattaneh; Blanton, Marina; Aliasgari, Mehrdad; Goodrich, Michael

    2017-01-01

    We present three secure privacy-preserving protocols for fingerprint alignment and matching, based on what are considered to be the most precise and efficient fingerprint recognition algorithms-those based on the geometric matching of "landmarks" known as minutia points. Our protocols allow two or more honest-but-curious parties to compare their respective privately-held fingerprints in a secure way such that they each learn nothing more than a highly-accurate score of how well the fingerprin...

  10. Mango: multiple alignment with N gapped oligos.

    Science.gov (United States)

    Zhang, Zefeng; Lin, Hao; Li, Ming

    2008-06-01

    Multiple sequence alignment is a classical and challenging task. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state-of-the-art works suffer from the "once a gap, always a gap" phenomenon. Is there a radically new way to do multiple sequence alignment? In this paper, we introduce a novel and orthogonal multiple sequence alignment method, using both multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole and tries to build the alignment vertically, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds have proved significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks, showing that MANGO compares favorably, in both accuracy and speed, against state-of-the-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, ProbConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0, and Kalign 2.0. We have further demonstrated the scalability of MANGO on very large datasets of repeat elements. MANGO can be downloaded at http://www.bioinfo.org.cn/mango/ and is free for academic usage.

  11. Concurrent and Accurate Short Read Mapping on Multicore Processors.

    Science.gov (United States)

    Martínez, Héctor; Tárraga, Joaquín; Medina, Ignacio; Barrachina, Sergio; Castillo, Maribel; Dopazo, Joaquín; Quintana-Ortí, Enrique S

    2015-01-01

    We introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, HPG Aligner SA (HPG Aligner SA is an open-source application. The software is available at http://www.opencb.org, exploits a suffix array to rapidly map a large fraction of the RNA fragments (reads), as well as leverages the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is enhanced with a careful strategy to detect splice junctions based on an adaptive division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing crucial information for the successful alignment of the complete reads. The experimental results on a platform with Intel multicore technology report the parallel performance of HPG Aligner SA, on RNA reads of 100-400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2, MapSplice, and STAR.

  12. Accuracy of structure-based sequence alignment of automatic methods

    Directory of Open Access Journals (Sweden)

    Lee Byungkook

    2007-09-01

    Full Text Available Abstract Background Accurate sequence alignments are essential for homology searches and for building three-dimensional structural models of proteins. Since structure is better conserved than sequence, structure alignments have been used to guide sequence alignments and are commonly used as the gold standard for sequence alignment evaluation. Nonetheless, as far as we know, there is no report of a systematic evaluation of pairwise structure alignment programs in terms of the sequence alignment accuracy. Results In this study, we evaluate CE, DaliLite, FAST, LOCK2, MATRAS, SHEBA and VAST in terms of the accuracy of the sequence alignments they produce, using sequence alignments from NCBI's human-curated Conserved Domain Database (CDD as the standard of truth. We find that 4 to 9% of the residues on average are either not aligned or aligned with more than 8 residues of shift error and that an additional 6 to 14% of residues on average are misaligned by 1–8 residues, depending on the program and the data set used. The fraction of correctly aligned residues generally decreases as the sequence similarity decreases or as the RMSD between the Cα positions of the two structures increases. It varies significantly across CDD superfamilies whether shift error is allowed or not. Also, alignments with different shift errors occur between proteins within the same CDD superfamily, leading to inconsistent alignments between superfamily members. In general, residue pairs that are more than 3.0 Å apart in the reference alignment are heavily (>= 25% on average misaligned in the test alignments. In addition, each method shows a different pattern of relative weaknesses for different SCOP classes. CE gives relatively poor results for β-sheet-containing structures (all-β, α/β, and α+β classes, DaliLite for "others" class where all but the major four classes are combined, and LOCK2 and VAST for all-β and "others" classes. Conclusion When the sequence

  13. Choice of reference sequence and assembler for alignment of Listeria monocytogenes short-read sequence data greatly influences rates of error in SNP analyses.

    Directory of Open Access Journals (Sweden)

    Arthur W Pightling

    Full Text Available The wide availability of whole-genome sequencing (WGS and an abundance of open-source software have made detection of single-nucleotide polymorphisms (SNPs in bacterial genomes an increasingly accessible and effective tool for comparative analyses. Thus, ensuring that real nucleotide differences between genomes (i.e., true SNPs are detected at high rates and that the influences of errors (such as false positive SNPs, ambiguously called sites, and gaps are mitigated is of utmost importance. The choices researchers make regarding the generation and analysis of WGS data can greatly influence the accuracy of short-read sequence alignments and, therefore, the efficacy of such experiments. We studied the effects of some of these choices, including: i depth of sequencing coverage, ii choice of reference-guided short-read sequence assembler, iii choice of reference genome, and iv whether to perform read-quality filtering and trimming, on our ability to detect true SNPs and on the frequencies of errors. We performed benchmarking experiments, during which we assembled simulated and real Listeria monocytogenes strain 08-5578 short-read sequence datasets of varying quality with four commonly used assemblers (BWA, MOSAIK, Novoalign, and SMALT, using reference genomes of varying genetic distances, and with or without read pre-processing (i.e., quality filtering and trimming. We found that assemblies of at least 50-fold coverage provided the most accurate results. In addition, MOSAIK yielded the fewest errors when reads were aligned to a nearly identical reference genome, while using SMALT to align reads against a reference sequence that is ∼0.82% distant from 08-5578 at the nucleotide level resulted in the detection of the greatest numbers of true SNPs and the fewest errors. Finally, we show that whether read pre-processing improves SNP detection depends upon the choice of reference sequence and assembler. In total, this study demonstrates that researchers

  14. Strategic Alignment of Business Intelligence

    OpenAIRE

    Cederberg, Niclas

    2010-01-01

    This thesis is about the concept of strategic alignment of business intelligence. It is based on a theoretical foundation that is used to define and explain business intelligence, data warehousing and strategic alignment. By combining a number of different methods for strategic alignment a framework for alignment of business intelligence is suggested. This framework addresses all different aspects of business intelligence identified as relevant for strategic alignment of business intelligence...

  15. Orientation and Alignment Echoes

    CERN Document Server

    Karras, G; Billard, F; Lavorel, B; Hartmann, J -M; Faucher, O; Gershnabel, E; Prior, Y; Averbukh, I Sh

    2015-01-01

    We present what is probably the simplest classical system featuring the echo phenomenon - a collection of randomly oriented free rotors with dispersed rotational velocities. Following excitation by a pair of time-delayed impulsive kicks, the mean orientation/alignment of the ensemble exhibits multiple echoes and fractional echoes. We elucidate the mechanism of the echo formation by kick-induced filamentation of phase space, and provide the first experimental demonstration of classical alignment echoes in a thermal gas of CO_2 molecules excited by a pair of femtosecond laser pulses.

  16. Group Based Interference Alignment

    CERN Document Server

    Ma, Yanjun; Chen, Rui; Yao, Junliang

    2010-01-01

    in $K$-user single-input single-output (SISO) frequency selective fading interference channels, it is shown that the achievable multiplexing gain is almost surely $K/2$ by using interference alignment (IA). However when the signaling dimensions is limited, allocating all the resource to all the users simultaneously is not optimal. According to this problem, a group based interference alignment (GIA) scheme is proposed and a search algorithm is designed to get the group patterns and the resource allocation among them. Analysis results show that our proposed scheme achieves a higher multiplexing gain when the resource is limited.

  17. PILOT optical alignment

    Science.gov (United States)

    Longval, Y.; Mot, B.; Ade, P.; André, Y.; Aumont, J.; Baustista, L.; Bernard, J.-Ph.; Bray, N.; de Bernardis, P.; Boulade, O.; Bousquet, F.; Bouzit, M.; Buttice, V.; Caillat, A.; Charra, M.; Chaigneau, M.; Crane, B.; Crussaire, J.-P.; Douchin, F.; Doumayrou, E.; Dubois, J.-P.; Engel, C.; Etcheto, P.; Gélot, P.; Griffin, M.; Foenard, G.; Grabarnik, S.; Hargrave, P..; Hughes, A.; Laureijs, R.; Lepennec, Y.; Leriche, B.; Maestre, S.; Maffei, B.; Martignac, J.; Marty, C.; Marty, W.; Masi, S.; Mirc, F.; Misawa, R.; Montel, J.; Montier, L.; Narbonne, J.; Nicot, J.-M.; Pajot, F.; Parot, G.; Pérot, E.; Pimentao, J.; Pisano, G.; Ponthieu, N.; Ristorcelli, I.; Rodriguez, L.; Roudil, G.; Salatino, M.; Savini, G.; Simonella, O.; Saccoccio, M.; Tapie, P.; Tauber, J.; Torre, J.-P.; Tucker, C.

    2016-07-01

    PILOT is a balloon-borne astronomy experiment designed to study the polarization of dust emission in the diffuse interstellar medium in our Galaxy at wavelengths 240 μm with an angular resolution about two arcminutes. Pilot optics is composed an off-axis Gregorian type telescope and a refractive re-imager system. All optical elements, except the primary mirror, are in a cryostat cooled to 3K. We combined the optical, 3D dimensional measurement methods and thermo-elastic modeling to perform the optical alignment. The talk describes the system analysis, the alignment procedure, and finally the performances obtained during the first flight in September 2015.

  18. SinicView: A visualization environment for comparisons of multiple nucleotide sequence alignment tools

    Directory of Open Access Journals (Sweden)

    Wong Chun-Yi

    2006-03-01

    Full Text Available Abstract Background Deluged by the rate and complexity of completed genomic sequences, the need to align longer sequences becomes more urgent, and many more tools have thus been developed. In the initial stage of genomic sequence analysis, a biologist is usually faced with the questions of how to choose the best tool to align sequences of interest and how to analyze and visualize the alignment results, and then with the question of whether poorly aligned regions produced by the tool are indeed not homologous or are just results due to inappropriate alignment tools or scoring systems used. Although several systematic evaluations of multiple sequence alignment (MSA programs have been proposed, they may not provide a standard-bearer for most biologists because those poorly aligned regions in these evaluations are never discussed. Thus, a tool that allows cross comparison of the alignment results obtained by different tools simultaneously could help a biologist evaluate their correctness and accuracy. Results In this paper, we present a versatile alignment visualization system, called SinicView, (for Sequence-aligning INnovative and Interactive Comparison VIEWer, which allows the user to efficiently compare and evaluate assorted nucleotide alignment results obtained by different tools. SinicView calculates similarity of the alignment outputs under a fixed window using the sum-of-pairs method and provides scoring profiles of each set of aligned sequences. The user can visually compare alignment results either in graphic scoring profiles or in plain text format of the aligned nucleotides along with the annotations information. We illustrate the capabilities of our visualization system by comparing alignment results obtained by MLAGAN, MAVID, and MULTIZ, respectively. Conclusion With SinicView, users can use their own data sequences to compare various alignment tools or scoring systems and select the most suitable one to perform alignment in the

  19. Speaking Fluently And Accurately

    Institute of Scientific and Technical Information of China (English)

    JosephDeVeto

    2004-01-01

    Even after many years of study,students make frequent mistakes in English. In addition, many students still need a long time to think of what they want to say. For some reason, in spite of all the studying, students are still not quite fluent.When I teach, I use one technique that helps students not only speak more accurately, but also more fluently. That technique is dictations.

  20. Aligning Theory with Practice

    Science.gov (United States)

    Kurz, Terri L.; Batarelo, Ivana

    2009-01-01

    This article describes a structure to help preservice teachers get invaluable field experience by aligning theory with practice supported by the integration of elementary school children into their university mathematics methodology course. This course structure allowed preservice teachers to learn about teaching mathematics in a nonthreatening…

  1. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G. Gomez and Y. Pakhotin

    2012-01-01

      A new track-based alignment for the DT chambers is ready for deployment: an offline tag has already been produced which will become part of the 52X Global Tag. This alignment was validated within the muon alignment group both at low and high momentum using a W/Z skim sample. It shows an improved mass resolution for pairs of stand-alone muons, improved curvature resolution at high momentum, and improved DT segment extrapolation residuals. The validation workflow for high-momentum muons used to depend solely on the “split cosmics” method, looking at the curvature difference between muon tracks reconstructed in the upper or lower half of CMS. The validation has now been extended to include energetic muons decaying from heavily boosted Zs: the di-muon invariant mass for global and stand-alone muons is reconstructed, and the invariant mass resolution is compared for different alignments. The main areas of development over the next few months will be preparing a new track-based C...

  2. Alignment of concerns

    DEFF Research Database (Denmark)

    Andersen, Tariq Osman; Bansler, Jørgen P.; Kensing, Finn;

    2014-01-01

    The emergence of patient-centered eHealth systems introduces new challenges, where patients come to play an increasingly important role. Realizing the promises requires an in-depth understanding of not only the technology, but also the needs of both clinicians and patients. However, insights from...... as a design rationale for successful eHealth, termed 'alignment of concerns'....

  3. Aligning Mental Representations

    DEFF Research Database (Denmark)

    Kano Glückstad, Fumiko

    2013-01-01

    This work introduces a framework that implements asymmetric communication theory proposed by Sperber and Wilson [1]. The framework applies a generalization model known as the Bayesian model of generalization (BMG) [2] for aligning knowledge possessed by two communicating parties. The work focuses...

  4. BSMAP: whole genome bisulfite sequence MAPping program

    Directory of Open Access Journals (Sweden)

    Li Wei

    2009-07-01

    Full Text Available Abstract Background Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the increased searching space, reduced complexity of bisulfite sequence, asymmetric cytosine to thymine alignments, and multiple CpG heterogeneous methylation. Results We developed an efficient bisulfite reads mapping algorithm BSMAP to address the above issues. BSMAP combines genome hashing and bitwise masking to achieve fast and accurate bisulfite mapping. Compared with existing bisulfite mapping approaches, BSMAP is faster, more sensitive and more flexible. Conclusion BSMAP is the first general-purpose bisulfite mapping software. It is able to map high-throughput bisulfite reads at whole genome level with feasible memory and CPU usage. It is freely available under GPL v3 license at http://code.google.com/p/bsmap/.

  5. Detection of Off-normal Images for NIF Automatic Alignment

    Energy Technology Data Exchange (ETDEWEB)

    Candy, J V; Awwal, A S; McClay, W A; Ferguson, S W; Burkhart, S C

    2005-07-11

    One of the major purposes of National Ignition Facility at Lawrence Livermore National Laboratory is to accurately focus 192 high energy laser beams on a nanoscale (mm) fusion target at the precise location and time. The automatic alignment system developed for NIF is used to align the beams in order to achieve the required focusing effect. However, if a distorted image is inadvertently created by a faulty camera shutter or some other opto-mechanical malfunction, the resulting image termed ''off-normal'' must be detected and rejected before further alignment processing occurs. Thus the off-normal processor acts as a preprocessor to automatic alignment image processing. In this work, we discuss the development of an ''off-normal'' pre-processor capable of rapidly detecting the off-normal images and performing the rejection. Wide variety of off-normal images for each loop is used to develop the criterion for rejections accurately.

  6. Robust local intervertebral disc alignment for spinal MRI

    Science.gov (United States)

    Reisman, James; Höppner, Jan; Huang, Szu-Hao; Zhang, Li; Lai, Shang-Hong; Odry, Benjamin; Novak, Carol L.

    2006-03-01

    Magnetic resonance (MR) imaging is frequently used to diagnose abnormalities in the spinal intervertebral discs. Owing to the non-isotropic resolution of typical MR spinal scans, physicians prefer to align the scanner plane with the disc in order to maximize the diagnostic value and to facilitate comparison with prior and follow-up studies. Commonly a planning scan is acquired of the whole spine, followed by a diagnostic scan aligned with selected discs of interest. Manual determination of the optimal disc plane is tedious and prone to operator variation. A fast and accurate method to automatically determine the disc alignment can decrease examination time and increase the reliability of diagnosis. We present a validation study of an automatic spine alignment system for determining the orientation of intervertebral discs in MR studies. In order to measure the effectiveness of the automatic alignment system, we compared its performance with human observers. 12 MR spinal scans of adult spines were tested. Two observers independently indicated the intervertebral plane for each disc, and then repeated the procedure on another day, in order to determine the inter- and intra-observer variability associated with manual alignment. Results were also collected for the observers utilizing the automatic spine alignment system, in order to determine the method's consistency and its accuracy with respect to human observers. We found that the results from the automatic alignment system are comparable with the alignment determined by human observers, with the computer showing greater speed and consistency.

  7. ABS: Sequence alignment by scanning

    KAUST Repository

    Bonny, Mohamed Talal

    2011-08-01

    Sequence alignment is an essential tool in almost any computational biology research. It processes large database sequences and considered to be high consumers of computation time. Heuristic algorithms are used to get approximate but fast results. We introduce fast alignment algorithm, called Alignment By Scanning (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the well-known alignment algorithms, the FASTA (which is heuristic) and the \\'Needleman-Wunsch\\' (which is optimal). The proposed algorithm achieves up to 76% enhancement in alignment score when it is compared with the FASTA Algorithm. The evaluations are conducted using different lengths of DNA sequences. © 2011 IEEE.

  8. Fast global sequence alignment technique

    KAUST Repository

    Bonny, Mohamed Talal

    2011-11-01

    Bioinformatics database is growing exponentially in size. Processing these large amount of data may take hours of time even if super computers are used. One of the most important processing tool in Bioinformatics is sequence alignment. We introduce fast alignment algorithm, called \\'Alignment By Scanning\\' (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the wellknown sequence alignment algorithms, the \\'GAP\\' (which is heuristic) and the \\'Needleman-Wunsch\\' (which is optimal). The proposed algorithm achieves up to 51% enhancement in alignment score when it is compared with the GAP Algorithm. The evaluations are conducted using different lengths of DNA sequences. © 2011 IEEE.

  9. Efficient oligonucleotide probe selection for pan-genomic tiling arrays

    Directory of Open Access Journals (Sweden)

    Zhang Wei

    2009-09-01

    Full Text Available Abstract Background Array comparative genomic hybridization is a fast and cost-effective method for detecting, genotyping, and comparing the genomic sequence of unknown bacterial isolates. This method, as with all microarray applications, requires adequate coverage of probes targeting the regions of interest. An unbiased tiling of probes across the entire length of the genome is the most flexible design approach. However, such a whole-genome tiling requires that the genome sequence is known in advance. For the accurate analysis of uncharacterized bacteria, an array must query a fully representative set of sequences from the species' pan-genome. Prior microarrays have included only a single strain per array or the conserved sequences of gene families. These arrays omit potentially important genes and sequence variants from the pan-genome. Results This paper presents a new probe selection algorithm (PanArray that can tile multiple whole genomes using a minimal number of probes. Unlike arrays built on clustered gene families, PanArray uses an unbiased, probe-centric approach that does not rely on annotations, gene clustering, or multi-alignments. Instead, probes are evenly tiled across all sequences of the pan-genome at a consistent level of coverage. To minimize the required number of probes, probes conserved across multiple strains in the pan-genome are selected first, and additional probes are used only where necessary to span polymorphic regions of the genome. The viability of the algorithm is demonstrated by array designs for seven different bacterial pan-genomes and, in particular, the design of a 385,000 probe array that fully tiles the genomes of 20 different Listeria monocytogenes strains with overlapping probes at greater than twofold coverage. Conclusion PanArray is an oligonucleotide probe selection algorithm for tiling multiple genome sequences using a minimal number of probes. It is capable of fully tiling all genomes of a species on

  10. Absorber Alignment Measurement Tool for Solar Parabolic Trough Collectors: Preprint

    Energy Technology Data Exchange (ETDEWEB)

    Stynes, J. K.; Ihas, B.

    2012-04-01

    As we pursue efforts to lower the capital and installation costs of parabolic trough solar collectors, it is essential to maintain high optical performance. While there are many optical tools available to measure the reflector slope errors of parabolic trough solar collectors, there are few tools to measure the absorber alignment. A new method is presented here to measure the absorber alignment in two dimensions to within 0.5 cm. The absorber alignment is measured using a digital camera and four photogrammetric targets. Physical contact with the receiver absorber or glass is not necessary. The alignment of the absorber is measured along its full length so that sagging of the absorber can be quantified with this technique. The resulting absorber alignment measurement provides critical information required to accurately determine the intercept factor of a collector.

  11. The Laser Shaft Alignment System with Dual PSDs

    Institute of Scientific and Technical Information of China (English)

    JIAO Guohua; LI Yulin; ZHANG Dongbo; LI Tonghai; HU Baowen

    2006-01-01

    Shaft alignment is an important technique during installation and maintenance of a rotating machine. A high-precision laser alignment system has been designed with dual PSDs (Position Sensing Detector) to change traditional manual way of shaft alignment and to make the measurement easier and more accurate. The system is comprised of two small measuring units (laser transmitter and detector) and a PDA (Personal Digital Assistant) with the measurement software. The laser alignment system with dual PSDs was improved on a single PSD system, and it gets higher measurement accuracy than the previous design, and it has been succeeded in designing and implement for actual shaft alignment. In the system, the range of offset measurement is ±4 mm, and the resolution is 1.5 μm, and the accuracy is less than 2 μm.

  12. A laser shaft alignment system with dual PSDs

    Institute of Scientific and Technical Information of China (English)

    JIAO Guo-hua; LI Yu-lin; ZHANG Dong-bo; LI Tong-hai; HU Bao-wen

    2006-01-01

    Shaft alignment is an important technique during installation and maintenance of a rotating machine. A high-precision laser alignment system has been designed with dual PSDs (Position Sensing Detector) to change traditional manual way of shaft alignment and to make the measurement easier and more accurate. The system is comprised of two small measuring units (laser transmitter and detector) and a PDA (Personal Digital Assistant) with measurement software. The laser alignment system with dual PSDs was improved on a single PSD system, and yields higher measurement accuracy than the previous design, and has been successful for designing and implements actual shaft alignment. In the system, the range of offset measurement is ±4 mm, and the resolution is 1.5 μm, with accuracy being less than 2 μm.

  13. A fast cross-validation method for alignment of electron tomography images based on Beer-Lambert law

    Science.gov (United States)

    Yan, Rui; Edwards, Thomas J.; Pankratz, Logan M.; Kuhn, Richard J.; Lanman, Jason K.; Liu, Jun; Jiang, Wen

    2015-01-01

    In electron tomography, accurate alignment of tilt series is an essential step in attaining high-resolution 3D reconstructions. Nevertheless, quantitative assessment of alignment quality has remained a challenging issue, even though many alignment methods have been reported. Here, we report a fast and accurate method, tomoAlignEval, based on the Beer-Lambert law, for the evaluation of alignment quality. Our method is able to globally estimate the alignment accuracy by measuring the goodness of log-linear relationship of the beam intensity attenuations at different tilt angles. Extensive tests with experimental data demonstrated its robust performance with stained and cryo samples. Our method is not only significantly faster but also more sensitive than measurements of tomogram resolution using Fourier shell correlation method (FSCe/o). From these tests, we also conclude that while current alignment methods are sufficiently accurate for stained samples, inaccurate alignments remain a major limitation for high resolution cryo-electron tomography. PMID:26455556

  14. MUSE alignment onto VLT

    Science.gov (United States)

    Laurent, Florence; Renault, Edgard; Boudon, Didier; Caillier, Patrick; Daguisé, Eric; Dupuy, Christophe; Jarno, Aurélien; Lizon, Jean-Louis; Migniau, Jean-Emmanuel; Nicklas, Harald; Piqueras, Laure

    2014-07-01

    MUSE (Multi Unit Spectroscopic Explorer) is a second generation Very Large Telescope (VLT) integral field spectrograph developed for the European Southern Observatory (ESO). It combines a 1' x 1' field of view sampled at 0.2 arcsec for its Wide Field Mode (WFM) and a 7.5"x7.5" field of view for its Narrow Field Mode (NFM). Both modes will operate with the improved spatial resolution provided by GALACSI (Ground Atmospheric Layer Adaptive Optics for Spectroscopic Imaging), that will use the VLT deformable secondary mirror and 4 Laser Guide Stars (LGS) foreseen in 2015. MUSE operates in the visible wavelength range (0.465-0.93 μm). A consortium of seven institutes is currently commissioning MUSE in the Very Large Telescope for the Preliminary Acceptance in Chile, scheduled for September, 2014. MUSE is composed of several subsystems which are under the responsibility of each institute. The Fore Optics derotates and anamorphoses the image at the focal plane. A Splitting and Relay Optics feed the 24 identical Integral Field Units (IFU), that are mounted within a large monolithic structure. Each IFU incorporates an image slicer, a fully refractive spectrograph with VPH-grating and a detector system connected to a global vacuum and cryogenic system. During 2012 and 2013, all MUSE subsystems were integrated, aligned and tested to the P.I. institute at Lyon. After successful PAE in September 2013, MUSE instrument was shipped to the Very Large Telescope in Chile where that was aligned and tested in ESO integration hall at Paranal. After, MUSE was directly transported, fully aligned and without any optomechanical dismounting, onto VLT telescope where the first light was overcame the 7th of February, 2014. This paper describes the alignment procedure of the whole MUSE instrument with respect to the Very Large Telescope (VLT). It describes how 6 tons could be move with accuracy better than 0.025mm and less than 0.25 arcmin in order to reach alignment requirements. The success

  15. Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment

    Directory of Open Access Journals (Sweden)

    Daniels Noah M

    2012-10-01

    Full Text Available Abstract Background The quality of multiple protein structure alignments are usually computed and assessed based on geometric functions of the coordinates of the backbone atoms from the protein chains. These purely geometric methods do not utilize directly protein sequence similarity, and in fact, determining the proper way to incorporate sequence similarity measures into the construction and assessment of protein multiple structure alignments has proved surprisingly difficult. Results We present Formatt, a multiple structure alignment based on the Matt purely geometric multiple structure alignment program, that also takes into account sequence similarity when constructing alignments. We show that Formatt outperforms Matt and other popular structure alignment programs on the popular HOMSTRAD benchmark. For the SABMark twilight zone benchmark set that captures more remote homology, Formatt and Matt outperform other programs; depending on choice of embedded sequence aligner, Formatt produces either better sequence and structural alignments with a smaller core size than Matt, or similarly sized alignments with better sequence similarity, for a small cost in average RMSD. Conclusions Considering sequence information as well as purely geometric information seems to improve quality of multiple structure alignments, though defining what constitutes the best alignment when sequence and structural measures would suggest different alignments remains a difficult open question.

  16. Intramedullary versus extramedullary alignment of the tibial component in the Triathlon knee

    LENUS (Irish Health Repository)

    Cashman, James P

    2011-08-20

    Abstract Background Long term survivorship in total knee arthroplasty is significantly dependant on prosthesis alignment. Our aim was determine which alignment guide was more accurate in positioning of the tibial component in total knee arthroplasty. We also aimed to assess whether there was any difference in short term patient outcome. Method A comparison of intramedullary versus extramedullary alignment jig was performed. Radiological alignment of tibial components and patient outcomes of 103 Triathlon total knee arthroplasties were analysed. Results Use of the intramedullary was found to be significantly more accurate in determining coronal alignment (p = 0.02) while use of the extramedullary jig was found to give more accurate results in sagittal alignment (p = 0.04). There was no significant difference in WOMAC or SF-36 at six months. Conclusion Use of an intramedullary jig is preferable for positioning of the tibial component using this knee system.

  17. Intramedullary versus extramedullary alignment of the tibial component in the Triathlon knee

    Directory of Open Access Journals (Sweden)

    Synnott Keith

    2011-08-01

    Full Text Available Abstract Background Long term survivorship in total knee arthroplasty is significantly dependant on prosthesis alignment. Our aim was determine which alignment guide was more accurate in positioning of the tibial component in total knee arthroplasty. We also aimed to assess whether there was any difference in short term patient outcome. Method A comparison of intramedullary versus extramedullary alignment jig was performed. Radiological alignment of tibial components and patient outcomes of 103 Triathlon total knee arthroplasties were analysed. Results Use of the intramedullary was found to be significantly more accurate in determining coronal alignment (p = 0.02 while use of the extramedullary jig was found to give more accurate results in sagittal alignment (p = 0.04. There was no significant difference in WOMAC or SF-36 at six months. Conclusion Use of an intramedullary jig is preferable for positioning of the tibial component using this knee system.

  18. Three-time rapid transfer alignment method of SINS/GPS navigation system of high-speed marine missile

    Institute of Scientific and Technical Information of China (English)

    WANG Si; DENG Zheng-long; SU Ling-feng

    2008-01-01

    The transfer alignment of SINS/GPS navigation system of a high-speed marine missile was investiga-ted. With the help of the big acceleration of a high-speed missile, the transfer alignment was changed into a three-time alignment. The azimuth alignment was coarsely finished in 10s in the first time alignment, the hori-zontal alignment was accurately and rapidly finished in the second time alignment, and the azimuth alignment was accurately finished in the third time alignment. Because the second time alignment and the third time align-ment were finished by GPS after the missile was launched, the horizontal alignment and the second azimuth a-lignment got rid of the influence of the warship body flexibility deforming. The precision and rapidity of the hori-zontal alignment were prominently increased due to the vertical launch of the marine missile with the big accel-eration. Simulation verifies the effectiveness of the proposed alignment method.

  19. Aligning component upgrades

    Directory of Open Access Journals (Sweden)

    Roberto Di Cosmo

    2011-08-01

    Full Text Available Modern software systems, like GNU/Linux distributions or Eclipse-based development environment, are often deployed by selecting components out of large component repositories. Maintaining such software systems by performing component upgrades is a complex task, and the users need to have an expressive preferences language at their disposal to specify the kind of upgrades they are interested in. Recent research has shown that it is possible to develop solvers that handle preferences expressed as a combination of a few basic criteria used in the MISC competition, ranging from the number of new components to the freshness of the final configuration. In this work we introduce a set of new criteria that allow the users to specify their preferences for solutions with components aligned to the same upstream sources, provide an efficient encoding and report on the experimental results that prove that optimising these alignment criteria is a tractable problem in practice.

  20. Inflation by alignment

    Energy Technology Data Exchange (ETDEWEB)

    Burgess, C.P. [PH -TH Division, CERN,CH-1211, Genève 23 (Switzerland); Department of Physics & Astronomy, McMaster University,1280 Main Street West, Hamilton ON (Canada); Perimeter Institute for Theoretical Physics,31 Caroline Street North, Waterloo ON (Canada); Roest, Diederik [Van Swinderen Institute for Particle Physics and Gravity, University of Groningen,Nijenborgh 4, 9747 AG Groningen (Netherlands)

    2015-06-08

    Pseudo-Goldstone bosons (pGBs) can provide technically natural inflatons, as has been comparatively well-explored in the simplest axion examples. Although inflationary success requires trans-Planckian decay constants, f≳M{sub p}, several mechanisms have been proposed to obtain this, relying on (mis-)alignments between potential and kinetic energies in multiple-field models. We extend these mechanisms to a broader class of inflationary models, including in particular the exponential potentials that arise for pGB potentials based on noncompact groups (and so which might apply to moduli in an extra-dimensional setting). The resulting potentials provide natural large-field inflationary models and can predict a larger primordial tensor signal than is true for simpler single-field versions of these models. In so doing we provide a unified treatment of several alignment mechanisms, showing how each emerges as a limit of the more general setup.

  1. Alignment of concerns

    DEFF Research Database (Denmark)

    Andersen, Tariq Osman; Bansler, Jørgen P.; Kensing, Finn;

    E-health promises to enable and support active patient participation in chronic care. However, these fairly recent innovations are complicated matters and emphasize significant challenges, such as patients’ and clinicians’ different ways of conceptualizing disease and illness. Informed by insight...... from medical phenomenology and our own empirical work in telemonitoring and medical care of heart patients, we propose a design rationale for e-health systems conceptualized as the ‘alignment of concerns’....

  2. Nuclear reactor alignment plate configuration

    Energy Technology Data Exchange (ETDEWEB)

    Altman, David A; Forsyth, David R; Smith, Richard E; Singleton, Norman R

    2014-01-28

    An alignment plate that is attached to a core barrel of a pressurized water reactor and fits within slots within a top plate of a lower core shroud and upper core plate to maintain lateral alignment of the reactor internals. The alignment plate is connected to the core barrel through two vertically-spaced dowel pins that extend from the outside surface of the core barrel through a reinforcement pad and into corresponding holes in the alignment plate. Additionally, threaded fasteners are inserted around the perimeter of the reinforcement pad and into the alignment plate to further secure the alignment plate to the core barrel. A fillet weld also is deposited around the perimeter of the reinforcement pad. To accomodate thermal growth between the alignment plate and the core barrel, a gap is left above, below and at both sides of one of the dowel pins in the alignment plate holes through with the dowel pins pass.

  3. Orbit IMU alignment: Error analysis

    Science.gov (United States)

    Corson, R. W.

    1980-01-01

    A comprehensive accuracy analysis of orbit inertial measurement unit (IMU) alignments using the shuttle star trackers was completed and the results are presented. Monte Carlo techniques were used in a computer simulation of the IMU alignment hardware and software systems to: (1) determine the expected Space Transportation System 1 Flight (STS-1) manual mode IMU alignment accuracy; (2) investigate the accuracy of alignments in later shuttle flights when the automatic mode of star acquisition may be used; and (3) verify that an analytical model previously used for estimating the alignment error is a valid model. The analysis results do not differ significantly from expectations. The standard deviation in the IMU alignment error for STS-1 alignments was determined to the 68 arc seconds per axis. This corresponds to a 99.7% probability that the magnitude of the total alignment error is less than 258 arc seconds.

  4. RECAT - Redundant Channel Alignment Technique

    Science.gov (United States)

    2016-06-07

    distribution unlimited 13. SUPPLEMENTARY NOTES NUWC2015 14. ABSTRACT A problem in the analog-to- digital , (A/D), conversion of broadband tape recorded...Alignment Technique, is used to align data taken on one pass with data from any other pass. The accuracy of this alignment is a function of the digital ...Redundant Channel Alignment Technique; analog-to- digital ; A/D; Broadband Bearing Time Processing 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF

  5. Method for alignment of microwires

    Energy Technology Data Exchange (ETDEWEB)

    Beardslee, Joseph A.; Lewis, Nathan S.; Sadtler, Bryce

    2017-01-24

    A method of aligning microwires includes modifying the microwires so they are more responsive to a magnetic field. The method also includes using a magnetic field so as to magnetically align the microwires. The method can further include capturing the microwires in a solid support structure that retains the longitudinal alignment of the microwires when the magnetic field is not applied to the microwires.

  6. Genomic libraries: I. Construction and screening of fosmid genomic libraries.

    Science.gov (United States)

    Quail, Mike A; Matthews, Lucy; Sims, Sarah; Lloyd, Christine; Beasley, Helen; Baxter, Simon W

    2011-01-01

    Large insert genome libraries have been a core resource required to sequence genomes, analyze haplotypes, and aid gene discovery. While next generation sequencing technologies are revolutionizing the field of genomics, traditional genome libraries will still be required for accurate genome assembly. Their utility is also being extended to functional studies for understanding DNA regulatory elements. Here, we present a detailed method for constructing genomic fosmid libraries, testing for common contaminants, gridding the library to nylon membranes, then hybridizing the library membranes with a radiolabeled probe to identify corresponding genomic clones. While this chapter focuses on fosmid libraries, many of these steps can also be applied to bacterial artificial chromosome libraries.

  7. Alignment Between Genetic and Physical Maps of Gibberella zeae

    Science.gov (United States)

    We previously published a genetic map of Gibberella zeae (Fusarium graminearum) based on a cross between Kansas strain Z-3639 (lineage 7) and Japanese strain R-5470 (lineage 6). In this study, that genetic map was aligned with the third assembly of the genomic sequence of G. zeae strain PH-1 (linea...

  8. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    Science.gov (United States)

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  9. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.

    Science.gov (United States)

    Sievers, Fabian; Wilm, Andreas; Dineen, David; Gibson, Toby J; Karplus, Kevin; Li, Weizhong; Lopez, Rodrigo; McWilliam, Hamish; Remmert, Michael; Söding, Johannes; Thompson, Julie D; Higgins, Desmond G

    2011-10-11

    Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences. Some methods allow computation of larger data sets while sacrificing quality, and others produce high-quality alignments, but scale badly with the number of sequences. In this paper, we describe a new program called Clustal Omega, which can align virtually any number of protein sequences quickly and that delivers accurate alignments. The accuracy of the package on smaller test cases is similar to that of the high-quality aligners. On larger data sets, Clustal Omega outperforms other packages in terms of execution time and quality. Clustal Omega also has powerful features for adding sequences to and exploiting information in existing alignments, making use of the vast amount of precomputed information in public databases like Pfam.

  10. Semiautomated improvement of RNA alignments

    DEFF Research Database (Denmark)

    Andersen, Ebbe Sloth; Lind-Thomsen, Allan; Knudsen, Bjarne

    2007-01-01

    We have developed a semiautomated RNA sequence editor (SARSE) that integrates tools for analyzing RNA alignments. The editor highlights different properties of the alignment by color, and its integrated analysis tools prevent the introduction of errors when doing alignment editing. SARSE readily...... connects to external tools to provide a flexible semiautomatic editing environment. A new method, Pcluster, is introduced for dividing the sequences of an RNA alignment into subgroups with secondary structure differences. Pcluster was used to evaluate 574 seed alignments obtained from the Rfam database...... and we identified 71 alignments with significant prediction of inconsistent base pairs and 102 alignments with significant prediction of novel base pairs. Four RNA families were used to illustrate how SARSE can be used to manually or automatically correct the inconsistent base pairs detected by Pcluster...

  11. ATLAS Inner Detector Alignment

    CERN Document Server

    Bocci, A

    2008-01-01

    The ATLAS experiment is a multi-purpose particle detector that will study high-energy particle collisions produced by the Large Hadron Collider at CERN. In order to achieve its physics goals, the ATLAS tracking requires that the positions of the silicon detector elements have to be known to a precision better than 10 μm. Several track-based alignment algorithms have been developed for the Inner Detector. An extensive validation has been performed with simulated events and real data coming from the ATLAS. Results from such validation are reported in this paper.

  12. CELT optics Alignment Procedure

    Science.gov (United States)

    Mast, Terry S.; Nelson, Jerry E.; Chanan, Gary A.; Noethe, Lothar

    2003-01-01

    The California Extremely Large Telescope (CELT) is a project to build a 30-meter diameter telescope for research in astronomy at visible and infrared wavelengths. The current optical design calls for a primary, secondary, and tertiary mirror with Ritchey-Chretién foci at two Nasmyth platforms. The primary mirror is a mosaic of 1080 actively-stabilized hexagonal segments. This paper summarizes a CELT report that describes a step-by-step procedure for aligning the many degrees of freedom of the CELT optics.

  13. TSGC and JSC Alignment

    Science.gov (United States)

    Sanchez, Humberto

    2013-01-01

    NASA and the SGCs are, by design, intended to work closely together and have synergistic Vision, Mission, and Goals. The TSGC affiliates and JSC have been working together, but not always in a concise, coordinated, nor strategic manner. Today we have a couple of simple ideas to present about how TSGC and JSC have started to work together in a more concise, coordinated, and strategic manner, and how JSC and non-TSG Jurisdiction members have started to collaborate: Idea I: TSGC and JSC Technical Alignment Idea II: Concept of Clusters.

  14. An Exact Mathematical Programming Approach to Multiple RNA Sequence-Structure Alignment

    NARCIS (Netherlands)

    Bauer, M.; Klau, G.W.; Reinert, K.

    2008-01-01

    One of the main tasks in computational biology is the computation of alignments of genomic sequences to reveal their commonalities. In case of DNA or protein sequences, sequence information alone is usually sufficient to compute reliable alignments. RNA molecules, however, build spatial confor

  15. High-throughput sequence alignment using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Trapnell Cole

    2007-12-01

    Full Text Available Abstract Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.

  16. All about alignment

    CERN Multimedia

    2006-01-01

    The ALICE absorbers, iron wall and superstructure have been installed with great precision. The ALICE front absorber, positioned in the centre of the detector, has been installed and aligned. Weighing more than 400 tonnes, the ALICE absorbers and the surrounding support structures have been installed and aligned with a precision of 1-2 mm, hardly an easy task but a very important one. The ALICE absorbers are made of three parts: the front absorber, a 35-tonne cone-shaped structure, and two small-angle absorbers, long straight cylinder sections weighing 18 and 40 tonnes. The three pieces lined up have a total length of about 17 m. In addition to these, ALICE technicians have installed a 300-tonne iron filter wall made of blocks that fit together like large Lego pieces and a surrounding metal support structure to hold the tracking and trigger chambers. The absorbers house the vacuum chamber and are also the reference surface for the positioning of the tracking and trigger chambers. For this reason, the ab...

  17. Testing the tidal alignment model of galaxy intrinsic alignment

    CERN Document Server

    Blazek, Jonathan; Seljak, Uros

    2011-01-01

    Weak gravitational lensing has become a powerful probe of large-scale structure and cosmological parameters. Precision weak lensing measurements require an understanding of the intrinsic alignment of galaxy ellipticities, which can in turn inform models of galaxy formation. It is hypothesized that elliptical galaxies align with the background tidal field and that this alignment mechanism dominates the correlation between ellipticities on cosmological scales (in the absence of lensing). We use recent large-scale structure measurements from the Sloan Digital Sky Survey to test this picture with several statistics: (1) the correlation between ellipticity and galaxy overdensity, w_{g+}; (2) the intrinsic alignment auto-correlation functions; (3) the correlation functions of curl-free, E, and divergence-free, B, modes (the latter of which is zero in the linear tidal alignment theory); (4) the alignment correlation function, w_g(r_p,theta), a recently developed statistic that generalizes the galaxy correlation func...

  18. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...

  19. Correction of the Caulobacter crescentus NA1000 genome annotation.

    Directory of Open Access Journals (Sweden)

    Bert Ely

    Full Text Available Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.

  20. Correction of the Caulobacter crescentus NA1000 genome annotation.

    Science.gov (United States)

    Ely, Bert; Scott, LaTia Etheredge

    2014-01-01

    Bacterial genome annotations are accumulating rapidly in the GenBank database and the use of automated annotation technologies to create these annotations has become the norm. However, these automated methods commonly result in a small, but significant percentage of genome annotation errors. To improve accuracy and reliability, we analyzed the Caulobacter crescentus NA1000 genome utilizing computer programs Artemis and MICheck to manually examine the third codon position GC content, alignment to a third codon position GC frame plot peak, and matches in the GenBank database. We identified 11 new genes, modified the start site of 113 genes, and changed the reading frame of 38 genes that had been incorrectly annotated. Furthermore, our manual method of identifying protein-coding genes allowed us to remove 112 non-coding regions that had been designated as coding regions. The improved NA1000 genome annotation resulted in a reduction in the use of rare codons since noncoding regions with atypical codon usage were removed from the annotation and 49 new coding regions were added to the annotation. Thus, a more accurate codon usage table was generated as well. These results demonstrate that a comparison of the location of peaks third codon position GC content to the location of protein coding regions could be used to verify the annotation of any genome that has a GC content that is greater than 60%.

  1. Pareto optimal pairwise sequence alignment.

    Science.gov (United States)

    DeRonne, Kevin W; Karypis, George

    2013-01-01

    Sequence alignment using evolutionary profiles is a commonly employed tool when investigating a protein. Many profile-profile scoring functions have been developed for use in such alignments, but there has not yet been a comprehensive study of Pareto optimal pairwise alignments for combining multiple such functions. We show that the problem of generating Pareto optimal pairwise alignments has an optimal substructure property, and develop an efficient algorithm for generating Pareto optimal frontiers of pairwise alignments. All possible sets of two, three, and four profile scoring functions are used from a pool of 11 functions and applied to 588 pairs of proteins in the ce_ref data set. The performance of the best objective combinations on ce_ref is also evaluated on an independent set of 913 protein pairs extracted from the BAliBASE RV11 data set. Our dynamic-programming-based heuristic approach produces approximated Pareto optimal frontiers of pairwise alignments that contain comparable alignments to those on the exact frontier, but on average in less than 1/58th the time in the case of four objectives. Our results show that the Pareto frontiers contain alignments whose quality is better than the alignments obtained by single objectives. However, the task of identifying a single high-quality alignment among those in the Pareto frontier remains challenging.

  2. Accurate structural correlations from maximum likelihood superpositions.

    Directory of Open Access Journals (Sweden)

    Douglas L Theobald

    2008-02-01

    Full Text Available The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method ("PCA plots" for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology.

  3. Research on localization and alignment technology for transfer cask

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Jingchuan, E-mail: jchwang@sjtu.edu.cn [Department of Automation, Shanghai Jiao Tong University, Shanghai (China); Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai (China); Yang, Ming; Chen, Weidong [Department of Automation, Shanghai Jiao Tong University, Shanghai (China); Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai (China)

    2015-10-15

    Highlights: • A method for the alignment between TB and HCB based on localizability is proposed. • A localization method based on the localizability estimation is proposed to realize the cask's localization accurately and ensures the transfer cask's accurate docking in the front of the window of Tokmak Building. • The experimental results show that the proposed algorithm works well in the indoor simulation environment. This system will be test in EAST of China. - Abstract: According to the long length characteristics of transfer cask compared to the environment space between Tokmak Building (TB) and HCB (Hot Cell Building), this paper proposes an autonomous localization and alignment method for the internal components transportation and replacement. A localization method based on the localizability estimation is used to realize the cask's localization and navigation accurately. Once the cask arrives at the front of the TB window, the position and attitude measurement system is used to detect the relative alignment error between the seal door of pallet and the window of TB real-time. The alignment between seal door and TB window could be realized based on this offset. The simulation experiment based on the real model is designed according to the real TB situation. The experiment results show that the proposed localization and alignment method can be used for transfer cask.

  4. PATtyFams: Protein families for the microbial genomes in the PATRIC database

    Directory of Open Access Journals (Sweden)

    James J Davis

    2016-02-01

    Full Text Available The ability to build accurate protein families is a fundamental operation in bioinformatics that influences comparative analyses, genome annotation and metabolic modeling. For several years we have been maintaining protein families for all microbial genomes in the PATRIC database (Pathosystems Resource Integration Center, patricbrc.org in order to drive many of the comparative analysis tools that are available through the PATRIC website. However, due to the burgeoning number of genomes, traditional approaches for generating protein families are becoming prohibitive. In this report, we describe a new approach for generating protein families, which we call PATtyFams. This method uses the k-mer-based function assignments available through RAST (Rapid Annotation using Subsystem Technology to rapidly guide family formation, and then differentiates the function-based groups into families using a Markov Cluster algorithm (MCL. This new approach for generating protein families is rapid, scalable and has properties that are consistent with alignment-based methods.

  5. Computational design and engineering of polymeric orthodontic aligners.

    Science.gov (United States)

    Barone, S; Paoli, A; Razionale, A V; Savignano, R

    2016-10-05

    Transparent and removable aligners represent an effective solution to correct various orthodontic malocclusions through minimally invasive procedures. An aligner-based treatment requires patients to sequentially wear dentition-mating shells obtained by thermoforming polymeric disks on reference dental models. An aligner is shaped introducing a geometrical mismatch with respect to the actual tooth positions to induce a loading system, which moves the target teeth toward the correct positions. The common practice is based on selecting the aligner features (material, thickness, and auxiliary elements) by only considering clinician's subjective assessments. In this article, a computational design and engineering methodology has been developed to reconstruct anatomical tissues, to model parametric aligner shapes, to simulate orthodontic movements, and to enhance the aligner design. The proposed approach integrates computer-aided technologies, from tomographic imaging to optical scanning, from parametric modeling to finite element analyses, within a 3-dimensional digital framework. The anatomical modeling provides anatomies, including teeth (roots and crowns), jaw bones, and periodontal ligaments, which are the references for the down streaming parametric aligner shaping. The biomechanical interactions between anatomical models and aligner geometries are virtually reproduced using a finite element analysis software. The methodology allows numerical simulations of patient-specific conditions and the comparative analyses of different aligner configurations. In this article, the digital framework has been used to study the influence of various auxiliary elements on the loading system delivered to a maxillary and a mandibular central incisor during an orthodontic tipping movement. Numerical simulations have shown a high dependency of the orthodontic tooth movement on the auxiliary element configuration, which should then be accurately selected to maximize the aligner

  6. HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.

    Science.gov (United States)

    O'Driscoll, Aisling; Belogrudov, Vladislav; Carroll, John; Kropp, Kai; Walsh, Paul; Ghazal, Peter; Sleator, Roy D

    2015-04-01

    The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function. As such, parallelised solutions have been proposed but many exhibit scalability limitations and are incapable of effectively processing "Big Data" - the name attributed to datasets that are extremely large, complex and require rapid processing. The Hadoop framework, comprised of distributed storage and a parallelised programming framework known as MapReduce, is specifically designed to work with such datasets but it is not trivial to efficiently redesign and implement bioinformatics algorithms according to this paradigm. The parallelisation strategy of "divide and conquer" for alignment algorithms can be applied to both data sets and input query sequences. However, scalability is still an issue due to memory constraints or large databases, with very large database segmentation leading to additional performance decline. Herein, we present Hadoop Blast (HBlast), a parallelised BLAST algorithm that proposes a flexible method to partition both databases and input query sequences using "virtual partitioning". HBlast presents improved scalability over existing solutions and well balanced computational work load while keeping database segmentation and recompilation to a minimum. Enhanced BLAST search performance on cheap memory constrained hardware has significant implications for in field clinical diagnostic testing; enabling faster and more accurate identification of pathogenic DNA in human blood or tissue samples.

  7. Onorbit IMU alignment error budget

    Science.gov (United States)

    Corson, R. W.

    1980-01-01

    The Star Tracker, Crew Optical Alignment Sight (COAS), and Inertial Measurement Unit (IMU) from a complex navigation system with a multitude of error sources were combined. A complete list of the system errors is presented. The errors were combined in a rational way to yield an estimate of the IMU alignment accuracy for STS-1. The expected standard deviation in the IMU alignment error for STS-1 type alignments was determined to be 72 arc seconds per axis for star tracker alignments and 188 arc seconds per axis for COAS alignments. These estimates are based on current knowledge of the star tracker, COAS, IMU, and navigation base error specifications, and were partially verified by preliminary Monte Carlo analysis.

  8. Lunar Alignments - Identification and Analysis

    Science.gov (United States)

    González-García, A. César

    Lunar alignments are difficult to establish given the apparent lack of written accounts clearly pointing toward lunar alignments for individual temples. While some individual cases are reviewed and highlighted, the weight of the proof must fall on statistical sampling. Some definitions for the lunar alignments are provided in order to clarify the targets, and thus, some new tools are provided to try to test the lunar hypothesis in several cases, especially in megalithic astronomy.

  9. Apparatus for accurately measuring high temperatures

    Science.gov (United States)

    Smith, D.D.

    The present invention is a thermometer used for measuring furnace temperatures in the range of about 1800/sup 0/ to 2700/sup 0/C. The thermometer comprises a broadband multicolor thermal radiation sensor positioned to be in optical alignment with the end of a blackbody sight tube extending into the furnace. A valve-shutter arrangement is positioned between the radiation sensor and the sight tube and a chamber for containing a charge of high pressure gas is positioned between the valve-shutter arrangement and the radiation sensor. A momentary opening of the valve shutter arrangement allows a pulse of the high gas to purge the sight tube of air-borne thermal radiation contaminants which permits the radiation sensor to accurately measure the thermal radiation emanating from the end of the sight tube.

  10. GraphAlignment: Bayesian pairwise alignment of biological networks

    Directory of Open Access Journals (Sweden)

    Kolář Michal

    2012-11-01

    Full Text Available Abstract Background With increased experimental availability and accuracy of bio-molecular networks, tools for their comparative and evolutionary analysis are needed. A key component for such studies is the alignment of networks. Results We introduce the Bioconductor package GraphAlignment for pairwise alignment of bio-molecular networks. The alignment incorporates information both from network vertices and network edges and is based on an explicit evolutionary model, allowing inference of all scoring parameters directly from empirical data. We compare the performance of our algorithm to an alternative algorithm, Græmlin 2.0. On simulated data, GraphAlignment outperforms Græmlin 2.0 in several benchmarks except for computational complexity. When there is little or no noise in the data, GraphAlignment is slower than Græmlin 2.0. It is faster than Græmlin 2.0 when processing noisy data containing spurious vertex associations. Its typical case complexity grows approximately as O(N2.6. On empirical bacterial protein-protein interaction networks (PIN and gene co-expression networks, GraphAlignment outperforms Græmlin 2.0 with respect to coverage and specificity, albeit by a small margin. On large eukaryotic PIN, Græmlin 2.0 outperforms GraphAlignment. Conclusions The GraphAlignment algorithm is robust to spurious vertex associations, correctly resolves paralogs, and shows very good performance in identification of homologous vertices defined by high vertex and/or interaction similarity. The simplicity and generality of GraphAlignment edge scoring makes the algorithm an appropriate choice for global alignment of networks.

  11. MANGO: a new approach to multiple sequence alignment.

    Science.gov (United States)

    Zhang, Zefeng; Lin, Hao; Li, Ming

    2007-01-01

    Multiple sequence alignment is a classical and challenging task for biological sequence analysis. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state of the art multiple sequence alignment programs suffer from the 'once a gap, always a gap' phenomenon. Is there a radically new way to do multiple sequence alignment? This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds are provably significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks showing that MANGO compares favorably, in both accuracy and speed, against state-of-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, Prob-ConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0 and Kalign 2.0.

  12. Mask alignment system for semiconductor processing

    Energy Technology Data Exchange (ETDEWEB)

    Webb, Aaron P.; Carlson, Charles T.; Weaver, William T.; Grant, Christopher N.

    2017-02-14

    A mask alignment system for providing precise and repeatable alignment between ion implantation masks and workpieces. The system includes a mask frame having a plurality of ion implantation masks loosely connected thereto. The mask frame is provided with a plurality of frame alignment cavities, and each mask is provided with a plurality of mask alignment cavities. The system further includes a platen for holding workpieces. The platen may be provided with a plurality of mask alignment pins and frame alignment pins configured to engage the mask alignment cavities and frame alignment cavities, respectively. The mask frame can be lowered onto the platen, with the frame alignment cavities moving into registration with the frame alignment pins to provide rough alignment between the masks and workpieces. The mask alignment cavities are then moved into registration with the mask alignment pins, thereby shifting each individual mask into precise alignment with a respective workpiece.

  13. Systematic evaluation of spliced alignment programs for RNA-seq data.

    Science.gov (United States)

    Engström, Pär G; Steijger, Tamara; Sipos, Botond; Grant, Gregory R; Kahles, André; Rätsch, Gunnar; Goldman, Nick; Hubbard, Tim J; Harrow, Jennifer; Guigó, Roderic; Bertone, Paul

    2013-12-01

    High-throughput RNA sequencing is an increasingly accessible method for studying gene structure and activity on a genome-wide scale. A critical step in RNA-seq data analysis is the alignment of partial transcript reads to a reference genome sequence. To assess the performance of current mapping software, we invited developers of RNA-seq aligners to process four large human and mouse RNA-seq data sets. In total, we compared 26 mapping protocols based on 11 programs and pipelines and found major performance differences between methods on numerous benchmarks, including alignment yield, basewise accuracy, mismatch and gap placement, exon junction discovery and suitability of alignments for transcript reconstruction. We observed concordant results on real and simulated RNA-seq data, confirming the relevance of the metrics employed. Future developments in RNA-seq alignment methods would benefit from improved placement of multimapped reads, balanced utilization of existing gene annotation and a reduced false discovery rate for splice junctions.

  14. CATO: The Clone Alignment Tool.

    Directory of Open Access Journals (Sweden)

    Peter V Henstock

    Full Text Available High-throughput cloning efforts produce large numbers of sequences that need to be aligned, edited, compared with reference sequences, and organized as files and selected clones. Different pieces of software are typically required to perform each of these tasks. We have designed a single piece of software, CATO, the Clone Alignment Tool, that allows a user to align, evaluate, edit, and select clone sequences based on comparisons to reference sequences. The input and output are designed to be compatible with standard data formats, and thus suitable for integration into a clone processing pipeline. CATO provides both sequence alignment and visualizations to facilitate the analysis of cloning experiments. The alignment algorithm matches each of the relevant candidate sequences against each reference sequence. The visualization portion displays three levels of matching: 1 a top-level summary of the top candidate sequences aligned to each reference sequence, 2 a focused alignment view with the nucleotides of matched sequences displayed against one reference sequence, and 3 a pair-wise alignment of a single reference and candidate sequence pair. Users can select the minimum matching criteria for valid clones, edit or swap reference sequences, and export the results to a summary file as part of the high-throughput cloning workflow.

  15. CATO: The Clone Alignment Tool.

    Science.gov (United States)

    Henstock, Peter V; LaPan, Peter

    2016-01-01

    High-throughput cloning efforts produce large numbers of sequences that need to be aligned, edited, compared with reference sequences, and organized as files and selected clones. Different pieces of software are typically required to perform each of these tasks. We have designed a single piece of software, CATO, the Clone Alignment Tool, that allows a user to align, evaluate, edit, and select clone sequences based on comparisons to reference sequences. The input and output are designed to be compatible with standard data formats, and thus suitable for integration into a clone processing pipeline. CATO provides both sequence alignment and visualizations to facilitate the analysis of cloning experiments. The alignment algorithm matches each of the relevant candidate sequences against each reference sequence. The visualization portion displays three levels of matching: 1) a top-level summary of the top candidate sequences aligned to each reference sequence, 2) a focused alignment view with the nucleotides of matched sequences displayed against one reference sequence, and 3) a pair-wise alignment of a single reference and candidate sequence pair. Users can select the minimum matching criteria for valid clones, edit or swap reference sequences, and export the results to a summary file as part of the high-throughput cloning workflow.

  16. RNA Structural Alignments, Part I

    DEFF Research Database (Denmark)

    Havgaard, Jakob Hull; Gorodkin, Jan

    2014-01-01

    Simultaneous alignment and secondary structure prediction of RNA sequences is often referred to as "RNA structural alignment." A class of the methods for structural alignment is based on the principles proposed by Sankoff more than 25 years ago. The Sankoff algorithm simultaneously folds and alig...... the methods based on the Sankoff algorithm. All the practical implementations of the algorithm use heuristics to make them run in reasonable time and memory. These heuristics are also described in this chapter.......Simultaneous alignment and secondary structure prediction of RNA sequences is often referred to as "RNA structural alignment." A class of the methods for structural alignment is based on the principles proposed by Sankoff more than 25 years ago. The Sankoff algorithm simultaneously folds and aligns...... two or more sequences. The advantage of this algorithm over those that separate the folding and alignment steps is that it makes better predictions. The disadvantage is that it is slower and requires more computer memory to run. The amount of computational resources needed to run the Sankoff algorithm...

  17. Lexical alignment in triadic communication.

    Science.gov (United States)

    Foltz, Anouschka; Gaspers, Judith; Thiele, Kristina; Stenneken, Prisca; Cimiano, Philipp

    2015-01-01

    Lexical alignment refers to the adoption of one's interlocutor's lexical items. Accounts of the mechanisms underlying such lexical alignment differ (among other aspects) in the role assigned to addressee-centered behavior. In this study, we used a triadic communicative situation to test which factors may modulate the extent to which participants' lexical alignment reflects addressee-centered behavior. Pairs of naïve participants played a picture matching game and received information about the order in which pictures were to be matched from a voice over headphones. On critical trials, participants did or did not hear a name for the picture to be matched next over headphones. Importantly, when the voice over headphones provided a name, it did not match the name that the interlocutor had previously used to describe the object. Participants overwhelmingly used the word that the voice over headphones provided. This result points to non-addressee-centered behavior and is discussed in terms of disrupting alignment with the interlocutor as well as in terms of establishing alignment with the voice over headphones. In addition, the type of picture (line drawing vs. tangram shape) independently modulated lexical alignment, such that participants showed more lexical alignment to their interlocutor for (more ambiguous) tangram shapes compared to line drawings. Overall, the results point to a rather large role for non-addressee-centered behavior during lexical alignment.

  18. Vacuum Alignment with more Flavors

    DEFF Research Database (Denmark)

    Ryttov, Thomas

    2014-01-01

    We study the alignment of the vacuum in gauge theories with $N_f$ Dirac fermions transforming according to a complex representation of the gauge group. The alignment of the vacuum is produced by adding a small mass perturbation to the theory. We study in detail the $N_f=2,3$ and $4$ case. For $N...

  19. Development of a new laser alignment device with Winston-Lutz phantom in radiotherapy

    Energy Technology Data Exchange (ETDEWEB)

    Lim, Young Kyung; Min, Soonk; Jeong, Eun Hee; Jeong, Jong Hwi; Kim, Haksoo; Park, Jeong-Hoon; Shin, DongHo; Lee, Se Byeong [National Cancer Center, Goyang (Korea, Republic of); Choi, Sang Hyoun [Korea Cancer Center Hospital, Seoul (Korea, Republic of); Hwang, Ui-Jung [National Medical Center, Seoul (Korea, Republic of); Kwak, Jung Won [Asan Medical Center, Seoul (Korea, Republic of); Kim, Siyong [Virginia Commonwealth University, Richmond (United States)

    2015-10-15

    The lasers must be aligned precisely to the radiation isocenter. According to the report provided by the American Association of Physicists in Medicine (AAPM) Task Group 142, the localizing lasers should be aligned to within ±2 mm of radiation isocenter for non intensity modulated radiation therapy (IMRT), ±1 mm for IMRT, and less than ±1 mm for stereotactic radiosurgery (SRS) on a monthly basis. In this study, we developed and tested a new laser alignment device adopting an accurate, reproducible and straightforward alignment method in radiotherapy. The device consists of two laser alignments parts: the first part is an optical alignment part, and the second is a radiation alignment part. In the radiation alignment, a Winston-Lutz (W-L) phantom which was installed in the device was used. In this study, we developed a new laser alignment device with a W-L phantom for radiotherapy. Its performance was also tested in a conventional medical linac and a simulator. It was revealed that the device could align the patient-setup lasers in the treatment room accurately, precisely, and fast. We expect the device can be used as a quality assurance tool daily and monthly.

  20. The CMS Silicon Tracker Alignment

    CERN Document Server

    Castello, R

    2008-01-01

    The alignment of the Strip and Pixel Tracker of the Compact Muon Solenoid experiment, with its large number of independent silicon sensors and its excellent spatial resolution, is a complex and challenging task. Besides high precision mounting, survey measurements and the Laser Alignment System, track-based alignment is needed to reach the envisaged precision.\\\\ Three different algorithms for track-based alignment were successfully tested on a sample of cosmic-ray data collected at the Tracker Integration Facility, where 15\\% of the Tracker was tested. These results, together with those coming from the CMS global run, will provide the basis for the full-scale alignment of the Tracker, which will be carried out with the first \\emph{p-p} collisions.

  1. Alignment of flexible protein structures.

    Science.gov (United States)

    Shatsky, M; Fligelman, Z Y; Nussinov, R; Wolfson, H J

    2000-01-01

    We present two algorithms which align flexible protein structures. Both apply efficient structural pattern detection and graph theoretic techniques. The FlexProt algorithm simultaneously detects the hinge regions and aligns the rigid subparts of the molecules. It does it by efficiently detecting maximal congruent rigid fragments in both molecules and calculating their optimal arrangement which does not violate the protein sequence order. The FlexMol algorithm is sequence order independent, yet requires as input the hypothesized hinge positions. Due its sequence order independence it can also be applied to protein-protein interface matching and drug molecule alignment. It aligns the rigid parts of the molecule using the Geometric Hashing method and calculates optimal connectivity among these parts by graph-theoretic techniques. Both algorithms are highly efficient even compared with rigid structure alignment algorithms. Typical running times on a standard desktop PC (400 MHz) are about 7 seconds for FlexProt and about 1 minute for FlexMol.

  2. Alignments in the nobelium isotopes

    Institute of Scientific and Technical Information of China (English)

    ZHENG Shi-Zie; XU Fu-Rong; YUAN Cen-Xi; QI Chong

    2009-01-01

    Total-Routhian-Surface calculations have been performed to investigate the deformation and align-ment properties of the No isotopes. It is found that normal deformed and superdeformed states in these nuclei can coexist at low excitation energies. In neutron-deficient No isotopes, the superdeformed shapes can even become the ground states. Moreover, we plotted the kinematic moments of inertia of the No isotopes, which follow very nicely available experimental data. It is noted that, as the rotational frequency increases, align-ments develop at hω=0.2-0.3 MeV. Our calculations show that the occupation of the vj orbital plays an important role in the alignments of the No isotopes.

  3. Downlink Interference Alignment

    CERN Document Server

    Suh, Changho; Tse, David

    2010-01-01

    We develop an interference alignment (IA) technique for a downlink cellular system. In the uplink, IA schemes need channel-state-information exchange across base-stations of different cells, but our downlink IA technique requires feedback only within a cell. As a result, the proposed scheme can be implemented with a few changes to an existing cellular system where the feedback mechanism (within a cell) is already being considered for supporting multi-user MIMO. Not only is our proposed scheme implementable with little effort, it can in fact provide substantial gain especially when interference from a dominant interferer (base-station) is significantly stronger than the remaining interference: it is shown that in the two-isolated cell layout, our scheme provides four-fold gain in throughput performance over a standard multi-user MIMO technique. We show through simulations that our technique provides respectable gain under more realistic scenarios: it gives approximately 55% and 20% gain for a linear cell layou...

  4. Interference Alignment for Secrecy

    CERN Document Server

    Koyluoglu, Onur Ozan; Lai, Lifeng; Poor, H Vincent

    2008-01-01

    This paper studies the frequency/time selective $K$-user Gaussian interference channel with secrecy constraints. Two distinct models, namely the interference channel with confidential messages and the one with an external eavesdropper, are analyzed. The key difference between the two models is the lack of channel state information (CSI) about the external eavesdropper. Using interference alignment along with secrecy pre-coding, it is shown that each user can achieve non-zero secure Degrees of Freedom (DoF) for both cases. More precisely, the proposed coding scheme achieves $\\frac{K-2}{2K-2}$ secure DoF {\\em with probability one} per user in the confidential messages model. For the external eavesdropper scenario, on the other hand, it is shown that each user can achieve $\\frac{K-2}{2K}$ secure DoF {\\em in the ergodic setting}. Remarkably, these results establish the {\\em positive impact} of interference on the secrecy capacity region of wireless networks.

  5. Space Mirror Alignment System

    Science.gov (United States)

    Jau, Bruno M.; McKinney, Colin; Smythe, Robert F.; Palmer, Dean L.

    2011-01-01

    An optical alignment mirror mechanism (AMM) has been developed with angular positioning accuracy of +/-0.2 arcsec. This requires the mirror s linear positioning actuators to have positioning resolutions of +/-112 nm to enable the mirror to meet the angular tip/tilt accuracy requirement. Demonstrated capabilities are 0.1 arc-sec angular mirror positioning accuracy, which translates into linear positioning resolutions at the actuator of 50 nm. The mechanism consists of a structure with sets of cross-directional flexures that enable the mirror s tip and tilt motion, a mirror with its kinematic mount, and two linear actuators. An actuator comprises a brushless DC motor, a linear ball screw, and a piezoelectric brake that holds the mirror s position while the unit is unpowered. An interferometric linear position sensor senses the actuator s position. The AMMs were developed for an Astrometric Beam Combiner (ABC) optical bench, which is part of an interferometer development. Custom electronics were also developed to accommodate the presence of multiple AMMs within the ABC and provide a compact, all-in-one solution to power and control the AMMs.

  6. Accurate pose estimation for forensic identification

    Science.gov (United States)

    Merckx, Gert; Hermans, Jeroen; Vandermeulen, Dirk

    2010-04-01

    In forensic authentication, one aims to identify the perpetrator among a series of suspects or distractors. A fundamental problem in any recognition system that aims for identification of subjects in a natural scene is the lack of constrains on viewing and imaging conditions. In forensic applications, identification proves even more challenging, since most surveillance footage is of abysmal quality. In this context, robust methods for pose estimation are paramount. In this paper we will therefore present a new pose estimation strategy for very low quality footage. Our approach uses 3D-2D registration of a textured 3D face model with the surveillance image to obtain accurate far field pose alignment. Starting from an inaccurate initial estimate, the technique uses novel similarity measures based on the monogenic signal to guide a pose optimization process. We will illustrate the descriptive strength of the introduced similarity measures by using them directly as a recognition metric. Through validation, using both real and synthetic surveillance footage, our pose estimation method is shown to be accurate, and robust to lighting changes and image degradation.

  7. The measurement of upper body alignment during the golf drive.

    Science.gov (United States)

    Wheat, J S; Vernon, T; Milner, C E

    2007-05-01

    Transverse plane rotations of the upper body are often estimated during the golf swing. The aim of this study was to determine the agreement between upper body alignments measured using markers attached to the thorax and markers on the acromion process during the golf drive. Three-dimensional coordinate data from nine markers were collected (300 Hz) during eight golf drives for 10 participants. The transverse plane alignment of the upper body was calculated using three techniques: inter-acromion vector, thorax vector, and Cardan angles. Agreement between the methods was then assessed using intra-class correlation and 95% limits of agreement. Our results suggested that the thorax vector can be used to provide an accurate estimation of thorax alignment at all stages of the golf swing (R > or = 0.97, systematic difference < 1.0 degrees , random difference < 3.8 degrees ). The inter-acromion vector gave an accurate estimation of thorax alignment at address (R = 0.90, systematic difference = 0.0 degrees , random difference = 4.3 degrees ) but it should not be used to estimate thorax alignment at the top of the backswing (R = 0.32, systematic difference = -16.0 degrees , random difference = 8.7 degrees ) or impact (R = 0.90, systematic difference = -5.1 degrees , random difference = 8.3 degrees ) during the golf drive.

  8. Refining borders of genome-rearrangements including repetitions

    Directory of Open Access Journals (Sweden)

    JA Arjona-Medina

    2016-10-01

    Full Text Available Abstract Background DNA rearrangement events have been widely studied in comparative genomic for many years. The importance of these events resides not only in the study about relatedness among different species, but also to determine the mechanisms behind evolution. Although there are many methods to identify genome-rearrangements (GR, the refinement of their borders has become a huge challenge. Until now no accepted method exists to achieve accurate fine-tuning: i.e. the notion of breakpoint (BP is still an open issue, and despite repeated regions are vital to understand evolution they are not taken into account in most of the GR detection and refinement methods. Methods and results We propose a method to refine the borders of GR including repeated regions. Instead of removing these repetitions to facilitate computation, we take advantage of them using a consensus alignment sequence of the repeated region in between two blocks. Using the concept of identity vectors for Synteny Blocks (SB and repetitions, a Finite State Machine is designed to detect transition points in the difference between such vectors. The method does not force the BP to be a region or a point but depends on the alignment transitions within the SBs and repetitions. Conclusion The accurate definition of the borders of SB and repeated genomic regions and consequently the detection of BP might help to understand the evolutionary model of species. In this manuscript we present a new proposal for such a refinement. Features of the SBs borders and BPs are different and fit with what is expected. SBs with more diversity in annotations and BPs short and richer in DNA replication and stress response, which are strongly linked with rearrangements.

  9. Sigma: multiple alignment of weakly-conserved non-coding DNA sequence

    Directory of Open Access Journals (Sweden)

    Siddharthan Rahul

    2006-03-01

    Full Text Available Abstract Background Existing tools for multiple-sequence alignment focus on aligning protein sequence or protein-coding DNA sequence, and are often based on extensions to Needleman-Wunsch-like pairwise alignment methods. We introduce a new tool, Sigma, with a new algorithm and scoring scheme designed specifically for non-coding DNA sequence. This problem acquires importance with the increasing number of published sequences of closely-related species. In particular, studies of gene regulation seek to take advantage of comparative genomics, and recent algorithms for finding regulatory sites in phylogenetically-related intergenic sequence require alignment as a preprocessing step. Much can also be learned about evolution from intergenic DNA, which tends to evolve faster than coding DNA. Sigma uses a strategy of seeking the best possible gapless local alignments (a strategy earlier used by DiAlign, at each step making the best possible alignment consistent with existing alignments, and scores the significance of the alignment based on the lengths of the aligned fragments and a background model which may be supplied or estimated from an auxiliary file of intergenic DNA. Results Comparative tests of sigma with five earlier algorithms on synthetic data generated to mimic real data show excellent performance, with Sigma balancing high "sensitivity" (more bases aligned with effective filtering of "incorrect" alignments. With real data, while "correctness" can't be directly quantified for the alignment, running the PhyloGibbs motif finder on pre-aligned sequence suggests that Sigma's alignments are superior. Conclusion By taking into account the peculiarities of non-coding DNA, Sigma fills a gap in the toolbox of bioinformatics.

  10. Aligning for Innovation - Alignment Strategy to Drive Innovation

    Science.gov (United States)

    Johnson, Hurel; Teltschik, David; Bussey, Horace, Jr.; Moy, James

    2010-01-01

    With the sudden need for innovation that will help the country achieve its long-term space exploration objectives, the question of whether NASA is aligned effectively to drive the innovation that it so desperately needs to take space exploration to the next level should be entertained. Authors such as Robert Kaplan and David North have noted that companies that use a formal system for implementing strategy consistently outperform their peers. They have outlined a six-stage management systems model for implementing strategy, which includes the aligning of the organization towards its objectives. This involves the alignment of the organization from the top down. This presentation will explore the impacts of existing U.S. industrial policy on technological innovation; assess the current NASA organizational alignment and its impacts on driving technological innovation; and finally suggest an alternative approach that may drive the innovation needed to take the world to the next level of space exploration, with NASA truly leading the way.

  11. Static rearfoot alignment: a comparison of clinical and radiographic measures.

    Science.gov (United States)

    Lamm, Bradley M; Mendicino, Robert W; Catanzariti, Alan R; Hillstrom, Howard J

    2005-01-01

    Foot structure is typically evaluated using static clinical and radiographic measures. To date, the literature is devoid of a correlation between rearfoot frontal plane radiographic parameters and clinical measures of alignment. In a repeated-measures study comparing radiographic and clinical rearfoot alignment in 24 healthy subjects, radiographic angular measurements were made from standard weightbearing anteroposterior, lateral, long leg calcaneal axial, and rearfoot alignment views. Clinical measurements were made using a jig and scanner to assess the malleolar valgus index and a goniometer to evaluate the resting and neutral calcaneal stance positions. There was a significant correlation between frontal plane radiographic angles (long leg calcaneal axial and rearfoot alignment views) (r = 0.814). Similarly, there was a significant correlation between clinical measures (resting calcaneal stance position and malleolar valgus index) (r = 0.714). A multivariate stepwise regression showed that resting calcaneal stance position can be accurately predicted from 3 of the 15 clinical and radiographic measurements collected: malleolar valgus index, rearfoot alignment view, and long leg calcaneal axial view (r = 0.829). In summary, a commonly used clinical measure of static rearfoot alignment, resting calcaneal stance position, was correlated closely with the malleolar valgus index and both frontal plane radiographic parameters.

  12. Validation of the CLIC alignment strategy on short range

    CERN Document Server

    Mainaud Durand, H; Griffet, S; Kemppinen, J; Rude, V; Sosin, M

    2012-01-01

    The pre-alignment of CLIC consists of aligning the components of linacs and beam delivery systems (BDS) in the most accurate possible way, so that a first pilot beam can circulate and allow the implementation of the beam based alignment. Taking into account the precision and accuracy needed: 10 µm rms over sliding windows of 200m, this pre-alignment must be active and it can be divided into two parts: the determination of a straight reference over 20 km, thanks to a metrological network and the determination of the component positions with respect to this reference, and their adjustment. The second part is the object of the paper, describing the steps of the proposed strategy: firstly the fiducialisation of the different components of CLIC; secondly, the alignment of these components on common supports and thirdly the active alignment of these supports using sensors and actuators. These steps have been validated on a test setup over a length of 4m, and the obtained results are analysed.

  13. CMS Muon Alignment: System Description and first results

    CERN Document Server

    Sobron, M

    2008-01-01

    The CMS detector has been instrumented with a precise and complex opto-mechanical alignment subsystem that provides a common reference frame between Tracker and Muon detection systems by means of a net of laser beams. The system allows a continuous and accurate monitoring of the muon chambers positions with respect to the Tracker body. Preliminary results of operation during the test of the CMS 4T solenoid magnet, performed in 2006, are presented. These measurements complement the information provided by the use of survey techniques and the results of alignment algorithms based on muon tracks crossing the detector.

  14. Structural alignment of RNA with triple helix structure.

    Science.gov (United States)

    Wong, Thomas K F; Yiu, S M

    2012-04-01

    Structural alignment is useful in identifying members of ncRNAs. Existing tools are all based on the secondary structures of the molecules. There is evidence showing that tertiary interactions (the interaction between a single-stranded nucleotide and a base-pair) in triple helix structures are critical in some functions of ncRNAs. In this article, we address the problem of structural alignment of RNAs with the triple helix. We provide a formal definition to capture a simplified model of a triple helix structure, then develop an algorithm of O(mn(3)) time to align a query sequence (of length m) with known triple helix structure with a target sequence (of length n) with an unknown structure. The resulting algorithm is shown to be useful in identifying ncRNA members in a simulated genome.

  15. Optimization of Substitution Matrix for Sequence Alignment of Major Capsid Proteins of Human Herpes Simplex Virus

    Directory of Open Access Journals (Sweden)

    Vipan Kumar Sohpal

    2011-12-01

    Full Text Available Protein sequence alignment has become an informative tool in modern molecular biology research. A number of substitution matrices have been readily available for sequence alignments, but it is challenging task to compute optimal matrices for alignment accuracy. Here, we used the parameter optimization procedure to select the optimal Q of substitution matrices for major viral capsid protein of human herpes simplex virus. Results predict that Blosum matrix is most accurate on alignment benchmarks, and Blosum 60 provides the optimal Q in all substitution matrices. PAM 200 matrices results slightly below than Blosum 60, while VTML matrices are intermediate of PAM and VT matrices under dynamic programming.

  16. Magnetic axis alignment and the Poisson alignment reference system

    Science.gov (United States)

    Griffith, Lee V.; Schenz, Richard F.; Sommargren, Gary E.

    1989-01-01

    Three distinct metrological operations are necessary to align a free-electron laser (FEL): the magnetic axis must be located, a straight line reference (SLR) must be generated, and the magnetic axis must be related to the SLR. This paper begins with a review of the motivation for developing an alignment system that will assure better than 100 micrometer accuracy in the alignment of the magnetic axis throughout an FEL. The paper describes techniques for identifying the magnetic axis of solenoids, quadrupoles, and wiggler poles. Propagation of a laser beam is described to the extent of revealing sources of nonlinearity in the beam. Development and use of the Poisson line, a diffraction effect, is described in detail. Spheres in a large-diameter laser beam create Poisson lines and thus provide a necessary mechanism for gauging between the magnetic axis and the SLR. Procedures for installing FEL components and calibrating alignment fiducials to the magnetic axes of the components are also described. An error budget shows that the Poisson alignment reference system will make it possible to meet the alignment tolerances for an FEL.

  17. DNA Sequence Alignment during Homologous Recombination.

    Science.gov (United States)

    Greene, Eric C

    2016-05-27

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination.

  18. RF Jitter Modulation Alignment Sensing

    Science.gov (United States)

    Ortega, L. F.; Fulda, P.; Diaz-Ortiz, M.; Perez Sanchez, G.; Ciani, G.; Voss, D.; Mueller, G.; Tanner, D. B.

    2017-01-01

    We will present the numerical and experimental results of a new alignment sensing scheme which can reduce the complexity of alignment sensing systems currently used, while maintaining the same shot noise limited sensitivity. This scheme relies on the ability of electro-optic beam deflectors to create angular modulation sidebands in radio frequency, and needs only a single-element photodiode and IQ demodulation to generate error signals for tilt and translation degrees of freedom in one dimension. It distances itself from current techniques by eliminating the need for beam centering servo systems, quadrant photodetectors and Gouy phase telescopes. RF Jitter alignment sensing can be used to reduce the complexity in the alignment systems of many laser optical experiments, including LIGO and the ALPS experiment.

  19. PROMALS3D: multiple protein sequence alignment enhanced with evolutionary and three-dimensional structural information.

    Science.gov (United States)

    Pei, Jimin; Grishin, Nick V

    2014-01-01

    Multiple sequence alignment (MSA) is an essential tool with many applications in bioinformatics and computational biology. Accurate MSA construction for divergent proteins remains a difficult computational task. The constantly increasing protein sequences and structures in public databases could be used to improve alignment quality. PROMALS3D is a tool for protein MSA construction enhanced with additional evolutionary and structural information from database searches. PROMALS3D automatically identifies homologs from sequence and structure databases for input proteins, derives structure-based constraints from alignments of three-dimensional structures, and combines them with sequence-based constraints of profile-profile alignments in a consistency-based framework to construct high-quality multiple sequence alignments. PROMALS3D output is a consensus alignment enriched with sequence and structural information about input proteins and their homologs. PROMALS3D Web server and package are available at http://prodata.swmed.edu/PROMALS3D.

  20. Separating weak lensing and intrinsic alignments using radio observations

    CERN Document Server

    Whittaker, Lee; Battye, Richard A

    2015-01-01

    We discuss methods for performing weak lensing using radio observations to recover information about the intrinsic structural properties of the source galaxies. Radio surveys provide unique information that can benefit weak lensing studies, such as HI emission, which may be used to construct galaxy velocity maps, and polarized synchrotron radiation; both of which provide information about the unlensed galaxy and can be used to reduce galaxy shape noise and the contribution of intrinsic alignments. Using a proxy for the intrinsic position angle of an observed galaxy, we develop techniques for cleanly separating weak gravitational lensing signals from intrinsic alignment contamination in forthcoming radio surveys. Random errors on the intrinsic orientation estimates introduce biases into the shear and intrinsic alignment estimates. However, we show that these biases can be corrected for if the error distribution is accurately known. We demonstrate our methods using simulations, where we reconstruct the shear an...

  1. The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

    Science.gov (United States)

    Lack, Justin B; Cardeno, Charis M; Crepeau, Marc W; Taylor, William; Corbett-Detig, Russell B; Stevens, Kristian A; Langley, Charles H; Pool, John E

    2015-04-01

    Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets.

  2. Alignment of the ATLAS Inner Detector

    CERN Document Server

    Marti-Garcia, Salvador; The ATLAS collaboration

    2016-01-01

    The Run-2 of the LHC has presented new challenges to track and vertex reconstruction with higher energies, denser jets and higher rates. In addition, the Insertable B-layer (IBL) is a fourth pixel layer, which has been deployed at the centre of ATLAS during the longshutdown-1 of the LHC. The physics performance of the experiment requires a high resolution and unbiased measurement of all charged particle kinematic parameters. In its turn, the performance of the tracking depends, among many other issues, on the accurate determination of the alignment parameters of the tracking sensors. The offline track based alignment of the ATLAS tracking system has to deal with more than 700,000 degrees of freedom (DoF). This represents a considerable numerical challenge in terms of both CPU time and precision. During Run-2, a mechanical distortion of the IBL staves up to 20um has been observed during data-taking, plus other short time scale movements. The talk will also describe the procedures implemented to detect and remo...

  3. Optical alignment of Centaur's inertial guidance system

    Science.gov (United States)

    Gordan, Andrew L.

    1987-01-01

    During Centaur launch operations the launch azimuth of the inertial platform's U-accelerometer input axis must be accurately established and maintained. This is accomplished by using an optically closed loop system with a long-range autotheodolite whose line of sight was established by a first-order survey. A collimated light beam from the autotheodolite intercepts a reflecting Porro prism mounted on the platform azimuth gimbal. Thus, any deviation of the Porro prism from its predetermined heading is optically detected by the autotheodolite. The error signal produced is used to torque the azimuth gimbal back to its required launch azimuth. The heading of the U-accelerometer input axis is therefore maintained automatically. Previously, the autotheodolite system could not distinguish between vehicle sway and rotational motion of the inertial platform unless at least three prisms were used. One prism was mounted on the inertial platform to maintain azimuth alignment, and two prisms were mounted externally on the vehicle to track sway. For example, the automatic azimuth-laying theodolite (AALT-SV-M2) on the Saturn vehilce used three prisms. The results of testing and modifying the AALT-SV-M2 autotheodolite to simultaneously monitor and maintain alignment of the inertial platform and track the sway of the vehicle from a single Porro prism.

  4. Software for computing and annotating genomic ranges.

    Directory of Open Access Journals (Sweden)

    Michael Lawrence

    Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  5. Anatomically Plausible Surface Alignment and Reconstruction

    DEFF Research Database (Denmark)

    Paulsen, Rasmus R.; Larsen, Rasmus

    2010-01-01

    With the increasing clinical use of 3D surface scanners, there is a need for accurate and reliable algorithms that can produce anatomically plausible surfaces. In this paper, a combined method for surface alignment and reconstruction is proposed. It is based on an implicit surface representation...... combined with a Markov Random Field regularisation method. Conceptually, the method maintains an implicit ideal description of the sought surface. This implicit surface is iteratively updated by realigning the input point sets and Markov Random Field regularisation. The regularisation is based on a prior...... energy that has earlier proved to be particularly well suited for human surface scans. The method has been tested on full cranial scans of ten test subjects and on several scans of the outer human ear....

  6. Image denoising using local tangent space alignment

    Science.gov (United States)

    Feng, JianZhou; Song, Li; Huo, Xiaoming; Yang, XiaoKang; Zhang, Wenjun

    2010-07-01

    We propose a novel image denoising approach, which is based on exploring an underlying (nonlinear) lowdimensional manifold. Using local tangent space alignment (LTSA), we 'learn' such a manifold, which approximates the image content effectively. The denoising is performed by minimizing a newly defined objective function, which is a sum of two terms: (a) the difference between the noisy image and the denoised image, (b) the distance from the image patch to the manifold. We extend the LTSA method from manifold learning to denoising. We introduce the local dimension concept that leads to adaptivity to different kind of image patches, e.g. flat patches having lower dimension. We also plug in a basic denoising stage to estimate the local coordinate more accurately. It is found that the proposed method is competitive: its performance surpasses the K-SVD denoising method.

  7. Sensing Characteristics of A Precision Aligner Using Moire Gratings for Precision Alignment System

    Institute of Scientific and Technical Information of China (English)

    ZHOU Lizhong; Hideo Furuhashi; Yoshiyuki Uchida

    2001-01-01

    Sensing characteristics of a precision aligner using moire gratings for precision alignment sysem has been investigated. A differential moire alignment system and a modified alignment system were used. The influence of the setting accuracy of the gap length and inclination of gratings on the alignment accuracy has been studied experimentally and theoretically. Setting accuracy of the gap length less than 2.5μm is required in modified moire alignment. There is no influence of the gap length on the alignment accuracy in the differential alignment system. The inclination affects alignment accuracies in both differential and modified moire alignment systems.

  8. ECR Browser: A Tool For Visualizing And Accessing Data From Comparisons Of Multiple Vertebrate Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Loots, G G; Ovcharenko, I; Stubbs, L; Nobrega, M A

    2004-01-06

    The increasing number of vertebrate genomes being sequenced in draft or finished form provide a unique opportunity to study and decode the language of DNA sequence through comparative genome alignments. However, novel tools and strategies are required to accommodate this increasing volume of genomic information and to facilitate experimental annotation of genome function. Here we present the ECR Browser, a tool that provides an easy and dynamic access to whole genome alignments of human, mouse, rat and fish sequences. This web-based tool (http://ecrbrowser.dcode.org) provides the starting point for discovery of novel genes, identification of distant gene regulatory elements and prediction of transcription factor binding sites. The genome alignment portal of the ECR Browser also permits fast and automated alignment of any user-submitted sequence to the genome of choice. The interconnection of the ECR browser with other DNA sequence analysis tools creates a unique portal for studying and exploring vertebrate genomes.

  9. FadE: whole genome methylation analysis for multiple sequencing platforms.

    Science.gov (United States)

    Souaiaia, Tade; Zhang, Zheng; Chen, Ting

    2013-01-01

    DNA methylation plays a central role in genomic regulation and disease. Sodium bisulfite treatment (SBT) causes unmethylated cytosines to be sequenced as thymine, which allows methylation levels to reflected in the number of 'C'-'C' alignments covering reference cytosines. Di-base color reads produced by lifetech's SOLiD sequencer provide unreliable results when translated to bases because single sequencing errors effect the downstream sequence. We describe FadE, an algorithm to accurately determine genome-wide methylation rates directly in color or nucleotide space. FadE uses SBT unmethylated and untreated data to determine background error rates and incorporate them into a model which uses Newton-Raphson optimization to estimate the methylation rate and provide a credible interval describing its distribution at every reference cytosine. We sequenced two slides of human fibroblast cell-line bisulfite-converted fragment library with the SOLiD sequencer to investigate genome-wide methylation levels. FadE reported widespread differences in methylation levels across CpG islands and a large number of differentially methylated regions adjacent to genes which compares favorably to the results of an investigation on the same cell-line using nucleotide-space reads at higher coverage levels, suggesting that FadE is an accurate method to estimate genome-wide methylation with color or nucleotide reads. http://code.google.com/p/fade/.

  10. A comprehensive evaluation of alignment algorithms in the context of RNA-seq.

    Science.gov (United States)

    Lindner, Robert; Friedel, Caroline C

    2012-01-01

    Transcriptome sequencing (RNA-Seq) overcomes limitations of previously used RNA quantification methods and provides one experimental framework for both high-throughput characterization and quantification of transcripts at the nucleotide level. The first step and a major challenge in the analysis of such experiments is the mapping of sequencing reads to a transcriptomic origin including the identification of splicing events. In recent years, a large number of such mapping algorithms have been developed, all of which have in common that they require algorithms for aligning a vast number of reads to genomic or transcriptomic sequences. Although the FM-index based aligner Bowtie has become a de facto standard within mapping pipelines, a much larger number of possible alignment algorithms have been developed also including other variants of FM-index based aligners. Accordingly, developers and users of RNA-seq mapping pipelines have the choice among a large number of available alignment algorithms. To provide guidance in the choice of alignment algorithms for these purposes, we evaluated the performance of 14 widely used alignment programs from three different algorithmic classes: algorithms using either hashing of the reference transcriptome, hashing of reads, or a compressed FM-index representation of the genome. Here, special emphasis was placed on both precision and recall and the performance for different read lengths and numbers of mismatches and indels in a read. Our results clearly showed the significant reduction in memory footprint and runtime provided by FM-index based aligners at a precision and recall comparable to the best hash table based aligners. Furthermore, the recently developed Bowtie 2 alignment algorithm shows a remarkable tolerance to both sequencing errors and indels, thus, essentially making hash-based aligners obsolete.

  11. A comprehensive evaluation of alignment algorithms in the context of RNA-seq.

    Directory of Open Access Journals (Sweden)

    Robert Lindner

    Full Text Available Transcriptome sequencing (RNA-Seq overcomes limitations of previously used RNA quantification methods and provides one experimental framework for both high-throughput characterization and quantification of transcripts at the nucleotide level. The first step and a major challenge in the analysis of such experiments is the mapping of sequencing reads to a transcriptomic origin including the identification of splicing events. In recent years, a large number of such mapping algorithms have been developed, all of which have in common that they require algorithms for aligning a vast number of reads to genomic or transcriptomic sequences. Although the FM-index based aligner Bowtie has become a de facto standard within mapping pipelines, a much larger number of possible alignment algorithms have been developed also including other variants of FM-index based aligners. Accordingly, developers and users of RNA-seq mapping pipelines have the choice among a large number of available alignment algorithms. To provide guidance in the choice of alignment algorithms for these purposes, we evaluated the performance of 14 widely used alignment programs from three different algorithmic classes: algorithms using either hashing of the reference transcriptome, hashing of reads, or a compressed FM-index representation of the genome. Here, special emphasis was placed on both precision and recall and the performance for different read lengths and numbers of mismatches and indels in a read. Our results clearly showed the significant reduction in memory footprint and runtime provided by FM-index based aligners at a precision and recall comparable to the best hash table based aligners. Furthermore, the recently developed Bowtie 2 alignment algorithm shows a remarkable tolerance to both sequencing errors and indels, thus, essentially making hash-based aligners obsolete.

  12. Accurate statistics for local sequence alignment with position-dependent scoring by rare-event sampling

    Directory of Open Access Journals (Sweden)

    Rahmann Sven

    2011-02-01

    Full Text Available Abstract Background Molecular database search tools need statistical models to assess the significance for the resulting hits. In the classical approach one asks the question how probable a certain score is observed by pure chance. Asymptotic theories for such questions are available for two random i.i.d. sequences. Some effort had been made to include effects of finite sequence lengths and to account for specific compositions of the sequences. In many applications, such as a large-scale database homology search for transmembrane proteins, these models are not the most appropriate ones. Search sensitivity and specificity benefit from position-dependent scoring schemes or use of Hidden Markov Models. Additional, one may wish to go beyond the assumption that the sequences are i.i.d. Despite their practical importance, the statistical properties of these settings have not been well investigated yet. Results In this paper, we discuss an efficient and general method to compute the score distribution to any desired accuracy. The general approach may be applied to different sequence models and and various similarity measures that satisfy a few weak assumptions. We have access to the low-probability region ("tail" of the distribution where scores are larger than expected by pure chance and therefore relevant for practical applications. Our method uses recent ideas from rare-event simulations, combining Markov chain Monte Carlo simulations with importance sampling and generalized ensembles. We present results for the score statistics of fixed and random queries against random sequences. In a second step, we extend the approach to a model of transmembrane proteins, which can hardly be described as i.i.d. sequences. For this case, we compare the statistical properties of a fixed query model as well as a hidden Markov sequence model in connection with a position based scoring scheme against the classical approach. Conclusions The results illustrate that the sensitivity and specificity strongly depend on the underlying scoring and sequence model. A specific ROC analysis for the case of transmembrane proteins supports our observation.

  13. ECR Browser: a tool for visualizing and accessing data from comparisons of multiple vertebrate genomes

    OpenAIRE

    Ovcharenko, Ivan; Nobrega, Marcelo A.; Loots, Gabriela G.; Stubbs, Lisa

    2004-01-01

    With an increasing number of vertebrate genomes being sequenced in draft or finished form, unique opportunities for decoding the language of DNA sequence through comparative genome alignments have arisen. However, novel tools and strategies are required to accommodate this large volume of genomic information and to facilitate the transfer of predictions generated by comparative sequence alignment to researchers focused on experimental annotation of genome function. Here, we present the ECR Br...

  14. Genometa--a fast and accurate classifier for short metagenomic shotgun reads.

    Directory of Open Access Journals (Sweden)

    Colin F Davenport

    Full Text Available Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer.The Genometa program, a step by step tutorial and Java source code are freely available from http://genomics1.mh-hannover.de/genometa/ and on http://code.google.com/p/genometa/. This program has been tested on Ubuntu Linux and Windows XP/7.

  15. Photosensitive Polymers for Liquid Crystal Alignment

    Science.gov (United States)

    Mahilny, U. V.; Stankevich, A. I.; Trofimova, A. V.; Muravsky, A. A.; Murauski, A. A.

    The peculiarities of alignment of liquid crystal (LC) materials by the layers of photocrosslinkable polymers with side benzaldehyde groups are considered. The investigation of mechanism of photostimulated alignment by rubbed benzaldehyde layer is performed. The methods of creation of multidomain aligning layers on the basis of photostimulated rubbing alignment are described.

  16. Peak alignment using wavelet pattern matching and differential evolution.

    Science.gov (United States)

    Zhang, Zhi-Min; Chen, Shan; Liang, Yi-Zeng

    2011-01-30

    Retention time shifts badly impair qualitative or quantitative results of chemometric analyses when entire chromatographic data are used. Hence, chromatograms should be aligned to perform further analysis. Being inspired and motivated by this purpose, a practical and handy peak alignment method (alignDE) is proposed, implemented in this research for one-way chromatograms, which basically consists of five steps: (1) chromatogram lengths equalization using linear interpolation; (2) accurate peak pattern matching by continuous wavelet transform (CWT) with the Mexican Hat and Haar wavelets as its mother wavelets; (3) flexible baseline fitting utilizing penalized least squares; (4) peak clustering when gap of two peaks is smaller than a certain threshold; (5) peak alignment using differential evolution (DE) to maximize linear correlation coefficient between reference signal and signal to be aligned. This method is demonstrated with both simulated chromatograms and real chromatograms, for example, chromatograms of fungal extracts and Red Peony Root obtained by HPLC-DAD. It is implemented in R language and available as open source software to a broad range of chromatograph users (http://code.google.com/p/alignde).

  17. Multispectral optical telescope alignment testing for a cryogenic space environment

    Science.gov (United States)

    Newswander, Trent; Hooser, Preston; Champagne, James

    2016-09-01

    Multispectral space telescopes with visible to long wave infrared spectral bands provide difficult alignment challenges. The visible channels require precision in alignment and stability to provide good image quality in short wavelengths. This is most often accomplished by choosing materials with near zero thermal expansion glass or ceramic mirrors metered with carbon fiber reinforced polymer (CFRP) that are designed to have a matching thermal expansion. The IR channels are less sensitive to alignment but they often require cryogenic cooling for improved sensitivity with the reduced radiometric background. Finding efficient solutions to this difficult problem of maintaining good visible image quality at cryogenic temperatures has been explored with the building and testing of a telescope simulator. The telescope simulator is an onaxis ZERODUR® mirror, CFRP metered set of optics. Testing has been completed to accurately measure telescope optical element alignment and mirror figure changes in a cryogenic space simulated environment. Measured alignment error and mirror figure error test results are reported with a discussion of their impact on system optical performance.

  18. Alignment of the Muon System at the CMS Experiment

    Science.gov (United States)

    Mueller, Ryan; Perniè, Luca; Pakhotin, Yuriy; Kamon, Teruki; Safonov, Alexei; Brown, Malachi

    2017-01-01

    The muon detectors of the CMS experiment provide fast trigger decisions, muon identifications and muon track measurements. Alignment of the muon detectors is crucial for accurate reconstruction of events with high pT muons that are present in signatures for many new physics scenarios. The muon detector's relative positions and orientations with respect to the inner silicon tracker may be precisely measured using reconstructed tracks propagating from the interaction point. This track-based alignment procedure is capable of aligning individual muon detectors to within 100 microns along sensitive modes. However, weak (insensitive) modes may not be well measured due to the system's design and cause systematic miss-measurements. In this report, we present a new track-based procedure which enables all 6 alignment parameters - 3 positions and 3 rotations for each individual muon detector. The improved algorithm allows for measurement of weak modes and considerably reduced related systematic uncertainties. We describe results of the alignment procedure obtained with 2016 data.

  19. Active network alignment: a matching-based approach

    CERN Document Server

    Malmi, Eric; Gionis, Aristides

    2016-01-01

    Network alignment is the problem of matching the nodes of two graphs, maximizing the similarity of the matched nodes and the edges between them. This problem is encountered in a wide array of applications - from biological networks to social networks to ontologies - where multiple networked data sources need to be integrated. Due to the difficulty of the task, an accurate alignment can rarely be found without human assistance. Thus, it is of great practical importance to develop network alignment algorithms that can optimally leverage experts who are able to provide the correct alignment for a small number of nodes. Yet, only a handful of existing works address this active network alignment setting. The majority of the existing active methods focus on absolute queries ("are nodes $a$ and $b$ the same or not?"), whereas we argue that it is generally easier for a human expert to answer relative queries ("which node in the set $\\{b_1, \\ldots, b_n\\}$ is the most similar to node $a$?"). This paper introduces a nov...

  20. Phylo: a citizen science approach for improving multiple sequence alignment.

    Directory of Open Access Journals (Sweden)

    Alexander Kawrykow

    Full Text Available BACKGROUND: Comparative genomics, or the study of the relationships of genome structure and function across different species, offers a powerful tool for studying evolution, annotating genomes, and understanding the causes of various genetic disorders. However, aligning multiple sequences of DNA, an essential intermediate step for most types of analyses, is a difficult computational task. In parallel, citizen science, an approach that takes advantage of the fact that the human brain is exquisitely tuned to solving specific types of problems, is becoming increasingly popular. There, instances of hard computational problems are dispatched to a crowd of non-expert human game players and solutions are sent back to a central server. METHODOLOGY/PRINCIPAL FINDINGS: We introduce Phylo, a human-based computing framework applying "crowd sourcing" techniques to solve the Multiple Sequence Alignment (MSA problem. The key idea of Phylo is to convert the MSA problem into a casual game that can be played by ordinary web users with a minimal prior knowledge of the biological context. We applied this strategy to improve the alignment of the promoters of disease-related genes from up to 44 vertebrate species. Since the launch in November 2010, we received more than 350,000 solutions submitted from more than 12,000 registered users. Our results show that solutions submitted contributed to improving the accuracy of up to 70% of the alignment blocks considered. CONCLUSIONS/SIGNIFICANCE: We demonstrate that, combined with classical algorithms, crowd computing techniques can be successfully used to help improving the accuracy of MSA. More importantly, we show that an NP-hard computational problem can be embedded in casual game that can be easily played by people without significant scientific training. This suggests that citizen science approaches can be used to exploit the billions of "human-brain peta-flops" of computation that are spent every day playing games

  1. Alignment method for parabolic trough solar concentrators

    Science.gov (United States)

    Diver, Richard B.

    2010-02-23

    A Theoretical Overlay Photographic (TOP) alignment method uses the overlay of a theoretical projected image of a perfectly aligned concentrator on a photographic image of the concentrator to align the mirror facets of a parabolic trough solar concentrator. The alignment method is practical and straightforward, and inherently aligns the mirror facets to the receiver. When integrated with clinometer measurements for which gravity and mechanical drag effects have been accounted for and which are made in a manner and location consistent with the alignment method, all of the mirrors on a common drive can be aligned and optimized for any concentrator orientation.

  2. Genome size variation in the genus Avena.

    Science.gov (United States)

    Yan, Honghai; Martin, Sara L; Bekele, Wubishet A; Latta, Robert G; Diederichsen, Axel; Peng, Yuanying; Tinker, Nicholas A

    2016-03-01

    Genome size is an indicator of evolutionary distance and a metric for genome characterization. Here, we report accurate estimates of genome size in 99 accessions from 26 species of Avena. We demonstrate that the average genome size of C genome diploid species (2C = 10.26 pg) is 15% larger than that of A genome species (2C = 8.95 pg), and that this difference likely accounts for a progression of size among tetraploid species, where AB genome configuration had similar genome sizes (average 2C = 25.74 pg). Genome size was mostly consistent within species and in general agreement with current information about evolutionary distance among species. Results also suggest that most of the polyploid species in Avena have experienced genome downsizing in relation to their diploid progenitors. Genome size measurements could provide additional quality control for species identification in germplasm collections, especially in cases where diploid and polyploid species have similar morphology.

  3. Comparative genomics in cyprinids: Common carp EST's help the annotation of the zebrafish genome

    NARCIS (Netherlands)

    Christoffels, A.; Bartfai, R.; Srinivasan, H.; Komen, J.

    2006-01-01

    Background - Automatic annotation of sequenced eukaryotic genomes integrates a combination of methodologies such as ab-initio methods and alignment of homologous genes and/or proteins. For example, annotation of the zebrafish genome within Ensembl relies heavily on available cDNA and protein sequenc

  4. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed Affan

    2012-01-26

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  5. Laser shaft alignment measurement model

    Science.gov (United States)

    Mo, Chang-tao; Chen, Changzheng; Hou, Xiang-lin; Zhang, Guoyu

    2007-12-01

    Laser beam's track which is on photosensitive surface of the a receiver will be closed curve, when driving shaft and the driven shaft rotate with same angular velocity and rotation direction. The coordinate of arbitrary point which is on the curve is decided by the relative position of two shafts. Basing on the viewpoint, a mathematic model of laser alignment is set up. By using a data acquisition system and a data processing model of laser alignment meter with single laser beam and a detector, and basing on the installation parameter of computer, the state parameter between two shafts can be obtained by more complicated calculation and correction. The correcting data of the four under chassis of the adjusted apparatus moving on the level and the vertical plane can be calculated. This will instruct us to move the apparatus to align the shafts.

  6. Genomic instability and cancer: an introduction

    Institute of Scientific and Technical Information of China (English)

    Zhiyuan Shen

    2011-01-01

    @@ Genomic instability as a major driving force of tumorigenesis.The ultimate goal of cell division for most non-cancerous somatic cells is to accurately duplicate the genome and then evenly divide the duplicated genome into the two daughter cells.This ensures that the daughter cells will have exactly the same genetic material as their parent cell.

  7. B-MIC: An Ultrafast Three-Level Parallel Sequence Aligner Using MIC.

    Science.gov (United States)

    Cui, Yingbo; Liao, Xiangke; Zhu, Xiaoqian; Wang, Bingqiang; Peng, Shaoliang

    2016-03-01

    Sequence alignment is the central process for sequence analysis, where mapping raw sequencing data to reference genome. The large amount of data generated by NGS is far beyond the process capabilities of existing alignment tools. Consequently, sequence alignment becomes the bottleneck of sequence analysis. Intensive computing power is required to address this challenge. Intel recently announced the MIC coprocessor, which can provide massive computing power. The Tianhe-2 is the world's fastest supercomputer now equipped with three MIC coprocessors each compute node. A key feature of sequence alignment is that different reads are independent. Considering this property, we proposed a MIC-oriented three-level parallelization strategy to speed up BWA, a widely used sequence alignment tool, and developed our ultrafast parallel sequence aligner: B-MIC. B-MIC contains three levels of parallelization: firstly, parallelization of data IO and reads alignment by a three-stage parallel pipeline; secondly, parallelization enabled by MIC coprocessor technology; thirdly, inter-node parallelization implemented by MPI. In this paper, we demonstrate that B-MIC outperforms BWA by a combination of those techniques using Inspur NF5280M server and the Tianhe-2 supercomputer. To the best of our knowledge, B-MIC is the first sequence alignment tool to run on Intel MIC and it can achieve more than fivefold speedup over the original BWA while maintaining the alignment precision.

  8. The alignment-distribution graph

    Science.gov (United States)

    Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert

    1993-01-01

    Implementing a data-parallel language such as Fortran 90 on a distributed-memory parallel computer requires distributing aggregate data objects (such as arrays) among the memory modules attached to the processors. The mapping of objects to the machine determines the amount of residual communication needed to bring operands of parallel operations into alignment with each other. We present a program representation called the alignment distribution graph that makes these communication requirements explicit. We describe the details of the representation, show how to model communication cost in this framework, and outline several algorithms for determining object mappings that approximately minimize residual communication.

  9. XUV ionization of aligned molecules

    Energy Technology Data Exchange (ETDEWEB)

    Kelkensberg, F.; Siu, W.; Gademann, G. [FOM Institute AMOLF, Science Park 104, NL-1098 XG Amsterdam (Netherlands); Rouzee, A.; Vrakking, M. J. J. [FOM Institute AMOLF, Science Park 104, NL-1098 XG Amsterdam (Netherlands); Max-Born-Institut, Max-Born Strasse 2A, D-12489 Berlin (Germany); Johnsson, P. [FOM Institute AMOLF, Science Park 104, NL-1098 XG Amsterdam (Netherlands); Department of Physics, Lund University, Post Office Box 118, SE-221 00 Lund (Sweden); Lucchini, M. [Department of Physics, Politecnico di Milano, Istituto di Fotonica e Nanotecnologie CNR-IFN, Piazza Leonardo da Vinci 32, 20133 Milano (Italy); Lucchese, R. R. [Department of Chemistry, Texas A and M University, College Station, Texas 77843-3255 (United States)

    2011-11-15

    New extreme-ultraviolet (XUV) light sources such as high-order-harmonic generation (HHG) and free-electron lasers (FELs), combined with laser-induced alignment techniques, enable novel methods for making molecular movies based on measuring molecular frame photoelectron angular distributions. Experiments are presented where CO{sub 2} molecules were impulsively aligned using a near-infrared laser and ionized using femtosecond XUV pulses obtained by HHG. Measured electron angular distributions reveal contributions from four orbitals and the onset of the influence of the molecular structure.

  10. Fast pairwise structural RNA alignments by pruning of the dynamical programming matrix

    DEFF Research Database (Denmark)

    Havgaard, Jakob Hull; Torarinsson, Elfar; Gorodkin, Jan

    2007-01-01

    genomes. One main problem with these methods is their computational complexity, and heuristics are therefore employed. Two heuristics are currently very popular: pre-folding and pre-aligning. However, these heuristics are not ideal, as pre-aligning is dependent on sequence similarity that may...... the advantage of providing the constraints dynamically. This has been included in a new implementation of the FOLDALIGN algorithm for pairwise local or global structural alignment of RNA sequences. It is shown that time and memory requirements are dramatically lowered while overall performance is maintained....... Furthermore, a new divide and conquer method is introduced to limit the memory requirement during global alignment and backtrack of local alignment. All branch points in the computed RNA structure are found and used to divide the structure into smaller unbranched segments. Each segment is then realigned...

  11. Systematic evaluation of spliced alignment programs for RNA-seq data

    OpenAIRE

    Engström, Pär G; Steijger, Tamara; Sipos, Botond; Grant, Gregory R; Kahles, André; RGASP Consortium; Rätsch, Gunnar; Goldman, Nick; Hubbard, Tim J.; Harrow, Jennifer; Guigó Serra, Roderic; Bertone, Paul

    2013-01-01

    High-throughput RNA sequencing is an increasingly accessible method for studying gene structure and activity on a genome-wide scale. A critical step in RNA-seq data analysis is the alignment of partial transcript reads to a reference genome sequence. To assess the performance of current mapping software, we invited developers of RNA-seq aligners to process four large human and mouse RNA-seq data sets. In total, we compared 26 mapping protocols based on 11 programs and pipelines and found majo...

  12. Detecting the limits of regulatory element conservation anddivergence estimation using pairwise and multiple alignments

    Energy Technology Data Exchange (ETDEWEB)

    Pollard, Daniel A.; Moses, Alan M.; Iyer, Venky N.; Eisen,Michael B.

    2006-08-14

    Background: Molecular evolutionary studies of noncodingsequences rely on multiple alignments. Yet how multiple alignmentaccuracy varies across sequence types, tree topologies, divergences andtools, and further how this variation impacts specific inferences,remains unclear. Results: Here we develop a molecular evolutionsimulation platform, CisEvolver, with models of background noncoding andtranscription factor binding site evolution, and use simulated alignmentsto systematically examine multiple alignment accuracy and its impact ontwo key molecular evolutionary inferences: transcription factor bindingsite conservation and divergence estimation. We find that the accuracy ofmultiple alignments is determined almost exclusively by the pairwisedivergence distance of the two most diverged species and that additionalspecies have a negligible influence on alignment accuracy. Conservedtranscription factor binding sites align better than surroundingnoncoding DNA yet are often found to be misaligned at relatively shortdivergence distances, such that studies of binding site gain and losscould easily be confounded by alignment error. Divergence estimates frommultiple alignments tend to be overestimated at short divergencedistances but reach a tool specific divergence at which they cease toincrease, leading to underestimation at long divergences. Our moststriking finding was that overall alignment accuracy, binding sitealignment accuracy and divergence estimation accuracy vary greatly acrossbranches in a tree and are most accurate for terminal branches connectingsister taxa and least accurate for internal branches connectingsub-alignments. Conclusions: Our results suggest that variation inalignment accuracy can lead to errors in molecular evolutionaryinferences that could be construed as biological variation. Thesefindings have implications for which species to choose for analyses, whatkind of errors would be expected for a given set of species and howmultiple alignment tools and

  13. DIALIGN P: Fast pair-wise and multiple sequence alignment using parallel processors

    Directory of Open Access Journals (Sweden)

    Kaufmann Michael

    2004-09-01

    Full Text Available Abstract Background Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Results Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. Conclusions By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.

  14. ChromAlign: A two-step algorithmic procedure for time alignment of three-dimensional LC-MS chromatographic surfaces.

    Science.gov (United States)

    Sadygov, Rovshan G; Maroto, Fernando Martin; Hühmer, Andreas F R

    2006-12-15

    We present an algorithmic approach to align three-dimensional chromatographic surfaces of LC-MS data of complex mixture samples. The approach consists of two steps. In the first step, we prealign chromatographic profiles: two-dimensional projections of chromatographic surfaces. This is accomplished by correlation analysis using fast Fourier transforms. In this step, a temporal offset that maximizes the overlap and dot product between two chromatographic profiles is determined. In the second step, the algorithm generates correlation matrix elements between full mass scans of the reference and sample chromatographic surfaces. The temporal offset from the first step indicates a range of the mass scans that are possibly correlated, then the correlation matrix is calculated only for these mass scans. The correlation matrix carries information on highly correlated scans, but it does not itself determine the scan or time alignment. Alignment is determined as a path in the correlation matrix that maximizes the sum of the correlation matrix elements. The computational complexity of the optimal path generation problem is reduced by the use of dynamic programming. The program produces time-aligned surfaces. The use of the temporal offset from the first step in the second step reduces the computation time for generating the correlation matrix and speeds up the process. The algorithm has been implemented in a program, ChromAlign, developed in C++ language for the .NET2 environment in WINDOWS XP. In this work, we demonstrate the applications of ChromAlign to alignment of LC-MS surfaces of several datasets: a mixture of known proteins, samples from digests of surface proteins of T-cells, and samples prepared from digests of cerebrospinal fluid. ChromAlign accurately aligns the LC-MS surfaces we studied. In these examples, we discuss various aspects of the alignment by ChromAlign, such as constant time axis shifts and warping of chromatographic surfaces.

  15. The Rigors of Aligning Performance

    Science.gov (United States)

    2015-06-01

    organization must consider and work closely with its many stakeholders so as to guarantee satisfaction ; this idea is especially important as there is no...define success. Methodology includes a literature review, employee and customer surveys and a Strength, Weaknesses, Opportunities, Threats...bearing in mind customer perceptions. Recommendations include employee training centered on goal alignment, which is vital to highlight the

  16. Aligning Assessments for COSMA Accreditation

    Science.gov (United States)

    Laird, Curt; Johnson, Dennis A.; Alderman, Heather

    2015-01-01

    Many higher education sport management programs are currently in the process of seeking accreditation from the Commission on Sport Management Accreditation (COSMA). This article provides a best-practice method for aligning student learning outcomes with a sport management program's mission and goals. Formative and summative assessment procedures…

  17. Aligned natural inflation with modulations

    Directory of Open Access Journals (Sweden)

    Kiwoon Choi

    2016-08-01

    Full Text Available The weak gravity conjecture applied for the aligned natural inflation indicates that generically there can be a modulation of the inflaton potential, with a period determined by sub-Planckian axion scale. We study the oscillations in the primordial power spectrum induced by such modulation, and discuss the resulting observational constraints on the model.

  18. Theoretical and practical feasibility demonstration of a micrometric remotely controlled pre-alignment system for the CLIC linear collider

    CERN Document Server

    Mainaud Durand, H; Chritin, N; Griffet, S; Kemppinen, J; Sosin, M; Touze, T

    2011-01-01

    The active pre-alignment of the Compact Linear Collider (CLIC) is one of the key points of the project: the components must be pre-aligned w.r.t. a straight line within a few microns over a sliding window of 200 m, along the two linacs of 20 km each. The proposed solution consists of stretched wires of more than 200 m, overlapping over half of their length, which will be the reference of alignment. Wire Positioning Sensors (WPS), coupled to the supports to be pre-aligned, will perform precise and accurate measurements within a few microns w.r.t. these wires. A micrometric fiducialisation of the components and a micrometric alignment of the components on common supports will make the strategy of pre-alignment complete. In this paper, the global strategy of active pre-alignment is detailed and illustrated by the latest results demonstrating the feasibility of the proposed solution.

  19. AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

    Directory of Open Access Journals (Sweden)

    Claros M Gonzalo

    2010-06-01

    Full Text Available Abstract Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used

  20. Progressive multiple sequence alignments from triplets

    Directory of Open Access Journals (Sweden)

    Stadler Peter F

    2007-07-01

    Full Text Available Abstract Background The quality of progressive sequence alignments strongly depends on the accuracy of the individual pairwise alignment steps since gaps that are introduced at one step cannot be removed at later aggregation steps. Adjacent insertions and deletions necessarily appear in arbitrary order in pairwise alignments and hence form an unavoidable source of errors. Research Here we present a modified variant of progressive sequence alignments that addresses both issues. Instead of pairwise alignments we use exact dynamic programming to align sequence or profile triples. This avoids a large fractions of the ambiguities arising in pairwise alignments. In the subsequent aggregation steps we follow the logic of the Neighbor-Net algorithm, which constructs a phylogenetic network by step-wisely replacing triples by pairs instead of combining pairs to singletons. To this end the three-way alignments are subdivided into two partial alignments, at which stage all-gap columns are naturally removed. This alleviates the "once a gap, always a gap" problem of progressive alignment procedures. Conclusion The three-way Neighbor-Net based alignment program aln3nn is shown to compare favorably on both protein sequences and nucleic acids sequences to other progressive alignment tools. In the latter case one easily can include scoring terms that consider secondary structure features. Overall, the quality of resulting alignments in general exceeds that of clustalw or other multiple alignments tools even though our software does not included heuristics for context dependent (mismatch scores.

  1. In-Flight Self-Alignment Method Aided by Geomagnetism for Moving Basement of Guided Munitions

    Directory of Open Access Journals (Sweden)

    Shuang-biao Zhang

    2015-01-01

    Full Text Available Due to power-after-launch mode of guided munitions of high rolling speed, initial attitude of munitions cannot be determined accurately, and this makes it difficult for navigation and control system to work effectively and validly. An in-flight self-alignment method aided by geomagnetism that includes a fast in-flight coarse alignment method and an in-flight alignment model based on Kalman theory is proposed in this paper. Firstly a fast in-flight coarse alignment method is developed by using gyros, magnetic sensors, and trajectory angles. Then, an in-flight alignment model is derived by investigation of the measurement errors and attitude errors, which regards attitude errors as state variables and geomagnetic components in navigation frame as observed variables. Finally, fight data of a spinning projectile is used to verify the performance of the in-flight self-alignment method. The satisfying results show that (1 the precision of coarse alignment can attain below 5°; (2 the attitude errors by in-flight alignment model converge to 24′ at early of the latter half of the flight; (3 the in-flight alignment model based on Kalman theory has better adaptability, and show satisfying performance.

  2. Aligned Layers of Silver Nano-Fibers

    Directory of Open Access Journals (Sweden)

    Andrii B. Golovin

    2012-02-01

    Full Text Available We describe a new dichroic polarizers made by ordering silver nano-fibers to aligned layers. The aligned layers consist of nano-fibers and self-assembled molecular aggregates of lyotropic liquid crystals. Unidirectional alignment of the layers is achieved by means of mechanical shearing. Aligned layers of silver nano-fibers are partially transparent to a linearly polarized electromagnetic radiation. The unidirectional alignment and density of the silver nano-fibers determine degree of polarization of transmitted light. The aligned layers of silver nano-fibers might be used in optics, microwave applications, and organic electronics.

  3. Image-based quantification of fiber alignment within electrospun tissue engineering scaffolds is related to mechanical anisotropy.

    Science.gov (United States)

    Fee, Timothy; Downs, Crawford; Eberhardt, Alan; Zhou, Yong; Berry, Joel

    2016-07-01

    It is well documented that electrospun tissue engineering scaffolds can be fabricated with variable degrees of fiber alignment to produce scaffolds with anisotropic mechanical properties. Several attempts have been made to quantify the degree of fiber alignment within an electrospun scaffold using image-based methods. However, these methods are limited by the inability to produce a quantitative measure of alignment that can be used to make comparisons across publications. Therefore, we have developed a new approach to quantifying the alignment present within a scaffold from scanning electron microscopic (SEM) images. The alignment is determined by using the Sobel approximation of the image gradient to determine the distribution of gradient angles with an image. This data was fit to a Von Mises distribution to find the dispersion parameter κ, which was used as a quantitative measure of fiber alignment. We fabricated four groups of electrospun polycaprolactone (PCL) + Gelatin scaffolds with alignments ranging from κ = 1.9 (aligned) to κ = 0.25 (random) and tested our alignment quantification method on these scaffolds. It was found that our alignment quantification method could distinguish between scaffolds of different alignments more accurately than two other published methods. Additionally, the alignment parameter κ was found to be a good predictor the mechanical anisotropy of our electrospun scaffolds. The ability to quantify fiber alignment within and make direct comparisons of scaffold fiber alignment across publications can reduce ambiguity between published results where cells are cultured on "highly aligned" fibrous scaffolds. This could have important implications for characterizing mechanics and cellular behavior on aligned tissue engineering scaffolds. © 2016 Wiley Periodicals, Inc. J Biomed Mater Res Part A: 104A: 1680-1686, 2016.

  4. RNASequel: accurate and repeat tolerant realignment of RNA-seq reads.

    Science.gov (United States)

    Wilson, Gavin W; Stein, Lincoln D

    2015-10-15

    RNA-seq is a key technology for understanding the biology of the cell because of its ability to profile transcriptional and post-transcriptional regulation at single nucleotide resolutions. Compared to DNA sequencing alignment algorithms, RNA-seq alignment algorithms have a diminished ability to accurately detect and map base pair substitutions, gaps, discordant pairs and repetitive regions. These shortcomings adversely affect experiments that require a high degree of accuracy, notably the ability to detect RNA editing. We have developed RNASequel, a software package that runs as a post-processing step in conjunction with an RNA-seq aligner and systematically corrects common alignment artifacts. Its key innovations are a two-pass splice junction alignment system that includes de novo splice junctions and the use of an empirically determined estimate of the fragment size distribution when resolving read pairs. We demonstrate that RNASequel produces improved alignments when used in conjunction with STAR or Tophat2 using two simulated datasets. We then show that RNASequel improves the identification of adenosine to inosine RNA editing sites on biological datasets. This software will be useful in applications requiring the accurate identification of variants in RNA sequencing data, the discovery of RNA editing sites and the analysis of alternative splicing.

  5. Library preparation for highly accurate population sequencing of RNA viruses

    Science.gov (United States)

    Acevedo, Ashley; Andino, Raul

    2015-01-01

    Circular resequencing (CirSeq) is a novel technique for efficient and highly accurate next-generation sequencing (NGS) of RNA virus populations. The foundation of this approach is the circularization of fragmented viral RNAs, which are then redundantly encoded into tandem repeats by ‘rolling-circle’ reverse transcription. When sequenced, the redundant copies within each read are aligned to derive a consensus sequence of their initial RNA template. This process yields sequencing data with error rates far below the variant frequencies observed for RNA viruses, facilitating ultra-rare variant detection and accurate measurement of low-frequency variants. Although library preparation takes ~5 d, the high-quality data generated by CirSeq simplifies downstream data analysis, making this approach substantially more tractable for experimentalists. PMID:24967624

  6. A physical map of the human genome

    Energy Technology Data Exchange (ETDEWEB)

    McPherson, J.D.; Marra, M.; Hillier, L.; Waterston, R.H.; Chinwalla, A.; Wallis, J.; Sekhon, M.; Wylie, K.; Mardis, E.R.; Wilson, R.K.; Fulton, R.; Kucaba, T.A.; Wagner-McPherson, C.; Barbazuk, W.B.; Gregory, S.G.; Humphray, S.J.; French, L.; Evans, R.S.; Bethel, G.; Whittaker, A.; Holden, J.L.; McCann, O.T.; Dunham, A.; Soderlund, C.; Scott, C.E.; Bentley, D.R.; Schuler, G.; Chen, H.-C.; Jang, W.; Green, E.D.; Idol, J.R.; Maduro, V.V. Braden; Montgomery, K.T.; Lee, E.; Miller, A.; Emerling, S.; Kucherlapati; Gibbs, R.; Scherer, S.; Gorrell, J.H.; Sodergren, E.; Clerc-Blankenburg, K.; Tabor, P.; Naylor, S.; Garcia, D.; de Jong, P.J.; Catanese, J.J.; Nowak, N.; Osoegawa, K.; Qin, S.; Rowen, L.; Madan, A.; Dors, M.; Hood, L.; Trask, B.; Friedman, C.; Massa, H.; Cheung, V.G.; Kirsch, I.R.; Reid, T.; Yonescu, R.; Weissenbach, J.; Bruls, T.; Heilig, R.; Branscomb, E.; Olsen, A.; Doggett, N.; Cheng, J.F.; Hawkins, T.; Myers, R.M.; Shang, J.; Ramirez, L.; Schmutz, J.; Velasquez, O.; Dixon, K.; Stone, N.E.; Cox, D.R.; Haussler, D.; Kent, W.J.; Furey, T.; Rogic, S.; Kennedy, S.; Jones, S.; Rosenthal, A.; Wen, G.; Schilhabel, M.; Gloeckner, G.; Nyakatura, G.; Siebert, R.; Schlegelberger, B.; Korenberg, J.; Chen, X.N.; Fujiyama, A.; Hattori, M.; Toyoda, A.; Yada, T.; Park, H.S.; Sakaki, Y.; Shimizu, N.; Asakawa, S.; Kawasaki, K.; Sasaki, T.; Shintani, A.; Shimizu, A.; Shibuya, K.; Kudoh, J.; Minoshima, S.; Ramser, J.; Seranski, P.; Hoff, C.; Poustka, A.; Reinhardt, R.; Lehrach, H.

    2001-01-01

    The human genome is by far the largest genome to be sequenced, and its size and complexity present many challenges for sequence assembly. The International Human Genome Sequencing Consortium constructed a map of the whole genome to enable the selection of clones for sequencing and for the accurate assembly of the genome sequence. Here we report the construction of the whole-genome bacterial artificial chromosome (BAC) map and its integration with previous landmark maps and information from mapping efforts focused on specific chromosomal regions. We also describe the integration of sequence data with the map.

  7. Alignment of multiple-off-axis-beam imaging/interference systems.

    Science.gov (United States)

    Vadivel, Shruthi K; Leibovici, Matthieu C R; Gaylord, Thomas K

    2016-04-20

    The alignment of components in complex multibeam arrangements is typically prone to errors that limit the performance of the system. A systematic procedure for aligning such systems is presented here. The method facilitates the precision alignment of the optical elements to achieve the accurate projection of multiple on- and off-axis images and the simultaneous interference of the multiple beams. In addition to the multibeam imaging/interference system presented, the procedure can be employed in other multibeam imaging and/or interfering configurations.

  8. A DNA sequence alignment algorithm using quality information and a fuzzy inference method

    Institute of Scientific and Technical Information of China (English)

    Kwangbaek Kim; Minhwan Kim; Youngwoon Woo

    2008-01-01

    DNA sequence alignment algorithms in computational molecular biology have been improved by diverse methods.In this paper.We propose a DNA sequence alignment that Uses quality information and a fuzzy inference method developed based on the characteristics of DNA fragments and a fuzzy logic system in order to improve conventional DNA sequence alignment methods that uses DNA sequence quality information.In conventional algorithms.DNA sequence alignment scores are calculated by the global sequence alignment algorithm proposed by Needleman-Wunsch,which is established by using quality information of each DNA fragment.However,there may be errors in the process of calculating DNA sequence alignment scores when the quality of DNA fragment tips is low.because only the overall DNA sequence quality information are used.In our proposed method.an exact DNA sequence alignment can be achieved in spite of the low quality of DNA fragment tips by improvement of conventional algorithms using quality information.Mapping score parameters used to calculate DNA sequence alignment scores are dynamically adjusted by the fuzzy logic system utilizing lengths of DNA fragments and frequencies of low quality DNA bases in the fragments.From the experiments by applying real genome data of National Center for Bioteclmology Information,we could see that the proposed method is more efficient than conventional algorithms.

  9. Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits

    Science.gov (United States)

    Pecetti, Luciano; Brummer, E. Charles; Palmonari, Alberto; Tava, Aldo

    2017-01-01

    Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3–0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits

  10. Efficient and accurate fragmentation methods.

    Science.gov (United States)

    Pruitt, Spencer R; Bertoni, Colleen; Brorsen, Kurt R; Gordon, Mark S

    2014-09-16

    Conspectus Three novel fragmentation methods that are available in the electronic structure program GAMESS (general atomic and molecular electronic structure system) are discussed in this Account. The fragment molecular orbital (FMO) method can be combined with any electronic structure method to perform accurate calculations on large molecular species with no reliance on capping atoms or empirical parameters. The FMO method is highly scalable and can take advantage of massively parallel computer systems. For example, the method has been shown to scale nearly linearly on up to 131 000 processor cores for calculations on large water clusters. There have been many applications of the FMO method to large molecular clusters, to biomolecules (e.g., proteins), and to materials that are used as heterogeneous catalysts. The effective fragment potential (EFP) method is a model potential approach that is fully derived from first principles and has no empirically fitted parameters. Consequently, an EFP can be generated for any molecule by a simple preparatory GAMESS calculation. The EFP method provides accurate descriptions of all types of intermolecular interactions, including Coulombic interactions, polarization/induction, exchange repulsion, dispersion, and charge transfer. The EFP method has been applied successfully to the study of liquid water, π-stacking in substituted benzenes and in DNA base pairs, solvent effects on positive and negative ions, electronic spectra and dynamics, non-adiabatic phenomena in electronic excited states, and nonlinear excited state properties. The effective fragment molecular orbital (EFMO) method is a merger of the FMO and EFP methods, in which interfragment interactions are described by the EFP potential, rather than the less accurate electrostatic potential. The use of EFP in this manner facilitates the use of a smaller value for the distance cut-off (Rcut). Rcut determines the distance at which EFP interactions replace fully quantum

  11. Accurate determination of antenna directivity

    DEFF Research Database (Denmark)

    Dich, Mikael

    1997-01-01

    The derivation of a formula for accurate estimation of the total radiated power from a transmitting antenna for which the radiated power density is known in a finite number of points on the far-field sphere is presented. The main application of the formula is determination of directivity from power......-pattern measurements. The derivation is based on the theory of spherical wave expansion of electromagnetic fields, which also establishes a simple criterion for the required number of samples of the power density. An array antenna consisting of Hertzian dipoles is used to test the accuracy and rate of convergence...

  12. The UCSC Genome Browser database: 2016 update.

    Science.gov (United States)

    Speir, Matthew L; Zweig, Ann S; Rosenbloom, Kate R; Raney, Brian J; Paten, Benedict; Nejad, Parisa; Lee, Brian T; Learned, Katrina; Karolchik, Donna; Hinrichs, Angie S; Heitner, Steve; Harte, Rachel A; Haeussler, Maximilian; Guruvadoo, Luvina; Fujita, Pauline A; Eisenhart, Christopher; Diekhans, Mark; Clawson, Hiram; Casper, Jonathan; Barber, Galt P; Haussler, David; Kuhn, Robert M; Kent, W James

    2016-01-01

    For the past 15 years, the UCSC Genome Browser (http://genome.ucsc.edu/) has served the international research community by offering an integrated platform for viewing and analyzing information from a large database of genome assemblies and their associated annotations. The UCSC Genome Browser has been under continuous development since its inception with new data sets and software features added frequently. Some release highlights of this year include new and updated genome browsers for various assemblies, including bonobo and zebrafish; new gene annotation sets; improvements to track and assembly hub support; and a new interactive tool, the "Data Integrator", for intersecting data from multiple tracks. We have greatly expanded the data sets available on the most recent human assembly, hg38/GRCh38, to include updated gene prediction sets from GENCODE, more phenotype- and disease-associated variants from ClinVar and ClinGen, more genomic regulatory data, and a new multiple genome alignment.

  13. Hohlraum Target Alignment from X-ray Detector Images using Starburst Design Patterns

    Energy Technology Data Exchange (ETDEWEB)

    Leach, R R; Conder, A; Edwards, O; Kroll, J; Kozioziemski, B; Mapoles, E; McGuigan, D; Wilhelmsen, K

    2010-12-14

    National Ignition Facility (NIF) is a high-energy laser facility comprised of 192 laser beams focused with enough power and precision on a hydrogen-filled spherical, cryogenic target to initiate a fusion reaction. The target container, or hohlraum, must be accurately aligned to an x-ray imaging system to allow careful monitoring of the frozen fuel layer in the target. To achieve alignment, x-ray images are acquired through starburst-shaped windows cut into opposite sides of the hohlraum. When the hohlraum is in alignment, the starburst pattern pairs match nearly exactly and allow a clear view of the ice layer formation on the edge of the target capsule. During the alignment process, x-ray image analysis is applied to determine the direction and magnitude of adjustment required. X-ray detector and source are moved in concert during the alignment process. The automated pointing alignment system described here is both accurate and efficient. In this paper, we describe the control and associated image processing that enables automation of the starburst pointing alignment.

  14. Aligning seminars with Bologna requirements

    DEFF Research Database (Denmark)

    Lueg, Klarissa; Lueg, Rainer; Lauridsen, Ole

    2016-01-01

    Changes in public policy, such as the Bologna Process, require students to be equipped with multifunctional competencies to master relevant tasks in unfamiliar situations. Achieving this goal might imply a change in many curricula toward deeper learning. As a didactical means to achieve deep...... learning results, the authors suggest reciprocal peer tutoring (RPT); as a conceptual framework the authors suggest the SOLO (Structure of Observed Learning Outcomes) taxonomy and constructive alignment as suggested by Biggs and Tang. Our study presents results from the introduction of RPT in a large...... course. The authors find that RPT produces satisfying learning outcomes, active students, and ideal constructive alignments of the seminar content with the exam, the intended learning outcomes, and the requirements of the Bologna Process. Our data, which comprise surveys and evaluations from both faculty...

  15. Prism Window for Optical Alignment

    Science.gov (United States)

    Tang, Hong

    2008-01-01

    A prism window has been devised for use, with an autocollimator, in aligning optical components that are (1) required to be oriented parallel to each other and/or at a specified angle of incidence with respect to a common optical path and (2) mounted at different positions along the common optical path. The prism window can also be used to align a single optical component at a specified angle of incidence. Prism windows could be generally useful for orienting optical components in manufacture of optical instruments. "Prism window" denotes an application-specific unit comprising two beam-splitter windows that are bonded together at an angle chosen to obtain the specified angle of incidence.

  16. Aligned mesoporous architectures and devices.

    Energy Technology Data Exchange (ETDEWEB)

    Brinker, C. Jeffrey; Lu, Yunfeng (University of California Los Angeles, Los Angeles, CA)

    2011-03-01

    This is the final report for the Presidential Early Career Award for Science and Engineering - PECASE (LDRD projects 93369 and 118841) awarded to Professor Yunfeng Lu (Tulane University and University of California-Los Angeles). During the last decade, mesoporous materials with tunable periodic pores have been synthesized using surfactant liquid crystalline as templates, opening a new avenue for a wide spectrum of applications. However, the applications are somewhat limited by the unfavorabe pore orientation of these materials. Although substantial effort has been devoted to align the pore channels, fabrication of mesoporous materials with perpendicular pore channels remains challenging. This project focused on fabrication of mesoporous materials with perpendicularly aligned pore channels. We demonstrated structures for use in water purification, separation, sensors, templated synthesis, microelectronics, optics, controlled release, and highly selective catalysts.

  17. The Cluster Substructure - Alignment Connection

    OpenAIRE

    Plionis, Manolis

    2001-01-01

    Using the APM cluster data we investigate whether the dynamical status of clusters is related to the large-scale structure of the Universe. We find that cluster substructure is strongly correlated with the tendency of clusters to be aligned with their nearest neighbour and in general with the nearby clusters that belong to the same supercluster. Furthermore, dynamically young clusters are more clustered than the overall cluster population. These are strong indications that cluster develop in ...

  18. Grain alignment in starless cores

    Energy Technology Data Exchange (ETDEWEB)

    Jones, T. J.; Bagley, M. [Minnesota Institute for Astrophysics, University of Minnesota, Minneapolis, MN 55455 (United States); Krejny, M. [Cree Inc., 4600 Silicon Dr., Durham, NC (United States); Andersson, B.-G. [SOFIA Science Center, USRA, Moffett Field, CA (United States); Bastien, P., E-mail: tjj@astro.umn.edu [Centre de recherche en astrophysique du Québec and Départment de Physique, Université de Montréal, Montréal (Canada)

    2015-01-01

    We present near-IR polarimetry data of background stars shining through a selection of starless cores taken in the K band, probing visual extinctions up to A{sub V}∼48. We find that P{sub K}/τ{sub K} continues to decline with increasing A{sub V} with a power law slope of roughly −0.5. Examination of published submillimeter (submm) polarimetry of starless cores suggests that by A{sub V}≳20 the slope for P versus τ becomes ∼−1, indicating no grain alignment at greater optical depths. Combining these two data sets, we find good evidence that, in the absence of a central illuminating source, the dust grains in dense molecular cloud cores with no internal radiation source cease to become aligned with the local magnetic field at optical depths greater than A{sub V}∼20. A simple model relating the alignment efficiency to the optical depth into the cloud reproduces the observations well.

  19. From Word Alignment to Word Senses, via Multilingual Wordnets

    Directory of Open Access Journals (Sweden)

    Dan Tufis

    2006-05-01

    Full Text Available Most of the successful commercial applications in language processing (text and/or speech dispense with any explicit concern on semantics, with the usual motivations stemming from the computational high costs required for dealing with semantics, in case of large volumes of data. With recent advances in corpus linguistics and statistical-based methods in NLP, revealing useful semantic features of linguistic data is becoming cheaper and cheaper and the accuracy of this process is steadily improving. Lately, there seems to be a growing acceptance of the idea that multilingual lexical ontologisms might be the key towards aligning different views on the semantic atomic units to be used in characterizing the general meaning of various and multilingual documents. Depending on the granularity at which semantic distinctions are necessary, the accuracy of the basic semantic processing (such as word sense disambiguation can be very high with relatively low complexity computing. The paper substantiates this statement by presenting a statistical/based system for word alignment and word sense disambiguation in parallel corpora. We describe a word alignment platform which ensures text pre-processing (tokenization, POS-tagging, lemmatization, chunking, sentence and word alignment as required by an accurate word sense disambiguation.

  20. Hardware Acceleration of Bioinformatics Sequence Alignment Applications

    NARCIS (Netherlands)

    Hasan, L.

    2011-01-01

    Biological sequence alignment is an important and challenging task in bioinformatics. Alignment may be defined as an arrangement of two or more DNA or protein sequences to highlight the regions of their similarity. Sequence alignment is used to infer the evolutionary relationship between a set of pr

  1. Physician-Hospital Alignment in Orthopedic Surgery.

    Science.gov (United States)

    Bushnell, Brandon D

    2015-09-01

    The concept of "alignment" between physicians and hospitals is a popular buzzword in the age of health care reform. Despite their often tumultuous histories, physicians and hospitals find themselves under increasing pressures to work together toward common goals. However, effective alignment is more than just simple cooperation between parties. The process of achieving alignment does not have simple, universal steps. Alignment will differ based on individual situational factors and the type of specialty involved. Ultimately, however, there are principles that underlie the concept of alignment and should be a part of any physician-hospital alignment efforts. In orthopedic surgery, alignment involves the clinical, administrative, financial, and even personal aspects of a surgeon's practice. It must be based on the principles of financial interest, clinical authority, administrative participation, transparency, focus on the patient, and mutual necessity. Alignment can take on various forms as well, with popular models consisting of shared governance and comanagement, gainsharing, bundled payments, accountable care organizations, and other methods. As regulatory and financial pressures continue to motivate physicians and hospitals to develop alignment relationships, new and innovative methods of alignment will also appear. Existing models will mature and evolve, with individual variability based on local factors. However, certain trends seem to be appearing as time progresses and alignment relationships deepen, including regional and national collaboration, population management, and changes in the legal system. This article explores the history, principles, and specific methods of physician-hospital alignment and its critical importance for the future of health care delivery.

  2. An Overview of Multiple Sequence Alignment Systems

    CERN Document Server

    Saeed, Fahad

    2009-01-01

    An overview of current multiple alignment systems to date are described.The useful algorithms, the procedures adopted and their limitations are presented.We also present the quality of the alignments obtained and in which cases(kind of alignments, kind of sequences etc) the particular systems are useful.

  3. Inferring comprehensible business/ICT alignment rules

    NARCIS (Netherlands)

    Cumps, B.; Martens, D.; De Backer, M.; Haesen, R.; Viaene, S.; Dedene, G.; Baesens, B.; Snoeck, M.

    2009-01-01

    We inferred business rules for business/ICT alignment by applying a novel rule induction algorithm on a data set containing rich alignment information polled from 641 organisations in 7 European countries. The alignment rule set was created using AntMiner+, a rule induction technique with a reputati

  4. Shift dynamics of capillary self-alignment

    NARCIS (Netherlands)

    Arutinov, G.; Mastrangeli, M.; Smits, E.C.P.; Heck, G.V.; Schoo, H.F.M.; Toonder, J.J.M. den; Dietzel, A.H.

    2014-01-01

    This paper describes the dynamics of capillary self-alignment of components with initial shift offsets from matching receptor sites. The analysis of the full uniaxial self-alignment dynamics of foil-based mesoscopic dies from pre-alignment to final settling evidenced three distinct, sequential regim

  5. Alignment of lower-limb prostheses.

    Science.gov (United States)

    Zahedi, M S; Spence, W D; Solomonidis, S E; Paul, J P

    1986-04-01

    Alignment of a prosthesis is defined as the position of the socket relative to the other prosthetic components of the limb. During dynamic alignment the prosthetist, using subjective judgment and feedback from the patient, aims to achieve the most suitable limb geometry for best function and comfort. Until recently it was generally believed that a patient could only be satisfied with a unique "optimum alignment." The purpose of this systematic study of lower-limb alignment parameters was to gain an understanding of the factors that make a limb configuration or optimum alignment, acceptable to the patient, and to obtain a measure of the variation of this alignment that would be acceptable to the amputee. In this paper, the acceptable range of alignments for 10 below- and 10 above-knee amputees are established. Three prosthetists were involved in the majority of the 183 below-knee and 100 above-knee fittings, although several other prosthetists were also involved. The effects of each different prosthetist on the established range of alignment for each patient are reported to be significant. It is now established that an amputee can tolerate several alignments ranging in some parameters by as much as 148 mm in shifts and 17 degrees in tilts. This paper describes the method of defining and measuring the alignment of lower-limb prostheses. It presents quantitatively established values for bench alignment position and the range of adjustment required for incorporation into the design of new alignment units.

  6. Aligning Projection Images from Binary Volumes

    NARCIS (Netherlands)

    Bleichrodt, F.; Beenhouwer, J. de; Sijbers, J.; Batenburg, K.J.

    2014-01-01

    In tomography, slight differences between the geometry of the scanner hardware and the geometric model used in the reconstruction lead to alignment artifacts. To exploit high-resolution detectors used in many applications of tomography, alignment of the projection data is essential. Markerless align

  7. Vertically aligned nanostructure scanning probe microscope tips

    Science.gov (United States)

    Guillorn, Michael A.; Ilic, Bojan; Melechko, Anatoli V.; Merkulov, Vladimir I.; Lowndes, Douglas H.; Simpson, Michael L.

    2006-12-19

    Methods and apparatus are described for cantilever structures that include a vertically aligned nanostructure, especially vertically aligned carbon nanofiber scanning probe microscope tips. An apparatus includes a cantilever structure including a substrate including a cantilever body, that optionally includes a doped layer, and a vertically aligned nanostructure coupled to the cantilever body.

  8. Strategic Alignment and New Product Development

    DEFF Research Database (Denmark)

    Acur, Nuran; Kandemir, Destan; Boer, Harry

    2012-01-01

    Strategic alignment is widely accepted as a prerequisite for a firm’s success, but insight into the role of alignment in, and its impact on, the new product evelopment (NPD) process and its performance is less well developed. Most publications on this topic either focus on one form of alignment o...

  9. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  10. The Oryza Map Alignment Project (OMAP) introgression lines for allelic diversity and new germplasm development

    Science.gov (United States)

    The Oryza Map Alignment Project (OMAP) has developed a genus wide model system for the study of rice that will ultimately provide a complete understanding of the genus. The purpose of this project is to capitalize on the strengths of the Arizona Genomics Institute (AGI), OMAP participants and the r...

  11. Hydropathy profile alignment : a tool to search for structural homologues of membrane proteins

    NARCIS (Netherlands)

    Lolkema, JS; Slotboom, DJ

    1998-01-01

    Hydropathy profile alignment is introduced as a tool in functional genomics. The architecture of membrane proteins is reflected in the hydropathy profile of the amino acid sequence. Both secondary and tertiary structural elements determine the profile which provides enough sensitivity to detect evol

  12. Accurate Modeling of Advanced Reflectarrays

    DEFF Research Database (Denmark)

    Zhou, Min

    Analysis and optimization methods for the design of advanced printed re ectarrays have been investigated, and the study is focused on developing an accurate and efficient simulation tool. For the analysis, a good compromise between accuracy and efficiency can be obtained using the spectral domain...... to the POT. The GDOT can optimize for the size as well as the orientation and position of arbitrarily shaped array elements. Both co- and cross-polar radiation can be optimized for multiple frequencies, dual polarization, and several feed illuminations. Several contoured beam reflectarrays have been designed...... using the GDOT to demonstrate its capabilities. To verify the accuracy of the GDOT, two offset contoured beam reflectarrays that radiate a high-gain beam on a European coverage have been designed and manufactured, and subsequently measured at the DTU-ESA Spherical Near-Field Antenna Test Facility...

  13. The Accurate Particle Tracer Code

    CERN Document Server

    Wang, Yulei; Qin, Hong; Yu, Zhi

    2016-01-01

    The Accurate Particle Tracer (APT) code is designed for large-scale particle simulations on dynamical systems. Based on a large variety of advanced geometric algorithms, APT possesses long-term numerical accuracy and stability, which are critical for solving multi-scale and non-linear problems. Under the well-designed integrated and modularized framework, APT serves as a universal platform for researchers from different fields, such as plasma physics, accelerator physics, space science, fusion energy research, computational mathematics, software engineering, and high-performance computation. The APT code consists of seven main modules, including the I/O module, the initialization module, the particle pusher module, the parallelization module, the field configuration module, the external force-field module, and the extendible module. The I/O module, supported by Lua and Hdf5 projects, provides a user-friendly interface for both numerical simulation and data analysis. A series of new geometric numerical methods...

  14. Accurate ab initio spin densities

    CERN Document Server

    Boguslawski, Katharina; Legeza, Örs; Reiher, Markus

    2012-01-01

    We present an approach for the calculation of spin density distributions for molecules that require very large active spaces for a qualitatively correct description of their electronic structure. Our approach is based on the density-matrix renormalization group (DMRG) algorithm to calculate the spin density matrix elements as basic quantity for the spatially resolved spin density distribution. The spin density matrix elements are directly determined from the second-quantized elementary operators optimized by the DMRG algorithm. As an analytic convergence criterion for the spin density distribution, we employ our recently developed sampling-reconstruction scheme [J. Chem. Phys. 2011, 134, 224101] to build an accurate complete-active-space configuration-interaction (CASCI) wave function from the optimized matrix product states. The spin density matrix elements can then also be determined as an expectation value employing the reconstructed wave function expansion. Furthermore, the explicit reconstruction of a CA...

  15. Accurate thickness measurement of graphene.

    Science.gov (United States)

    Shearer, Cameron J; Slattery, Ashley D; Stapleton, Andrew J; Shapter, Joseph G; Gibson, Christopher T

    2016-03-29

    Graphene has emerged as a material with a vast variety of applications. The electronic, optical and mechanical properties of graphene are strongly influenced by the number of layers present in a sample. As a result, the dimensional characterization of graphene films is crucial, especially with the continued development of new synthesis methods and applications. A number of techniques exist to determine the thickness of graphene films including optical contrast, Raman scattering and scanning probe microscopy techniques. Atomic force microscopy (AFM), in particular, is used extensively since it provides three-dimensional images that enable the measurement of the lateral dimensions of graphene films as well as the thickness, and by extension the number of layers present. However, in the literature AFM has proven to be inaccurate with a wide range of measured values for single layer graphene thickness reported (between 0.4 and 1.7 nm). This discrepancy has been attributed to tip-surface interactions, image feedback settings and surface chemistry. In this work, we use standard and carbon nanotube modified AFM probes and a relatively new AFM imaging mode known as PeakForce tapping mode to establish a protocol that will allow users to accurately determine the thickness of graphene films. In particular, the error in measuring the first layer is reduced from 0.1-1.3 nm to 0.1-0.3 nm. Furthermore, in the process we establish that the graphene-substrate adsorbate layer and imaging force, in particular the pressure the tip exerts on the surface, are crucial components in the accurate measurement of graphene using AFM. These findings can be applied to other 2D materials.

  16. Accurate thickness measurement of graphene

    Science.gov (United States)

    Shearer, Cameron J.; Slattery, Ashley D.; Stapleton, Andrew J.; Shapter, Joseph G.; Gibson, Christopher T.

    2016-03-01

    Graphene has emerged as a material with a vast variety of applications. The electronic, optical and mechanical properties of graphene are strongly influenced by the number of layers present in a sample. As a result, the dimensional characterization of graphene films is crucial, especially with the continued development of new synthesis methods and applications. A number of techniques exist to determine the thickness of graphene films including optical contrast, Raman scattering and scanning probe microscopy techniques. Atomic force microscopy (AFM), in particular, is used extensively since it provides three-dimensional images that enable the measurement of the lateral dimensions of graphene films as well as the thickness, and by extension the number of layers present. However, in the literature AFM has proven to be inaccurate with a wide range of measured values for single layer graphene thickness reported (between 0.4 and 1.7 nm). This discrepancy has been attributed to tip-surface interactions, image feedback settings and surface chemistry. In this work, we use standard and carbon nanotube modified AFM probes and a relatively new AFM imaging mode known as PeakForce tapping mode to establish a protocol that will allow users to accurately determine the thickness of graphene films. In particular, the error in measuring the first layer is reduced from 0.1-1.3 nm to 0.1-0.3 nm. Furthermore, in the process we establish that the graphene-substrate adsorbate layer and imaging force, in particular the pressure the tip exerts on the surface, are crucial components in the accurate measurement of graphene using AFM. These findings can be applied to other 2D materials.

  17. De novo assembly of a haplotype-resolved human genome.

    Science.gov (United States)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

    2015-06-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.

  18. A Vondrak low pass filter for IMU sensor initial alignment on a disturbed base.

    Science.gov (United States)

    Li, Zengke; Wang, Jian; Gao, Jingxiang; Li, Binghao; Zhou, Feng

    2014-12-10

    The initial alignment of the Inertial Measurement Unit (IMU) is an important process of INS to determine the coordinate transformation matrix which is used in the integration of Global Positioning Systems (GPS) with Inertial Navigation Systems (INS). In this paper a novel alignment method for a disturbed base, such as a vehicle disturbed by wind outdoors, implemented with the aid of a Vondrak low pass filter, is proposed. The basic principle of initial alignment including coarse alignment and fine alignment is introduced first. The spectral analysis is processed to compare the differences between the characteristic error of INS force observation on a stationary base and on disturbed bases. In order to reduce the high frequency noise in the force observation more accurately and more easily, a Vondrak low pass filter is constructed based on the spectral analysis result. The genetic algorithms method is introduced to choose the smoothing factor in the Vondrak filter and the corresponding objective condition is built. The architecture of the proposed alignment method with the Vondrak low pass filter is shown. Furthermore, simulated experiments and actual experiments were performed to validate the new algorithm. The results indicate that, compared with the conventional alignment method, the Vondrak filter could eliminate the high frequency noise in the force observation and the proposed alignment method could improve the attitude accuracy. At the same time, only one parameter needs to be set, which makes the proposed method easier to implement than other low-pass filter methods.

  19. A Vondrak Low Pass Filter for IMU Sensor Initial Alignment on a Disturbed Base

    Directory of Open Access Journals (Sweden)

    Zengke Li

    2014-12-01

    Full Text Available The initial alignment of the Inertial Measurement Unit (IMU is an important process of INS to determine the coordinate transformation matrix which is used in the integration of Global Positioning Systems (GPS with Inertial Navigation Systems (INS. In this paper a novel alignment method for a disturbed base, such as a vehicle disturbed by wind outdoors, implemented with the aid of a Vondrak low pass filter, is proposed. The basic principle of initial alignment including coarse alignment and fine alignment is introduced first. The spectral analysis is processed to compare the differences between the characteristic error of INS force observation on a stationary base and on disturbed bases. In order to reduce the high frequency noise in the force observation more accurately and more easily, a Vondrak low pass filter is constructed based on the spectral analysis result. The genetic algorithms method is introduced to choose the smoothing factor in the Vondrak filter and the corresponding objective condition is built. The architecture of the proposed alignment method with the Vondrak low pass filter is shown. Furthermore, simulated experiments and actual experiments were performed to validate the new algorithm. The results indicate that, compared with the conventional alignment method, the Vondrak filter could eliminate the high frequency noise in the force observation and the proposed alignment method could improve the attitude accuracy. At the same time, only one parameter needs to be set, which makes the proposed method easier to implement than other low-pass filter methods.

  20. Measures of frontal plane lower limb alignment obtained from static radiographs and dynamic gait analysis.

    Science.gov (United States)

    Hunt, Michael A; Birmingham, Trevor B; Jenkyn, Thomas R; Giffin, J Robert; Jones, Ian C

    2008-05-01

    Currently, lower limb alignment is measured statically from radiographs that may not accurately represent the condition of the limb when moving and weight-bearing. Thus, the purpose of the present study was to introduce and examine a novel measure of dynamic lower limb alignment obtained during walking in patients with knee OA. In this cross-sectional study, standing, full-length lower limb radiographs were acquired from 80 individuals with confirmed knee OA, who also underwent three-dimensional gait analyses with reflective markers placed on the segments of the lower limb. Frontal plane lower limb alignment was measured using the static radiographs (mechanical axis) and gait analyses (marker-based alignment) by identifying the centres of the hip, knee, and ankle from both methods. Simple linear regression indicated these measures were highly correlated (r=0.84), however, 30% of the variance in the marker-based measure of lower limb alignment was not explained by the mechanical axis despite using the same anatomical landmarks. Results from this study suggest that a valid measure of dynamic lower limb alignment can be obtained from a standard quantitative gait analysis and highlight the differences in measures of lower limb alignment obtained in static and dynamic situations. Future research into the clinical utility of measures of dynamic alignment in the treatment of OA may aid in the development of interventions specifically tailored to one's dynamic lower limb biomechanics during gait.

  1. Automatic spreader-container alignment system using infrared structured lights.

    Science.gov (United States)

    Liu, Yu; Wang, Yibo; Lv, Jimin; Zhang, Maojun

    2012-06-01

    This paper presents a computer-vision system to assist reach stackers to automatically align the spreader with the target container. By analyzing infrared lines on the top of the container, the proposed system is able to calculate the relative position between the spreader and the container. The invisible structured lights are equipped in this system to enable all-weather operation, which can avoid environmental factors such as shadows and differences in climate. Additionally, the lateral inclination of the spreader is taken into consideration to offer a more accurate alignment than other competing systems. Estimation errors are reduced through approaches including power series and linear regression. The accuracy can be controlled within 2 cm or 2 deg, which meets the requirements of reach stackers' operation.

  2. Alignment of wave functions for angular momentum projection

    CERN Document Server

    Taniguchi, Yasutaka

    2016-01-01

    Angular momentum projection is used to obtain eigen states of angular momentum from general wave functions. Multi-configuration mixing calculation with angular momentum projection is an important microscopic method in nuclear physics. For accurate multi-configuration mixing calculation with angular momentum projection, concentrated distribution of $z$ components $K$ of angular momentum in the body-fixed frame ($K$-distribution) is favored. Orientation of wave functions strongly affects $K$-distribution. Minimization of variance of $\\hat{J}_z$ is proposed as an alignment method to obtain wave functions that have concentrated $K$-distribution. Benchmark calculations are performed for $\\alpha$-$^{24}$Mg cluster structure, triaxially superdeformed states in $^{40}$Ar, and Hartree-Fock states of some nuclei. The proposed alignment method is useful and works well for various wave functions to obtain concentrated $K$-distribution.

  3. Aligning molecules with intense nonresonant laser fields

    DEFF Research Database (Denmark)

    Larsen, J.J.; Safvan, C.P.; Sakai, H.;

    1999-01-01

    Molecules in a seeded supersonic beam are aligned by the interaction between an intense nonresonant linearly polarized laser field and the molecular polarizability. We demonstrate the general applicability of the scheme by aligning I2, ICl, CS2, CH3I, and C6H5I molecules. The alignment is probed...... by mass selective two dimensional imaging of the photofragment ions produced by femtosecond laser pulses. Calculations on the degree of alignment of I2 are in good agreement with the experiments. We discuss some future applications of laser aligned molecules....

  4. Subsonic Mechanical Alignment of Irregular Grains

    CERN Document Server

    Lazarian, Alex

    2007-01-01

    We show that grains can be efficiently aligned by interacting with a subsonic gaseous flow. The alignment arises from grains having irregularities that scatter atoms with different efficiency in the right and left directions. The grains tend to align with long axes perpendicular to magnetic field, which corresponds to Davis-Greenstein predictions, but does not involve magnetic field. For rather conservative factors characterizing the grain helicity and scattering efficiency of impinging atoms, the alignment of helical grains is much more efficient than the Gold-type alignment processes.

  5. Alignment of Short Reads: A Crucial Step for Application of Next-Generation Sequencing Data in Precision Medicine

    Directory of Open Access Journals (Sweden)

    Hao Ye

    2015-11-01

    Full Text Available Precision medicine or personalized medicine has been proposed as a modernized and promising medical strategy. Genetic variants of patients are the key information for implementation of precision medicine. Next-generation sequencing (NGS is an emerging technology for deciphering genetic variants. Alignment of raw reads to a reference genome is one of the key steps in NGS data analysis. Many algorithms have been developed for alignment of short read sequences since 2008. Users have to make a decision on which alignment algorithm to use in their studies. Selection of the right alignment algorithm determines not only the alignment algorithm but also the set of suitable parameters to be used by the algorithm. Understanding these algorithms helps in selecting the appropriate alignment algorithm for different applications in precision medicine. Here, we review current available algorithms and their major strategies such as seed-and-extend and q-gram filter. We also discuss the challenges in current alignment algorithms, including alignment in multiple repeated regions, long reads alignment and alignment facilitated with known genetic variants.

  6. Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method

    Directory of Open Access Journals (Sweden)

    Lund Ole

    2007-07-01

    Full Text Available Abstract Background Antigen presenting cells (APCs sample the extra cellular space and present peptides from here to T helper cells, which can be activated if the peptides are of foreign origin. The peptides are presented on the surface of the cells in complex with major histocompatibility class II (MHC II molecules. Identification of peptides that bind MHC II molecules is thus a key step in rational vaccine design and developing methods for accurate prediction of the peptide:MHC interactions play a central role in epitope discovery. The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. Here, we present a novel stabilization matrix alignment method, SMM-align, that allows for direct prediction of peptide:MHC binding affinities. The predictive performance of the method is validated on a large MHC class II benchmark data set covering 14 HLA-DR (human MHC and three mouse H2-IA alleles. Results The predictive performance of the SMM-align method was demonstrated to be superior to that of the Gibbs sampler, TEPITOPE, SVRMHC, and MHCpred methods. Cross validation between peptide data set obtained from different sources demonstrated that direct incorporation of peptide length potentially results in over-fitting of the binding prediction method. Focusing on amino terminal peptide flanking residues (PFR, we demonstrate a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. Visualizing the binding motif as obtained by the SMM-align and TEPITOPE methods highlights a series of fundamental discrepancies between the two predicted motifs. For the DRB1*1302 allele for instance, the TEPITOPE method favors basic amino acids at most anchor positions, whereas the SMM-align method identifies a preference for hydrophobic or neutral amino acids at the anchors. Conclusion

  7. Classifying Genomic Sequences by Sequence Feature Analysis

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hua Liu; Dian Jiao; Xiao Sun

    2005-01-01

    Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream,exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.

  8. Galaxy alignments: Theory, modelling and simulations

    CERN Document Server

    Kiessling, Alina; Joachimi, Benjamin; Kirk, Donnacha; Kitching, Thomas D; Leonard, Adrienne; Mandelbaum, Rachel; Schäfer, Björn Malte; Sifón, Cristóbal; Brown, Michael L; Rassat, Anais

    2015-01-01

    The shapes of galaxies are not randomly oriented on the sky. During the galaxy formation and evolution process, environment has a strong influence, as tidal gravitational fields in large-scale structure tend to align the shapes and angular momenta of nearby galaxies. Additionally, events such as galaxy mergers affect the relative alignments of galaxies throughout their history. These "intrinsic galaxy alignments" are known to exist, but are still poorly understood. This review will offer a pedagogical introduction to the current theories that describe intrinsic galaxy alignments, including the apparent difference in intrinsic alignment between early- and late-type galaxies and the latest efforts to model them analytically. It will then describe the ongoing efforts to simulate intrinsic alignments using both $N$-body and hydrodynamic simulations. Due to the relative youth of this field, there is still much to be done to understand intrinsic galaxy alignments and this review summarises the current state of the ...

  9. FOGSAA: Fast Optimal Global Sequence Alignment Algorithm

    Science.gov (United States)

    Chakraborty, Angana; Bandyopadhyay, Sanghamitra

    2013-04-01

    In this article we propose a Fast Optimal Global Sequence Alignment Algorithm, FOGSAA, which aligns a pair of nucleotide/protein sequences faster than any optimal global alignment method including the widely used Needleman-Wunsch (NW) algorithm. FOGSAA is applicable for all types of sequences, with any scoring scheme, and with or without affine gap penalty. Compared to NW, FOGSAA achieves a time gain of (70-90)% for highly similar nucleotide sequences (> 80% similarity), and (54-70)% for sequences having (30-80)% similarity. For other sequences, it terminates with an approximate score. For protein sequences, the average time gain is between (25-40)%. Compared to three heuristic global alignment methods, the quality of alignment is improved by about 23%-53%. FOGSAA is, in general, suitable for aligning any two sequences defined over a finite alphabet set, where the quality of the global alignment is of supreme importance.

  10. Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space.

    Science.gov (United States)

    Wilton, Richard; Budavari, Tamas; Langmead, Ben; Wheelan, Sarah J; Salzberg, Steven L; Szalay, Alexander S

    2015-01-01

    When computing alignments of DNA sequences to a large genome, a key element in achieving high processing throughput is to prioritize locations in the genome where high-scoring mappings might be expected. We formulated this task as a series of list-processing operations that can be efficiently performed on graphics processing unit (GPU) hardware.We followed this approach in implementing a read aligner called Arioc that uses GPU-based parallel sort and reduction techniques to identify high-priority locations where potential alignments may be found. We then carried out a read-by-read comparison of Arioc's reported alignments with the alignments found by several leading read aligners. With simulated reads, Arioc has comparable or better accuracy than the other read aligners we tested. With human sequencing reads, Arioc demonstrates significantly greater throughput than the other aligners we evaluated across a wide range of sensitivity settings. The Arioc software is available at https://github.com/RWilton/Arioc. It is released under a BSD open-source license.

  11. Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space

    Directory of Open Access Journals (Sweden)

    Richard Wilton

    2015-03-01

    Full Text Available When computing alignments of DNA sequences to a large genome, a key element in achieving high processing throughput is to prioritize locations in the genome where high-scoring mappings might be expected. We formulated this task as a series of list-processing operations that can be efficiently performed on graphics processing unit (GPU hardware.We followed this approach in implementing a read aligner called Arioc that uses GPU-based parallel sort and reduction techniques to identify high-priority locations where potential alignments may be found. We then carried out a read-by-read comparison of Arioc’s reported alignments with the alignments found by several leading read aligners. With simulated reads, Arioc has comparable or better accuracy than the other read aligners we tested. With human sequencing reads, Arioc demonstrates significantly greater throughput than the other aligners we evaluated across a wide range of sensitivity settings. The Arioc software is available at https://github.com/RWilton/Arioc. It is released under a BSD open-source license.

  12. Pupil Alignment Measuring Technique and Alignment Reference for Instruments or Optical Systems

    Science.gov (United States)

    Hagopian, John G.

    2010-01-01

    A technique was created to measure the pupil alignment of instruments in situ by measuring calibrated pupil alignment references (PARs) in instruments. The PAR can also be measured using an alignment telescope or an imaging system. PAR allows the verification of the science instrument (SI) pupil alignment at the integrated science instrument module (ISIM) level of assembly at ambient and cryogenic operating temperature. This will allow verification of the ISIM+SI alignment, and provide feedback to realign the SI if necessary.

  13. SOAP3: ultra-fast GPU-based parallel alignment tool for short reads.

    Science.gov (United States)

    Liu, Chi-Man; Wong, Thomas; Wu, Edward; Luo, Ruibang; Yiu, Siu-Ming; Li, Yingrui; Wang, Bingqiang; Yu, Chang; Chu, Xiaowen; Zhao, Kaiyong; Li, Ruiqiang; Lam, Tak-Wah

    2012-03-15

    SOAP3 is the first short read alignment tool that leverages the multi-processors in a graphic processing unit (GPU) to achieve a drastic improvement in speed. We adapted the compressed full-text index (BWT) used by SOAP2 in view of the advantages and disadvantages of GPU. When tested with millions of Illumina Hiseq 2000 length-100 bp reads, SOAP3 takes < 30 s to align a million read pairs onto the human reference genome and is at least 7.5 and 20 times faster than BWA and Bowtie, respectively. For aligning reads with up to four mismatches, SOAP3 aligns slightly more reads than BWA and Bowtie; this is because SOAP3, unlike BWA and Bowtie, is not heuristic-based and always reports all answers.

  14. Aligned interactions in cosmic rays

    Energy Technology Data Exchange (ETDEWEB)

    Kempa, J., E-mail: kempa@pw.plock.pl [Warsaw University of Technology Branch Plock (Poland)

    2015-12-15

    The first clean Centauro was found in cosmic rays years many ago at Mt Chacaltaya experiment. Since that time, many people have tried to find this type of interaction, both in cosmic rays and at accelerators. But no one has found a clean cases of this type of interaction.It happened finally in the last exposure of emulsion at Mt Chacaltaya where the second clean Centauro has been found. The experimental data for both the Centauros and STRANA will be presented and discussed in this paper. We also present our comments to the intriguing question of the existence of a type of nuclear interactions at high energy with alignment.

  15. MEANS FOR DETERMINING CENTRIFUGE ALIGNMENT

    Science.gov (United States)

    Smith, W.Q.

    1958-08-26

    An apparatus is presented for remotely determining the alignment of a centrifuge. The centrifage shaft is provided with a shoulder, upon which two followers ride, one for detecting radial movements, and one upon the shoulder face for determining the axial motion. The followers are attached to separate liquid filled bellows, and a tube connects each bellows to its respective indicating gage at a remote location. Vibrations produced by misalignment of the centrifuge shaft are transmitted to the bellows, and tbence through the tubing to the indicator gage. This apparatus is particularly useful for operation in a hot cell where the materials handled are dangerous to the operating personnel.

  16. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  17. Genomic taxonomy of vibrios

    Directory of Open Access Journals (Sweden)

    Iida Tetsuya

    2009-10-01

    Full Text Available Abstract Background Vibrio taxonomy has been based on a polyphasic approach. In this study, we retrieve useful taxonomic information (i.e. data that can be used to distinguish different taxonomic levels, such as species and genera from 32 genome sequences of different vibrio species. We use a variety of tools to explore the taxonomic relationship between the sequenced genomes, including Multilocus Sequence Analysis (MLSA, supertrees, Average Amino Acid Identity (AAI, genomic signatures, and Genome BLAST atlases. Our aim is to analyse the usefulness of these tools for species identification in vibrios. Results We have generated four new genome sequences of three Vibrio species, i.e., V. alginolyticus 40B, V. harveyi-like 1DA3, and V. mimicus strains VM573 and VM603, and present a broad analyses of these genomes along with other sequenced Vibrio species. The genome atlas and pangenome plots provide a tantalizing image of the genomic differences that occur between closely related sister species, e.g. V. cholerae and V. mimicus. The vibrio pangenome contains around 26504 genes. The V. cholerae core genome and pangenome consist of 1520 and 6923 genes, respectively. Pangenomes might allow different strains of V. cholerae to occupy different niches. MLSA and supertree analyses resulted in a similar phylogenetic picture, with a clear distinction of four groups (Vibrio core group, V. cholerae-V. mimicus, Aliivibrio spp., and Photobacterium spp.. A Vibrio species is defined as a group of strains that share > 95% DNA identity in MLSA and supertree analysis, > 96% AAI, ≤ 10 genome signature dissimilarity, and > 61% proteome identity. Strains of the same species and species of the same genus will form monophyletic groups on the basis of MLSA and supertree. Conclusion The combination of different analytical and bioinformatics tools will enable the most accurate species identification through genomic computational analysis. This endeavour will culminate in

  18. Automated quantification of aligned collagen for human breast carcinoma prognosis

    Directory of Open Access Journals (Sweden)

    Jeremy S Bredfeldt

    2014-01-01

    Full Text Available Background: Mortality in cancer patients is directly attributable to the ability of cancer cells to metastasize to distant sites from the primary tumor. This migration of tumor cells begins with a remodeling of the local tumor microenvironment, including changes to the extracellular matrix and the recruitment of stromal cells, both of which facilitate invasion of tumor cells into the bloodstream. In breast cancer, it has been proposed that the alignment of collagen fibers surrounding tumor epithelial cells can serve as a quantitative image-based biomarker for survival of invasive ductal carcinoma patients. Specific types of collagen alignment have been identified for their prognostic value and now these tumor associated collagen signatures (TACS are central to several clinical specimen imaging trials. Here, we implement the semi-automated acquisition and analysis of this TACS candidate biomarker and demonstrate a protocol that will allow consistent scoring to be performed throughout large patient cohorts. Methods: Using large field of view high resolution microscopy techniques, image processing and supervised learning methods, we are able to quantify and score features of collagen fiber alignment with respect to adjacent tumor-stromal boundaries. Results: Our semi-automated technique produced scores that have statistically significant correlation with scores generated by a panel of three human observers. In addition, our system generated classification scores that accurately predicted survival in a cohort of 196 breast cancer patients. Feature rank analysis reveals that TACS positive fibers are more well-aligned with each other, are of generally lower density, and terminate within or near groups of epithelial cells at larger angles of interaction. Conclusion: These results demonstrate the utility of a supervised learning protocol for streamlining the analysis of collagen alignment with respect to tumor stromal boundaries.

  19. Automated alignment-based curation of gene models in filamentous fungi

    OpenAIRE

    2014-01-01

    Background Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The wealth of available fungal genomes has not yet been exploited by an automated method that applies quality control of gene models in order to obtain more accurate genome annotations. Results We prov...

  20. A More Accurate Fourier Transform

    CERN Document Server

    Courtney, Elya

    2015-01-01

    Fourier transform methods are used to analyze functions and data sets to provide frequencies, amplitudes, and phases of underlying oscillatory components. Fast Fourier transform (FFT) methods offer speed advantages over evaluation of explicit integrals (EI) that define Fourier transforms. This paper compares frequency, amplitude, and phase accuracy of the two methods for well resolved peaks over a wide array of data sets including cosine series with and without random noise and a variety of physical data sets, including atmospheric $\\mathrm{CO_2}$ concentrations, tides, temperatures, sound waveforms, and atomic spectra. The FFT uses MIT's FFTW3 library. The EI method uses the rectangle method to compute the areas under the curve via complex math. Results support the hypothesis that EI methods are more accurate than FFT methods. Errors range from 5 to 10 times higher when determining peak frequency by FFT, 1.4 to 60 times higher for peak amplitude, and 6 to 10 times higher for phase under a peak. The ability t...

  1. SeqMule: automated pipeline for analysis of human exome/genome sequencing data.

    Science.gov (United States)

    Guo, Yunfei; Ding, Xiaolei; Shen, Yufeng; Lyon, Gholson J; Wang, Kai

    2015-09-18

    Next-generation sequencing (NGS) technology has greatly helped us identify disease-contributory variants for Mendelian diseases. However, users are often faced with issues such as software compatibility, complicated configuration, and no access to high-performance computing facility. Discrepancies exist among aligners and variant callers. We developed a computational pipeline, SeqMule, to perform automated variant calling from NGS data on human genomes and exomes. SeqMule integrates computational-cluster-free parallelization capability built on top of the variant callers, and facilitates normalization/intersection of variant calls to generate consensus set with high confidence. SeqMule integrates 5 alignment tools, 5 variant calling algorithms and accepts various combinations all by one-line command, therefore allowing highly flexible yet fully automated variant calling. In a modern machine (2 Intel Xeon X5650 CPUs, 48 GB memory), when fast turn-around is needed, SeqMule generates annotated VCF files in a day from a 30X whole-genome sequencing data set; when more accurate calling is needed, SeqMule generates consensus call set that improves over single callers, as measured by both Mendelian error rate and consistency. SeqMule supports Sun Grid Engine for parallel processing, offers turn-key solution for deployment on Amazon Web Services, allows quality check, Mendelian error check, consistency evaluation, HTML-based reports. SeqMule is available at http://seqmule.openbioinformatics.org.

  2. Velocity-aligned Doppler spectroscopy

    Energy Technology Data Exchange (ETDEWEB)

    Xu, Z.; Koplitz, B.; Wittig, C.

    1989-03-01

    The technique of velocity-aligned Doppler spectrosocopy (VADS) is presented and discussed. For photolysis/probe experiments with pulsed initiation, VADS can yield Doppler profiles for nascent photofragments that allow detailed center-of-mass (c.m.) kinetic energy distributions to be extracted. When compared with traditional forms of Doppler spectroscopy, the improvement in kinetic energy resolution is dramatic. Changes in the measured profiles are a consequence of spatial discrimination (i.e., focused and overlapping photolysis and probe beams) and delayed observation. These factors result in the selective detection of species whose velocities are aligned with the wave vector of the probe radiation k/sub pr/, thus revealing the speed distribution along k/sub pr/ rather than the distribution of nascent velocity components projected upon this direction. Mathematical details of the procedure used to model VADS are given, and experimental illustrations for HI, H/sub 2/S, and NH/sub 3/ photodissociation are presented. In these examples, pulsed photodissociation produces H atoms that are detected by sequential two-photon, two-frequency ionization via Lyman-..cap alpha.. with a pulsed laser (121.6+364.7 nm), and measuring the Lyman-..cap alpha.. Doppler profile as a function of probe delay reveals both internal and c.m. kinetic energy distributions for the photofragments. Strengths and weaknesses of VADS as a tool for investigating photofragmentation phenomena are also discussed.

  3. BrucellaBase: Genome information resource.

    Science.gov (United States)

    Sankarasubramanian, Jagadesan; Vishnu, Udayakumar S; Khader, L K M Abdul; Sridhar, Jayavel; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash

    2016-09-01

    Brucella sp. causes a major zoonotic disease, brucellosis. Brucella belongs to the family Brucellaceae under the order Rhizobiales of Alphaproteobacteria. We present BrucellaBase, a web-based platform, providing features of a genome database together with unique analysis tools. We have developed a web version of the multilocus sequence typing (MLST) (Whatmore et al., 2007) and phylogenetic analysis of Brucella spp. BrucellaBase currently contains genome data of 510 Brucella strains along with the user interfaces for BLAST, VFDB, CARD, pairwise genome alignment and MLST typing. Availability of these tools will enable the researchers interested in Brucella to get meaningful information from Brucella genome sequences. BrucellaBase will regularly be updated with new genome sequences, new features along with improvements in genome annotations. BrucellaBase is available online at http://www.dbtbrucellosis.in/brucellabase.html or http://59.99.226.203/brucellabase/homepage.html.

  4. Galaxy alignment on large and small scales

    Science.gov (United States)

    Kang, X.; Lin, W. P.; Dong, X.; Wang, Y. O.; Dutton, A.; Macciò, A.

    2016-10-01

    Galaxies are not randomly distributed across the universe but showing different kinds of alignment on different scales. On small scales satellite galaxies have a tendency to distribute along the major axis of the central galaxy, with dependence on galaxy properties that both red satellites and centrals have stronger alignment than their blue counterparts. On large scales, it is found that the major axes of Luminous Red Galaxies (LRGs) have correlation up to 30Mpc/h. Using hydro-dynamical simulation with star formation, we investigate the origin of galaxy alignment on different scales. It is found that most red satellite galaxies stay in the inner region of dark matter halo inside which the shape of central galaxy is well aligned with the dark matter distribution. Red centrals have stronger alignment than blue ones as they live in massive haloes and the central galaxy-halo alignment increases with halo mass. On large scales, the alignment of LRGs is also from the galaxy-halo shape correlation, but with some extent of mis-alignment. The massive haloes have stronger alignment than haloes in filament which connect massive haloes. This is contrary to the naive expectation that cosmic filament is the cause of halo alignment.

  5. MSOAR 2.0: Incorporating tandem duplications into ortholog assignment based on genome rearrangement

    Directory of Open Access Journals (Sweden)

    Zhang Liqing

    2010-01-01

    Full Text Available Abstract Background Ortholog assignment is a critical and fundamental problem in comparative genomics, since orthologs are considered to be functional counterparts in different species and can be used to infer molecular functions of one species from those of other species. MSOAR is a recently developed high-throughput system for assigning one-to-one orthologs between closely related species on a genome scale. It attempts to reconstruct the evolutionary history of input genomes in terms of genome rearrangement and gene duplication events. It assumes that a gene duplication event inserts a duplicated gene into the genome of interest at a random location (i.e., the random duplication model. However, in practice, biologists believe that genes are often duplicated by tandem duplications, where a duplicated gene is located next to the original copy (i.e., the tandem duplication model. Results In this paper, we develop MSOAR 2.0, an improved system for one-to-one ortholog assignment. For a pair of input genomes, the system first focuses on the tandemly duplicated genes of each genome and tries to identify among them those that were duplicated after the speciation (i.e., the so-called inparalogs, using a simple phylogenetic tree reconciliation method. For each such set of tandemly duplicated inparalogs, all but one gene will be deleted from the concerned genome (because they cannot possibly appear in any one-to-one ortholog pairs, and MSOAR is invoked. Using both simulated and real data experiments, we show that MSOAR 2.0 is able to achieve a better sensitivity and specificity than MSOAR. In comparison with the well-known genome-scale ortholog assignment tool InParanoid, Ensembl ortholog database, and the orthology information extracted from the well-known whole-genome multiple alignment program MultiZ, MSOAR 2.0 shows the highest sensitivity. Although the specificity of MSOAR 2.0 is slightly worse than that of InParanoid in the real data experiments

  6. Meeting Report: Hackathon-Workshop on Darwin Core and MIxS Standards Alignment (February 2012).

    Science.gov (United States)

    Tuama, Eamonn Ó; Deck, John; Dröge, Gabriel; Döring, Markus; Field, Dawn; Kottmann, Renzo; Ma, Juncai; Mori, Hiroshi; Morrison, Norman; Sterk, Peter; Sugawara, Hideaki; Wieczorek, John; Wu, Linhuan; Yilmaz, Pelin

    2012-10-10

    The Global Biodiversity Information Facility and the Genomic Standards Consortium convened a joint workshop at the University of Oxford, 27-29 February 2012, with a small group of experts from Europe, USA, China and Japan, to continue the alignment of the Darwin Core with the MIxS and related genomics standards. Several reference mappings were produced as well as test expressions of MIxS in RDF. The use and management of controlled vocabulary terms was considered in relation to both GBIF and the GSC, and tools for working with terms were reviewed. Extensions for publishing genomic biodiversity data to the GBIF network via a Darwin Core Archive were prototyped and work begun on preparing translations of the Darwin Core to Japanese and Chinese. Five genomic repositories were identified for engagement to begin the process of testing the publishing of genomic data to the GBIF network commencing with the SILVA rRNA database.

  7. Redshift and luminosity evolution of the intrinsic alignments of galaxies in Horizon-AGN

    Science.gov (United States)

    Chisari, N.; Laigle, C.; Codis, S.; Dubois, Y.; Devriendt, J.; Miller, L.; Benabed, K.; Slyz, A.; Gavazzi, R.; Pichon, C.

    2016-09-01

    Intrinsic galaxy shape and angular momentum alignments can arise in cosmological large-scale structure due to tidal interactions or galaxy formation processes. Cosmological hydrodynamical simulations have recently come of age as a tool to study these alignments and their contamination to weak gravitational lensing. We probe the redshift and luminosity evolution of intrinsic alignments in Horizon-AGN between z = 0 and 3 for galaxies with an r-band absolute magnitude of Mr ≤ -20. Alignments transition from being radial at low redshifts and high luminosities, dominated by the contribution of ellipticals, to being tangential at high redshift and low luminosities, where discs dominate the signal. This cannot be explained by the evolution of the fraction of ellipticals and discs alone: intrinsic evolution in the amplitude of alignments is necessary. The alignment amplitude of elliptical galaxies alone is smaller in amplitude by a factor of ≃2, but has similar luminosity and redshift evolution as in current observations and in the non-linear tidal alignment model at projected separations of ≳1 Mpc. Alignments of discs are null in projection and consistent with current low-redshift observations. The combination of the two populations yields an overall amplitude a factor of ≃4 lower than observed alignments of luminous red galaxies with a steeper luminosity dependence. The restriction on accurate galaxy shapes implies that the galaxy population in the simulation is complete only to Mr ≤ -20. Higher resolution simulations will be necessary to avoid extrapolation of the intrinsic alignment predictions to the range of luminosities probed by future surveys.

  8. MACSIMS : multiple alignment of complete sequences information management system

    Directory of Open Access Journals (Sweden)

    Plewniak Frédéric

    2006-06-01

    Full Text Available Abstract Background In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. Results MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. Conclusion MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at http://bips.u-strasbg.fr/MACSIMS/.

  9. Mitochondrial genome sequences illuminate maternal lineages of conservation concern in a rare carnivore

    Directory of Open Access Journals (Sweden)

    Pilgrim Kristine

    2011-04-01

    Full Text Available Abstract Background Science-based wildlife management relies on genetic information to infer population connectivity and identify conservation units. The most commonly used genetic marker for characterizing animal biodiversity and identifying maternal lineages is the mitochondrial genome. Mitochondrial genotyping figures prominently in conservation and management plans, with much of the attention focused on the non-coding displacement ("D" loop. We used massively parallel multiplexed sequencing to sequence complete mitochondrial genomes from 40 fishers, a threatened carnivore that possesses low mitogenomic diversity. This allowed us to test a key assumption of conservation genetics, specifically, that the D-loop accurately reflects genealogical relationships and variation of the larger mitochondrial genome. Results Overall mitogenomic divergence in fishers is exceedingly low, with 66 segregating sites and an average pairwise distance between genomes of 0.00088 across their aligned length (16,290 bp. Estimates of variation and genealogical relationships from the displacement (D loop region (299 bp are contradicted by the complete mitochondrial genome, as well as the protein coding fraction of the mitochondrial genome. The sources of this contradiction trace primarily to the near-absence of mutations marking the D-loop region of one of the most divergent lineages, and secondarily to independent (recurrent mutations at two nucleotide position in the D-loop amplicon. Conclusions Our study has two important implications. First, inferred genealogical reconstructions based on the fisher D-loop region contradict inferences based on the entire mitogenome to the point that the populations of greatest conservation concern cannot be accurately resolved. Whole-genome analysis identifies Californian haplotypes from the northern-most populations as highly distinctive, with a significant excess of amino acid changes that may be indicative of molecular

  10. The UCSC Archaeal Genome Browser: 2012 update.

    Science.gov (United States)

    Chan, Patricia P; Holmes, Andrew D; Smith, Andrew M; Tran, Danny; Lowe, Todd M

    2012-01-01

    The UCSC Archaeal Genome Browser (http://archaea.ucsc.edu) offers a graphical web-based resource for exploration and discovery within archaeal and other selected microbial genomes. By bringing together existing gene annotations, gene expression data, multiple-genome alignments, pre-computed sequence comparisons and other specialized analysis tracks, the genome browser is a powerful aggregator of varied genomic information. The genome browser environment maintains the current look-and-feel of the vertebrate UCSC Genome Browser, but also integrates archaeal and bacterial-specific tracks with a few graphic display enhancements. The browser currently contains 115 archaeal genomes, plus 31 genomes of viruses known to infect archaea. Some of the recently developed or enhanced tracks visualize data from published high-throughput RNA-sequencing studies, the NCBI Conserved Domain Database, sequences from pre-genome sequencing studies, predicted gene boundaries from three different protein gene prediction algorithms, tRNAscan-SE gene predictions with RNA secondary structures and CRISPR locus predictions. We have also developed a companion resource, the Archaeal COG Browser, to provide better search and display of arCOG gene function classifications, including their phylogenetic distribution among available archaeal genomes.

  11. Galaxy alignments: Observations and impact on cosmology

    CERN Document Server

    Kirk, Donnacha; Hoekstra, Henk; Joachimi, Benjamin; Kitching, Thomas D; Mandelbaum, Rachel; Sifón, Cristóbal; Cacciato, Marcello; Choi, Ami; Kiessling, Alina; Leonard, Adrienne; Rassat, Anais; Schäfer, Björn Malte

    2015-01-01

    Galaxy shapes are not randomly oriented, rather they are statistically aligned in a way that can depend on formation environment, history and galaxy type. Studying the alignment of galaxies can therefore deliver important information about the astrophysics of galaxy formation and evolution as well as the growth of structure in the Universe. In this review paper we summarise key measurements of intrinsic alignments, divided by galaxy type, scale and environment. We also cover the statistics and formalism necessary to understand the observations in the literature. With the emergence of weak gravitational lensing as a precision probe of cosmology, galaxy alignments took on an added importance because they can mimic cosmic shear, the effect of gravitational lensing by large-scale structure on observed galaxy shapes. This makes intrinsic alignments an important systematic effect in weak lensing studies. We quantify the impact of intrinsic alignments on cosmic shear surveys and finish by reviewing practical mitigat...

  12. Magnetic alignment and patterning of cellulose fibers

    Directory of Open Access Journals (Sweden)

    Fumiko Kimura and Tsunehisa Kimura

    2008-01-01

    Full Text Available The alignment and patterning of cellulose fibers under magnetic fields are reported. Static and rotating magnetic fields were used to align cellulose fibers with sizes ranging from millimeter to nanometer sizes. Cellulose fibers of the millimeter order, which were prepared for papermaking, and much smaller fibers with micrometer to nanometer sizes prepared by the acid hydrolysis of larger ones underwent magnetic alignment. Under a rotating field, a uniaxial alignment of fibers was achieved. The alignment was successfully fixed by the photopolymerization of a UV-curable resin precursor used as matrix. A monodomain chiral nematic film was prepared from an aqueous suspension of nanofibers. Using a field modulator inserted in a homogeneous magnetic field, simultaneous alignment and patterning were achieved

  13. Magnetic alignment and patterning of cellulose fibers

    Energy Technology Data Exchange (ETDEWEB)

    Kimura, Fumiko; Kimura, Tsunehisa [Division of Forest and Biomaterials Science, Graduate School of Agriculture, Kyoto University, Kitashirakawa, Sakyo-ku, Kyoto 606-8502 (Japan)], E-mail: tkimura@kais.kyoto-u.ac.jp

    2008-04-01

    The alignment and patterning of cellulose fibers under magnetic fields are reported. Static and rotating magnetic fields were used to align cellulose fibers with sizes ranging from millimeter to nanometer sizes. Cellulose fibers of the millimeter order, which were prepared for papermaking, and much smaller fibers with micrometer to nanometer sizes prepared by the acid hydrolysis of larger ones underwent magnetic alignment. Under a rotating field, a uniaxial alignment of fibers was achieved. The alignment was successfully fixed by the photopolymerization of a UV-curable resin precursor used as matrix. A monodomain chiral nematic film was prepared from an aqueous suspension of nanofibers. Using a field modulator inserted in a homogeneous magnetic field, simultaneous alignment and patterning were achieved.

  14. Alignment of in-vessel components by metrology defined adaptive machining

    Energy Technology Data Exchange (ETDEWEB)

    Wilson, David [ITER Organization, Route de Vinon sur Verdon, CS90 046, St Paul-lez-Durance (France); Bernard, Nathanaël [G2Métric, Launaguet 31140 (France); Mariani, Antony [Spatial Alignment Ltd., Witney (United Kingdom)

    2015-10-15

    Highlights: • Advanced metrology techniques developed for large volume high density in-vessel surveys. • Virtual alignment process employed to optimize the alignment of 440 blanket modules. • Auto-geometry construct, from survey data, using CAD proximity detection and orientation logic. • HMI developed to relocate blanket modules if customization limits on interfaces are exceeded. • Data export format derived for Catia parametric models, defining customization requirements. - Abstract: The assembly of ITER will involve the precise and accurate alignment of a large number of components and assemblies in areas where access will often be severely constrained and where process efficiency will be critical. One such area is the inside of the vacuum vessel where several thousand components shall be custom machined to provide the alignment references for in-vessel systems. The paper gives an overview of the process that will be employed; to survey the interfaces for approximately 3500 components then define and execute the customization process.

  15. Program for PET image alignment: Effects on calculated differences in cerebral metabolic rates for glucose

    Energy Technology Data Exchange (ETDEWEB)

    Phillips, R.L.; London, E.D.; Links, J.M.; Cascella, N.G. (NIDA Addiction Research Center, Baltimore, MD (USA))

    1990-12-01

    A program was developed to align positron emission tomography images from multiple studies on the same subject. The program allowed alignment of two images with a fineness of one-tenth the width of a pixel. The indications and effects of misalignment were assessed in eight subjects from a placebo-controlled double-blind crossover study on the effects of cocaine on regional cerebral metabolic rates for glucose. Visual examination of a difference image provided a sensitive and accurate tool for assessing image alignment. Image alignment within 2.8 mm was essential to reduce variability of measured cerebral metabolic rates for glucose. Misalignment by this amount introduced errors on the order of 20% in the computed metabolic rate for glucose. These errors propagate to the difference between metabolic rates for a subject measured in basal versus perturbed states.

  16. Planar self-aligned imprint lithography for coplanar plasmonic nanostructures fabrication

    KAUST Repository

    Wan, Weiwei

    2014-03-01

    Nanoimprint lithography (NIL) is a cost-efficient nanopatterning technology because of its promising advantages of high throughput and high resolution. However, accurate multilevel overlay capability of NIL required for integrated circuit manufacturing remains a challenge due to the high cost of achieving mechanical alignment precision. Although self-aligned imprint lithography was developed to avoid the need of alignment for the vertical layered structures, it has limited usage in the manufacture of the coplanar structures, such as integrated plasmonic devices. In this paper, we develop a new process of planar self-alignment imprint lithography (P-SAIL) to fabricate the metallic and dielectric structures on the same plane. P-SAIL transfers the multilevel imprint processes to a single-imprint process which offers higher efficiency and less cost than existing manufacturing methods. Such concept is demonstrated in an example of fabricating planar plasmonic structures consisting of different materials. © 2014 Springer-Verlag Berlin Heidelberg.

  17. Velocity-aligned Doppler spectroscopy

    Science.gov (United States)

    Xu, Z.; Koplitz, B.; Wittig, C.

    1989-03-01

    The use of velocity-aligned Doppler spectroscopy (VADS) to measure center-of-mass kinetic-energy distributions of nascent photofragments produced in pulsed-initiation photolysis/probe experiments is described and demonstrated. In VADS, pulsed photolysis and probe laser beams counterpropagate through the ionization region of a time-of-flight mass spectrometer. The theoretical principles of VADS and the mathematical interpretation of VADS data are explained and illustrated with diagrams; the experimental setup is described; and results for the photodissociation of HI, H2S, and NH3 are presented in graphs and characterized in detail. VADS is shown to give much higher kinetic-energy resolution than conventional Doppler spectroscopy.

  18. Aligned carbon nanotubes for nanoelectronics

    Science.gov (United States)

    Choi, Won Bong; Bae, Eunju; Kang, Donghun; Chae, Soodoo; Cheong, Byung-ho; Ko, Ju-hye; Lee, Eungmin; Park, Wanjun

    2004-10-01

    We discuss the central issues to be addressed for realizing carbon nanotube (CNT) nanoelectronics. We focus on selective growth, electron energy bandgap engineering and device integration. We have introduced a nanotemplate to control the selective growth, length and diameter of CNTs. Vertically aligned CNTs are synthesized for developing a vertical CNT-field effect transistor (FET). The ohmic contact of the CNT/metal interface is formed by rapid thermal annealing. Diameter control, synthesis of Y-shaped CNTs and surface modification of CNTs open up the possibility for energy bandgap modulation. The concepts of an ultra-high density transistor based on the vertical-CNT array and a nonvolatile memory based on the top gate structure with an oxide-nitride-oxide charge trap are also presented. We suggest that the deposited memory film can be used for the quantum dot storage due to the localized electric field created by a nano scale CNT-electron channel.

  19. Microwave Emission from Aligned Dust

    CERN Document Server

    Lazarian, A

    2003-01-01

    Polarized microwave emission from dust is an important foreground that may contaminate polarized CMB studies unless carefully accounted for. We discuss potential difficulties associated with this foreground, namely, the existence of different grain populations with very different emission/polarization properties and variations of the polarization yield with grain temperature. In particular, we discuss observational evidence in favor of rotational emission from tiny PAH particles with dipole moments, i.e. ``spinning dust'', and also consider magneto-dipole emission from strongly magnetized grains. We argue that in terms of polarization, the magneto-dipole emission may dominate even if its contribution to total emissivity is subdominant. Addressing polarized emission at frequencies larger than approsimately 100 GHz, we discuss the complications arising from the existence of dust components with different temperatures and possibly different alignment properties.

  20. Multilingual alignments by monolingual string differences

    OpenAIRE

    Lardilleux, Adrien; Lepage, Yves

    2008-01-01

    International audience; We propose a method to obtain subsentential alignments from several languages simultaneously. The method handles several languages at once, and avoids the complexity explosion due to the usual pair-by-pair processing. It can be used for different units (characters, morphemes, words, chunks). An evaluation of word alignments with a trilingual machine translation corpus has been conducted. A comparison of the results with those obtained by state of the art alignment soft...

  1. Distributed Interference Alignment with Low Overhead

    CERN Document Server

    Ma, Yanjun; Chen, Rui

    2011-01-01

    Based on closed-form interference alignment (IA) solutions, a low overhead distributed interference alignment (LOIA) scheme is proposed in this paper for the $K$-user SISO interference channel, and extension to multiple antenna scenario is also considered. Compared with the iterative interference alignment (IIA) algorithm proposed by Gomadam et al., the overhead is greatly reduced. Simulation results show that the IIA algorithm is strictly suboptimal compared with our LOIA algorithm in the overhead-limited scenario.

  2. COS to FGS Alignment {NUV}

    Science.gov (United States)

    Hartig, George

    2009-07-01

    DESCRIPTION: In order to determine the location of the COS reference frame with respect to the FGS reference frames, NUV MIRRORA images will be obtained of an astrometric target and field. Astrometric guide stars and targets must be employed for this activity in order to facilitate the alignment wth the FGS. Images will be obtained at the initial pointing and at positions offset in V2 and in V3. Starting with the original blind pointing, obtain MIRRORA image exposures in a 5x5 POS-TARG grid centered on initial pointing; repeat the image sequence at two bracketing focus positions in same visit. Following completion of third pattern, return to nominal focus and perform 5x5 ACQ/SEARCH target acquisition and obtain one TIME-TAG MIRRORA image and one ACCUM verification exposure. Next perform an ACQ/IMAGE target acquisition followed by an ACCUM verification exposure. Also obtain ACCUM verification exposure for each of the two alternate focus positions used previously. Using MIRRORB obtain ACCUM confirmation image at nominal focus and ACCUM images at alternate focus positions and then perform an ACQ/IMAGE and confirming image at nominal focus. Analyze imagery, uplink pointing offset as offset 11469A and adjust nominal focus via patchable constant uplinked with subsequent visit of this program; update aperture locations via modified SIAF file uplinked with subsequent SMS. Use updated focus and offset pointing as input for COS 09 {program 11469 - NUV Optics Alignment and Focus} {note the SIAF update is not a prerequisite for COS 09 to proceed, but the pointing offset and focus update are}.

  3. Genome-wide association and genomic selection in animal breeding.

    Science.gov (United States)

    Hayes, Ben; Goddard, Mike

    2010-11-01

    Results from genome-wide association studies in livestock, and humans, has lead to the conclusion that the effect of individual quantitative trait loci (QTL) on complex traits, such as yield, are likely to be small; therefore, a large number of QTL are necessary to explain genetic variation in these traits. Given this genetic architecture, gains from marker-assisted selection (MAS) programs using only a small number of DNA markers to trace a limited number of QTL is likely to be small. This has lead to the development of alternative technology for using the available dense single nucleotide polymorphism (SNP) information, called genomic selection. Genomic selection uses a genome-wide panel of dense markers so that all QTL are likely to be in linkage disequilibrium with at least one SNP. The genomic breeding values are predicted to be the sum of the effect of these SNPs across the entire genome. In dairy cattle breeding, the accuracy of genomic estimated breeding values (GEBV) that can be achieved and the fact that these are available early in life have lead to rapid adoption of the technology. Here, we discuss the design of experiments necessary to achieve accurate prediction of GEBV in future generations in terms of the number of markers necessary and the size of the reference population where marker effects are estimated. We also present a simple method for implementing genomic selection using a genomic relationship matrix. Future challenges discussed include using whole genome sequence data to improve the accuracy of genomic selection and management of inbreeding through genomic relationships.

  4. Choosing the best heuristic for seeded alignment of DNA sequences

    Directory of Open Access Journals (Sweden)

    Buhler Jeremy

    2006-03-01

    Full Text Available Abstract Background Seeded alignment is an important component of algorithms for fast, large-scale DNA similarity search. A good seed matching heuristic can reduce the execution time of genomic-scale sequence comparison without degrading sensitivity. Recently, many types of seed have been proposed to improve on the performance of traditional contiguous seeds as used in, e.g., NCBI BLASTN. Choosing among these seed types, particularly those that use information besides the presence or absence of matching residue pairs, requires practical guidance based on a rigorous comparison, including assessment of sensitivity, specificity, and computational efficiency. This work performs such a comparison, focusing on alignments in DNA outside widely studied coding regions. Results We compare seeds of several types, including those allowing transition mutations rather than matches at fixed positions, those allowing transitions at arbitrary positions ("BLASTZ" seeds, and those using a more general scoring matrix. For each seed type, we use an extended version of our Mandala seed design software to choose seeds with optimized sensitivity for various levels of specificity. Our results show that, on a test set biased toward alignments of noncoding DNA, transition information significantly improves seed performance, while finer distinctions between different types of mismatches do not. BLASTZ seeds perform especially well. These results depend on properties of our test set that are not shared by EST-based test sets with a strong bias toward coding DNA. Conclusion Practical seed design requires careful attention to the properties of the alignments being sought. For noncoding DNA sequences, seeds that use transition information, especially BLASTZ-style seeds, are particularly useful. The Mandala seed design software can be found at http://www.cse.wustl.edu/~yanni/mandala/.

  5. Self-transport and self-alignment of microchips using microscopic rain

    Science.gov (United States)

    Chang, Bo; Shah, Ali; Zhou, Quan; Ras, Robin H. A.; Hjort, Klas

    2015-10-01

    Alignment of microchips with receptors is an important process step in the construction of integrated micro- and nanosystems for emerging technologies, and facilitating alignment by spontaneous self-assembly processes is highly desired. Previously, capillary self-alignment of microchips driven by surface tension effects on patterned surfaces has been reported, where it was essential for microchips to have sufficient overlap with receptor sites. Here we demonstrate for the first time capillary self-transport and self-alignment of microchips, where microchips are initially placed outside the corresponding receptor sites and can be self-transported by capillary force to the receptor sites followed by self-alignment. The surface consists of hydrophilic silicon receptor sites surrounded by superhydrophobic black silicon. Rain-induced microscopic droplets are used to form the meniscus for the self-transport and self-alignment. The boundary conditions for the self-transport have been explored by modeling and confirmed experimentally. The maximum permitted gap between a microchip and a receptor site is determined by the volume of the liquid and by the wetting contrast between receptor site and substrate. Microscopic rain applied on hydrophilic-superhydrophobic patterned surfaces greatly improves the capability, reliability and error-tolerance of the process, avoiding the need for accurate initial placement of microchips, and thereby greatly simplifying the alignment process.

  6. Self-transport and self-alignment of microchips using microscopic rain.

    Science.gov (United States)

    Chang, Bo; Shah, Ali; Zhou, Quan; Ras, Robin H A; Hjort, Klas

    2015-10-09

    Alignment of microchips with receptors is an important process step in the construction of integrated micro- and nanosystems for emerging technologies, and facilitating alignment by spontaneous self-assembly processes is highly desired. Previously, capillary self-alignment of microchips driven by surface tension effects on patterned surfaces has been reported, where it was essential for microchips to have sufficient overlap with receptor sites. Here we demonstrate for the first time capillary self-transport and self-alignment of microchips, where microchips are initially placed outside the corresponding receptor sites and can be self-transported by capillary force to the receptor sites followed by self-alignment. The surface consists of hydrophilic silicon receptor sites surrounded by superhydrophobic black silicon. Rain-induced microscopic droplets are used to form the meniscus for the self-transport and self-alignment. The boundary conditions for the self-transport have been explored by modeling and confirmed experimentally. The maximum permitted gap between a microchip and a receptor site is determined by the volume of the liquid and by the wetting contrast between receptor site and substrate. Microscopic rain applied on hydrophilic-superhydrophobic patterned surfaces greatly improves the capability, reliability and error-tolerance of the process, avoiding the need for accurate initial placement of microchips, and thereby greatly simplifying the alignment process.

  7. Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer; Rundsten, Carsten Friis; Ussery, David

    2012-01-01

    more biologically relevant, especially considering that many of these genome sequences are draft quality. The E. coli pan-genome for this set of isolates contains 16,373 gene clusters. A core-gene tree, based on alignment and a pan-genome tree based on gene presence/absence, maps the relatedness...

  8. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  9. Some aspects of SR beamline alignment

    Energy Technology Data Exchange (ETDEWEB)

    Gaponov, Yu.A., E-mail: Yury.Gaponov@maxlab.lu.se [MAX-lab, Lund University, P.O.B. 118, SE-221 00 Lund (Sweden); Cerenius, Y. [MAX-lab, Lund University, P.O.B. 118, SE-221 00 Lund (Sweden); Nygaard, J. [Faculty of Life Sciences, University of Copenhagen, DK-1871 Frederiksberg C (Denmark); Ursby, T.; Larsson, K. [MAX-lab, Lund University, P.O.B. 118, SE-221 00 Lund (Sweden)

    2011-09-01

    Based on the Synchrotron Radiation (SR) beamline optical element-by-element alignment with analysis of the alignment results an optimized beamline alignment algorithm has been designed and developed. The alignment procedures have been designed and developed for the MAX-lab I911-4 fixed energy beamline. It has been shown that the intermediate information received during the monochromator alignment stage can be used for the correction of both monochromator and mirror without the next stages of alignment of mirror, slits, sample holder, etc. Such an optimization of the beamline alignment procedures decreases the time necessary for the alignment and becomes useful and helpful in the case of any instability of the beamline optical elements, storage ring electron orbit or the wiggler insertion device, which could result in the instability of angular and positional parameters of the SR beam. A general purpose software package for manual, semi-automatic and automatic SR beamline alignment has been designed and developed using the developed algorithm. The TANGO control system is used as the middle-ware between the stand-alone beamline control applications BLTools, BPMonitor and the beamline equipment.

  10. The art of editing RNA structural alignments

    DEFF Research Database (Denmark)

    Andersen, Ebbe Sloth

    2014-01-01

    Manual editing of RNA structural alignments may be considered more art than science, since it still requires an expert biologist to take multiple levels of information into account and be slightly creative when constructing high-quality alignments. Even though the task is rather tedious, it is re......Manual editing of RNA structural alignments may be considered more art than science, since it still requires an expert biologist to take multiple levels of information into account and be slightly creative when constructing high-quality alignments. Even though the task is rather tedious...

  11. Triangular Alignment (TAME). A Tensor-based Approach for Higher-order Network Alignment

    Energy Technology Data Exchange (ETDEWEB)

    Mohammadi, Shahin [Purdue Univ., West Lafayette, IN (United States); Gleich, David F. [Purdue Univ., West Lafayette, IN (United States); Kolda, Tamara G. [Sandia National Laboratories (SNL-CA), Livermore, CA (United States); Grama, Ananth [Purdue Univ., West Lafayette, IN (United States)

    2015-11-01

    Network alignment is an important tool with extensive applications in comparative interactomics. Traditional approaches aim to simultaneously maximize the number of conserved edges and the underlying similarity of aligned entities. We propose a novel formulation of the network alignment problem that extends topological similarity to higher-order structures and provide a new objective function that maximizes the number of aligned substructures. This objective function corresponds to an integer programming problem, which is NP-hard. Consequently, we approximate this objective function as a surrogate function whose maximization results in a tensor eigenvalue problem. Based on this formulation, we present an algorithm called Triangular AlignMEnt (TAME), which attempts to maximize the number of aligned triangles across networks. We focus on alignment of triangles because of their enrichment in complex networks; however, our formulation and resulting algorithms can be applied to general motifs. Using a case study on the NAPABench dataset, we show that TAME is capable of producing alignments with up to 99% accuracy in terms of aligned nodes. We further evaluate our method by aligning yeast and human interactomes. Our results indicate that TAME outperforms the state-of-art alignment methods both in terms of biological and topological quality of the alignments.

  12. The effects of alignment error and alignment filtering on the sitewise detection of positive selection.

    Science.gov (United States)

    Jordan, Gregory; Goldman, Nick

    2012-04-01

    When detecting positive selection in proteins, the prevalence of errors resulting from misalignment and the ability of alignment filters to mitigate such errors are not well understood, but filters are commonly applied to try to avoid false positive results. Focusing on the sitewise detection of positive selection across a wide range of divergence levels and indel rates, we performed simulation experiments to quantify the false positives and false negatives introduced by alignment error and the ability of alignment filters to improve performance. We found that some aligners led to many false positives, whereas others resulted in very few. False negatives were a problem for all aligners, increasing with sequence divergence. Of the aligners tested, PRANK's codon-based alignments consistently performed the best and ClustalW performed the worst. Of the filters tested, GUIDANCE performed the best and Gblocks performed the worst. Although some filters showed good ability to reduce the error rates from ClustalW and MAFFT alignments, none were found to substantially improve the performance of PRANK alignments under most conditions. Our results revealed distinct trends in error rates and power levels for aligners and filters within a biologically plausible parameter space. With the best aligner, a low false positive rate was maintained even with extremely divergent indel-prone sequences. Controls using the true alignment and an optimal filtering method suggested that performance improvements could be gained by improving aligners or filters to reduce the prevalence of false negatives, especially at higher divergence levels and indel rates.

  13. Reconstructing the genomic content of microbiome taxa through shotgun metagenomic deconvolution.

    Science.gov (United States)

    Carr, Rogan; Shen-Orr, Shai S; Borenstein, Elhanan

    2013-01-01

    Metagenomics has transformed our understanding of the microbial world, allowing researchers to bypass the need to isolate and culture individual taxa and to directly characterize both the taxonomic and gene compositions of environmental samples. However, associating the genes found in a metagenomic sample with the specific taxa of origin remains a critical challenge. Existing binning methods, based on nucleotide composition or alignment to reference genomes allow only a coarse-grained classification and rely heavily on the availability of sequenced genomes from closely related taxa. Here, we introduce a novel computational framework, integrating variation in gene abundances across multiple samples with taxonomic abundance data to deconvolve metagenomic samples into taxa-specific gene profiles and to reconstruct the genomic content of community members. This assembly-free method is not bounded by various factors limiting previously described methods of metagenomic binning or metagenomic assembly and represents a fundamentally different approach to metagenomic-based genome reconstruction. An implementation of this framework is available at http://elbo.gs.washington.edu/software.html. We first describe the mathematical foundations of our framework and discuss considerations for implementing its various components. We demonstrate the ability of this framework to accurately deconvolve a set of metagenomic samples and to recover the gene content of individual taxa using synthetic metagenomic samples. We specifically characterize determinants of prediction accuracy and examine the impact of annotation errors on the reconstructed genomes. We finally apply metagenomic deconvolution to samples from the Human Microbiome Project, successfully reconstructing genus-level genomic content of various microbial genera, based solely on variation in gene count. These reconstructed genera are shown to correctly capture genus-specific properties. With the accumulation of metagenomic

  14. Multimodal Integration (Image and Text Using Ontology Alignment

    Directory of Open Access Journals (Sweden)

    Ahmad A.A. Shareha

    2009-01-01

    Full Text Available Problem statement: This study proposed multimodal integration method at the concept level to investigate information from multimodalities. The multimodal data was represented as two separate lists of concepts which were extracted from images and its related text. The concepts extracted from image analysis are often ambiguous, while the concepts extracted from text processing could be sense-ambiguous. The major problems that face the integration of the underlying modalities (image and text were: The difference in the coverage and the difference in the granularity level. Approach: This study proposed a novel application using ontology alignment to unify the underlying ontologies. The said lists of concepts were represented in a structured form within the corresponding ontologies then the two structural lists are enriched and matched based on the alignment, this matching represent the final knowledge. Results: The difference in the coverage was solved in this study using the alignment process and the difference in the granularity level was solved using the enrichment process. Thus, the proposed integration produced accurate integrated results. Conclusion: Thus, integration of these concepts allows the totality of the knowledge be expressed more precisely.

  15. CCD Camera Lens Interface for Real-Time Theodolite Alignment

    Science.gov (United States)

    Wake, Shane; Scott, V. Stanley, III

    2012-01-01

    Theodolites are a common instrument in the testing, alignment, and building of various systems ranging from a single optical component to an entire instrument. They provide a precise way to measure horizontal and vertical angles. They can be used to align multiple objects in a desired way at specific angles. They can also be used to reference a specific location or orientation of an object that has moved. Some systems may require a small margin of error in position of components. A theodolite can assist with accurately measuring and/or minimizing that error. The technology is an adapter for a CCD camera with lens to attach to a Leica Wild T3000 Theodolite eyepiece that enables viewing on a connected monitor, and thus can be utilized with multiple theodolites simultaneously. This technology removes a substantial part of human error by relying on the CCD camera and monitors. It also allows image recording of the alignment, and therefore provides a quantitative means to measure such error.

  16. 38 CFR 4.46 - Accurate measurement.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Accurate measurement. 4... RATING DISABILITIES Disability Ratings The Musculoskeletal System § 4.46 Accurate measurement. Accurate measurement of the length of stumps, excursion of joints, dimensions and location of scars with respect...

  17. Dynamics of genome rearrangement in bacterial populations.

    Directory of Open Access Journals (Sweden)

    Aaron E Darling

    Full Text Available Genome structure variation has profound impacts on phenotype in organisms ranging from microbes to humans, yet little is known about how natural selection acts on genome arrangement. Pathogenic bacteria such as Yersinia pestis, which causes bubonic and pneumonic plague, often exhibit a high degree of genomic rearrangement. The recent availability of several Yersinia genomes offers an unprecedented opportunity to study the evolution of genome structure and arrangement. We introduce a set of statistical methods to study patterns of rearrangement in circular chromosomes and apply them to the Yersinia. We constructed a multiple alignment of eight Yersinia genomes using Mauve software to identify 78 conserved segments that are internally free from genome rearrangement. Based on the alignment, we applied Bayesian statistical methods to infer the phylogenetic inversion history of Yersinia. The sampling of genome arrangement reconstructions contains seven parsimonious tree topologies, each having different histories of 79 inversions. Topologies with a greater number of inversions also exist, but were sampled less frequently. The inversion phylogenies agree with results suggested by SNP patterns. We then analyzed reconstructed inversion histories to identify patterns of rearrangement. We confirm an over-representation of "symmetric inversions"-inversions with endpoints that are equally distant from the origin of chromosomal replication. Ancestral genome arrangements demonstrate moderate preference for replichore balance in Yersinia. We found that all inversions are shorter than expected under a neutral model, whereas inversions acting within a single replichore are much shorter than expected. We also found evidence for a canonical configuration of the origin and terminus of replication. Finally, breakpoint reuse analysis reveals that inversions with endpoints proximal to the origin of DNA replication are nearly three times more frequent. Our findings

  18. VISTA - computational tools for comparative genomics

    Energy Technology Data Exchange (ETDEWEB)

    Frazer, Kelly A.; Pachter, Lior; Poliakov, Alexander; Rubin,Edward M.; Dubchak, Inna

    2004-01-01

    Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/VISTA/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, submit their own sequences of interest to several VISTA servers for various types of comparative analysis, and obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kilobase (kb) interval on human chromosome 5 that encodes for the kinesin family member3A (KIF3A) protein.

  19. HIPPI: highly accurate protein family classification with ensembles of HMMs

    Directory of Open Access Journals (Sweden)

    Nam-phuong Nguyen

    2016-11-01

    Full Text Available Abstract Background Given a new biological sequence, detecting membership in a known family is a basic step in many bioinformatics analyses, with applications to protein structure and function prediction and metagenomic taxon identification and abundance profiling, among others. Yet family identification of sequences that are distantly related to sequences in public databases or that are fragmentary remains one of the more difficult analytical problems in bioinformatics. Results We present a new technique for family identification called HIPPI (Hierarchical Profile Hidden Markov Models for Protein family Identification. HIPPI uses a novel technique to represent a multiple sequence alignment for a given protein family or superfamily by an ensemble of profile hidden Markov models computed using HMMER. An evaluation of HIPPI on the Pfam database shows that HIPPI has better overall precision and recall than blastp, HMMER, and pipelines based on HHsearch, and maintains good accuracy even for fragmentary query sequences and for protein families with low average pairwise sequence identity, both conditions where other methods degrade in accuracy. Conclusion HIPPI provides accurate protein family identification and is robust to difficult model conditions. Our results, combined with observations from previous studies, show that ensembles of profile Hidden Markov models can better represent multiple sequence alignments than a single profile Hidden Markov model, and thus can improve downstream analyses for various bioinformatic tasks. Further research is needed to determine the best practices for building the ensemble of profile Hidden Markov models. HIPPI is available on GitHub at https://github.com/smirarab/sepp .

  20. Bayesian coestimation of phylogeny and sequence alignment

    Directory of Open Access Journals (Sweden)

    Jensen Jens

    2005-04-01

    Full Text Available Abstract Background Two central problems in computational biology are the determination of the alignment and phylogeny of a set of biological sequences. The traditional approach to this problem is to first build a multiple alignment of these sequences, followed by a phylogenetic reconstruction step based on this multiple alignment. However, alignment and phylogenetic inference are fundamentally interdependent, and ignoring this fact leads to biased and overconfident estimations. Whether the main interest be in sequence alignment or phylogeny, a major goal of computational biology is the co-estimation of both. Results We developed a fully Bayesian Markov chain Monte Carlo method for coestimating phylogeny and sequence alignment, under the Thorne-Kishino-Felsenstein model of substitution and single nucleotide insertion-deletion (indel events. In our earlier work, we introduced a novel and efficient algorithm, termed the "indel peeling algorithm", which includes indels as phylogenetically informative evolutionary events, and resembles Felsenstein's peeling algorithm for substitutions on a phylogenetic tree. For a fixed alignment, our extension analytically integrates out both substitution and indel events within a proper statistical model, without the need for data augmentation at internal tree nodes, allowing for efficient sampling of tree topologies and edge lengths. To additionally sample multiple alignments, we here introduce an efficient partial Metropolized independence sampler for alignments, and combine these two algorithms into a fully Bayesian co-estimation procedure for the alignment and phylogeny problem. Our approach results in estimates for the posterior distribution of evolutionary rate parameters, for the maximum a-posteriori (MAP phylogenetic tree, and for the posterior decoding alignment. Estimates for the evolutionary tree and multiple alignment are augmented with confidence estimates for each node height and alignment column

  1. Alignments between galaxies, satellite systems and haloes

    Science.gov (United States)

    Shao, Shi; Cautun, Marius; Frenk, Carlos S.; Gao, Liang; Crain, Robert A.; Schaller, Matthieu; Schaye, Joop; Theuns, Tom

    2016-08-01

    The spatial distribution of the satellite populations of the Milky Way and Andromeda are puzzling in that they are nearly perpendicular to the discs of their central galaxies. To understand the origin of such configurations we study the alignment of the central galaxy, satellite system and dark matter halo in the largest of the `Evolution and Assembly of GaLaxies and their Environments' (EAGLE) simulation. We find that centrals and their satellite systems tend to be well aligned with their haloes, with a median misalignment angle of 33° in both cases. While the centrals are better aligned with the inner 10 kpc halo, the satellite systems are better aligned with the entire halo indicating that satellites preferentially trace the outer halo. The central-satellite alignment is weak (median misalignment angle of 52°) and we find that around 20 per cent of systems have a misalignment angle larger than 78°, which is the value for the Milky Way. The central-satellite alignment is a consequence of the tendency of both components to align with the dark matter halo. As a consequence, when the central is parallel to the satellite system, it also tends to be parallel to the halo. In contrast, if the central is perpendicular to the satellite system, as in the case of the Milky Way and Andromeda, then the central-halo alignment is much weaker. Dispersion-dominated (spheroidal) centrals have a stronger alignment with both their halo and their satellites than rotation-dominated (disc) centrals. We also found that the halo, the central galaxy and the satellite system tend to be aligned with the surrounding large-scale distribution of matter, with the halo being the better aligned of the three.

  2. Multiscale Point Correspondence Using Feature Distribution and Frequency Domain Alignment

    Directory of Open Access Journals (Sweden)

    Zeng-Shun Zhao

    2012-01-01

    Full Text Available In this paper, a hybrid scheme is proposed to find the reliable point-correspondences between two images, which combines the distribution of invariant spatial feature description and frequency domain alignment based on two-stage coarse to fine refinement strategy. Firstly, the source and the target images are both down-sampled by the image pyramid algorithm in a hierarchical multi-scale way. The Fourier-Mellin transform is applied to obtain the transformation parameters at the coarse level between the image pairs; then, the parameters can serve as the initial coarse guess, to guide the following feature matching step at the original scale, where the correspondences are restricted in a search window determined by the deformation between the reference image and the current image; Finally, a novel matching strategy is developed to reject the false matches by validating geometrical relationships between candidate matching points. By doing so, the alignment parameters are refined, which is more accurate and more flexible than a robust fitting technique. This in return can provide a more accurate result for feature correspondence. Experiments on real and synthetic image-pairs show that our approach provides satisfactory feature matching performance.

  3. Comparison of genomic data via statistical distribution.

    Science.gov (United States)

    Amiri, Saeid; Dinov, Ivo D

    2016-10-21

    Sequence comparison has become an essential tool in bioinformatics, because highly homologous sequences usually imply significant functional or structural similarity. Traditional sequence analysis techniques are based on preprocessing and alignment, which facilitate measuring and quantitative characterization of genetic differences, variability and complexity. However, recent developments of next generation and whole genome sequencing technologies give rise to new challenges that are related to measuring similarity and capturing rearrangements of large segments contained in the genome. This work is devoted to illustrating different methods recently introduced for quantifying sequence distances and variability. Most of the alignment-free methods rely on counting words, which are small contiguous fragments of the genome. Our approach considers the locations of nucleotides in the sequences and relies more on appropriate statistical distributions. The results of this technique for comparing sequences, by extracting information and comparing matching fidelity and location regularization information, are very encouraging, specifically to classify mutation sequences.

  4. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    Science.gov (United States)

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing.

  5. Aligning Optical Fibers by Means of Actuated MEMS Wedges

    Science.gov (United States)

    Morgan, Brian; Ghodssi, Reza

    2007-01-01

    Microelectromechanical systems (MEMS) of a proposed type would be designed and fabricated to effect lateral and vertical alignment of optical fibers with respect to optical, electro-optical, optoelectronic, and/or photonic devices on integrated circuit chips and similar monolithic device structures. A MEMS device of this type would consist of a pair of oppositely sloped alignment wedges attached to linear actuators that would translate the wedges in the plane of a substrate, causing an optical fiber in contact with the sloping wedge surfaces to undergo various displacements parallel and perpendicular to the plane. In making it possible to accurately align optical fibers individually during the packaging stages of fabrication of the affected devices, this MEMS device would also make it possible to relax tolerances in other stages of fabrication, thereby potentially reducing costs and increasing yields. In a typical system according to the proposal (see Figure 1), one or more pair(s) of alignment wedges would be positioned to create a V groove in which an optical fiber would rest. The fiber would be clamped at a suitable distance from the wedges to create a cantilever with a slight bend to push the free end of the fiber gently to the bottom of the V groove. The wedges would be translated in the substrate plane by amounts Dx1 and Dx2, respectively, which would be chosen to move the fiber parallel to the plane by a desired amount Dx and perpendicular to the plane by a desired amount Dy. The actuators used to translate the wedges could be variants of electrostatic or thermal actuators that are common in MEMS.

  6. Protein sequence alignment analysis by local covariation: coevolution statistics detect benchmark alignment errors.

    Directory of Open Access Journals (Sweden)

    Russell J Dickson

    Full Text Available The use of sequence alignments to understand protein families is ubiquitous in molecular biology. High quality alignments are difficult to build and protein alignment remains one of the largest open problems in computational biology. Misalignments can lead to inferential errors about protein structure, folding, function, phylogeny, and residue importance. Identifying alignment errors is difficult because alignments are built and validated on the same primary criteria: sequence conservation. Local covariation identifies systematic misalignments and is independent of conservation. We demonstrate an alignment curation tool, LoCo, that integrates local covariation scores with the Jalview alignment editor. Using LoCo, we illustrate how local covariation is capable of identifying alignment errors due to the reduction of positional independence in the region of misalignment. We highlight three alignments from the benchmark database, BAliBASE 3, that contain regions of high local covariation, and investigate the causes to illustrate these types of scenarios. Two alignments contain sequential and structural shifts that cause elevated local covariation. Realignment of these misaligned segments reduces local covariation; these alternative alignments are supported with structural evidence. We also show that local covariation identifies active site residues in a validated alignment of paralogous structures. Loco is available at https://sourceforge.net/projects/locoprotein/files/.

  7. RevTrans: multiple alignment of coding DNA from aligned amino acid sequences

    DEFF Research Database (Denmark)

    Wernersson, Rasmus; Pedersen, Anders Gorm

    2003-01-01

    The simple fact that proteins are built from 20 amino acids while DNA only contains four different bases, means that the 'signal-to-noise ratio' in protein sequence alignments is much better than in alignments of DNA. Besides this information-theoretical advantage, protein alignments also benefit...

  8. Business and IT alignment in context

    NARCIS (Netherlands)

    Silvius, A.J.G.

    2013-01-01

    Already for more than two decades, the necessity and desirability of aligning business needs and information technology (IT) capabilities is considered to be one of the key issues in IT management. However, several studies report quite low scores on business and IT alignment (BIA). The question “Why

  9. Compositions for directed alignment of conjugated polymers

    Science.gov (United States)

    Kim, Jinsang; Kim, Bong-Gi; Jeong, Eun Jeong

    2016-04-19

    Conjugated polymers (CPs) achieve directed alignment along an applied flow field and a dichroic ratio of as high as 16.67 in emission from well-aligned thin films and fully realized anisotropic optoelectronic properties of CPs in field-effect transistor (FET).

  10. Achieving Organisational Change through Values Alignment

    Science.gov (United States)

    Branson, Christopher M.

    2008-01-01

    Purpose: The purpose of this paper is to, first, establish the interdependency between the successful achievement of organisational change and the attainment of values alignment within an organisation's culture and then, second, to describe an effective means for attaining such values alignment. Design/methodology/approach: Literature from the…

  11. Probabilistic sequence alignment of stratigraphic records

    Science.gov (United States)

    Lin, Luan; Khider, Deborah; Lisiecki, Lorraine E.; Lawrence, Charles E.

    2014-10-01

    The assessment of age uncertainty in stratigraphically aligned records is a pressing need in paleoceanographic research. The alignment of ocean sediment cores is used to develop mutually consistent age models for climate proxies and is often based on the δ18O of calcite from benthic foraminifera, which records a global ice volume and deep water temperature signal. To date, δ18O alignment has been performed by manual, qualitative comparison or by deterministic algorithms. Here we present a hidden Markov model (HMM) probabilistic algorithm to find 95% confidence bands for δ18O alignment. This model considers the probability of every possible alignment based on its fit to the δ18O data and transition probabilities for sedimentation rate changes obtained from radiocarbon-based estimates for 37 cores. Uncertainty is assessed using a stochastic back trace recursion to sample alignments in exact proportion to their probability. We applied the algorithm to align 35 late Pleistocene records to a global benthic δ18O stack and found that the mean width of 95% confidence intervals varies between 3 and 23 kyr depending on the resolution and noisiness of the record's δ18O signal. Confidence bands within individual cores also vary greatly, ranging from ~0 to >40 kyr. These alignment uncertainty estimates will allow researchers to examine the robustness of their conclusions, including the statistical evaluation of lead-lag relationships between events observed in different cores.

  12. Sambamba : Fast processing of NGS alignment formats

    NARCIS (Netherlands)

    Tarasov, Artem; Vilella, Albert J.; Cuppen, Edwin; Nijman, Isaac J.; Prins, Pjotr

    2015-01-01

    Summary: Sambamba is a high-performance robust tool and library for working with SAM, BAM and CRAM sequence alignment files; the most common file formats for aligned next generation sequencing data. Sambamba is a faster alternative to samtools that exploits multi-core processing and dramatically red

  13. Instructional Alignment under No Child Left Behind

    Science.gov (United States)

    Polikoff, Morgan S.

    2012-01-01

    The alignment of instruction with the content of standards and assessments is the key mediating variable separating the policy of standards-based reform (SBR) from the outcome of improved student achievement. Few studies have investigated SBR's effects on instructional alignment, and most have serious methodological limitations. This research uses…

  14. Vacuum alignment with and without elementary scalars

    DEFF Research Database (Denmark)

    Alanne, Tommi; Gertov, Helene; Meroni, Aurora;

    2016-01-01

    We systematically elucidate differences and similarities of the vacuum alignment issue in composite and renormalizable elementary extensions of the Standard Model featuring a pseudo-Goldstone Higgs. We also provide general conditions for the stability of the vacuum in the elementary framework......, thereby extending previous studies of the vacuum alignment....

  15. Optical packet switching without packet alignment

    DEFF Research Database (Denmark)

    Hansen, Peter Bukhave; Danielsen, Søren Lykke; Stubkjær, Kristian

    1998-01-01

    Operation without packet alignment of an all-optical packet switch is proposed and predicted feasible through a detailed traffic analysis. Packet alignment units are eliminated resulting in a simple switch architecture while optimal traffic performance is maintained through the flexibility provided...

  16. Evaluating Alignment between Curriculum, Assessment, and Instruction

    Science.gov (United States)

    Martone, Andrea; Sireci, Stephen G.

    2009-01-01

    The authors (a) discuss the importance of alignment for facilitating proper assessment and instruction, (b) describe the three most common methods for evaluating the alignment between state content standards and assessments, (c) discuss the relative strengths and limitations of these methods, and (d) discuss examples of applications of each…

  17. Partial Automated Alignment and Integration System

    Science.gov (United States)

    Kelley, Gary Wayne (Inventor)

    2014-01-01

    The present invention is a Partial Automated Alignment and Integration System (PAAIS) used to automate the alignment and integration of space vehicle components. A PAAIS includes ground support apparatuses, a track assembly with a plurality of energy-emitting components and an energy-receiving component containing a plurality of energy-receiving surfaces. Communication components and processors allow communication and feedback through PAAIS.

  18. A precise CT phantom alignment procedure.

    Science.gov (United States)

    Schneiders, N J; Bushong, S C

    1980-01-01

    Two of the AAPM CT performance phantom inserts require precise alignment. We present a method for aligning an insert which makes use of the partial volume effect. We demonstrate that the procedure is sensitive to tilts of less than one degree and, using the slice thickness insert, allows reproducible positioning.

  19. SOA-Driven Business-Software Alignment

    NARCIS (Netherlands)

    Shishkov, Boris; Sinderen, van Marten; Quartel, Dick

    2006-01-01

    The alignment of business processes and their supporting application software is a major concern during the initial software design phases. This paper proposes a design approach addressing this problem of business-software alignment. The approach takes an initial business model as a basis in derivin

  20. Aligning application architecture to the business context

    NARCIS (Netherlands)

    Wieringa, R.J.; Blanken, H.M.; Fokkinga, M.M.; Grefen, P.W.P.J.; Eder, J.; Missikoff, M.

    2003-01-01

    Alignment of application architecture to business architecture is a central problem in the design, acquisition and implementation of information systems in current large-scale information-processing organizations. Current research in architecture alignment is either too strategic or too software imp

  1. What is the Constructivism in Constructive Alignment?

    Science.gov (United States)

    Jervis, Loretta M.; Jervis, Les

    2005-01-01

    This paper examines the concept of constructive alignment in respect of science education. The concept is placed in the context of its two contributory components--constructivism and instructional alignment. The former has a well-established body of critical literature that highlights the challenges of constructivism for both science and science…

  2. Vacuum alignment with(out) elementary scalars

    CERN Document Server

    Alanne, Tommi; Meroni, Aurora; Sannino, Francesco

    2016-01-01

    We systematically elucidate differences and similarities of the vacuum alignment issue in composite and renormalizable elementary extensions of the Standard Model featuring a pseudo-Goldstone Higgs. We also provide general conditions for the stability of the vacuum in the elementary framework, thereby extending previous studies of the vacuum alignment.

  3. Predicting relatedness of bacterial genomes using the chaperonin-60 universal target (cpn60 UT): application to Thermoanaerobacter species.

    Science.gov (United States)

    Verbeke, Tobin J; Sparling, Richard; Hill, Janet E; Links, Matthew G; Levin, David; Dumonceaux, Tim J

    2011-05-01

    D.R. Zeigler determined that the sequence identity of bacterial genomes can be predicted accurately using the sequence identities of a corresponding set of genes that meet certain criteria [32]. This three-gene model for comparing bacterial genome pairs requires the determination of the sequence identities for recN, thdF, and rpoA. This involves the generation of approximately 4.2kb of genomic DNA sequence from each organism to be compared, and also normally requires that oligonucleotide primers be designed for amplification and sequencing based on the sequences of closely related organisms. However, we have developed an analogous mathematical model for predicting the sequence identity of whole genomes based on the sequence identity of the 542-567 base pair chaperonin-60 universal target (cpn60 UT). The cpn60 UT is accessible in nearly all bacterial genomes with a single set of universal primers, and its length is such that it can be completely sequenced in one pair of overlapping sequencing reads via di-deoxy sequencing. These mathematical models were applied to a set of Thermoanaerobacter isolates from a wood chip compost pile and it was shown that both the one-gene cpn60 UT-based model and the three-gene model based on recN, rpoA, and thdF predicted that these isolates could be classified as Thermoanaerobacter thermohydrosulfuricus. Furthermore, it was found that the genomic prediction model using cpn60 UT gave similar results to whole-genome sequence alignments over a broad range of taxa, suggesting that this method may have general utility for screening isolates and predicting their taxonomic affiliations.

  4. Circos: an information aesthetic for comparative genomics.

    Science.gov (United States)

    Krzywinski, Martin; Schein, Jacqueline; Birol, Inanç; Connors, Joseph; Gascoyne, Randy; Horsman, Doug; Jones, Steven J; Marra, Marco A

    2009-09-01

    We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.

  5. Microbial species delineation using whole genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  6. Semiautomatic beam-based LHC collimator alignment

    CERN Document Server

    Valentino, Gianluca; Bruce, Roderik; Wollmann, Daniel; Sammut, Nicholas; Rossi, Adriana; Redaelli, Stefano

    2012-01-01

    Full beam-based alignment of the LHC collimation system was a time-consuming procedure (up to 28 hours) as the collimators were set up manually. A yearly alignment campaign has been sufficient for now, although in the future due to tighter tolerances this may lead to a decrease in the cleaning efficiency if machine parameters such as the beam orbit drift over time. Automating the collimator setup procedure can reduce the beam time for collimator setup and allow for more frequent alignments, therefore reducing the risk of performance degradation. This article describes the design and testing of a semiautomatic algorithm as a first step towards a fully automatic setup procedure. The parameters used to measure the accuracy and performance of the alignment are defined and determined from experimental data. A comparison of these measured parameters at 450 GeV and 3.5 TeV with manual and semiautomatic alignment is provided.

  7. Bokeh Mirror Alignment for Cherenkov Telescopes

    CERN Document Server

    Ahnen, M L; Balbo, M; Bergmann, M; Biland, A; Blank, M; Bretz, T; Bruegge, K A; Buss, J; Domke, M; Dorner, D; Einecke, S; Hempfling, C; Hildebrand, D; Hughes, G; Lustermann, W; Mannheim, K; Mueller, S A; Neise, D; Neronov, A; Noethe, M; Overkemping, A -K; Paravac, A; Pauss, F; Rhode, W; Shukla, A; Temme, F; Thaele, J; Toscano, S; Vogler, P; Walter, R; Wilbert, A

    2016-01-01

    Imaging Atmospheric Cherenkov Telescopes (IACTs) need imaging optics with large apertures and high image intensities to map the faint Cherenkov light emitted from cosmic ray air showers onto their image sensors. Segmented reflectors fulfill these needs, and composed from mass production mirror facets they are inexpensive and lightweight. However, as the overall image is a superposition of the individual facet images, alignment remains a challenge. Here we present a simple, yet extendable method, to align a segmented reflector using its Bokeh. Bokeh alignment does not need a star or good weather nights but can be done even during daytime. Bokeh alignment optimizes the facet orientations by comparing the segmented reflectors Bokeh to a predefined template. The optimal Bokeh template is highly constricted by the reflector's aperture and is easy accessible. The Bokeh is observed using the out of focus image of a near by point like light source in a distance of about 10 focal lengths. We introduce Bokeh alignment ...

  8. Grassmannian Differential Limited Feedback for Interference Alignment

    CERN Document Server

    Ayach, Omar El

    2011-01-01

    Channel state information (CSI) in the interference channel can be used to precode, align, and reduce the dimension of interference at the receivers, to achieve the channel's maximum multiplexing gain, through what is known as interference alignment. Most interference alignment algorithms require knowledge of all the interfering channels to compute the alignment precoders. CSI, considered available at the receivers, can be shared with the transmitters via limited feedback. When alignment is done by coding over frequency extensions in a single antenna system, the required CSI lies on the Grassmannian manifold and its structure can be exploited in feedback. Unfortunately, the number of channels to be shared grows with the square of the number of users creating too much overhead with conventional feedback methods. This paper proposes Grassmannian differential feedback to reduce feedback overhead by exploiting both the channel's temporal correlation and Grassmannian structure. The performance of the proposed algo...

  9. An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments.

    Science.gov (United States)

    Duitama, Jorge; Quintero, Juan Camilo; Cruz, Daniel Felipe; Quintero, Constanza; Hubmann, Georg; Foulquié-Moreno, Maria R; Verstrepen, Kevin J; Thevelein, Johan M; Tohme, Joe

    2014-04-01

    Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species.

  10. Algorithms for Automatic Alignment of Arrays

    Science.gov (United States)

    Chatterjee, Siddhartha; Gilbert, John R.; Oliker, Leonid; Schreiber, Robert; Sheffler, Thomas J.

    1996-01-01

    Aggregate data objects (such as arrays) are distributed across the processor memories when compiling a data-parallel language for a distributed-memory machine. The mapping determines the amount of communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: an alignment that maps all the objects to an abstract template, followed by a distribution that maps the template to the processors. This paper describes algorithms for solving the various facets of the alignment problem: axis and stride alignment, static and mobile offset alignment, and replication labeling. We show that optimal axis and stride alignment is NP-complete for general program graphs, and give a heuristic method that can explore the space of possible solutions in a number of ways. We show that some of these strategies can give better solutions than a simple greedy approach proposed earlier. We also show how local graph contractions can reduce the size of the problem significantly without changing the best solution. This allows more complex and effective heuristics to be used. We show how to model the static offset alignment problem using linear programming, and we show that loop-dependent mobile offset alignment is sometimes necessary for optimum performance. We describe an algorithm with for determining mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself or can be used to improve performance. We describe an algorithm based on network flow that replicates objects so as to minimize the total amount of broadcast communication in replication.

  11. Accurate phylogenetic classification of DNA fragments based onsequence composition

    Energy Technology Data Exchange (ETDEWEB)

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis; Hugenholtz, Philip; Rigoutsos, Isidore

    2006-05-01

    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequence characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.

  12. Identification of probable genomic packaging signal sequence from SARS—CoV genome by bioinformatics analysis

    Institute of Scientific and Technical Information of China (English)

    QINLei; XIONGBin; LUOCheng; GUOZong-Ming; HAOPei; SUJiong; NANPeng; FENGYing; SHIYi-Xiang; YUXiao-Jing; LUOXiao-Min; CHENKai-Xian; SHENXu; SHENJian-Hua; ZOUJian-Ping; ZHAOGuo-Ping; SHITie-Liu; HEWei-Zhong; ZHONGYang; JIANGHua-Liang; LIYi-Xue

    2003-01-01

    AIM:To predict the probable genomic packaging signal of SARS-CoV by bioinformatics analysis. The derived packaging signal may be used to design antisense RNA and RNA interfere (RANi) drugs treating SARS. methods: Based on the studies about the genomic packaging signals of MHV and BCoV, especially the information about primary and secondary structures, the putative genomic packaging signal of SARS_CoV were analyzed by using bioinformatic tools. Multi-alignment for the genomic sequences was performed among SARS-CoV,MHV,BCoV, PEDV and HCoV 229E. Secondary structures of RNA sequences were also predicted for the identification fo the possible genomic packaging signals. Meanwhile, the N and M proteins of all five viruses were analyzed to study the evolutionary relationship with genomic packaging signals. RESULTS: The putative genomic packaging signal of SARS-CoV locates at the 3′ end of ORF1b near that of MHV and BCoV, where is the most variable region of this gene. The RNA secondary structure of SARS-CoV genomic packaging signal is very similar to that of MHV and BCoV. The same result was also obtained in studying the genomic packaging signals of PEDV and HCoV 229E. Further more, the genomic sequence multi-alignment indicated that the locations of packaging signals of SARS-CoV, PEDV, and HCoV overlaped each other. It seems that the mutation rate of packaging signal sequences is much higher than the N protein, while only subtle variations for the M protein. CONCLUSIONS: The probable genomic packaging signal of SARS-CoV is analogous to that of MHV and BCoV, with the corresponding secondary RNA structure locating at the similar region of ORF1b. The positions where genomic packaging signals exist have suffered rounds of mutations, which may influence the primary structures of the N and M proteins consequently.

  13. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  14. Sequencing intractable DNA to close microbial genomes.

    Science.gov (United States)

    Hurt, Richard A; Brown, Steven D; Podar, Mircea; Palumbo, Anthony V; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  15. Silicon Alignment Pins: An Easy Way to Realize a Wafer-to-Wafer Alignment

    Science.gov (United States)

    Jung-Kubiak, Cecile; Reck, Theodore J.; Lin, Robert H.; Peralta, Alejandro; Gill, John J.; Lee, Choonsup; Siles, Jose; Toda, Risaku; Chattopadhyay, Goutam; Cooper, Ken B.; Mehdi, Imran; Thomas, Bertrand

    2013-01-01

    Submillimeter heterodyne instruments play a critical role in addressing fundamental questions regarding the evolution of galaxies as well as being a crucial tool in planetary science. To make these instruments compatible with small platforms, especially for the study of the outer planets, or to enable the development of multi-pixel arrays, it is essential to reduce the mass, power, and volume of the existing single-pixel heterodyne receivers. Silicon micromachining technology is naturally suited for making these submillimeter and terahertz components, where precision and accuracy are essential. Waveguide and channel cavities are etched in a silicon bulk material using deep reactive ion etching (DRIE) techniques. Power amplifiers, multiplier and mixer chips are then integrated and the silicon pieces are stacked together to form a supercompact receiver front end. By using silicon micromachined packages for these components, instrument mass can be reduced and higher levels of integration can be achieved. A method is needed to assemble accurately these silicon pieces together, and a technique was developed here using etched pockets and silicon pins to align two wafers together.

  16. Accurate strand-specific quantification of viral RNA.

    Directory of Open Access Journals (Sweden)

    Nicole E Plaskon

    Full Text Available The presence of full-length complements of viral genomic RNA is a hallmark of RNA virus replication within an infected cell. As such, methods for detecting and measuring specific strands of viral RNA in infected cells and tissues are important in the study of RNA viruses. Strand-specific quantitative real-time PCR (ssqPCR assays are increasingly being used for this purpose, but the accuracy of these assays depends on the assumption that the amount of cDNA measured during the quantitative PCR (qPCR step accurately reflects amounts of a specific viral RNA strand present in the RT reaction. To specifically test this assumption, we developed multiple ssqPCR assays for the positive-strand RNA virus o'nyong-nyong (ONNV that were based upon the most prevalent ssqPCR assay design types in the literature. We then compared various parameters of the ONNV-specific assays. We found that an assay employing standard unmodified virus-specific primers failed to discern the difference between cDNAs generated from virus specific primers and those generated through false priming. Further, we were unable to accurately measure levels of ONNV (- strand RNA with this assay when higher levels of cDNA generated from the (+ strand were present. Taken together, these results suggest that assays of this type do not accurately quantify levels of the anti-genomic strand present during RNA virus infectious cycles. However, an assay permitting the use of a tag-specific primer was able to distinguish cDNAs transcribed from ONNV (- strand RNA from other cDNAs present, thus allowing accurate quantification of the anti-genomic strand. We also report the sensitivities of two different detection strategies and chemistries, SYBR(R Green and DNA hydrolysis probes, used with our tagged ONNV-specific ssqPCR assays. Finally, we describe development, design and validation of ssqPCR assays for chikungunya virus (CHIKV, the recent cause of large outbreaks of disease in the Indian Ocean

  17. BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark.

    Science.gov (United States)

    Thompson, Julie D; Koehl, Patrice; Ripp, Raymond; Poch, Olivier

    2005-10-01

    Multiple sequence alignment is one of the cornerstones of modern molecular biology. It is used to identify conserved motifs, to determine protein domains, in 2D/3D structure prediction by homology and in evolutionary studies. Recently, high-throughput technologies such as genome sequencing and structural proteomics have lead to an explosion in the amount of sequence and structure information available. In response, several new multiple alignment methods have been developed that improve both the efficiency and the quality of protein alignments. Consequently, the benchmarks used to evaluate and compare these methods must also evolve. We present here the latest release of the most widely used multiple alignment benchmark, BAliBASE, which provides high quality, manually refined, reference alignments based on 3D structural superpositions. Version 3.0 of BAliBASE includes new, more challenging test cases, representing the real problems encountered when aligning large sets of complex sequences. Using a novel, semiautomatic update protocol, the number of protein families in the benchmark has been increased and representative test cases are now available that cover most of the protein fold space. The total number of proteins in BAliBASE has also been significantly increased from 1444 to 6255 sequences. In addition, full-length sequences are now provided for all test cases, which represent difficult cases for both global and local alignment programs. Finally, the BAliBASE Web site (http://www-bio3d-igbmc.u-strasbg.fr/balibase) has been completely redesigned to provide a more user-friendly, interactive interface for the visualization of the BAliBASE reference alignments and the associated annotations.

  18. Alignment and prediction of cis-regulatory modules based on a probabilistic model of evolution.

    Science.gov (United States)

    He, Xin; Ling, Xu; Sinha, Saurabh

    2009-03-01

    Cross-species comparison has emerged as a powerful paradigm for predicting cis-regulatory modules (CRMs) and understanding their evolution. The comparison requires reliable sequence alignment, which remains a challenging task for less conserved noncoding sequences. Furthermore, the existing models of DNA sequence evolution generally do not explicitly treat the special properties of CRM sequences. To address these limitations, we propose a model of CRM evolution that captures different modes of evolution of functional transcription factor binding sites (TFBSs) and the background sequences. A particularly novel aspect of our work is a probabilistic model of gains and losses of TFBSs, a process being recognized as an important part of regulatory sequence evolution. We present a computational framework that uses this model to solve the problems of CRM alignment and prediction. Our alignment method is similar to existing methods of statistical alignment but uses the conserved binding sites to improve alignment. Our CRM prediction method deals with the inherent uncertainties of binding site annotations and sequence alignment in a probabilistic framework. In simulated as well as real data, we demonstrate that our program is able to improve both alignment and prediction of CRM sequences over several state-of-the-art methods. Finally, we used alignments produced by our program to study binding site conservation in genome-wide binding data of key transcription factors in the Drosophila blastoderm, with two intriguing results: (i) the factor-bound sequences are under strong evolutionary constraints even if their neighboring genes are not expressed in the blastoderm and (ii) binding sites in distal bound sequences (relative to transcription start sites) tend to be more conserved than those in proximal regions. Our approach is implemented as software, EMMA (Evolutionary Model-based cis-regulatory Module Analysis), ready to be applied in a broad biological context.

  19. Alignment and prediction of cis-regulatory modules based on a probabilistic model of evolution.

    Directory of Open Access Journals (Sweden)

    Xin He

    2009-03-01

    Full Text Available Cross-species comparison has emerged as a powerful paradigm for predicting cis-regulatory modules (CRMs and understanding their evolution. The comparison requires reliable sequence alignment, which remains a challenging task for less conserved noncoding sequences. Furthermore, the existing models of DNA sequence evolution generally do not explicitly treat the special properties of CRM sequences. To address these limitations, we propose a model of CRM evolution that captures different modes of evolution of functional transcription factor binding sites (TFBSs and the background sequences. A particularly novel aspect of our work is a probabilistic model of gains and losses of TFBSs, a process being recognized as an important part of regulatory sequence evolution. We present a computational framework that uses this model to solve the problems of CRM alignment and prediction. Our alignment method is similar to existing methods of statistical alignment but uses the conserved binding sites to improve alignment. Our CRM prediction method deals with the inherent uncertainties of binding site annotations and sequence alignment in a probabilistic framework. In simulated as well as real data, we demonstrate that our program is able to improve both alignment and prediction of CRM sequences over several state-of-the-art methods. Finally, we used alignments produced by our program to study binding site conservation in genome-wide binding data of key transcription factors in the Drosophila blastoderm, with two intriguing results: (i the factor-bound sequences are under strong evolutionary constraints even if their neighboring genes are not expressed in the blastoderm and (ii binding sites in distal bound sequences (relative to transcription start sites tend to be more conserved than those in proximal regions. Our approach is implemented as software, EMMA (Evolutionary Model-based cis-regulatory Module Analysis, ready to be applied in a broad biological context.

  20. Proper alignment of the microscope.

    Science.gov (United States)

    Rottenfusser, Rudi

    2013-01-01

    The light microscope is merely the first element of an imaging system in a research facility. Such a system may include high-speed and/or high-resolution image acquisition capabilities, confocal technologies, and super-resolution methods of various types. Yet more than ever, the proverb "garbage in-garbage out" remains a fact. Image manipulations may be used to conceal a suboptimal microscope setup, but an artifact-free image can only be obtained when the microscope is optimally aligned, both mechanically and optically. Something else is often overlooked in the quest to get the best image out of the microscope: Proper sample preparation! The microscope optics can only do its job when its design criteria are matched to the specimen or vice versa. The specimen itself, the mounting medium, the cover slip, and the type of immersion medium (if applicable) are all part of the total optical makeup. To get the best results out of a microscope, understanding the functions of all of its variable components is important. Only then one knows how to optimize these components for the intended application. Different approaches might be chosen to discuss all of the microscope's components. We decided to follow the light path which starts with the light source and ends at the camera or the eyepieces. To add more transparency to this sequence, the section up to the microscope stage was called the "Illuminating Section", to be followed by the "Imaging Section" which starts with the microscope objective. After understanding the various components, we can start "working with the microscope." To get the best resolution and contrast from the microscope, the practice of "Koehler Illumination" should be understood and followed by every serious microscopist. Step-by-step instructions as well as illustrations of the beam path in an upright and inverted microscope are included in this chapter. A few practical considerations are listed in Section 3.

  1. PlantLoc: an accurate web server for predicting plant protein subcellular localization by substantiality motif

    OpenAIRE

    Tang, Shengnan; Li, Tonghua; Cong, Peisheng; Xiong, Wenwei; Wang, Zhiheng; Sun, Jiangming

    2013-01-01

    Knowledge of subcellular localizations (SCLs) of plant proteins relates to their functions and aids in understanding the regulation of biological processes at the cellular level. We present PlantLoc, a highly accurate and fast webserver for predicting the multi-label SCLs of plant proteins. The PlantLoc server has two innovative characters: building localization motif libraries by a recursive method without alignment and Gene Ontology information; and establishing simple architecture for rapi...

  2. Human-mouse comparative genomics: successes and failures to reveal functional regions of the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Baroukh, Nadine; Rubin, Edward M.

    2003-05-15

    Deciphering the genetic code embedded within the human genome remains a significant challenge despite the human genome consortium's recent success at defining its linear sequence (Lander et al. 2001; Venter et al. 2001). While useful strategies exist to identify a large percentage of protein encoding regions, efforts to accurately define functional sequences in the remaining {approx}97 percent of the genome lag. Our primary interest has been to utilize the evolutionary relationship and the universal nature of genomic sequence information in vertebrates to reveal functional elements in the human genome. This has been achieved through the combined use of vertebrate comparative genomics to pinpoint highly conserved sequences as candidates for biological activity and transgenic mouse studies to address the functionality of defined human DNA fragments. Accordingly, we describe strategies and insights into functional sequences in the human genome through the use of comparative genomics coupled wit h functional studies in the mouse.

  3. Galaxy alignment on large and small scales

    CERN Document Server

    Kang, X; Wang, Y O; Dutton, A; Macciò, A

    2014-01-01

    Galaxies are not randomly distributed across the universe but showing different kinds of alignment on different scales. On small scales satellite galaxies have a tendency to distribute along the major axis of the central galaxy, with dependence on galaxy properties that both red satellites and centrals have stronger alignment than their blue counterparts. On large scales, it is found that the major axes of Luminous Red Galaxies (LRGs) have correlation up to 30Mpc/h. Using hydro-dynamical simulation with star formation, we investigate the origin of galaxy alignment on different scales. It is found that most red satellite galaxies stay in the inner region of dark matter halo inside which the shape of central galaxy is well aligned with the dark matter distribution. Red centrals have stronger alignment than blue ones as they live in massive haloes and the central galaxy-halo alignment increases with halo mass. On large scales, the alignment of LRGs is also from the galaxy-halo shape correlation, but with some ex...

  4. Sparse alignment for robust tensor learning.

    Science.gov (United States)

    Lai, Zhihui; Wong, Wai Keung; Xu, Yong; Zhao, Cairong; Sun, Mingming

    2014-10-01

    Multilinear/tensor extensions of manifold learning based algorithms have been widely used in computer vision and pattern recognition. This paper first provides a systematic analysis of the multilinear extensions for the most popular methods by using alignment techniques, thereby obtaining a general tensor alignment framework. From this framework, it is easy to show that the manifold learning based tensor learning methods are intrinsically different from the alignment techniques. Based on the alignment framework, a robust tensor learning method called sparse tensor alignment (STA) is then proposed for unsupervised tensor feature extraction. Different from the existing tensor learning methods, L1- and L2-norms are introduced to enhance the robustness in the alignment step of the STA. The advantage of the proposed technique is that the difficulty in selecting the size of the local neighborhood can be avoided in the manifold learning based tensor feature extraction algorithms. Although STA is an unsupervised learning method, the sparsity encodes the discriminative information in the alignment step and provides the robustness of STA. Extensive experiments on the well-known image databases as well as action and hand gesture databases by encoding object images as tensors demonstrate that the proposed STA algorithm gives the most competitive performance when compared with the tensor-based unsupervised learning methods.

  5. Gene finding in novel genomes

    Directory of Open Access Journals (Sweden)

    Korf Ian

    2004-05-01

    Full Text Available Abstract Background Computational gene prediction continues to be an important problem, especially for genomes with little experimental data. Results I introduce the SNAP gene finder which has been designed to be easily adaptable to a variety of genomes. In novel genomes without an appropriate gene finder, I demonstrate that employing a foreign gene finder can produce highly inaccurate results, and that the most compatible parameters may not come from the nearest phylogenetic neighbor. I find that foreign gene finders are more usefully employed to bootstrap parameter estimation and that the resulting parameters can be highly accurate. Conclusion Since gene prediction is sensitive to species-specific parameters, every genome needs a dedicated gene finder.

  6. Genomic selection in small dairy cattle populations

    DEFF Research Database (Denmark)

    Thomasen, Jørn Rind

    Genomic selection provides more accurate estimation of genetic merit for breeding candidates without own recordings and is now an integrated part of most dairy breeding schemes. However, the method has turned out to be less efficient in the numerically smaler breeds. This thesis focuses on optimi......Genomic selection provides more accurate estimation of genetic merit for breeding candidates without own recordings and is now an integrated part of most dairy breeding schemes. However, the method has turned out to be less efficient in the numerically smaler breeds. This thesis focuses...... on optimization of genomc selction for a small dairy cattle breed such as Danish Jersey. Implementing genetic superior breeding schemes thus requires more accurate genomc predictions. Besides international collaboration, genotyping of cows is an efficient way to obtain more accurate genomic predictions...

  7. Precise Alignment and Permanent Mounting of Thin and Lightweight X-ray Segments

    Science.gov (United States)

    Biskach, Michael P.; Chan, Kai-Wing; Hong, Melinda N.; Mazzarella, James R.; McClelland, Ryan S.; Norman, Michael J.; Saha, Timo T.; Zhang, William W.

    2012-01-01

    To provide observations to support current research efforts in high energy astrophysics. future X-ray telescope designs must provide matching or better angular resolution while significantly increasing the total collecting area. In such a design the permanent mounting of thin and lightweight segments is critical to the overall performance of the complete X-ray optic assembly. The thin and lightweight segments used in the assemhly of the modules are desigued to maintain and/or exceed the resolution of existing X-ray telescopes while providing a substantial increase in collecting area. Such thin and delicate X-ray segments are easily distorted and yet must be aligned to the arcsecond level and retain accurate alignment for many years. The Next Generation X-ray Optic (NGXO) group at NASA Goddard Space Flight Center has designed, assembled. and implemented new hardware and procedures mth the short term goal of aligning three pairs of X-ray segments in a technology demonstration module while maintaining 10 arcsec alignment through environmental testing as part of the eventual design and construction of a full sized module capable of housing hundreds of X-ray segments. The recent attempts at multiple segment pair alignment and permanent mounting is described along with an overview of the procedure used. A look into what the next year mll bring for the alignment and permanent segment mounting effort illustrates some of the challenges left to overcome before an attempt to populate a full sized module can begin.

  8. Automatic laser beam alignment using blob detection for an environment monitoring spectroscopy

    Science.gov (United States)

    Khidir, Jarjees; Chen, Youhua; Anderson, Gary

    2013-05-01

    This paper describes a fully automated system to align an infra-red laser beam with a small retro-reflector over a wide range of distances. The component development and test were especially used for an open-path spectrometer gas detection system. Using blob detection under OpenCV library, an automatic alignment algorithm was designed to achieve fast and accurate target detection in a complex background environment. Test results are presented to show that the proposed algorithm has been successfully applied to various target distances and environment conditions.

  9. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment.

    Science.gov (United States)

    Remmert, Michael; Biegert, Andreas; Hauser, Andreas; Söding, Johannes

    2011-12-25

    Sequence-based protein function and structure prediction depends crucially on sequence-search sensitivity and accuracy of the resulting sequence alignments. We present an open-source, general-purpose tool that represents both query and database sequences by profile hidden Markov models (HMMs): 'HMM-HMM-based lightning-fast iterative sequence search' (HHblits; http://toolkit.genzentrum.lmu.de/hhblits/). Compared to the sequence-search tool PSI-BLAST, HHblits is faster owing to its discretized-profile prefilter, has 50-100% higher sensitivity and generates more accurate alignments.

  10. The UCSC Genome Browser database: 2017 update

    Science.gov (United States)

    Tyner, Cath; Barber, Galt P.; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Eisenhart, Christopher; Fischer, Clayton M.; Gibson, David; Gonzalez, Jairo Navarro; Guruvadoo, Luvina; Haeussler, Maximilian; Heitner, Steve; Hinrichs, Angie S.; Karolchik, Donna; Lee, Brian T.; Lee, Christopher M.; Nejad, Parisa; Raney, Brian J.; Rosenbloom, Kate R.; Speir, Matthew L.; Villarreal, Chris; Vivian, John; Zweig, Ann S.; Haussler, David; Kuhn, Robert M.; Kent, W. James

    2017-01-01

    Since its 2001 debut, the University of California, Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu/) team has provided continuous support to the international genomics and biomedical communities through a web-based, open source platform designed for the fast, scalable display of sequence alignments and annotations landscaped against a vast collection of quality reference genome assemblies. The browser's publicly accessible databases are the backbone of a rich, integrated bioinformatics tool suite that includes a graphical interface for data queries and downloads, alignment programs, command-line utilities and more. This year's highlights include newly designed home and gateway pages; a new ‘multi-region’ track display configuration for exon-only, gene-only and custom regions visualization; new genome browsers for three species (brown kiwi, crab-eating macaque and Malayan flying lemur); eight updated genome assemblies; extended support for new data types such as CRAM, RNA-seq expression data and long-range chromatin interaction pairs; and the unveiling of a new supported mirror site in Japan. PMID:27899642

  11. The UCSC Genome Browser database: 2017 update.

    Science.gov (United States)

    Tyner, Cath; Barber, Galt P; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Eisenhart, Christopher; Fischer, Clayton M; Gibson, David; Gonzalez, Jairo Navarro; Guruvadoo, Luvina; Haeussler, Maximilian; Heitner, Steve; Hinrichs, Angie S; Karolchik, Donna; Lee, Brian T; Lee, Christopher M; Nejad, Parisa; Raney, Brian J; Rosenbloom, Kate R; Speir, Matthew L; Villarreal, Chris; Vivian, John; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Kent, W James

    2017-01-04

    Since its 2001 debut, the University of California, Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu/) team has provided continuous support to the international genomics and biomedical communities through a web-based, open source platform designed for the fast, scalable display of sequence alignments and annotations landscaped against a vast collection of quality reference genome assemblies. The browser's publicly accessible databases are the backbone of a rich, integrated bioinformatics tool suite that includes a graphical interface for data queries and downloads, alignment programs, command-line utilities and more. This year's highlights include newly designed home and gateway pages; a new 'multi-region' track display configuration for exon-only, gene-only and custom regions visualization; new genome browsers for three species (brown kiwi, crab-eating macaque and Malayan flying lemur); eight updated genome assemblies; extended support for new data types such as CRAM, RNA-seq expression data and long-range chromatin interaction pairs; and the unveiling of a new supported mirror site in Japan.

  12. NCBI prokaryotic genome annotation pipeline.

    Science.gov (United States)

    Tatusova, Tatiana; DiCuccio, Michael; Badretdin, Azat; Chetvernin, Vyacheslav; Nawrocki, Eric P; Zaslavsky, Leonid; Lomsadze, Alexandre; Pruitt, Kim D; Borodovsky, Mark; Ostell, James

    2016-08-19

    Recent technological advances have opened unprecedented opportunities for large-scale sequencing and analysis of populations of pathogenic species in disease outbreaks, as well as for large-scale diversity studies aimed at expanding our knowledge across the whole domain of prokaryotes. To meet the challenge of timely interpretation of structure, function and meaning of this vast genetic information, a comprehensive approach to automatic genome annotation is critically needed. In collaboration with Georgia Tech, NCBI has developed a new approach to genome annotation that combines alignment based methods with methods of predicting protein-coding and RNA genes and other functional elements directly from sequence. A new gene finding tool, GeneMarkS+, uses the combined evidence of protein and RNA placement by homology as an initial map of annotation to generate and modify ab initio gene predictions across the whole genome. Thus, the new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies more on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence. The pipeline provides a framework for generation and analysis of annotation on the full breadth of prokaryotic taxonomy. For additional information on PGAP see https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ and the NCBI Handbook, https://www.ncbi.nlm.nih.gov/books/NBK174280/.

  13. Coelostat and heliostat - Theory of alignment

    Science.gov (United States)

    Demianski, M.; Pasachoff, J. M.

    1984-06-01

    For perfectly aligned heliostats and coelostats tracking at the solar rate and half the solar rate, respectively, the solar beam has no translational motion. But, particularly in the field at eclipses, it is not possible to align heliostats and coelostats with infinite precision. The authors derive the effect of small misalignments on the translational motion of the beam, and give tables to allow the calculation of the accuracy to which the instruments must be mounted and adjusted to attain a desired accuracy over a given duration. Further, it is shown how to derive the necessary adjustments to improve alignment, given measurements of the tracking error.

  14. Rotational Alignment Altered by Source Position Correlations

    Science.gov (United States)

    Jacobs, Chris S.; Heflin, M. B.; Lanyi, G. E.; Sovers, O. J.; Steppe, J. A.

    2010-01-01

    In the construction of modern Celestial Reference Frames (CRFs) the overall rotational alignment is only weakly constrained by the data. Therefore, common practice has been to apply a 3-dimensional No-Net-Rotation (NNR) constraint in order to align an under-construction frame to the ICRF. We present evidence that correlations amongst source position parameters must be accounted for in order to properly align a CRF at the 5-10 (mu)as level of uncertainty found in current work. Failure to do so creates errors at the 10-40 (mu)as level.

  15. Robust and resistant 2D shape alignment

    DEFF Research Database (Denmark)

    Larsen, Rasmus; Eiriksson, Hrafnkell

    2001-01-01

    \\_\\$\\backslash\\$infty\\$ norm alignments are formulated as linear programming problems. The linear vector function formulation along with the different norms results in alignment methods that are both resistant from influence from outliers, robust wrt. errors in the annotation and capable of handling missing datapoints......We express the alignment of 2D shapes as the minimization of the norm of a linear vector function. The minimization is done in the \\$l\\_1\\$, \\$l\\_2\\$ and the \\$l\\_\\$\\backslash\\$infty\\$ norms using well known standard numerical methods. In particular, the \\$l\\_1\\$ and the \\$l...

  16. Alignment and Sensitive Detection of DNA by a Moving Interface

    Science.gov (United States)

    Bensimon, A.; Simon, A.; Chiffaudel, A.; Croquette, V.; Heslot, F.; Bensimon, D.

    1994-09-01

    In a process called "molecular combing," DNA molecules attached at one end to a solid surface were extended and aligned by a receding air-water interface and left to dry on the surface. Molecular combing was observed to extend the length of the bacteriophage λ DNA molecule to 21.5 ± 0.5 micrometers (unextended length, 16.2 micrometers). With the combing process, it was possible to (i) extend a chromosomal Escherichia coli DNA fragment (10^6 base pairs) and (ii) detect a minute quantity of DNA (10^3 molecules). These results open the way for a faster physical mapping of the genome and for the detection of small quantities of target DNA from a population of molecules.

  17. Draft genome of a commonly misdiagnosed multidrug resistant pathogen Candida auris

    OpenAIRE

    2015-01-01

    Background Candida auris is a multidrug resistant, emerging agent of fungemia in humans. Its actual global distribution remains obscure as the current commercial methods of clinical diagnosis misidentify it as C. haemulonii. Here we report the first draft genome of C. auris to explore the genomic basis of virulence and unique differences that could be employed for differential diagnosis. Results More than 99.5 % of the C. auris genomic reads did not align to the current whole (or draft) genom...

  18. A collection of plant-specific genomic data and resources at NCBI.

    Science.gov (United States)

    Tatusova, Tatiana; Smith-White, Brian; Ostell, James

    2007-01-01

    The National Center for Biotechnology Information (NCBI) provides a data-rich environment in support of genomic research by collecting the biological data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains and integrating the data with analytical, search, and retrieval resources through the NCBI Web site. Entrez, an integrated search and retrieval system, enables text searches across various diverse biological databases maintained at NCBI. Map Viewer, the genome browser developed at NCBI, displays aligned genetic, physical, and sequence maps for eukaryotic genomes including those of many plants. A specialized plant query page allows maps from all plant genomes available in the Map Viewer to be searched to produce a display of aligned maps from several species. Customized Plant Basic Local Alignment Search Tool (PlantBLAST) allows the user to perform sequence similarity searches in a special collection of mapped plant sequence data and to view the resulting alignments within a genomic context using Map Viewer. In addition, pre-computed sequence similarities, such as those for proteins offered by BLAST Link (BLink), enable fluid navigation from un-annotated to annotated sequences, quickening the pace of discovery. Plant Genome Central (PGC) is a Web portal that provides centralized access to all NCBI plant genome resources. Also, there are links to plant-specific Web resources external to NCBI such as organism-specific databases, genome-sequencing project Web pages, and homepages of genomic bioinformatics organizations.

  19. A Fast and Specific Alignment Method for Minisatellite Maps

    Directory of Open Access Journals (Sweden)

    Eric Rivals

    2006-01-01

    Full Text Available Background: Variable minisatellites count among the most polymorphic markers of eukaryotic and prokaryotic genomes. This variability can affect gene coding regions, like in the prion protein gene, or gene regulation regions, like for the cystatin B gene, and be associated or implicated in diseases: the Creutzfeld-Jakob disease and the myoclonus epilepsy type 1, for our examples. When it affects neutrally evolving regions, the polymorphism in length (i.e., in number of copies of minisatellites proved useful in population genetics.Motivation: In these tandem repeat sequences, different mutational mechanisms let the number of copies, as well as the copies themselves, vary. Especially, the interspersion of events of tandem duplication/contraction and of punctual mutation makes the succession of variant repeats much more informative than the sole allele length. To exploit this information requires the ability to align minisatellite alleles by accounting for both punctual mutations and tandem duplications.Results: We propose a minisatellite maps alignment program that improves on previous solutions. Our new program is faster, simpler, considers an extended evolutionary model, and is available to the community. We test it on the data set of 609 alleles of the MSY1 (DYF155S1 human minisatellite and confirm its ability to recover known evolutionary signals. Our experiments highlight that the informativeness of minisatellites resides in their length and composition polymorphisms. Exploiting both simultaneously is critical to unravel the implications of variable minisatellites in the control of gene expression and diseases.Availability: Software is available at http://atgc.lirmm.fr/ms_align/

  20. Aligned Fibrous Scaffold Induced Aligned Growth of Corneal Stroma Cells in vitro Culture

    Institute of Scientific and Technical Information of China (English)

    GAO Yan; YAN Jing; CUI Xue-jun; WANG Hong-yan; WANG Qing

    2012-01-01

    To investigate the contribution of fibre arrangement to guiding the aligned growth of corneal stroma cells,aligned and randomly oriented fibrous scaffolds of gelatin and poly-L-lactic acid(PLLA) were fabricated by electrospinning.A comparative study of two different systems with corneal stroma cells on randomly organized and aligned fibres were conducted.The efficiency of the scaffolds for inducing the aligned growth of cells was assessed by morphological observation and 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl-tetrazolium bromide(MTT) assay.Results show that the cells cultured on both randomly oriented and aligned scaffolds maintained normal morphology and well spreading as well as long term proliferation.Importantly,corneal stroma cells grew high orderly on the aligned scaffold,while the cells grew disordered on the randomly oriented scaffold.Moreover,the cells exhibited higher viability in aligned scaffold than that in randomly oriented scaffold.These results indcate that electrospinng to prepare aligned fibrous scaffolds has provided an effective approach to the aligned growth of corneal stroma cells in vitro.Our findings that fiber arrangement plays a crucial role in guiding the aligned growth of cells may be helpful to the development of better biomaterials for tissue engineered cornea.

  1. Automated Eukaryotic Gene Structure Annotation Using EVidenceModeler and the Program to Assemble Spliced Alignments

    Energy Technology Data Exchange (ETDEWEB)

    Haas, B J; Salzberg, S L; Zhu, W; Pertea, M; Allen, J E; Orvis, J; White, O; Buell, C R; Wortman, J R

    2007-12-10

    EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.

  2. Fractal MapReduce decomposition of sequence alignment

    Directory of Open Access Journals (Sweden)

    Almeida Jonas S

    2012-05-01

    Full Text Available Abstract Background The dramatic fall in the cost of genomic sequencing, and the increasing convenience of distributed cloud computing resources, positions the MapReduce coding pattern as a cornerstone of scalable bioinformatics algorithm development. In some cases an algorithm will find a natural distribution via use of map functions to process vectorized components, followed by a reduce of aggregate intermediate results. However, for some data analysis procedures such as sequence analysis, a more fundamental reformulation may be required. Results In this report we describe a solution to sequence comparison that can be thoroughly decomposed into multiple rounds of map and reduce operations. The route taken makes use of iterated maps, a fractal analysis technique, that has been found to provide a "alignment-free" solution to sequence analysis and comparison. That is, a solution that does not require dynamic programming, relying on a numeric Chaos Game Representation (CGR data structure. This claim is demonstrated in this report by calculating the length of the longest similar segment by inspecting only the USM coordinates of two analogous units: with no resort to dynamic programming. Conclusions The procedure described is an attempt at extreme decomposition and parallelization of sequence alignment in anticipation of a volume of genomic sequence data that cannot be met by current algorithmic frameworks. The solution found is delivered with a browser-based application (webApp, highlighting the browser's emergence as an environment for high performance distributed computing. Availability Public distribution of accompanying software library with open source and version control at http://usm.github.com. Also available as a webApp through Google Chrome's WebStore http://chrome.google.com/webstore: search with "usm".

  3. A quantitative account of genomic island acquisitions in prokaryotes

    Directory of Open Access Journals (Sweden)

    Roos Tom E

    2011-08-01

    Full Text Available Abstract Background Microbial genomes do not merely evolve through the slow accumulation of mutations, but also, and often more dramatically, by taking up new DNA in a process called horizontal gene transfer. These innovation leaps in the acquisition of new traits can take place via the introgression of single genes, but also through the acquisition of large gene clusters, which are termed Genomic Islands. Since only a small proportion of all the DNA diversity has been sequenced, it can be hard to find the appropriate donors for acquired genes via sequence alignments from databases. In contrast, relative oligonucleotide frequencies represent a remarkably stable genomic signature in prokaryotes, which facilitates compositional comparisons as an alignment-free alternative for phylogenetic relatedness. In this project, we test whether Genomic Islands identified in individual bacterial genomes have a similar genomic signature, in terms of relative dinucleotide frequencies, and can therefore be expected to originate from a common donor species. Results When multiple Genomic Islands are present within a single genome, we find that up to 28% of these are compositionally very similar to each other, indicative of frequent recurring acquisitions from the same donor to the same acceptor. Conclusions This represents the first quantitative assessment of common directional transfer events in prokaryotic evolutionary history. We suggest that many of the resident Genomic Islands per prokaryotic genome originated from the same source, which may have implications with respect to their regulatory interactions, and for the elucidation of the common origins of these acquired gene clusters.

  4. PriFi - Using a Multiple Alignment of Related Sequences to Find Primers for  Amplification of Homologs

    DEFF Research Database (Denmark)

    Fredslund, Jakob; Schauser, Leif; Madsen, Lene Heegaard;

    2005-01-01

    Using a comparative approach, the web program PriFi (http://cgi-www.daimi.au.dk/cgi-chili/PriFi/main) designs pairs of primers useful for PCR amplification of genomic DNA in species where prior sequence information is not available. The program works with an alignment of DNA sequences from...

  5. PASS2: an automated database of protein alignments organised as structural superfamilies

    Directory of Open Access Journals (Sweden)

    Sowdhamini Ramanathan

    2004-04-01

    Full Text Available Abstract Background The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2 database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. Description An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of

  6. Aligning parallel arrays to reduce communication

    Science.gov (United States)

    Sheffler, Thomas J.; Schreiber, Robert; Gilbert, John R.; Chatterjee, Siddhartha

    1994-01-01

    Axis and stride alignment is an important optimization in compiling data-parallel programs for distributed-memory machines. We previously developed an optimal algorithm for aligning array expressions. Here, we examine alignment for more general program graphs. We show that optimal alignment is NP-complete in this setting, so we study heuristic methods. This paper makes two contributions. First, we show how local graph transformations can reduce the size of the problem significantly without changing the best solution. This allows more complex and effective heuristics to be used. Second, we give a heuristic that can explore the space of possible solutions in a number of ways. We show that some of these strategies can give better solutions than a simple greedy approach proposed earlier. Our algorithms have been implemented; we present experimental results showing their effect on the performance of some example programs running on the CM-5.

  7. Alignment free characterization of 2D gratings

    CERN Document Server

    Madsen, Morten Hannibal; Hansen, Poul-Erik; Jørgensen, Jan Friis

    2015-01-01

    Fast characterization of 2-dimensional gratings is demonstrated using a Fourier lens optical system and a differential optimization algorithm. It is shown that both the grating specific parameters such as the basis vectors and the angle between them and the alignment of the sample, such as the rotation of the sample around the x-, y-, and z-axis, can be deduced from a single measurement. More specifically, the lattice vectors and the angle between them have been measured, while the corrections of the alignment parameters are used to improve the quality of the measurement, and hence reduce the measurement uncertainty. Alignment free characterization is demonstrated on both a 2D hexagonal grating with a period of 700 nm and a checkerboard grating with a pitch of 3000 nm. The method can also be used for both automatic alignment and in-line characterization of gratings.

  8. Faster exon assembly by sparse spliced alignment

    CERN Document Server

    Tiskin, Alexander

    2007-01-01

    Assembling a gene from candidate exons is an important problem in computational biology. Among the most successful approaches to this problem is \\emph{spliced alignment}, proposed by Gelfand et al., which scores different candidate exon chains within a DNA sequence of length $m$ by comparing them to a known related gene sequence of length n, $m = \\Theta(n)$. Gelfand et al.\\ gave an algorithm for spliced alignment running in time O(n^3). Kent et al.\\ considered sparse spliced alignment, where the number of candidate exons is O(n), and proposed an algorithm for this problem running in time O(n^{2.5}). We improve on this result, by proposing an algorithm for sparse spliced alignment running in time O(n^{2.25}). Our approach is based on a new framework of \\emph{quasi-local string comparison}.

  9. Alignment of the NOMAD-STAR detector

    CERN Document Server

    Cervera-Villanueva, A

    2000-01-01

    This note describes the alignment of the NOMAD-STAR detector. This is the B/sub 4/C-silicon target installed in the NOMAD spectrometer in 1997. NOMAD-STAR is composed of modules of 12 silicon detectors each giving a total length of 72 cm. Ten of these modules (called ladders) are assembled to form a layer. There are five layers interleaved with passive boron carbide plates. The total surface of silicon is 1.14 m /sup 2/. Energetic muons from the flat-top of the CERN SPS cycle provide the necessary information to perform a very precise software alignment. This alignment is needed to ensure that the impact parameter measurement needed for the identification of taus in a detector like NOMAD-STAR will not be limited by the error in the alignment. (15 refs).

  10. Robust and Efficient Parametric Face Alignment

    NARCIS (Netherlands)

    Tzimiropoulos, Georgios; Zafeiriou, Stefanos; Pantic, Maja

    2011-01-01

    We propose a correlation-based approach to parametric object alignment particularly suitable for face analysis applications which require efficiency and robustness against occlusions and illumination changes. Our algorithm registers two images by iteratively maximizing their correlation coefficient

  11. Molecular focusing and alignment with plasmon fields.

    Science.gov (United States)

    Artamonov, Maxim; Seideman, Tamar

    2010-12-01

    We show the possibility of simultaneously aligning molecules and focusing their center-of-mass motion near a metal nanoparticle in the field intensity gradient created by the surface plasmon enhancement of incident light. The rotational motion is described quantum mechanically while the translation is treated classically. The effects of the nanoparticle shape on the alignment and focusing are explored. Our results carry interesting implications to the field of molecular nanoplasmonics and suggest several potential applications in nanochemistry.

  12. Unscented Kalman filter for SINS alignment

    Institute of Scientific and Technical Information of China (English)

    Zhou Zhanxin; Gao Yanan; Chen Jiabin

    2007-01-01

    In order to improve the filter accuracy for the nonlinear error model of strapdown inertial navigation system (SINS) alignment, Unscented Kalman Filter (UKF) is presented for simulation with stationary base and moving base of SINS alignment.Simulation results show the superior performance of this approach when compared with classical suboptimal techniques such as extended Kalman filter in cases of large initial misalignment.The UKF has good performance in case of small initial misalignment.

  13. The Nonlinear Evolution of Galaxy Intrinsic Alignments

    OpenAIRE

    Lee, Jounghun; Pen, Ue-Li

    2007-01-01

    The non-Gaussian contribution to the intrinsic halo spin alignments is analytically modeled and numerically detected. Assuming that the growth of non-Gaussianity in the density fluctuations caused the tidal field to have nonlinear-order effect on the orientations of the halo angular momentum, we model the intrinsic halo spin alignments as a linear scaling of the density correlations on large scales, which is different from the previous quadratic-scaling model based on the linear tidal torque ...

  14. Technology Alignment and Portfolio Prioritization (TAPP)

    Science.gov (United States)

    Funaro, Gregory V.; Alexander, Reginald A.

    2015-01-01

    Technology Alignment and Portfolio Prioritization (TAPP) is a method being developed by the Advanced Concepts Office, at NASA Marshall Space Flight Center. The TAPP method expands on current technology assessment methods by incorporating the technological structure underlying technology development, e.g., organizational structures and resources, institutional policy and strategy, and the factors that motivate technological change. This paper discusses the methods ACO is currently developing to better perform technology assessments while taking into consideration Strategic Alignment, Technology Forecasting, and Long Term Planning.

  15. Aligned natural inflation: Monodromies of two axions

    Directory of Open Access Journals (Sweden)

    Rolf Kappl

    2014-10-01

    Full Text Available Natural (axionic inflation [1] can accommodate sizeable primordial tensor modes but suffers from the necessity of trans-Planckian variations of the inflaton field. This problem can be solved via the mechanism of aligned axions [2], where the aligned axion spirals down in the potential of other axions. We elaborate on the mechanism in view of the recently reported observations of the BICEP2 collaboration [3].

  16. Aligned natural inflation: Monodromies of two axions

    Energy Technology Data Exchange (ETDEWEB)

    Kappl, Rolf, E-mail: kappl@th.physik.uni-bonn.de; Krippendorf, Sven, E-mail: krippendorf@th.physik.uni-bonn.de; Nilles, Hans Peter, E-mail: nilles@th.physik.uni-bonn.de

    2014-10-07

    Natural (axionic) inflation [1] can accommodate sizeable primordial tensor modes but suffers from the necessity of trans-Planckian variations of the inflaton field. This problem can be solved via the mechanism of aligned axions [2], where the aligned axion spirals down in the potential of other axions. We elaborate on the mechanism in view of the recently reported observations of the BICEP2 collaboration [3].

  17. Optimal Nonlinear Filter for INS Alignment

    Institute of Scientific and Technical Information of China (English)

    赵瑞; 顾启泰

    2002-01-01

    All the methods to handle the inertial navigation system (INS) alignment were sub-optimal in the past. In this paper, particle filtering (PF) as an optimal method is used for solving the problem of INS alignment. A sub-optimal two-step filtering algorithm is presented to improve the real-time performance of PF. The approach combines particle filtering with Kalman filtering (KF). Simulation results illustrate the superior performance of these approaches when compared with extended Kalman filtering (EKF).

  18. Orthodontics Align Crooked Teeth and Boost Self-Esteem

    Science.gov (United States)

    ... desktop! more... Orthodontics Align Crooked Teeth and Boost Self- esteem Article Chapters Orthodontics Align Crooked Teeth and Boost Self- esteem Orthodontics print full article print this chapter email ...

  19. Radiative torque alignment: Essential Physical Processes

    CERN Document Server

    Hoang, Thiem

    2007-01-01

    We study physical processes that affect the alignment of grains subject to radiative torques (RATs). To describe the action of the RATs we use the analytical model (AMO) of RATs introduced in Paper I, namely, in Lazarian & Hoang (2007). We focus our discussion on the RAT alignment by anisotropic radiation flux in respect to magnetic field. Such an alignment does not invoke paramagnetic, i.e. Davis-Greenstein, dissipation, but, nevertheless, grains tend to align with long axes perpendicular to magnetic field. We use phase space trajectory maps to describe the alignment. When we account for thermal fluctuations within grain material, we show that for grains, which are characterized by a triaxial ellipsoid of inertia, the zero-J attractor point obtained in our earlier study develops into a low-J attractor point. Value at the latter point is the order of thermal angular momentum corresponding to the grain temperature. We show that the alignment of grains with long axes parallel to magnetic field (``wrong alig...

  20. Alignments between galaxies, satellite systems and haloes

    CERN Document Server

    Shao, Shi; Frenk, Carlos S; Gao, Liang; Crain, Robert A; Schaller, Matthieu; Schaye, Joop; Theuns, Tom

    2016-01-01

    The spatial distribution of the satellite populations of the Milky Way and Andromeda are puzzling in that they are nearly perpendicular to the disks of their central galaxies. To understand the origin of such configurations we study the alignment of the central galaxy, satellite system and dark matter halo in the largest of the "Evolution and Assembly of GaLaxies and their Environments" (EAGLE) simulation. We find that centrals and their satellite systems tend to be well aligned with their haloes, with a median misalignment angle of $33^{\\circ}$ in both cases. While the centrals are better aligned with the inner $10$ kpc halo, the satellite systems are better aligned with the entire halo indicating that satellites preferentially trace the outer halo. The central - satellite alignment is weak (median misalignment angle of $52^{\\circ}$) and we find that around $20\\%$ of systems have a misalignment angle larger than $78^{\\circ}$, which is the value for the Milky Way. The central - satellite alignment is a conseq...