WorldWideScience

Sample records for higher sequence similarity

  1. FRESCO: Referential compression of highly similar sequences.

    Science.gov (United States)

    Wandelt, Sebastian; Leser, Ulf

    2013-01-01

    In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.

  2. Interference effects in learning similar sequences of discrete movements

    NARCIS (Netherlands)

    Koedijker, J.M.; Oudejans, R.R.D.; Beek, P.J.

    2010-01-01

    Three experiments were conducted to examine proactive and retroactive interference effects in learning two similar sequences of discrete movements. In each experiment, the participants in the experimental group practiced two movement sequences on consecutive days (1 on each day, order

  3. BLAST and FASTA similarity searching for multiple sequence alignment.

    Science.gov (United States)

    Pearson, William R

    2014-01-01

    BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.

  4. Tensor products of higher almost split sequences

    OpenAIRE

    Pasquali, Andrea

    2015-01-01

    We investigate how the higher almost split sequences over a tensor product of algebras are related to those over each factor. Herschend and Iyama gave a precise criterion for when the tensor product of an $n$-representation finite algebra and an $m$-representation finite algebra is $(n+m)$-representation finite. In this case we give a complete description of the higher almost split sequences over the tensor product by expressing every higher almost split sequence as the mapping cone of a suit...

  5. Investigating Correlation between Protein Sequence Similarity and Semantic Similarity Using Gene Ontology Annotations.

    Science.gov (United States)

    Ikram, Najmul; Qadir, Muhammad Abdul; Afzal, Muhammad Tanvir

    2018-01-01

    Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.

  6. Exploring the relationship between sequence similarity and accurate phylogenetic trees.

    Science.gov (United States)

    Cantarel, Brandi L; Morrison, Hilary G; Pearson, William

    2006-11-01

    We have characterized the relationship between accurate phylogenetic reconstruction and sequence similarity, testing whether high levels of sequence similarity can consistently produce accurate evolutionary trees. We generated protein families with known phylogenies using a modified version of the PAML/EVOLVER program that produces insertions and deletions as well as substitutions. Protein families were evolved over a range of 100-400 point accepted mutations; at these distances 63% of the families shared significant sequence similarity. Protein families were evolved using balanced and unbalanced trees, with ancient or recent radiations. In families sharing statistically significant similarity, about 60% of multiple sequence alignments were 95% identical to true alignments. To compare recovered topologies with true topologies, we used a score that reflects the fraction of clades that were correctly clustered. As expected, the accuracy of the phylogenies was greatest in the least divergent families. About 88% of phylogenies clustered over 80% of clades in families that shared significant sequence similarity, using Bayesian, parsimony, distance, and maximum likelihood methods. However, for protein families with short ancient branches (ancient radiation), only 30% of the most divergent (but statistically significant) families produced accurate phylogenies, and only about 70% of the second most highly conserved families, with median expectation values better than 10(-60), produced accurate trees. These values represent upper bounds on expected tree accuracy for sequences with a simple divergence history; proteins from 700 Giardia families, with a similar range of sequence similarities but considerably more gaps, produced much less accurate trees. For our simulated insertions and deletions, correct multiple sequence alignments did not perform much better than those produced by T-COFFEE, and including sequences with expressed sequence tag-like sequencing errors did not

  7. Using SQL Databases for Sequence Similarity Searching and Analysis.

    Science.gov (United States)

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  8. The HMMER Web Server for Protein Sequence Similarity Search.

    Science.gov (United States)

    Prakash, Ananth; Jeffryes, Matt; Bateman, Alex; Finn, Robert D

    2017-12-08

    Protein sequence similarity search is one of the most commonly used bioinformatics methods for identifying evolutionarily related proteins. In general, sequences that are evolutionarily related share some degree of similarity, and sequence-search algorithms use this principle to identify homologs. The requirement for a fast and sensitive sequence search method led to the development of the HMMER software, which in the latest version (v3.1) uses a combination of sophisticated acceleration heuristics and mathematical and computational optimizations to enable the use of profile hidden Markov models (HMMs) for sequence analysis. The HMMER Web server provides a common platform by linking the HMMER algorithms to databases, thereby enabling the search for homologs, as well as providing sequence and functional annotation by linking external databases. This unit describes three basic protocols and two alternate protocols that explain how to use the HMMER Web server using various input formats and user defined parameters. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  9. Sequence Similarity Presenter: a tool for the graphic display of similarities of long sequences for use in presentations.

    Science.gov (United States)

    Fröhlich, K U

    1994-04-01

    A new method for the presentation of alignments of long sequences is described. The degree of identity for the aligned sequences is averaged for sections of a fixed number of residues. The resulting values are converted to shades of gray, with white corresponding to lack of identity and black corresponding to perfect identity. A sequence alignment is represented as a bar filled with varying shades of gray. The display is compact and allows for a fast and intuitive recognition of the distribution of regions with a high similarity. It is well suited for the presentation of alignments of long sequences, e.g. of protein superfamilies, in plenary lectures. The method is implemented as a HyperCard stack for Apple Macintosh computers. Several options for the modification of the output are available (e.g. background reduction, size of the summation window, consideration of amino acid similarity, inclusion of graphic markers to indicate specific domains). The output is a PostScript file which can be printed, imported as EPS or processed further with Adobe Illustrator.

  10. Self-similarity of higher-order moving averages

    Science.gov (United States)

    Arianos, Sergio; Carbone, Anna; Türk, Christian

    2011-10-01

    In this work, higher-order moving average polynomials are defined by straightforward generalization of the standard moving average. The self-similarity of the polynomials is analyzed for fractional Brownian series and quantified in terms of the Hurst exponent H by using the detrending moving average method. We prove that the exponent H of the fractional Brownian series and of the detrending moving average variance asymptotically agree for the first-order polynomial. Such asymptotic values are compared with the results obtained by the simulations. The higher-order polynomials correspond to trend estimates at shorter time scales as the degree of the polynomial increases. Importantly, the increase of polynomial degree does not require to change the moving average window. Thus trends at different time scales can be obtained on data sets with the same size. These polynomials could be interesting for those applications relying on trend estimates over different time horizons (financial markets) or on filtering at different frequencies (image analysis).

  11. Model-free aftershock forecasts constructed from similar sequences in the past

    Science.gov (United States)

    van der Elst, N.; Page, M. T.

    2017-12-01

    The basic premise behind aftershock forecasting is that sequences in the future will be similar to those in the past. Forecast models typically use empirically tuned parametric distributions to approximate past sequences, and project those distributions into the future to make a forecast. While parametric models do a good job of describing average outcomes, they are not explicitly designed to capture the full range of variability between sequences, and can suffer from over-tuning of the parameters. In particular, parametric forecasts may produce a high rate of "surprises" - sequences that land outside the forecast range. Here we present a non-parametric forecast method that cuts out the parametric "middleman" between training data and forecast. The method is based on finding past sequences that are similar to the target sequence, and evaluating their outcomes. We quantify similarity as the Poisson probability that the observed event count in a past sequence reflects the same underlying intensity as the observed event count in the target sequence. Event counts are defined in terms of differential magnitude relative to the mainshock. The forecast is then constructed from the distribution of past sequences outcomes, weighted by their similarity. We compare the similarity forecast with the Reasenberg and Jones (RJ95) method, for a set of 2807 global aftershock sequences of M≥6 mainshocks. We implement a sequence-specific RJ95 forecast using a global average prior and Bayesian updating, but do not propagate epistemic uncertainty. The RJ95 forecast is somewhat more precise than the similarity forecast: 90% of observed sequences fall within a factor of two of the median RJ95 forecast value, whereas the fraction is 85% for the similarity forecast. However, the surprise rate is much higher for the RJ95 forecast; 10% of observed sequences fall in the upper 2.5% of the (Poissonian) forecast range. The surprise rate is less than 3% for the similarity forecast. The similarity

  12. Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%

    DEFF Research Database (Denmark)

    Havgaard, Jakob Hull; Lyngsø, Rune B.; Stormo, Gary D.

    2005-01-01

    detect two genes with low sequence similarity, where the genes are part of a larger genomic region. Results: Here we present such an approach for pairwise local alignment which is based on FILDALIGN and the Sankoff algorithm for simultaneous structural alignment of multiple sequences. We include...... the ability to conduct mutual scans of two sequences of arbitrary length while searching for common local structural motifs of some maximum length. This drastically reduces the complexity of the algorithm. The scoring scheme includes structural parameters corresponding to those available for free energy....... The structure prediction performance for a family is typically around 0.7 using Matthews correlation coefficient. In case (2), the algorithm is successful at locating RNA families with an average sensitivity of 0.8 and a positive predictive value of 0.9 using a BLAST-like hit selection scheme. Availability...

  13. Similar Ratios of Introns to Intergenic Sequence across Animal Genomes.

    Science.gov (United States)

    Francis, Warren R; Wörheide, Gert

    2017-06-01

    One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. Similar representations of sequence knowledge in young and older adults: A study of effector independent transfer

    Directory of Open Access Journals (Sweden)

    Jonathan Sebastiaan Barnhoorn

    2016-08-01

    Full Text Available Older adults show reduced motor performance and changes in motor skill development. To better understand these changes, we studied differences in sequence knowledge representations between young and older adults using a transfer task. Transfer, or the ability to apply motor skills flexibly, is highly relevant in day-to-day motor activity and facilitates generalization of learning to new contexts. By using movement types that are completely unrelated in terms of muscle activation and response location, we focused on transfer facilitated by the early, visuospatial system.We tested 32 right-handed older adults (65 – 74 and 32 young adults (18 – 30. During practice of a discrete sequence production task, participants learned two 6-element sequences using either unimanual key-presses (KPs or by moving a lever with lower arm flexion-extension (FE movements. Each sequence was performed 144 times. They then performed a test phase consisting of familiar and random sequences performed with the type of movements not used during practice. Both age groups displayed transfer from FE to KP movements as indicated by faster performance on the familiar sequences in the test phase. Only young adults transferred their sequence knowledge from KP to FE movements. In both directions, the young showed higher transfer than older adults. These results suggest that the older participants, like the young, represented their sequences in an abstract visuospatial manner. Transfer was asymmetric in both age groups: there was more transfer from FE to KP movements than vice versa. This similar asymmetry is a further indication that the types of representations that older adults develop are comparable to those that young adults develop. We furthermore found that older adults improved less during FE practice, gained less explicit knowledge, displayed a smaller visuospatial working memory capacity and had lower processing speed than young adults. Despite the many differences

  15. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    Science.gov (United States)

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  16. Application of discrete Fourier inter-coefficient difference for assessing genetic sequence similarity.

    Science.gov (United States)

    King, Brian R; Aburdene, Maurice; Thompson, Alex; Warres, Zach

    2014-01-01

    Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.

  17. Correlation between protein sequence similarity and x-ray diffraction quality in the protein data bank.

    Science.gov (United States)

    Lu, Hui-Meng; Yin, Da-Chuan; Ye, Ya-Jing; Luo, Hui-Min; Geng, Li-Qiang; Li, Hai-Sheng; Guo, Wei-Hong; Shang, Peng

    2009-01-01

    As the most widely utilized technique to determine the 3-dimensional structure of protein molecules, X-ray crystallography can provide structure of the highest resolution among the developed techniques. The resolution obtained via X-ray crystallography is known to be influenced by many factors, such as the crystal quality, diffraction techniques, and X-ray sources, etc. In this paper, the authors found that the protein sequence could also be one of the factors. We extracted information of the resolution and the sequence of proteins from the Protein Data Bank (PDB), classified the proteins into different clusters according to the sequence similarity, and statistically analyzed the relationship between the sequence similarity and the best resolution obtained. The results showed that there was a pronounced correlation between the sequence similarity and the obtained resolution. These results indicate that protein structure itself is one variable that may affect resolution when X-ray crystallography is used.

  18. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

    Directory of Open Access Journals (Sweden)

    Holly J Atkinson

    Full Text Available The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.

  19. Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

    Science.gov (United States)

    Atkinson, Holly J; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C

    2009-01-01

    The dramatic increase in heterogeneous types of biological data--in particular, the abundance of new protein sequences--requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity--GPCRs and kinases from humans, and the crotonase superfamily of enzymes--we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.

  20. K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.

    Science.gov (United States)

    Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue

    2018-05-15

    Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.

  1. Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM

    Directory of Open Access Journals (Sweden)

    Yunyun Liang

    2015-01-01

    Full Text Available Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM. Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS, segmented PsePSSM, and segmented autocovariance transformation (ACT based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640 are adopted in this paper. Then a 700-dimensional (700D feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA. To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.

  2. Adhesive proteins of stalked and acorn barnacles display homology with low sequence similarities.

    Directory of Open Access Journals (Sweden)

    Jaimie-Leigh Jonker

    Full Text Available Barnacle adhesion underwater is an important phenomenon to understand for the prevention of biofouling and potential biotechnological innovations, yet so far, identifying what makes barnacle glue proteins 'sticky' has proved elusive. Examination of a broad range of species within the barnacles may be instructive to identify conserved adhesive domains. We add to extensive information from the acorn barnacles (order Sessilia by providing the first protein analysis of a stalked barnacle adhesive, Lepas anatifera (order Lepadiformes. It was possible to separate the L. anatifera adhesive into at least 10 protein bands using SDS-PAGE. Intense bands were present at approximately 30, 70, 90 and 110 kilodaltons (kDa. Mass spectrometry for protein identification was followed by de novo sequencing which detected 52 peptides of 7-16 amino acids in length. None of the peptides matched published or unpublished transcriptome sequences, but some amino acid sequence similarity was apparent between L. anatifera and closely-related Dosima fascicularis. Antibodies against two acorn barnacle proteins (ab-cp-52k and ab-cp-68k showed cross-reactivity in the adhesive glands of L. anatifera. We also analysed the similarity of adhesive proteins across several barnacle taxa, including Pollicipes pollicipes (a stalked barnacle in the order Scalpelliformes. Sequence alignment of published expressed sequence tags clearly indicated that P. pollicipes possesses homologues for the 19 kDa and 100 kDa proteins in acorn barnacles. Homology aside, sequence similarity in amino acid and gene sequences tended to decline as taxonomic distance increased, with minimum similarities of 18-26%, depending on the gene. The results indicate that some adhesive proteins (e.g. 100 kDa are more conserved within barnacles than others (20 kDa.

  3. Detecting atypical examples of known domain types by sequence similarity searching: the SBASE domain library approach.

    Science.gov (United States)

    Dhir, Somdutta; Pacurar, Mircea; Franklin, Dino; Gáspári, Zoltán; Kertész-Farkas, Attila; Kocsor, András; Eisenhaber, Frank; Pongor, Sándor

    2010-11-01

    SBASE is a project initiated to detect known domain types and predicting domain architectures using sequence similarity searching (Simon et al., Protein Seq Data Anal, 5: 39-42, 1992, Pongor et al, Nucl. Acids. Res. 21:3111-3115, 1992). The current approach uses a curated collection of domain sequences - the SBASE domain library - and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network (http://hydra.icgeb.trieste.it/sbase/). It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies. This article gives an overview of the project history as well as of the concepts and principles developed within this the project.

  4. Construction of a phylogenetic tree of photosynthetic prokaryotes based on average similarities of whole genome sequences.

    Directory of Open Access Journals (Sweden)

    Soichirou Satoh

    Full Text Available Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.

  5. On the Power and Limits of Sequence Similarity Based Clustering of Proteins Into Families

    DEFF Research Database (Denmark)

    Wiwie, Christian; Röttger, Richard

    2017-01-01

    Over the last decades, we have observed an ongoing tremendous growth of available sequencing data fueled by the advancements in wet-lab technology. The sequencing information is only the beginning of the actual understanding of how organisms survive and prosper. It is, for instance, equally...... important to also unravel the proteomic repertoire of an organism. A classical computational approach for detecting protein families is a sequence-based similarity calculation coupled with a subsequent cluster analysis. In this work we have intensively analyzed various clustering tools on a large scale. We...... used the data to investigate the behavior of the tools' parameters underlining the diversity of the protein families. Furthermore, we trained regression models for predicting the expected performance of a clustering tool for an unknown data set and aimed to also suggest optimal parameters...

  6. Bidirectional gene sequences with similar homology to functional proteins of alkane degrading bacterium pseudomonas fredriksbergensis DNA

    International Nuclear Information System (INIS)

    Megeed, A.A.

    2011-01-01

    The potential for two overlapping fragments of DNA from a clone of newly isolated alkanes degrading bacterium Pseudomonas frederiksbergensis encoding sequences with similar homology to two parts of functional proteins is described. One strand contains a sequence with high homology to alkanes monooxygenase (alkB), a member of the alkanes hydroxylase family, and the other strand contains a sequence with some homology to alcohol dehydrogenase gene (alkJ). Overlapping of the genes on opposite strands has been reported in eukaryotic species, and is now reported in a bacterial species. The sequence comparisons and ORFS results revealed that the regulation and the genes organization involved in alkane oxidation represented in Pseudomonas frederiksberghensis varies among the different known alkane degrading bacteria. The alk gene cluster containing homologues to the known alkane monooxygenase (alkB), and rubredoxin (alkG) are oriented in the same direction, whereas alcohol dehydrogenase (alkJ) is oriented in the opposite direction. Such genomes encode messages on both strands of the DNA, or in an overlapping but different reading frames, of the same strand of DNA. The possibility of creating novel genes from pre-existing sequences, known as overprinting, which is a widespread phenomenon in small viruses. Here, the origin and evolution of the gene overlap to bacteriophages belonging to the family Microviridae have been investigated. Such a phenomenon is most widely described in extremely small genomes such as those of viruses or small plasmids, yet here is a unique phenomenon. (author)

  7. Enzyme sequence similarity improves the reaction alignment method for cross-species pathway comparison

    Energy Technology Data Exchange (ETDEWEB)

    Ovacik, Meric A. [Chemical and Biochemical Engineering Department, Rutgers University, Piscataway, NJ 08854 (United States); Androulakis, Ioannis P., E-mail: yannis@rci.rutgers.edu [Chemical and Biochemical Engineering Department, Rutgers University, Piscataway, NJ 08854 (United States); Biomedical Engineering Department, Rutgers University, Piscataway, NJ 08854 (United States)

    2013-09-15

    Pathway-based information has become an important source of information for both establishing evolutionary relationships and understanding the mode of action of a chemical or pharmaceutical among species. Cross-species comparison of pathways can address two broad questions: comparison in order to inform evolutionary relationships and to extrapolate species differences used in a number of different applications including drug and toxicity testing. Cross-species comparison of metabolic pathways is complex as there are multiple features of a pathway that can be modeled and compared. Among the various methods that have been proposed, reaction alignment has emerged as the most successful at predicting phylogenetic relationships based on NCBI taxonomy. We propose an improvement of the reaction alignment method by accounting for sequence similarity in addition to reaction alignment method. Using nine species, including human and some model organisms and test species, we evaluate the standard and improved comparison methods by analyzing glycolysis and citrate cycle pathways conservation. In addition, we demonstrate how organism comparison can be conducted by accounting for the cumulative information retrieved from nine pathways in central metabolism as well as a more complete study involving 36 pathways common in all nine species. Our results indicate that reaction alignment with enzyme sequence similarity results in a more accurate representation of pathway specific cross-species similarities and differences based on NCBI taxonomy.

  8. Enzyme sequence similarity improves the reaction alignment method for cross-species pathway comparison

    International Nuclear Information System (INIS)

    Ovacik, Meric A.; Androulakis, Ioannis P.

    2013-01-01

    Pathway-based information has become an important source of information for both establishing evolutionary relationships and understanding the mode of action of a chemical or pharmaceutical among species. Cross-species comparison of pathways can address two broad questions: comparison in order to inform evolutionary relationships and to extrapolate species differences used in a number of different applications including drug and toxicity testing. Cross-species comparison of metabolic pathways is complex as there are multiple features of a pathway that can be modeled and compared. Among the various methods that have been proposed, reaction alignment has emerged as the most successful at predicting phylogenetic relationships based on NCBI taxonomy. We propose an improvement of the reaction alignment method by accounting for sequence similarity in addition to reaction alignment method. Using nine species, including human and some model organisms and test species, we evaluate the standard and improved comparison methods by analyzing glycolysis and citrate cycle pathways conservation. In addition, we demonstrate how organism comparison can be conducted by accounting for the cumulative information retrieved from nine pathways in central metabolism as well as a more complete study involving 36 pathways common in all nine species. Our results indicate that reaction alignment with enzyme sequence similarity results in a more accurate representation of pathway specific cross-species similarities and differences based on NCBI taxonomy

  9. Testing statistical significance scores of sequence comparison methods with structure similarity

    Directory of Open Access Journals (Sweden)

    Leunissen Jack AM

    2006-10-01

    Full Text Available Abstract Background In the past years the Smith-Waterman sequence comparison algorithm has gained popularity due to improved implementations and rapidly increasing computing power. However, the quality and sensitivity of a database search is not only determined by the algorithm but also by the statistical significance testing for an alignment. The e-value is the most commonly used statistical validation method for sequence database searching. The CluSTr database and the Protein World database have been created using an alternative statistical significance test: a Z-score based on Monte-Carlo statistics. Several papers have described the superiority of the Z-score as compared to the e-value, using simulated data. We were interested if this could be validated when applied to existing, evolutionary related protein sequences. Results All experiments are performed on the ASTRAL SCOP database. The Smith-Waterman sequence comparison algorithm with both e-value and Z-score statistics is evaluated, using ROC, CVE and AP measures. The BLAST and FASTA algorithms are used as reference. We find that two out of three Smith-Waterman implementations with e-value are better at predicting structural similarities between proteins than the Smith-Waterman implementation with Z-score. SSEARCH especially has very high scores. Conclusion The compute intensive Z-score does not have a clear advantage over the e-value. The Smith-Waterman implementations give generally better results than their heuristic counterparts. We recommend using the SSEARCH algorithm combined with e-values for pairwise sequence comparisons.

  10. Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.

    Science.gov (United States)

    Tan, Yen Hock; Huang, He; Kihara, Daisuke

    2006-08-15

    Aligning distantly related protein sequences is a long-standing problem in bioinformatics, and a key for successful protein structure prediction. Its importance is increasing recently in the context of structural genomics projects because more and more experimentally solved structures are available as templates for protein structure modeling. Toward this end, recent structure prediction methods employ profile-profile alignments, and various ways of aligning two profiles have been developed. More fundamentally, a better amino acid similarity matrix can improve a profile itself; thereby resulting in more accurate profile-profile alignments. Here we have developed novel amino acid similarity matrices from knowledge-based amino acid contact potentials. Contact potentials are used because the contact propensity to the other amino acids would be one of the most conserved features of each position of a protein structure. The derived amino acid similarity matrices are tested on benchmark alignments at three different levels, namely, the family, the superfamily, and the fold level. Compared to BLOSUM45 and the other existing matrices, the contact potential-based matrices perform comparably in the family level alignments, but clearly outperform in the fold level alignments. The contact potential-based matrices perform even better when suboptimal alignments are considered. Comparing the matrices themselves with each other revealed that the contact potential-based matrices are very different from BLOSUM45 and the other matrices, indicating that they are located in a different basin in the amino acid similarity matrix space.

  11. Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment

    Directory of Open Access Journals (Sweden)

    Manzini Giovanni

    2007-07-01

    Full Text Available Abstract Background Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rather than a formula quantifying the similarity of two strings. Three approximations of USM are available, namely UCD (Universal Compression Dissimilarity, NCD (Normalized Compression Dissimilarity and CD (Compression Dissimilarity. Their applicability and robustness is tested on various data sets yielding a first massive quantitative estimate that the USM methodology and its approximations are of value. Despite the rich theory developed around USM, its experimental assessment has limitations: only a few data compressors have been tested in conjunction with USM and mostly at a qualitative level, no comparison among UCD, NCD and CD is available and no comparison of USM with existing methods, both based on alignments and not, seems to be available. Results We experimentally test the USM methodology by using 25 compressors, all three of its known approximations and six data sets of relevance to Molecular Biology. This offers the first systematic and quantitative experimental assessment of this methodology, that naturally complements the many theoretical and the preliminary experimental results available. Moreover, we compare the USM methodology both with methods based on alignments and not. We may group our experiments into two sets. The first one, performed via ROC

  12. Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment.

    Science.gov (United States)

    Ferragina, Paolo; Giancarlo, Raffaele; Greco, Valentina; Manzini, Giovanni; Valiente, Gabriel

    2007-07-13

    Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric) has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rather than a formula quantifying the similarity of two strings. Three approximations of USM are available, namely UCD (Universal Compression Dissimilarity), NCD (Normalized Compression Dissimilarity) and CD (Compression Dissimilarity). Their applicability and robustness is tested on various data sets yielding a first massive quantitative estimate that the USM methodology and its approximations are of value. Despite the rich theory developed around USM, its experimental assessment has limitations: only a few data compressors have been tested in conjunction with USM and mostly at a qualitative level, no comparison among UCD, NCD and CD is available and no comparison of USM with existing methods, both based on alignments and not, seems to be available. We experimentally test the USM methodology by using 25 compressors, all three of its known approximations and six data sets of relevance to Molecular Biology. This offers the first systematic and quantitative experimental assessment of this methodology, that naturally complements the many theoretical and the preliminary experimental results available. Moreover, we compare the USM methodology both with methods based on alignments and not. We may group our experiments into two sets. The first one, performed via ROC (Receiver Operating Curve) analysis, aims at

  13. Lower prevalence but similar fitness in a parasitic fungus at higher radiation levels near Chernobyl.

    Science.gov (United States)

    Aguileta, Gabriela; Badouin, Helene; Hood, Michael E; Møller, Anders P; Le Prieur, Stephanie; Snirc, Alodie; Siguenza, Sophie; Mousseau, Timothy A; Shykoff, Jacqui A; Cuomo, Christina A; Giraud, Tatiana

    2016-07-01

    Nuclear disasters at Chernobyl and Fukushima provide examples of effects of acute ionizing radiation on mutations that can affect the fitness and distribution of species. Here, we investigated the prevalence of Microbotryum lychnidis-dioicae, a pollinator-transmitted fungal pathogen of plants causing anther-smut disease in Chernobyl, its viability, fertility and karyotype variation, and the accumulation of nonsynonymous mutations in its genome. We collected diseased flowers of Silene latifolia from locations ranging by more than two orders of magnitude in background radiation, from 0.05 to 21.03 μGy/h. Disease prevalence decreased significantly with increasing radiation level, possibly due to lower pollinator abundance and altered pollinator behaviour. Viability and fertility, measured as the budding rate of haploid sporidia following meiosis from the diploid teliospores, did not vary with increasing radiation levels and neither did karyotype overall structure and level of chromosomal size heterozygosity. We sequenced the genomes of twelve samples from Chernobyl and of four samples collected from uncontaminated areas and analysed alignments of 6068 predicted genes, corresponding to 1.04 × 10(7)  base pairs. We found no dose-dependent differences in substitution rates (neither dN, dS, nor dN/dS). Thus, we found no significant evidence of increased deleterious mutation rates at higher levels of background radiation in this plant pathogen. We even found lower levels of nonsynonymous substitution rates in contaminated areas compared to control regions, suggesting that purifying selection was stronger in contaminated than uncontaminated areas. We briefly discuss the possibilities for a mechanistic basis of radio resistance in this nonmelanized fungus. © 2016 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  14. Scaling Relations of Local Magnitude versus Moment Magnitude for Sequences of Similar Earthquakes in Switzerland

    KAUST Repository

    Bethmann, F.

    2011-03-22

    Theoretical considerations and empirical regressions show that, in the magnitude range between 3 and 5, local magnitude, ML, and moment magnitude, Mw, scale 1:1. Previous studies suggest that for smaller magnitudes this 1:1 scaling breaks down. However, the scatter between ML and Mw at small magnitudes is usually large and the resulting scaling relations are therefore uncertain. In an attempt to reduce these uncertainties, we first analyze the ML versus Mw relation based on 195 events, induced by the stimulation of a geothermal reservoir below the city of Basel, Switzerland. Values of ML range from 0.7 to 3.4. From these data we derive a scaling of ML ~ 1:5Mw over the given magnitude range. We then compare peak Wood-Anderson amplitudes to the low-frequency plateau of the displacement spectra for six sequences of similar earthquakes in Switzerland in the range of 0:5 ≤ ML ≤ 4:1. Because effects due to the radiation pattern and to the propagation path between source and receiver are nearly identical at a particular station for all events in a given sequence, the scatter in the data is substantially reduced. Again we obtain a scaling equivalent to ML ~ 1:5Mw. Based on simulations using synthetic source time functions for different magnitudes and Q values estimated from spectral ratios between downhole and surface recordings, we conclude that the observed scaling can be explained by attenuation and scattering along the path. Other effects that could explain the observed magnitude scaling, such as a possible systematic increase of stress drop or rupture velocity with moment magnitude, are masked by attenuation along the path.

  15. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Directory of Open Access Journals (Sweden)

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  16. The genomic sequence of cowpea aphid-borne mosaic virus and its similarities with other potyviruses

    NARCIS (Netherlands)

    Mlotshwa, S.; Verver, J.; Sithole-Niang, I.; Kampen, van T.; Kammen, van A.; Wellink, J.

    2002-01-01

    The genomic sequence of a Zimbabwe isolate of Cowpea aphid-borne mosaic virus (CABMV-Z) was determined by sequencing overlapping viral cDNA clones generated by RT-PCR using degenerate and/or specific primers. The sequence is 9465 nucleotides in length excluding the 3' terminal poly (A) tail and

  17. Characterization of CG6178 gene product with high sequence similarity to firefly luciferase in Drosophila melanogaster.

    Science.gov (United States)

    Oba, Yuichi; Ojika, Makoto; Inouye, Satoshi

    2004-03-31

    This is the first identification of a long-chain fatty acyl-CoA synthetase in Drosophila by enzymatic characterization. The gene product of CG6178 (CG6178) in Drosophila melanogaster genome, which has a high sequence similarity to firefly luciferase, has been expressed and characterized. CG6178 showed long-chain fatty acyl-CoA synthetic activity in the presence of ATP, CoA and Mg(2+), suggesting a fatty acyl adenylate is an intermediate. Recently, it was revealed that firefly luciferase has two catalytic functions, monooxygenase (luciferase) and AMP-mediated CoA ligase (fatty acyl-CoA synthetase). However, unlike firefly luciferase, CG6178 did not show luminescence activity in the presence of firefly luciferin, ATP, CoA and Mg(2+). The enzymatic properties of CG6178 including substrate specificity, pH dependency and optimal temperature were close to those of firefly luciferase and rat fatty acyl-CoA synthetase. Further, phylogenic analyses strongly suggest that the firefly luciferase gene may have evolved from a fatty acyl-CoA synthetase gene as a common ancestral gene.

  18. CLONING AND SEQUENCING OF THE GENE FOR A LACTOCOCCAL ENDOPEPTIDASE, AN ENZYME WITH SEQUENCE SIMILARITY TO MAMMALIAN ENKEPHALINASE

    NARCIS (Netherlands)

    Mierau, Igor; Tan, Paris S.T.; Haandrikman, Alfred J.; Kok, Jan; Leenhouts, Kees J.; Konings, Wil N.; Venema, Gerard

    The gene specifying an endopeptidase of Lactococcus lactis, named pepO, was cloned from a genomic library of L. lactis subsp. cremoris P8-247 in lambdaEMBL3 and was subsequently sequenced. pepO is probably the last gene of an operon encoding the binding-protein-dependent oligopeptide transport

  19. Simultaneous identification of long similar substrings in large sets of sequences

    Directory of Open Access Journals (Sweden)

    Wittig Burghardt

    2007-05-01

    Full Text Available Abstract Background Sequence comparison faces new challenges today, with many complete genomes and large libraries of transcripts known. Gene annotation pipelines match these sequences in order to identify genes and their alternative splice forms. However, the software currently available cannot simultaneously compare sets of sequences as large as necessary especially if errors must be considered. Results We therefore present a new algorithm for the identification of almost perfectly matching substrings in very large sets of sequences. Its implementation, called ClustDB, is considerably faster and can handle 16 times more data than VMATCH, the most memory efficient exact program known today. ClustDB simultaneously generates large sets of exactly matching substrings of a given minimum length as seeds for a novel method of match extension with errors. It generates alignments of maximum length with a considered maximum number of errors within each overlapping window of a given size. Such alignments are not optimal in the usual sense but faster to calculate and often more appropriate than traditional alignments for genomic sequence comparisons, EST and full-length cDNA matching, and genomic sequence assembly. The method is used to check the overlaps and to reveal possible assembly errors for 1377 Medicago truncatula BAC-size sequences published at http://www.medicago.org/genome/assembly_table.php?chr=1. Conclusion The program ClustDB proves that window alignment is an efficient way to find long sequence sections of homogenous alignment quality, as expected in case of random errors, and to detect systematic errors resulting from sequence contaminations. Such inserts are systematically overlooked in long alignments controlled by only tuning penalties for mismatches and gaps. ClustDB is freely available for academic use.

  20. An Alignment-Free Algorithm in Comparing the Similarity of Protein Sequences Based on Pseudo-Markov Transition Probabilities among Amino Acids.

    Science.gov (United States)

    Li, Yushuang; Song, Tian; Yang, Jiasheng; Zhang, Yi; Yang, Jialiang

    2016-01-01

    In this paper, we have proposed a novel alignment-free method for comparing the similarity of protein sequences. We first encode a protein sequence into a 440 dimensional feature vector consisting of a 400 dimensional Pseudo-Markov transition probability vector among the 20 amino acids, a 20 dimensional content ratio vector, and a 20 dimensional position ratio vector of the amino acids in the sequence. By evaluating the Euclidean distances among the representing vectors, we compare the similarity of protein sequences. We then apply this method into the ND5 dataset consisting of the ND5 protein sequences of 9 species, and the F10 and G11 datasets representing two of the xylanases containing glycoside hydrolase families, i.e., families 10 and 11. As a result, our method achieves a correlation coefficient of 0.962 with the canonical protein sequence aligner ClustalW in the ND5 dataset, much higher than those of other 5 popular alignment-free methods. In addition, we successfully separate the xylanases sequences in the F10 family and the G11 family and illustrate that the F10 family is more heat stable than the G11 family, consistent with a few previous studies. Moreover, we prove mathematically an identity equation involving the Pseudo-Markov transition probability vector and the amino acids content ratio vector.

  1. Ultra-fast sequence clustering from similarity networks with SiLiX

    Directory of Open Access Journals (Sweden)

    Duret Laurent

    2011-04-01

    Full Text Available Abstract Background The number of gene sequences that are available for comparative genomics approaches is increasing extremely quickly. A current challenge is to be able to handle this huge amount of sequences in order to build families of homologous sequences in a reasonable time. Results We present the software package SiLiX that implements a novel method which reconsiders single linkage clustering with a graph theoretical approach. A parallel version of the algorithms is also presented. As a demonstration of the ability of our software, we clustered more than 3 millions sequences from about 2 billion BLAST hits in 7 minutes, with a high clustering quality, both in terms of sensitivity and specificity. Conclusions Comparing state-of-the-art software, SiLiX presents the best up-to-date capabilities to face the problem of clustering large collections of sequences. SiLiX is freely available at http://lbbe.univ-lyon1.fr/SiLiX.

  2. Lower- Versus Higher-Income Populations In The Alternative Quality Contract: Improved Quality And Similar Spending.

    Science.gov (United States)

    Song, Zirui; Rose, Sherri; Chernew, Michael E; Safran, Dana Gelb

    2017-01-01

    As population-based payment models become increasingly common, it is crucial to understand how such payment models affect health disparities. We evaluated health care quality and spending among enrollees in areas with lower versus higher socioeconomic status in Massachusetts before and after providers entered into the Alternative Quality Contract, a two-sided population-based payment model with substantial incentives tied to quality. We compared changes in process measures, outcome measures, and spending between enrollees in areas with lower and higher socioeconomic status from 2006 to 2012 (outcome measures were measured after the intervention only). Quality improved for all enrollees in the Alternative Quality Contract after their provider organizations entered the contract. Process measures improved 1.2 percentage points per year more among enrollees in areas with lower socioeconomic status than among those in areas with higher socioeconomic status. Outcome measure improvement was no different between the subgroups; neither were changes in spending. Larger or comparable improvements in quality among enrollees in areas with lower socioeconomic status suggest a potential narrowing of disparities. Strong pay-for-performance incentives within a population-based payment model could encourage providers to focus on improving quality for more disadvantaged populations. Project HOPE—The People-to-People Health Foundation, Inc.

  3. Similar Representations of Sequence Knowledge in Young and Older Adults: A Study of Effector Independent Transfer

    NARCIS (Netherlands)

    Barnhoorn, Jonathan Sebastiaan; Döhring, Falko R.; van Asseldonk, Edwin H.F.; Verwey, Willem B.

    2016-01-01

    Older adults show reduced motor performance and changes in motor skill development. To better understand these changes, we studied differences in sequence knowledge representations between young and older adults using a transfer task. Transfer, or the ability to apply motor skills flexibly, is

  4. A behavioral similarity measure between labeled Petri nets based on principal transition sequences

    NARCIS (Netherlands)

    Wang, J.; He, T.; Wen, L.; Wu, N.; Hofstede, ter A.H.M.; Su, J.; Meersman, R.; Dillon, T.S.; Herrero, P.

    2010-01-01

    Being able to determine the degree of similarity between process models is important for management, reuse, and analysis of business process models. In this paper we propose a novel method to determine the degree of similarity between process models, which exploits their semantics. Our approach is

  5. Single nucleus genome sequencing reveals high similarity among nuclei of an endomycorrhizal fungus.

    Directory of Open Access Journals (Sweden)

    Kui Lin

    2014-01-01

    Full Text Available Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence of a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection and transmission of the genetic make-up to next generations. Although this concept has become textbook knowledge, it is only based on studying a few loci, including 45S rDNA. To provide a more comprehensive insight into the genetic makeup of arbuscular endomycorrhizal fungi, we applied de novo genome sequencing of individual nuclei of Rhizophagus irregularis. This revealed a surprisingly low level of polymorphism between nuclei. In contrast, within a nucleus, the 45S rDNA repeat unit turned out to be highly diverged. This finding demystifies a long-lasting hypothesis on the complex genetic makeup of arbuscular endomycorrhizal fungi. Subsequent genome assembly resulted in the first draft reference genome sequence of an arbuscular endomycorrhizal fungus. Its length is 141 Mbps, representing over 27,000 protein-coding gene models. We used the genomic sequence to reinvestigate the phylogenetic relationships of Rhizophagus irregularis with other fungal phyla. This unambiguously demonstrated that Glomeromycota are more closely related to Mucoromycotina than to its postulated sister Dikarya.

  6. HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.

    Science.gov (United States)

    O'Driscoll, Aisling; Belogrudov, Vladislav; Carroll, John; Kropp, Kai; Walsh, Paul; Ghazal, Peter; Sleator, Roy D

    2015-04-01

    The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function. As such, parallelised solutions have been proposed but many exhibit scalability limitations and are incapable of effectively processing "Big Data" - the name attributed to datasets that are extremely large, complex and require rapid processing. The Hadoop framework, comprised of distributed storage and a parallelised programming framework known as MapReduce, is specifically designed to work with such datasets but it is not trivial to efficiently redesign and implement bioinformatics algorithms according to this paradigm. The parallelisation strategy of "divide and conquer" for alignment algorithms can be applied to both data sets and input query sequences. However, scalability is still an issue due to memory constraints or large databases, with very large database segmentation leading to additional performance decline. Herein, we present Hadoop Blast (HBlast), a parallelised BLAST algorithm that proposes a flexible method to partition both databases and input query sequences using "virtual partitioning". HBlast presents improved scalability over existing solutions and well balanced computational work load while keeping database segmentation and recompilation to a minimum. Enhanced BLAST search performance on cheap memory constrained hardware has significant implications for in field clinical diagnostic testing; enabling faster and more accurate identification of pathogenic DNA in human blood or tissue samples. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. First-order and higher order sequence learning in specific language impairment.

    Science.gov (United States)

    Clark, Gillian M; Lum, Jarrad A G

    2017-02-01

    A core claim of the procedural deficit hypothesis of specific language impairment (SLI) is that the disorder is associated with poor implicit sequence learning. This study investigated whether implicit sequence learning problems in SLI are present for first-order conditional (FOC) and higher order conditional (HOC) sequences. Twenty-five children with SLI and 27 age-matched, nonlanguage-impaired children completed 2 serial reaction time tasks. On 1 version, the sequence to be implicitly learnt comprised a FOC sequence and on the other a HOC sequence. Results showed that the SLI group learned the HOC sequence (η p ² = .285, p = .005) but not the FOC sequence (η p ² = .099, p = .118). The control group learned both sequences (FOC η p ² = .497, HOC η p 2= .465, ps < .001). The SLI group's difficulty learning the FOC sequence is consistent with the procedural deficit hypothesis. However, the study provides new evidence that multiple mechanisms may underpin the learning of FOC and HOC sequences. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  8. Identification of similar regions of protein structures using integrated sequence and structure analysis tools

    Directory of Open Access Journals (Sweden)

    Heiland Randy

    2006-03-01

    Full Text Available Abstract Background Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site http://www.sblest.org/ and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. Results Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. Conclusion With structural genomics initiatives determining structures with little, if any, functional characterization

  9. When Does Between-Sequence Phonological Similarity Promote Irrelevant Sound Disruption?

    Science.gov (United States)

    Marsh, John E.; Vachon, Francois; Jones, Dylan M.

    2008-01-01

    Typically, the phonological similarity between to-be-recalled items and TBI auditory stimuli has no impact if recall in serial order is required. However, in the present study, the authors have shown that the free recall, but not serial recall, of lists of phonologically related to-be-remembered items was disrupted by an irrelevant sound stream…

  10. Protein sequences and redox titrations indicate that the electron acceptors in reaction centers from heliobacteria are similar to Photosystem I

    Science.gov (United States)

    Trost, J. T.; Brune, D. C.; Blankenship, R. E.

    1992-01-01

    Photosynthetic reaction centers isolated from Heliobacillus mobilis exhibit a single major protein on SDS-PAGE of 47 000 Mr. Attempts to sequence the reaction center polypeptide indicated that the N-terminus is blocked. After enzymatic and chemical cleavage, four peptide fragments were sequenced from the Heliobacillus mobilis apoprotein. Only one of these sequences showed significant specific similarity to any of the protein and deduced protein sequences in the GenBank data base. This fragment is identical with 56% of the residues, including both cysteines, found in highly conserved region that is proposed to bind iron-sulfur center Fx in the Photosystem I reaction center peptide that is the psaB gene product. The similarity to the psaA gene product in this region is 48%. Redox titrations of laser-flash-induced photobleaching with millisecond decay kinetics on isolated reaction centers from Heliobacterium gestii indicate a midpoint potential of -414 mV with n = 2 titration behavior. In membranes, the behavior is intermediate between n = 1 and n = 2, and the apparent midpoint potential is -444 mV. This is compared to the behavior in Photosystem I, where the intermediate electron acceptor A1, thought to be a phylloquinone molecule, has been proposed to undergo a double reduction at low redox potentials in the presence of viologen redox mediators. These results strongly suggest that the acceptor side electron transfer system in reaction centers from heliobacteria is indeed analogous to that found in Photosystem I. The sequence similarities indicate that the divergence of the heliobacteria from the Photosystem I line occurred before the gene duplication and subsequent divergence that lead to the heterodimeric protein core of the Photosystem I reaction center.

  11. Sequence diversity, cytotoxicity and antigenic similarities of the leukotoxin of isolates of Mannheimia species from mastitis in domestic sheep.

    Science.gov (United States)

    Omaleki, Lida; Browning, Glenn F; Barber, Stuart R; Allen, Joanne L; Srikumaran, Subramaniam; Markham, Philip F

    2014-11-07

    Species within the genus Mannheimia are among the most important causes of ovine mastitis. Isolates of these species can express leukotoxin A (LktA), a primary virulence factor of these bacteria. To examine the significance of variation in the LktA, the sequences of the lktA genes in a panel of isolates from cases of ovine mastitis were compared. The cross-neutralising capacities of rat antisera raised against LktA of one Mannheimia glucosida, one haemolytic Mannheimia ruminalis, and two Mannheimia haemolytica isolates were also examined to assess the effect that variation in the lktA gene can have on protective immunity against leukotoxins with differing sequences. The lktA nucleotide distance between the M. haemolytica isolates was greater than between the M. glucosida isolates, with the M. haemolytica isolates divisible into two groups based on their lktA sequences. Comparison of the topology of phylogenetic trees of 16S rDNA and lktA sequences revealed differences in the relationships between some isolates, suggesting horizontal gene transfer. Cross neutralisation data obtained with monospecific anti-LktA rat sera were used to derive antigenic similarity coefficients for LktA from the four Mannheimia species isolates. Similarity coefficients indicated that LktA of the two M. haemolytica isolates were least similar, while LktA from M. glucosida was most similar to those for one of the M. haemolytica isolates and the haemolytic M. ruminalis isolate. The results suggested that vaccination with the M. glucosida leukotoxin would generate the greatest cross-protection against ovine mastitis caused by Mannheimia species with these alleles. Copyright © 2014 Elsevier B.V. All rights reserved.

  12. Musicians' and nonmusicians' short-term memory for verbal and musical sequences: comparing phonological similarity and pitch proximity.

    Science.gov (United States)

    Williamson, Victoria J; Baddeley, Alan D; Hitch, Graham J

    2010-03-01

    Language-music comparative studies have highlighted the potential for shared resources or neural overlap in auditory short-term memory. However, there is a lack of behavioral methodologies for comparing verbal and musical serial recall. We developed a visual grid response that allowed both musicians and nonmusicians to perform serial recall of letter and tone sequences. The new method was used to compare the phonological similarity effect with the impact of an operationalized musical equivalent-pitch proximity. Over the course of three experiments, we found that short-term memory for tones had several similarities to verbal memory, including limited capacity and a significant effect of pitch proximity in nonmusicians. Despite being vulnerable to phonological similarity when recalling letters, however, musicians showed no effect of pitch proximity, a result that we suggest might reflect strategy differences. Overall, the findings support a limited degree of correspondence in the way that verbal and musical sounds are processed in auditory short-term memory.

  13. A Novel Phytase with Sequence Similarity to Purple Acid Phosphatases Is Expressed in Cotyledons of Germinating Soybean Seedlings 1

    Science.gov (United States)

    Hegeman, Carla E.; Grabau, Elizabeth A.

    2001-01-01

    Phytic acid (myo-inositol hexakisphosphate) is the major storage form of phosphorus in plant seeds. During germination, stored reserves are used as a source of nutrients by the plant seedling. Phytic acid is degraded by the activity of phytases to yield inositol and free phosphate. Due to the lack of phytases in the non-ruminant digestive tract, monogastric animals cannot utilize dietary phytic acid and it is excreted into manure. High phytic acid content in manure results in elevated phosphorus levels in soil and water and accompanying environmental concerns. The use of phytases to degrade seed phytic acid has potential for reducing the negative environmental impact of livestock production. A phytase was purified to electrophoretic homogeneity from cotyledons of germinated soybeans (Glycine max L. Merr.). Peptide sequence data generated from the purified enzyme facilitated the cloning of the phytase sequence (GmPhy) employing a polymerase chain reaction strategy. The introduction of GmPhy into soybean tissue culture resulted in increased phytase activity in transformed cells, which confirmed the identity of the phytase gene. It is surprising that the soybean phytase was unrelated to previously characterized microbial or maize (Zea mays) phytases, which were classified as histidine acid phosphatases. The soybean phytase sequence exhibited a high degree of similarity to purple acid phosphatases, a class of metallophosphoesterases. PMID:11500558

  14. "Venom" of the slow loris: sequence similarity of prosimian skin gland protein and Fel d 1 cat allergen.

    Science.gov (United States)

    Krane, Sonja; Itagaki, Yasuhiro; Nakanishi, Koji; Weldon, Paul J

    2003-02-01

    Bites inflicted on humans by the slow loris (Nycticebus coucang), a prosimian from Indonesia, are painful and elicit anaphylaxis. Toxins from N. coucang are thought to originate in the brachial organ, a naked, gland-laden area of skin situated on the flexor surface of the arm that is licked during grooming. We isolated a major component of the brachial organ secretions from N. coucang, an approximately 18 kDa protein composed of two 70-90 amino-acid chains linked by one or more disulfide bonds. The N-termini of these peptide chains exhibit nearly 70% sequence similarity (37% identity, chain 1; 54% identity, chain 2) with the two chains of Fel d 1, the major allergen from the domestic cat (Felis catus). The extensive sequence similarity between the brachial organ component of N. coucang and the cat allergen suggests that they exhibit immunogenic cross-reactivity. This work clarifies the chemical nature of the brachial organ exudate and suggests a possible mode of action underlying the noxious effects of slow loris bites.

  15. Bumblebees (Bombus terrestris) and honeybees (Apis mellifera) prefer similar colours of higher spectral purity over trained colours.

    Science.gov (United States)

    Rohde, Katja; Papiorek, Sarah; Lunau, Klaus

    2013-03-01

    Differences in the concentration of pigments as well as their composition and spatial arrangement cause intraspecific variation in the spectral signature of flowers. Known colour preferences and requirements for flower-constant foraging bees predict different responses to colour variability. In experimental settings, we simulated small variations of unicoloured petals and variations in the spatial arrangement of colours within tricoloured petals using artificial flowers and studied their impact on the colour choices of bumblebees and honeybees. Workers were trained to artificial flowers of a given colour and then given the simultaneous choice between three test colours: either the training colour, one colour of lower and one of higher spectral purity, or the training colour, one colour of lower and one of higher dominant wavelength; in all cases the perceptual contrast between the training colour and the additional test colours was similarly small. Bees preferred artificial test flowers which resembled the training colour with the exception that they preferred test colours with higher spectral purity over trained colours. Testing the behaviour of bees at artificial flowers displaying a centripetal or centrifugal arrangement of three equally sized colours with small differences in spectral purity, bees did not prefer any type of artificial flowers, but preferentially choose the most spectrally pure area for the first antenna contact at both types of artificial flowers. Our results indicate that innate preferences for flower colours of high spectral purity in pollinators might exert selective pressure on the evolution of flower colours.

  16. Location of the redox-active thiols of ribonucleotide reductase: sequences similarity between the Escherichia coli and Lactobacillus leichmannii enzymes

    International Nuclear Information System (INIS)

    Lin, A.N.I.; Ashley, G.W.; Stubbe, J.

    1987-01-01

    The redox-active thiols of Escherichia coli ribonucleoside diphosphate reductase and of Lactobacillus leichmannii ribonucleoside triphosphate reductase have been located by a procedure involving (1) prereduction of enzyme with dithiothreitol, (2) specific oxidation of the redox-active thiols by treatment with substrate in the absence of exogenous reductant, (3) alkylation of other thiols with iodoacetamide, and (4) reduction of the disulfides with dithiothreitol and alkylation with [1- 14 C]iodoacetamide. The dithiothreitol-reduce E. coli B1 subunit is able to convert 3 equiv of CDP to dCDP and is labeled with 5.4 equiv of 14 C. Sequencing of tryptic peptides shows that 2.8 equiv of 14 C is on cysteines-752 and -757 at the C-terminus of B1, while 1.0-1.5 equiv of 14 C is on cysteines-222 and -227. It thus appears that two sets of redox-active dithiols are involved in substrate reduction. The L. leichmannii reductase is able to convert 1.1 equiv of CTP to dCTP and is labeled with 2.1 equiv of 14 C. Sequencing of tryptic peptides shows that 1.4 equiv of 14 C is located on the two cysteines of C-E-G-G-A-C-P-I-K. This peptide shows remarkable and unexpected similarity to the thiol-containing region of the C-terminal peptide of E. coli B1, C-E-S-G-A-C-K-I

  17. Structural and Sequence Similarities of Hydra Xeroderma Pigmentosum A Protein to Human Homolog Suggest Early Evolution and Conservation

    Directory of Open Access Journals (Sweden)

    Apurva Barve

    2013-01-01

    Full Text Available Xeroderma pigmentosum group A (XPA is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1 and replication protein A 70 kDa subunit (RPA70 proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla.

  18. In vitro identification and in silico utilization of interspecies sequence similarities using GeneChip® technology

    Directory of Open Access Journals (Sweden)

    Ye Shui Q

    2005-05-01

    Full Text Available Abstract Background Genomic approaches in large animal models (canine, ovine etc are challenging due to insufficient genomic information for these species and the lack of availability of corresponding microarray platforms. To address this problem, we speculated that conserved interspecies genetic sequences can be experimentally detected by cross-species hybridization. The Affymetrix platform probe redundancy offers flexibility in selecting individual probes with high sequence similarities between related species for gene expression analysis. Results Gene expression profiles of 40 canine samples were generated using the human HG-U133A GeneChip (U133A. Due to interspecies genetic differences, only 14 ± 2% of canine transcripts were detected by U133A probe sets whereas profiling of 40 human samples detected 49 ± 6% of human transcripts. However, when these probe sets were deconstructed into individual probes and examined performance of each probe, we found that 47% of human probes were able to find their targets in canine tissues and generate a detectable hybridization signal. Therefore, we restricted gene expression analysis to these probes and observed the 60% increase in the number of identified canine transcripts. These results were validated by comparison of transcripts identified by our restricted analysis of cross-species hybridization with transcripts identified by hybridization of total lung canine mRNA to new Affymetrix Canine GeneChip®. Conclusion The experimental identification and restriction of gene expression analysis to probes with detectable hybridization signal drastically increases transcript detection of canine-human hybridization suggesting the possibility of broad utilization of cross-hybridizations of related species using GeneChip technology.

  19. Intrinsic atopic dermatitis shows similar TH2 and higher TH17 immune activation compared with extrinsic atopic dermatitis.

    Science.gov (United States)

    Suárez-Fariñas, Mayte; Dhingra, Nikhil; Gittler, Julia; Shemer, Avner; Cardinale, Irma; de Guzman Strong, Cristina; Krueger, James G; Guttman-Yassky, Emma

    2013-08-01

    Atopic dermatitis (AD) is classified as extrinsic and intrinsic, representing approximately 80% and 20% of patients with the disease, respectively. Although sharing a similar clinical phenotype, only extrinsic AD is characterized by high serum IgE levels. Because most patients with AD exhibit high IgE levels, an "allergic"/IgE-mediated disease pathogenesis was hypothesized. However, current models associate AD with T-cell activation, particularly TH2/TH22 polarization, and epidermal barrier defects. We sought to define whether both variants share a common pathogenesis. We stratified 51 patients with severe AD into extrinsic AD (n = 42) and intrinsic AD (n = 9) groups (with similar mean disease activity/SCORAD scores) and analyzed the molecular and cellular skin pathology of lesional and nonlesional intrinsic AD and extrinsic AD by using gene expression (real-time PCR) and immunohistochemistry. A significant correlation between IgE levels and SCORAD scores (r = 0.76, P extrinsic AD. Marked infiltrates of T cells and dendritic cells and corresponding epidermal alterations (keratin 16, Mki67, and S100A7/A8/A9) defined lesional skin of patients with both variants. However, higher activation of all inflammatory axes (including TH2) was detected in patients with intrinsic AD, particularly TH17 and TH22 cytokines. Positive correlations between TH17-related molecules and SCORAD scores were only found in patients with intrinsic AD, whereas only patients with extrinsic AD showed positive correlations between SCORAD scores and TH2 cytokine (IL-4 and IL-5) levels and negative correlations with differentiation products (loricrin and periplakin). Although differences in TH17 and TH22 activation exist between patients with intrinsic AD and those with extrinsic AD, we identified common disease-defining features of T-cell activation, production of polarized cytokines, and keratinocyte responses to immune products. Our data indicate that a TH2 bias is not the sole cause of high Ig

  20. Pulmonary parenchyma segmentation in thin CT image sequences with spectral clustering and geodesic active contour model based on similarity

    Science.gov (United States)

    He, Nana; Zhang, Xiaolong; Zhao, Juanjuan; Zhao, Huilan; Qiang, Yan

    2017-07-01

    While the popular thin layer scanning technology of spiral CT has helped to improve diagnoses of lung diseases, the large volumes of scanning images produced by the technology also dramatically increase the load of physicians in lesion detection. Computer-aided diagnosis techniques like lesions segmentation in thin CT sequences have been developed to address this issue, but it remains a challenge to achieve high segmentation efficiency and accuracy without much involvement of human manual intervention. In this paper, we present our research on automated segmentation of lung parenchyma with an improved geodesic active contour model that is geodesic active contour model based on similarity (GACBS). Combining spectral clustering algorithm based on Nystrom (SCN) with GACBS, this algorithm first extracts key image slices, then uses these slices to generate an initial contour of pulmonary parenchyma of un-segmented slices with an interpolation algorithm, and finally segments lung parenchyma of un-segmented slices. Experimental results show that the segmentation results generated by our method are close to what manual segmentation can produce, with an average volume overlap ratio of 91.48%.

  1. On universal common ancestry, sequence similarity, and phylogenetic structure: the sins of P-values and the virtues of Bayesian evidence

    Directory of Open Access Journals (Sweden)

    Theobald Douglas L

    2011-11-01

    Full Text Available Abstract Background The universal common ancestry (UCA of all known life is a fundamental component of modern evolutionary theory, supported by a wide range of qualitative molecular evidence. Nevertheless, recently both the status and nature of UCA has been questioned. In earlier work I presented a formal, quantitative test of UCA in which model selection criteria overwhelmingly choose common ancestry over independent ancestry, based on a dataset of universally conserved proteins. These model-based tests are founded in likelihoodist and Bayesian probability theory, in opposition to classical frequentist null hypothesis tests such as Karlin-Altschul E-values for sequence similarity. In a recent comment, Koonin and Wolf (K&W claim that the model preference for UCA is "a trivial consequence of significant sequence similarity". They support this claim with a computational simulation, derived from universally conserved proteins, which produces similar sequences lacking phylogenetic structure. The model selection tests prefer common ancestry for this artificial data set. Results For the real universal protein sequences, hierarchical phylogenetic structure (induced by genealogical history is the overriding reason for why the tests choose UCA; sequence similarity is a relatively minor factor. First, for cases of conflicting phylogenetic structure, the tests choose independent ancestry even with highly similar sequences. Second, certain models, like star trees and K&W's profile model (corresponding to their simulation, readily explain sequence similarity yet lack phylogenetic structure. However, these are extremely poor models for the real proteins, even worse than independent ancestry models, though they explain K&W's artificial data well. Finally, K&W's simulation is an implementation of a well-known phylogenetic model, and it produces sequences that mimic homologous proteins. Therefore the model selection tests work appropriately with the artificial

  2. Phylogenetic similarity of the canine parvovirus wild-type isolates on the basis of VP1/VP2 gene fragment sequence analysis.

    Science.gov (United States)

    Rypul, K; Chmielewski, R; Smielewska-Loś, E; Klimentowski, S

    2002-04-01

    Biological material was taken from dogs with diarrhoea. Faecal samples were taken from within live animals and intestinal tract fragments (i.e. small intestine, and stomach) were taken from dead animals. In total, 18 specimens were investigated from dogs housed alone or in large groups. To test for the presence of the virus, latex (On Site Biotech, Uppsala, Sweden) and direct immunofluorescence tests were performed. At the same time, polymerase chain reaction (PCR) with primers complementary to a conservative region of VP1/VP2 was carried out. The products of amplification were analysed on 2% agarose gel. The purified products were cloned with the Template Generation System (Finnzymes, Espoo, Finland) using a transposition reaction and positive clones were searched using the 'colony screening by PCR' method. The sequencing gave 12 sequences of VP1/VP2 gene fragments that were of high similarity. Among the 12 analysed sequences, six exhibited 88% similarity, four exhibited 100% similarity and two exhibited 71% similarity.

  3. Remarkable sequence similarity between the dinoflagellate-infecting marine girus and the terrestrial pathogen African swine fever virus

    Directory of Open Access Journals (Sweden)

    Claverie Jean-Michel

    2009-10-01

    Full Text Available Abstract Heterocapsa circularisquama DNA virus (HcDNAV; previously designated as HcV is a giant virus (girus with a ~356-kbp double-stranded DNA (dsDNA genome. HcDNAV lytically infects the bivalve-killing marine dinoflagellate H. circularisquama, and currently represents the sole DNA virus isolated from dinoflagellates, one of the most abundant protists in marine ecosystems. Its morphological features, genome type, and host range previously suggested that HcDNAV might be a member of the family Phycodnaviridae of Nucleo-Cytoplasmic Large DNA Viruses (NCLDVs, though no supporting sequence data was available. NCLDVs currently include two families found in aquatic environments (Phycodnaviridae, Mimiviridae, one mostly infecting terrestrial animals (Poxviridae, another isolated from fish, amphibians and insects (Iridoviridae, and the last one (Asfarviridae exclusively represented by the animal pathogen African swine fever virus (ASFV, the agent of a fatal hemorrhagic disease in domestic swine. In this study, we determined the complete sequence of the type B DNA polymerase (PolB gene of HcDNAV. The viral PolB was transcribed at least from 6 h post inoculation (hpi, suggesting its crucial function for viral replication. Most unexpectedly, the HcDNAV PolB sequence was found to be closely related to the PolB sequence of ASFV. In addition, the amino acid sequence of HcDNAV PolB showed a rare amino acid substitution within a motif containing highly conserved motif: YSDTDS was found in HcDNAV PolB instead of YGDTDS in most dsDNA viruses. Together with the previous observation of ASFV-like sequences in the Sorcerer II Global Ocean Sampling metagenomic datasets, our results further reinforce the ideas that the terrestrial ASFV has its evolutionary origin in marine environments.

  4. Intrinsic atopic dermatitis (AD) shows similar Th2 and higher Th17 immune activation compared to extrinsic AD

    Science.gov (United States)

    Suárez-Fariñas, M; Dhingra, N; Gittler, J; Shemer, A; Cardinale, I; de Guzman Strong, C; Krueger, JG; Guttman-Yassky, E

    2013-01-01

    Background Atopic dermatitis (AD) is classified as extrinsic (ADe) and intrinsic (ADi), representing approximately 80% and 20% of patients, respectively. While sharing a similar clinical phenotype, only ADe is characterized by high serum IgE. Since most AD patients exhibit high IgE, an “allergic”/IgE-mediated disease pathogenesis was hypothesized. However, current models associate AD with T-cell activation, particularly Th2/Th22 polarization, and epidermal barrier defects. Objective To define if both variants share a common pathogenesis. Methods We stratified 51 severe AD patients as ADe (42) and ADi (9) (with similar mean disease activity/SCORAD), and analyzed the molecular and cellular skin pathology of lesional and non-lesional ADi and ADe using gene-expression (RT-PCR) and immunohistochemistry. Results A significant correlation between IgE levels and SCORAD (r=0.76, pextrinsic and intrinsic AD variants might be treated with T-cell targeted therapeutics or agents that modify keratinocyte responses. PMID:23777851

  5. Structural and sequence variants in patients with Silver-Russell syndrome or similar features-Curation of a disease database

    DEFF Research Database (Denmark)

    Tümer, Zeynep; López-Hernández, Julia Angélica; Netchine, Irène

    2018-01-01

    data of these patients. The clinical features are scored according to the Netchine-Harbison clinical scoring system (NH-CSS), which has recently been accepted as standard by consensus. The structural and sequence variations are reviewed and where necessary redescribed according to recent...

  6. Analysis of HIV-1 intersubtype recombination breakpoints suggests region with high pairing probability may be a more fundamental factor than sequence similarity affecting HIV-1 recombination.

    Science.gov (United States)

    Jia, Lei; Li, Lin; Gui, Tao; Liu, Siyang; Li, Hanping; Han, Jingwan; Guo, Wei; Liu, Yongjian; Li, Jingyun

    2016-09-21

    With increasing data on HIV-1, a more relevant molecular model describing mechanism details of HIV-1 genetic recombination usually requires upgrades. Currently an incomplete structural understanding of the copy choice mechanism along with several other issues in the field that lack elucidation led us to perform an analysis of the correlation between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarity to further explore structural mechanisms. Near full length sequences of URFs from Asia, Europe, and Africa (one sequence/patient), and representative sequences of worldwide CRFs were retrieved from the Los Alamos HIV database. Their recombination patterns were analyzed by jpHMM in detail. Then the relationships between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarities were investigated. Pearson correlation test showed that all URF groups and the CRF group exhibit the same breakpoint distribution pattern. Additionally, the Wilcoxon two-sample test indicated a significant and inexplicable limitation of recombination in regions with high pairing probability. These regions have been found to be strongly conserved across distinct biological states (i.e., strong intersubtype similarity), and genetic similarity has been determined to be a very important factor promoting recombination. Thus, the results revealed an unexpected disagreement between intersubtype similarity and breakpoint distribution, which were further confirmed by genetic similarity analysis. Our analysis reveals a critical conflict between results from natural HIV-1 isolates and those from HIV-1-based assay vectors in which genetic similarity has been shown to be a very critical factor promoting recombination. These results indicate the region with high-pairing probabilities may be a more fundamental factor affecting HIV-1 recombination than sequence similarity in natural HIV-1 infections. Our

  7. Monoclonal Antibodies Against Fusicoccin with Binding Characteristics Similar to the Putative Fusicoccin Receptor of Higher Plants 1

    Science.gov (United States)

    Feyerabend, Martin; Weiler, Elmar W.

    1987-01-01

    Monoclonal antibodies were raised against fusicoccin. The toxin, linked to bovine serum albumin through its t-pentenyl moiety, served as immunogen. Hybridomas secreting anti-fusicoccin antibodies were screened by radioimmunoassay employing a novel radioactive derivative, [3H]-nor-fusicoccin-alcohol of high specific activity (1.5 × 1014Bq/mole). The two monoclonal antibodies reported here are of high apparent affinity for fusicoccin (0.71 × 10−9 molar and 1.85 × 10−9 molar). This is comparable to the apparent affinity of rabbit antiserum raised against the same type of conjugate (9.3 × 10−9 molar). A method for the single step purification of the monoclonal antibodies from ascites fluid is reported. A solid-phase immunoassay, using alkaline phosphatase as enzyme, exhibits a measuring range from 0.1 to 1.5 picomoles (about 70 picograms to 1 nanogram) of fusicoccin. The displacement of [3H]-nor-fusicoccin-alcohol from the antibodies by compounds structurally related to fusicoccin exhibits similar selectivity as a microsomal binding assay with the same tracer as radiolabeled probe. Images Fig. 2 PMID:16665786

  8. Nuclear and cpDNA sequences combined provide strong inference of higher phylogenetic relationships in the phlox family (Polemoniaceae).

    Science.gov (United States)

    Johnson, Leigh A; Chan, Lauren M; Weese, Terri L; Busby, Lisa D; McMurry, Samuel

    2008-09-01

    Members of the phlox family (Polemoniaceae) serve as useful models for studying various evolutionary and biological processes. Despite its biological importance, no family-wide phylogenetic estimate based on multiple DNA regions with complete generic sampling is available. Here, we analyze one nuclear and five chloroplast DNA sequence regions (nuclear ITS, chloroplast matK, trnL intron plus trnL-trnF intergeneric spacer, and the trnS-trnG, trnD-trnT, and psbM-trnD intergenic spacers) using parsimony and Bayesian methods, as well as assessments of congruence and long branch attraction, to explore phylogenetic relationships among 84 ingroup species representing all currently recognized Polemoniaceae genera. Relationships inferred from the ITS and concatenated chloroplast regions are similar overall. A combined analysis provides strong support for the monophyly of Polemoniaceae and subfamilies Acanthogilioideae, Cobaeoideae, and Polemonioideae. Relationships among subfamilies, and thus for the precise root of Polemoniaceae, remain poorly supported. Within the largest subfamily, Polemonioideae, four clades corresponding to tribes Polemonieae, Phlocideae, Gilieae, and Loeselieae receive strong support. The monogeneric Polemonieae appears sister to Phlocideae. Relationships within Polemonieae, Phlocideae, and Gilieae are mostly consistent between analyses and data permutations. Many relationships within Loeselieae remain uncertain. Overall, inferred phylogenetic relationships support a higher-level classification for Polemoniaceae proposed in 2000.

  9. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase

    Energy Technology Data Exchange (ETDEWEB)

    Zemla, A; Lang, D; Kostova, T; Andino, R; Zhou, C

    2010-11-29

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitate the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected

  10. An alignment-free method to find similarity among protein sequences via the general form of Chou's pseudo amino acid composition.

    Science.gov (United States)

    Gupta, M K; Niyogi, R; Misra, M

    2013-01-01

    In this paper, we propose a method to create the 60-dimensional feature vector for protein sequences via the general form of pseudo amino acid composition. The construction of the feature vector is based on the contents of amino acids, total distance of each amino acid from the first amino acid in the protein sequence and the distribution of 20 amino acids. The obtained cosine distance metric (also called the similarity matrix) is used to construct the phylogenetic tree by the neighbour joining method. In order to show the applicability of our approach, we tested it on three proteins: 1) ND5 protein sequences from nine species, 2) ND6 protein sequences from eight species, and 3) 50 coronavirus spike proteins. The results are in agreement with known history and the output from the multiple sequence alignment program ClustalW, which is widely used. We have also compared our phylogenetic results with six other recently proposed alignment-free methods. These comparisons show that our proposed method gives a more consistent biological relationship than the others. In addition, the time complexity is linear and space required is less as compared with other alignment-free methods that use graphical representation. It should be noted that the multiple sequence alignment method has exponential time complexity.

  11. An effective approach for annotation of protein families with low sequence similarity and conserved motifs: identifying GDSL hydrolases across the plant kingdom.

    Science.gov (United States)

    Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica

    2016-02-18

    The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through

  12. A protein-tyrosine phosphatase with sequence similarity to the SH2 domain of the protein-tyrosine kinases.

    Science.gov (United States)

    Shen, S H; Bastien, L; Posner, B I; Chrétien, P

    1991-08-22

    The phosphorylation of proteins at tyrosine residues is critical in cellular signal transduction, neoplastic transformation and control of the mitotic cycle. These mechanisms are regulated by the activities of both protein-tyrosine kinases (PTKs) and protein-tyrosine phosphatases (PTPases). As in the PTKs, there are two classes of PTPases: membrane associated, receptor-like enzymes and soluble proteins. Here we report the isolation of a complementary DNA clone encoding a new form of soluble PTPase, PTP1C. The enzyme possesses a large noncatalytic region at the N terminus which unexpectedly contains two adjacent copies of the Src homology region 2 (the SH2 domain) found in various nonreceptor PTKs and other cytoplasmic signalling proteins. As with other SH2 sequences, the SH2 domains of PTP1C formed high-affinity complexes with the activated epidermal growth factor receptor and other phosphotyrosine-containing proteins. These results suggest that the SH2 regions in PTP1C may interact with other cellular components to modulate its own phosphatase activity against interacting substrates. PTPase activity may thus directly link growth factor receptors and other signalling proteins through protein-tyrosine phosphorylation.

  13. Fold-recognition and comparative modeling of human α2,3-sialyltransferases reveal their sequence and structural similarities to CstII from Campylobacter jejuni

    Directory of Open Access Journals (Sweden)

    Balaji Petety V

    2006-04-01

    Full Text Available Abstract Background The 3-D structure of none of the eukaryotic sialyltransferases (SiaTs has been determined so far. Sequence alignment algorithms such as BLAST and PSI-BLAST could not detect a homolog of these enzymes from the protein databank. SiaTs, thus, belong to the hard/medium target category in the CASP experiments. The objective of the current work is to model the 3-D structures of human SiaTs which transfer the sialic acid in α2,3-linkage viz., ST3Gal I, II, III, IV, V, and VI, using fold-recognition and comparative modeling methods. The pair-wise sequence similarity among these six enzymes ranges from 41 to 63%. Results Unlike the sequence similarity servers, fold-recognition servers identified CstII, a α2,3/8 dual-activity SiaT from Campylobacter jejuni as the homolog of all the six ST3Gals; the level of sequence similarity between CstII and ST3Gals is only 15–20% and the similarity is restricted to well-characterized motif regions of ST3Gals. Deriving template-target sequence alignments for the entire ST3Gal sequence was not straightforward: the fold-recognition servers could not find a template for the region preceding the L-motif and that between the L- and S-motifs. Multiple structural templates were identified to model these regions and template identification-modeling-evaluation had to be performed iteratively to choose the most appropriate templates. The modeled structures have acceptable stereochemical properties and are also able to provide qualitative rationalizations for some of the site-directed mutagenesis results reported in literature. Apart from the predicted models, an unexpected but valuable finding from this study is the sequential and structural relatedness of family GT42 and family GT29 SiaTs. Conclusion The modeled 3-D structures can be used for docking and other modeling studies and for the rational identification of residues to be mutated to impart desired properties such as altered stability, substrate

  14. Curriculum Mapping in Higher Education: A Case Study and Proposed Content Scope and Sequence Mapping Tool

    Science.gov (United States)

    Arafeh, Sousan

    2016-01-01

    Best practice in curriculum development and implementation requires that discipline-based standards or requirements embody both curricular and programme scopes and sequences. Ensuring these are present and aligned in course/programme content, activities and assessments to support student success requires formalised and systematised review and…

  15. Sequencing of the needle transcriptome from Norway spruce (Picea abies Karst L. reveals lower substitution rates, but similar selective constraints in gymnosperms and angiosperms

    Directory of Open Access Journals (Sweden)

    Chen Jun

    2012-11-01

    Full Text Available Abstract Background A detailed knowledge about spatial and temporal gene expression is important for understanding both the function of genes and their evolution. For the vast majority of species, transcriptomes are still largely uncharacterized and even in those where substantial information is available it is often in the form of partially sequenced transcriptomes. With the development of next generation sequencing, a single experiment can now simultaneously identify the transcribed part of a species genome and estimate levels of gene expression. Results mRNA from actively growing needles of Norway spruce (Picea abies was sequenced using next generation sequencing technology. In total, close to 70 million fragments with a length of 76 bp were sequenced resulting in 5 Gbp of raw data. A de novo assembly of these reads, together with publicly available expressed sequence tag (EST data from Norway spruce, was used to create a reference transcriptome. Of the 38,419 PUTs (putative unique transcripts longer than 150 bp in this reference assembly, 83.5% show similarity to ESTs from other spruce species and of the remaining PUTs, 3,704 show similarity to protein sequences from other plant species, leaving 4,167 PUTs with limited similarity to currently available plant proteins. By predicting coding frames and comparing not only the Norway spruce PUTs, but also PUTs from the close relatives Picea glauca and Picea sitchensis to both Pinus taeda and Taxus mairei, we obtained estimates of synonymous and non-synonymous divergence among conifer species. In addition, we detected close to 15,000 SNPs of high quality and estimated gene expression differences between samples collected under dark and light conditions. Conclusions Our study yielded a large number of single nucleotide polymorphisms as well as estimates of gene expression on transcriptome scale. In agreement with a recent study we find that the synonymous substitution rate per year (0.6 × 10

  16. Characterization of human MMTV-like (HML) elements similar to a sequence that was highly expressed in a human breast cancer: further definition of the HML-6 group.

    Science.gov (United States)

    Yin, H; Medstrand, P; Kristofferson, A; Dietrich, U; Aman, P; Blomberg, J

    1999-03-30

    Previously, we found a retroviral sequence, HML-6.2BC1, to be expressed at high levels in a multifocal ductal breast cancer from a 41-year-old woman who also developed ovarian carcinoma. The sequence of a human genomic clone (HML-6.28) selected by high-stringency hybridization with HML-6.2BC1 is reported here. It was 99% identical to HML-6.2BC1 and gave the same restriction fragments as total DNA. HML-6.28 is a 4.7-kb provirus with a 5'LTR, truncated in RT. Data from two similar genomic clones and sequences found in GenBank are also reported. Overlaps between them gave a rather complete picture of the HML-6.2BC1-like human endogenous retroviral elements. Work with somatic cell hybrids and FISH localized HML-6.28 to chromosome 6, band p21, close to the MHC region. The causal role of HML-6.28 in breast cancer remains unclear. Nevertheless, the ca. 20 Myr old HML-6 sequences enabled the definition of common and unique features of type A, B, and D (ABD) retroviruses. In Gag, HML-6 has no intervening sequences between matrix and capsid proteins, unlike extant exogenous ABD viruses, possibly an ancestral feature. Alignment of the dUTPase showed it to be present in all ABD viruses, but gave a phylogenetic tree different from trees made from other ABD genes, indicating a distinct phylogeny of dUTPase. A conserved 24-mer sequence in the amino terminus of some ABD envelope genes suggested a conserved function. Copyright 1999 Academic Press.

  17. Subset of Kappa and Lambda Germline Sequences Result in Light Chains with a Higher Molecular Mass Phenotype.

    Science.gov (United States)

    Barnidge, David R; Lundström, Susanna L; Zhang, Bo; Dasari, Surendra; Murray, David L; Zubarev, Roman A

    2015-12-04

    In our previous work, we showed that electrospray ionization of intact polyclonal kappa and lambda light chains isolated from normal serum generates two distinct, Gaussian-shaped, molecular mass distributions representing the light-chain repertoire. During the analysis of a large (>100) patient sample set, we noticed a low-intensity molecular mass distribution with a mean of approximately 24 250 Da, roughly 800 Da higher than the mean of the typical kappa molecular-mass distribution mean of 23 450 Da. We also observed distinct clones in this region that did not appear to contain any typical post-translational modifications that would account for such a large mass shift. To determine the origin of the high molecular mass clones, we performed de novo bottom-up mass spectrometry on a purified IgM monoclonal light chain that had a calculated molecular mass of 24 275.03 Da. The entire sequence of the monoclonal light chain was determined using multienzyme digestion and de novo sequence-alignment software and was found to belong to the germline allele IGKV2-30. The alignment of kappa germline sequences revealed ten IGKV2 and one IGKV4 sequences that contained additional amino acids in their CDR1 region, creating the high-molecular-mass phenotype. We also performed an alignment of lambda germline sequences, which showed additional amino acids in the CDR2 region, and the FR3 region of functional germline sequences that result in a high-molecular-mass phenotype. The work presented here illustrates the ability of mass spectrometry to provide information on the diversity of light-chain molecular mass phenotypes in circulation, which reflects the germline sequences selected by the immunoglobulin-secreting B-cell population.

  18. SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity.

    Science.gov (United States)

    Li, Ying Hong; Xu, Jing Yu; Tao, Lin; Li, Xiao Feng; Li, Shuang; Zeng, Xian; Chen, Shang Ying; Zhang, Peng; Qin, Chu; Zhang, Cheng; Chen, Zhe; Zhu, Feng; Chen, Yu Zong

    2016-01-01

    Knowledge of protein function is important for biological, medical and therapeutic studies, but many proteins are still unknown in function. There is a need for more improved functional prediction methods. Our SVM-Prot web-server employed a machine learning method for predicting protein functional families from protein sequences irrespective of similarity, which complemented those similarity-based and other methods in predicting diverse classes of proteins including the distantly-related proteins and homologous proteins of different functions. Since its publication in 2003, we made major improvements to SVM-Prot with (1) expanded coverage from 54 to 192 functional families, (2) more diverse protein descriptors protein representation, (3) improved predictive performances due to the use of more enriched training datasets and more variety of protein descriptors, (4) newly integrated BLAST analysis option for assessing proteins in the SVM-Prot predicted functional families that were similar in sequence to a query protein, and (5) newly added batch submission option for supporting the classification of multiple proteins. Moreover, 2 more machine learning approaches, K nearest neighbor and probabilistic neural networks, were added for facilitating collective assessment of protein functions by multiple methods. SVM-Prot can be accessed at http://bidd2.nus.edu.sg/cgi-bin/svmprot/svmprot.cgi.

  19. Molecular phylogeny and species separation of five morphologically similar Holosticha-complex ciliates (Protozoa, Ciliophora) using ARDRA riboprinting and multigene sequence data

    Science.gov (United States)

    Gao, Feng; Yi, Zhenzhen; Gong, Jun; Al-Rasheid Khaled, A. S.; Song, Weibo

    2010-05-01

    To separate and redefine the ambiguous Holosticha-complex, a confusing group of hypotrichous ciliates, six strains belonging to five morphospecies of three genera, Holosticha heterofoissneri, Anteholosticha sp. pop1, Anteholosticha sp. pop2, A. manca, A. gracilis and Nothoholosticha fasciola, were analyzed using 12 restriction enzymes on the basis of amplified ribosomal DNA restriction analysis. Nine of the 12 enzymes could digest the DNA products, four ( Hinf I, Hind III, Msp I, Taq I) yielded species-specific restriction patterns, and Hind III and Taq I produced different patterns for two Anteholosticha sp. populations. Distinctly different restriction digestion haplotypes and similarity indices can be used to separate the species. The secondary structures of the five species were predicted based on the ITS2 transcripts and there were several minor differences among species, while two Anteholosticha sp. populations were identical. In addition, phylogenies based on the SSrRNA gene sequences were reconstructed using multiple algorithms, which grouped them generally into four clades, and exhibited that the genus Anteholosticha should be a convergent assemblage. The fact that Holosticha species clustered with the oligotrichs and choreotrichs, though with very low support values, indicated that the topology may be very divergent and unreliable when the number of sequence data used in the analyses is too low.

  20. Human Treponema pallidum 11q/j isolate belongs to subsp. endemicum but contains two loci with a sequence in TP0548 and TP0488 similar to subsp. pertenue and subsp. pallidum, respectively.

    Directory of Open Access Journals (Sweden)

    Lenka Mikalová

    2017-03-01

    Full Text Available Treponema pallidum subsp. endemicum (TEN is the causative agent of endemic syphilis (bejel. An unusual human TEN 11q/j isolate was obtained from a syphilis-like primary genital lesion from a patient that returned to France from Pakistan.The TEN 11q/j isolate was characterized using nested PCR followed by Sanger sequencing and/or direct Illumina sequencing. Altogether, 44 chromosomal regions were analyzed. Overall, the 11q/j isolate clustered with TEN strains Bosnia A and Iraq B as expected from previous TEN classification of the 11q/j isolate. However, the 11q/j sequence in a 505 bp-long region at the TP0488 locus was similar to Treponema pallidum subsp. pallidum (TPA strains, but not to TEN Bosnia A and Iraq B sequences, suggesting a recombination event at this locus. Similarly, the 11q/j sequence in a 613 bp-long region at the TP0548 locus was similar to Treponema pallidum subsp. pertenue (TPE strains, but not to TEN sequences.A detailed analysis of two recombinant loci found in the 11q/j clinical isolate revealed that the recombination event occurred just once, in the TP0488, with the donor sequence originating from a TPA strain. Since TEN Bosnia A and Iraq B were found to contain TPA-like sequences at the TP0548 locus, the recombination at TP0548 took place in a treponeme that was an ancestor to both TEN Bosnia A and Iraq B. The sequence of 11q/j isolate in TP0548 represents an ancestral TEN sequence that is similar to yaws-causing treponemes. In addition to the importance of the 11q/j isolate for reconstruction of the TEN phylogeny, this case emphasizes the possible role of TEN strains in development of syphilis-like lesions.

  1. A putative carbohydrate-binding domain of the lactose-binding Cytisus sessilifolius anti-H(O) lectin has a similar amino acid sequence to that of the L-fucose-binding Ulex europaeus anti-H(O) lectin.

    Science.gov (United States)

    Konami, Y; Yamamoto, K; Osawa, T; Irimura, T

    1995-04-01

    The complete amino acid sequence of a lactose-binding Cytisus sessilifolius anti-H(O) lectin II (CSA-II) was determined using a protein sequencer. After digestion of CSA-II with endoproteinase Lys-C or Asp-N, the resulting peptides were purified by reversed-phase high performance liquid chromatography (HPLC) and then subjected to sequence analysis. Comparison of the complete amino acid sequence of CSA-II with the sequences of other leguminous seed lectins revealed regions of extensive homology. The amino acid sequence of a putative carbohydrate-binding domain of CSA-II was found to be similar to those of several anti-H(O) leguminous lectins, especially to that of the L-fucose-binding Ulex europaeus lectin I (UEA-I).

  2. Monomorphism in humans and sequence differences among higher primates for a sequence tagged site (STS) in homeo box cluster 2 as assayed by denaturing gradient electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Ruano, G.; Ruddle, F.H.; Kidd, K.K. (Yale Univ., New Haven, CT (United States)); Gray, M.R. (Tufts Univ., Boston, MA (United States)); Miki, Tetsuro (Osaka Univ. (Japan)); Ferguson-Smith, A.C. (Inst. of Animal Physiology and Genetics Research, Cambridge (United Kingdom))

    1990-03-11

    The human homeo box cluster 2 (HOX2) contains genes coding for DNA binding proteins involved in developmental control and is highly conserved between mouse and man. The authors have applied in concert the Polymerase Chain Reaction (PCR) and Denaturing Gradient Electrophoresis (DGE) to amplify defined primate HOX2 segments and to detect sequence differences among them. They have sequenced a PstI fragment 4 kb upstream from HOX 2.2 and synthesized primers delimiting both halves of 630 bp segment within it PCR on various unrelated humans and SC-PCR on chimpanzee, gorilla, orangutan and gibbon yielded products of the same length for each primer pair.

  3. Whole-genome sequencing of asian lung cancers: second-hand smoke unlikely to be responsible for higher incidence of lung cancer among Asian never-smokers.

    Science.gov (United States)

    Krishnan, Vidhya G; Ebert, Philip J; Ting, Jason C; Lim, Elaine; Wong, Swee-Seong; Teo, Audrey S M; Yue, Yong G; Chua, Hui-Hoon; Ma, Xiwen; Loh, Gary S L; Lin, Yuhao; Tan, Joanna H J; Yu, Kun; Zhang, Shenli; Reinhard, Christoph; Tan, Daniel S W; Peters, Brock A; Lincoln, Stephen E; Ballinger, Dennis G; Laramie, Jason M; Nilsen, Geoffrey B; Barber, Thomas D; Tan, Patrick; Hillmer, Axel M; Ng, Pauline C

    2014-11-01

    Asian nonsmoking populations have a higher incidence of lung cancer compared with their European counterparts. There is a long-standing hypothesis that the increase of lung cancer in Asian never-smokers is due to environmental factors such as second-hand smoke. We analyzed whole-genome sequencing of 30 Asian lung cancers. Unsupervised clustering of mutational signatures separated the patients into two categories of either all the never-smokers or all the smokers or ex-smokers. In addition, nearly one third of the ex-smokers and smokers classified with the never-smoker-like cluster. The somatic variant profiles of Asian lung cancers were similar to that of European origin with G.C>T.A being predominant in smokers. We found EGFR and TP53 to be the most frequently mutated genes with mutations in 50% and 27% of individuals, respectively. Among the 16 never-smokers, 69% had an EGFR mutation compared with 29% of 14 smokers/ex-smokers. Asian never-smokers had lung cancer signatures distinct from the smoker signature and their mutation profiles were similar to European never-smokers. The profiles of Asian and European smokers are also similar. Taken together, these results suggested that the same mutational mechanisms underlie the etiology for both ethnic groups. Thus, the high incidence of lung cancer in Asian never-smokers seems unlikely to be due to second-hand smoke or other carcinogens that cause oxidative DNA damage, implying that routine EGFR testing is warranted in the Asian population regardless of smoking status. ©2014 American Association for Cancer Research.

  4. The de Bono LAMS Sequence Series: Template Designs as Knowledge-Mobilising Strategy for 21st Century Higher Education

    Science.gov (United States)

    Dobozy, Eva

    2012-01-01

    In this paper, the five interlocking de Bono LAMS sequences are introduced as a new form of generic template designs. This transdisciplinary knowledge-mobilising strategy is based on Edward de Bono's attention-directing ideas and thinking skills, commonly known as the CoRT tools. The development of the de Bono LAMS sequence series is an important…

  5. Effects of circulating member B of the family with sequence similarity 3 on the risk of developing metabolic syndrome and its components: A 5-year prospective study.

    Science.gov (United States)

    Wang, Haoyu; Yu, Fadong; Zhang, Zhuo; Hou, Yuanyuan; Teng, Weiping; Shan, Zhongyan; Lai, Yaxin

    2017-11-27

    Member B of the family with sequence similarity 3 (FAM3B), also known as pancreatic-derived factor, is mainly synthesized and secreted by islet β-cells, and plays a role in abnormal metabolism of glucose and lipids. However, the prospective association of FAM3B with metabolic disorders remains unclear. The present study aimed to reveal the predictive relationship between pancreas-specific cytokine and metabolic syndrome (MetS). A total of 210 adults (88 men and 122 women) without MetS, aged between 40 and 65 years, were recruited and received a comprehensive health examination. Baseline serum FAM3B levels were determined by sandwich enzyme-linked immunosorbent assay. Subsequently, all participants underwent a follow-up examination after 5 years. MetS was identified in accordance with the International Diabetes Federation criteria. During follow up, 35.7% participants developed MetS. In comparison with the non-MetS group, participants with MetS had an increased serum FAM3B at baseline (21.85 ng/mL [19.38, 24.17 ng/mL] vs 28.56 ng/mL [25.32, 38.10 ng/mL], P < 0.001). Moreover, serum FAM3B was significantly associated with variations in fasting plasma insulin (r = -0.306, P < 0.001), homeostasis model assessment of β-cell function (r = -0.328, P < 0.001) and homeostasis model assessment of insulin resistance (r = -0.191, P = 0.006). Furthermore, a positive correlation between baseline FAM3B and the incidence of MetS was observed, even after multivariable adjustment (relative risk 1.23 [1.15, 1.31], P < 0.001). Furthermore, the optimal cut-off values of FAM3B was 23.98 ng/mL for predicting MetS based on the Youden Index. Elevated circulating FAM3B might be considered as a predictor of newly-onset MetS and its progression. © 2017 The Authors. Journal of Diabetes Investigation published by Asian Association for the Study of Diabetes (AASD) and John Wiley & Sons Australia, Ltd.

  6. Sequence similarity between the erythrocyte binding domain 1 of the Plasmodium vivax Duffy binding protein and the V3 loop of HIV-1 strain MN reveals binding residues for the Duffy Antigen Receptor for Chemokines

    OpenAIRE

    Bolton, Michael J; Garry, Robert F

    2011-01-01

    Abstract Background The surface glycoprotein (SU, gp120) of the human immunodeficiency virus (HIV) must bind to a chemokine receptor, CCR5 or CXCR4, to invade CD4+ cells. Plasmodium vivax uses the Duffy Binding Protein (DBP) to bind the Duffy Antigen Receptor for Chemokines (DARC) and invade reticulocytes. Results Variable loop 3 (V3) of HIV-1 SU and domain 1 of the Plasmodium vivax DBP share a sequence similarity. The site of amino acid sequence similarity was necessary, but not sufficient, ...

  7. Fusion protein gene nucleotide sequence similarities, shared antigenic sites and phylogenetic analysis suggest that phocid distemper virus 2 and canine distemper virus belong to the same virus entity.

    NARCIS (Netherlands)

    I.K.G. Visser (Ilona); R.W.J. van der Heijden (Roger); M.W.G. van de Bildt (Marco); M.J.H. Kenter (Marcel); C. Örvell; A.D.M.E. Osterhaus (Albert)

    1993-01-01

    textabstractNucleotide sequencing of the fusion protein (F) gene of phocid distemper virus-2 (PDV-2), recently isolated from Baikal seals (Phoca sibirica), revealed an open reading frame (nucleotides 84 to 2075) with two potential in-frame ATG translation initiation codons. We suggest that the

  8. A HIGHER EFFICIENCY OF CONVERTING GAS TO STARS PUSHES GALAXIES AT z ∼ 1.6 WELL ABOVE THE STAR-FORMING MAIN SEQUENCE

    Energy Technology Data Exchange (ETDEWEB)

    Silverman, J. D.; Rujopakarn, W. [Kavli Institute for the Physics and Mathematics of the Universe (WPI), The University of Tokyo Institutes for Advanced Study, The University of Tokyo, Kashiwa, Chiba 277-8583 (Japan); Daddi, E.; Liu, D. [Laboratoire AIM, CEA/DSM-CNRS-Universite Paris Diderot, Irfu/Service d’Astrophysique, CEA Saclay (France); Rodighiero, G. [Dipartimento di Fisica e Astronomia, Universita di Padova, vicolo Osservatorio, 3, I-35122 Padova (Italy); Sargent, M. [Astronomy Centre, Department of Physics and Astronomy, University of Sussex, Brighton BN1 9QH (United Kingdom); Renzini, A. [Instituto Nazionale de Astrofisica, Osservatorio Astronomico di Padova, v.co dell’Osservatorio 5, I-35122 Padova (Italy); Feruglio, C. [IRAM—Institut de RadioAstronomie Millimétrique, 300 rue de la Piscine, F-38406 Saint Martin d’Hères (France); Kashino, D. [Division of Particle and Astrophysical Science, Graduate School of Science, Nagoya University, Nagoya 464-8602 (Japan); Sanders, D. [Institute for Astronomy, University of Hawaii, 2680 Woodlawn Drive, Honolulu, HI 96822 (United States); Kartaltepe, J. [National Optical Astronomy Observatory, 950 N. Cherry Avenue, Tucson, AZ 85719 (United States); Nagao, T. [Graduate School of Science and Engineering, Ehime University, 2-5 Bunkyo-cho, Matsuyama 790-8577 (Japan); Arimoto, N. [Subaru Telescope, 650 North A’ohoku Place, Hilo, HI-96720 (United States); Berta, S.; Lutz, D. [Max-Planck-Institut für extraterrestrische Physik, D-84571 Garching (Germany); Béthermin, M. [European Southern Observatory, Karl-Schwarzschild-Strasse 2, D-85748 Garching (Germany); Koekemoer, A., E-mail: john.silverman@ipmu.jp [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD, 21218 (United States); and others

    2015-10-20

    Local starbursts have a higher efficiency of converting gas into stars, as compared to typical star-forming galaxies at a given stellar mass, possibly indicative of different modes of star formation. With the peak epoch of galaxy formation occurring at z > 1, it remains to be established whether such an efficient mode of star formation is occurring at high redshift. To address this issue, we measure the molecular gas content of seven high-redshift (z ∼ 1.6) starburst galaxies with the Atacama Large Millimeter/submillimeter Array and IRAM/Plateau de Bure Interferometer. Our targets are selected from the sample of Herschel far-infrared-detected galaxies having star formation rates (∼300–800 M{sub ⊙} yr{sup −1}) elevated (≳4×) above the star-forming main sequence (MS) and included in the FMOS-COSMOS near-infrared spectroscopic survey of star-forming galaxies at z ∼ 1.6 with Subaru. We detect CO emission in all cases at high levels of significance, indicative of high gas fractions (∼30%–50%). Even more compelling, we firmly establish with a clean and systematic selection that starbursts, identified as MS outliers, at high redshift generally have a lower ratio of CO to total infrared luminosity as compared to typical MS star-forming galaxies, although with a smaller offset than expected based on past studies of local starbursts. We put forward a hypothesis that there exists a continuous increase in star formation efficiency with elevation from the MS with galaxy mergers as a possible physical driver. Along with a heightened star formation efficiency, our high-redshift sample is similar in other respects to local starbursts, such as being metal rich and having a higher ionization state of the interstellar medium.

  9. Despite higher glucocorticoid levels and stress responses in female rats, both sexes exhibit similar stress-induced changes in hippocampal neurogenesis.

    Science.gov (United States)

    Hulshof, Henriëtte J; Novati, Arianna; Luiten, Paul G M; den Boer, Johan A; Meerlo, Peter

    2012-10-01

    Sex differences in stress reactivity may be one of the factors underlying the increased sensitivity for the development of psychopathologies in women. Particularly, an increased hypothalamic-pituitary-adrenal (HPA) axis reactivity in females may exacerbate stress-induced changes in neuronal plasticity and neurogenesis, which in turn may contribute to an increased sensitivity to psychopathology. The main aim of the present study was to examine male-female differences in stress-induced changes in different aspects of hippocampal neurogenesis, i.e. cell proliferation, differentiation and survival. Both sexes were exposed to a wide variety of stressors, where after differences in HPA-axis reactivity and neurogenesis were assessed. To study the role of oestradiol in potential sex differences, ovariectomized females received low or high physiological oestradiol level replacement pellets. The results show that females in general have a higher basal and stress-induced HPA-axis activity than males, with minimal differences between the two female groups. Cell proliferation in the dorsal hippocampus was significantly higher in high oestradiol females compared to low oestradiol females and males, while doublecortin (DCX) expression as a marker of cell differentiation was significantly higher in males compared to females, independent of oestradiol level. Stress exposure did not significantly influence cell proliferation or survival of new cells, but did reduce DCX expression. In conclusion, despite the male-female differences in HPA-axis activity, the effect of repeated stress exposure on hippocampal cell differentiation was not significantly different between sexes. Copyright © 2012 Elsevier B.V. All rights reserved.

  10. Intraspecific sequence comparisons reveal similar rates of non-collinear gene insertion in the B and D genomes of bread wheat

    Czech Academy of Sciences Publication Activity Database

    Bartoš, Jan; Vlček, Čestmír; Choulet, F.; Džunková, Mária; Cviková, Kateřina; Šafář, Jan; Šimková, Hana; Pačes, Jan; Strnad, Hynek; Sourdille, P.; Berges, H.; Cattonaro, F.; Feuillet, C.; Doležel, Jaroslav

    2012-01-01

    Roč. 12, č. 155 (2012), s. 1-10 ISSN 1471-2229 R&D Projects: GA ČR GAP501/10/1778 Grant - others:GA MŠk(CZ) ED0007/01/01 Program:ED Institutional research plan: CEZ:AV0Z50380511; CEZ:AV0Z50520514 Keywords : Wheat * BAC sequencing * Homoeologous genomes Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 4.354, year: 2012

  11. Sequence similarity between the cp gene and the transgene in transgenic papayas = Similaridade de seqüência entre o gene cp do vírus e do transgene presente em mamoeiros transgênicos

    NARCIS (Netherlands)

    Souza, M.T.; Teixeira, M.; Gonsalves, D.

    2005-01-01

    The Papaya ringspot virus (PRSV) coat protein transgene present in 'Rainbow' and 'SunUp' papayas disclose high sequence similarity (>89%) to the cp gene from PRSV BR and TH. Despite this, both isolates are able to break down the resistance in 'Rainbow', while only the latter is able to do so in

  12. Judgments of brand similarity

    NARCIS (Netherlands)

    Bijmolt, THA; Wedel, M; Pieters, RGM; DeSarbo, WS

    This paper provides empirical insight into the way consumers make pairwise similarity judgments between brands, and how familiarity with the brands, serial position of the pair in a sequence, and the presentation format affect these judgments. Within the similarity judgment process both the

  13. Pregnant women with HIV in rural Nigeria have higher rates of antiretroviral treatment initiation, but similar loss to follow-up as non-pregnant women and men.

    Science.gov (United States)

    Aliyu, Muktar H; Blevins, Meridith; Megazzini, Karen M; Parrish, Deidra D; Audet, Carolyn M; Chan, Naomi; Odoh, Chisom; Gebi, Usman I; Muhammad, Mukhtar Y; Shepherd, Bryan E; Wester, C William; Vermund, Sten H

    2015-11-01

    We examined antiretroviral therapy (ART) initiation and retention by sex and pregnancy status in rural Nigeria. We studied HIV-infected ART-naïve patients aged ≥15 years entering care from June 2009 to September 2013. We calculated the probability of early ART initiation and cumulative incidence of loss to follow-up (LTFU) during the first year of ART, and examined the association between LTFU and sex/pregnancy using Cox regression. The cohort included 3813 ART-naïve HIV-infected adults (2594 women [68.0%], 273 [11.8%] of them pregnant). The proportion of pregnant clients initiating ART within 90 days of enrollment (78.0%, 213/273) was higher than among non-pregnant women (54.3%,1261/2321) or men (53.0%, 650/1219), both pPregnant women initiated ART sooner than non-pregnant women and men (median [IQR] days from enrollment to ART initiation for pregnant women=7 days [0-21] vs 14 days [7-49] for non-pregnant women and 14 days [7-42] for men; pPregnant women with HIV in rural Nigeria were more likely to initiate ART but were no more likely to be retained in care. Our findings underscore the importance of effective retention strategies across all patient groups, regardless of sex and pregnancy status. © The Author 2015. Published by Oxford University Press on behalf of Royal Society of Tropical Medicine and Hygiene. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  14. A realist synthesis of cross-border patient movement from low and middle income countries to similar or higher income countries.

    Science.gov (United States)

    Durham, Jo; Blondell, Sarah J

    2017-08-29

    Patient travel across borders to access healthcare is becoming increasingly common and widespread. Patients moving from high income to middle income countries for healthcare is well documented, with patients seeking treatments that are cheaper or more readily available than at home. Less well understood is when patients move from one low income country to another or from a low income country to a higher income country. In this paper, a realist review was undertaken to explore why, in what contexts and how patients from lower income countries travel to countries with the same, or more advanced, economies for planned healthcare. Based on an initial scoping of the literature and discussions with key informants, we generated an initial theory and set of propositions about why, how, who and in what contexts people cross international borders for planned healthcare. We then systematically located and synthesized (1) peer-reviewed studies from the Scopus, Embase, Web of Science and Econlit databases; (2) non-indexed reports using key informants and Google; and (3) papers from the reference lists of included documents, to glean supportive or contradictory evidence for our initial propositions. As we reviewed the literature and extracted our data, we drew on the work of Pierre Bourdieu to understand the interplay between material and non-material capital and cognitive processes in decisions to cross borders for healthcare. Patient travel was largely undertaken due to a lack of services in the home country and/or unacceptability of local services, with decisions on when, and where, to travel, usually made within the patient's social networks. They were able to travel via use of multiple resources, including social networks, economic and cultural capital, and habitus. Those patients with greater volumes of the aforementioned factors had greater healthcare options; however, even those with limited resources engaged in patient travel. Patient movement challenges traditional

  15. Whole-exome sequencing identified a homozygous FNBP4 mutation in a family with a condition similar to microphthalmia with limb anomalies.

    Science.gov (United States)

    Kondo, Yukiko; Koshimizu, Eriko; Megarbane, Andre; Hamanoue, Haruka; Okada, Ippei; Nishiyama, Kiyomi; Kodera, Hirofumi; Miyatake, Satoko; Tsurusaki, Yoshinori; Nakashima, Mitsuko; Doi, Hiroshi; Miyake, Noriko; Saitsu, Hirotomo; Matsumoto, Naomichi

    2013-07-01

    Microphthalmia with limb anomalies (MLA), also known as Waardenburg anophthalmia syndrome or ophthalmoacromelic syndrome, is a rare autosomal recessive disorder. Recently, we and others successfully identified SMOC1 as the causative gene for MLA. However, there are several MLA families without SMOC1 abnormality, suggesting locus heterogeneity in MLA. We aimed to identify a pathogenic mutation in one Lebanese family having an MLA-like condition without SMOC1 mutation by whole-exome sequencing (WES) combined with homozygosity mapping. A c.683C>T (p.Thr228Met) in FNBP4 was found as a primary candidate, drawing the attention that FNBP4 and SMOC1 may potentially modulate BMP signaling. Copyright © 2013 Wiley Periodicals, Inc.

  16. The Impact of Protein Structure and Sequence Similarity on the Accuracy of Machine-Learning Scoring Functions for Binding Affinity Prediction.

    Science.gov (United States)

    Li, Hongjian; Peng, Jiangjun; Leung, Yee; Leung, Kwong-Sak; Wong, Man-Hon; Lu, Gang; Ballester, Pedro J

    2018-03-14

    It has recently been claimed that the outstanding performance of machine-learning scoring functions (SFs) is exclusively due to the presence of training complexes with highly similar proteins to those in the test set. Here, we revisit this question using 24 similarity-based training sets, a widely used test set, and four SFs. Three of these SFs employ machine learning instead of the classical linear regression approach of the fourth SF (X-Score which has the best test set performance out of 16 classical SFs). We have found that random forest (RF)-based RF-Score-v3 outperforms X-Score even when 68% of the most similar proteins are removed from the training set. In addition, unlike X-Score, RF-Score-v3 is able to keep learning with an increasing training set size, becoming substantially more predictive than X-Score when the full 1105 complexes are used for training. These results show that machine-learning SFs owe a substantial part of their performance to training on complexes with dissimilar proteins to those in the test set, against what has been previously concluded using the same data. Given that a growing amount of structural and interaction data will be available from academic and industrial sources, this performance gap between machine-learning SFs and classical SFs is expected to enlarge in the future.

  17. Protein-Level Integration Strategy of Multiengine MS Spectra Search Results for Higher Confidence and Sequence Coverage.

    Science.gov (United States)

    Zhao, Panpan; Zhong, Jiayong; Liu, Wanting; Zhao, Jing; Zhang, Gong

    2017-12-01

    Multiple search engines based on various models have been developed to search MS/MS spectra against a reference database, providing different results for the same data set. How to integrate these results efficiently with minimal compromise on false discoveries is an open question due to the lack of an independent, reliable, and highly sensitive standard. We took the advantage of the translating mRNA sequencing (RNC-seq) result as a standard to evaluate the integration strategies of the protein identifications from various search engines. We used seven mainstream search engines (Andromeda, Mascot, OMSSA, X!Tandem, pFind, InsPecT, and ProVerB) to search the same label-free MS data sets of human cell lines Hep3B, MHCCLM3, and MHCC97H from the Chinese C-HPP Consortium for Chromosomes 1, 8, and 20. As expected, the union of seven engines resulted in a boosted false identification, whereas the intersection of seven engines remarkably decreased the identification power. We found that identifications of at least two out of seven engines resulted in maximizing the protein identification power while minimizing the ratio of suspicious/translation-supported identifications (STR), as monitored by our STR index, based on RNC-Seq. Furthermore, this strategy also significantly improves the peptides coverage of the protein amino acid sequence. In summary, we demonstrated a simple strategy to significantly improve the performance for shotgun mass spectrometry by protein-level integrating multiple search engines, maximizing the utilization of the current MS spectra without additional experimental work.

  18. Sequence similarity between the erythrocyte binding domain of the Plasmodium vivax Duffy binding protein and the V3 loop of HIV-1 strain MN reveals a functional heparin binding motif involved in binding to the Duffy antigen receptor for chemokines

    OpenAIRE

    Bolton, Michael J; Garry, Robert F

    2011-01-01

    Abstract Background The HIV surface glycoprotein gp120 (SU, gp120) and the Plasmodium vivax Duffy binding protein (PvDBP) bind to chemokine receptors during infection and have a site of amino acid sequence similarity in their binding domains that often includes a heparin binding motif (HBM). Infection by either pathogen has been found to be inhibited by polyanions. Results Specific polyanions that inhibit HIV infection and bind to the V3 loop of X4 strains also inhibited DBP-mediated infectio...

  19. Sequence similarity between the erythrocyte binding domain 1 of the Plasmodium vivax Duffy binding protein and the V3 loop of HIV-1 strain MN reveals binding residues for the Duffy Antigen Receptor for Chemokines

    Directory of Open Access Journals (Sweden)

    Garry Robert F

    2011-01-01

    Full Text Available Abstract Background The surface glycoprotein (SU, gp120 of the human immunodeficiency virus (HIV must bind to a chemokine receptor, CCR5 or CXCR4, to invade CD4+ cells. Plasmodium vivax uses the Duffy Binding Protein (DBP to bind the Duffy Antigen Receptor for Chemokines (DARC and invade reticulocytes. Results Variable loop 3 (V3 of HIV-1 SU and domain 1 of the Plasmodium vivax DBP share a sequence similarity. The site of amino acid sequence similarity was necessary, but not sufficient, for DARC binding and contained a consensus heparin binding site essential for DARC binding. Both HIV-1 and P. vivax can be blocked from binding to their chemokine receptors by the chemokine, RANTES and its analog AOP-RANTES. Site directed mutagenesis of the heparin binding motif in members of the DBP family, the P. knowlesi alpha, beta and gamma proteins abrogated their binding to erythrocytes. Positively charged residues within domain 1 are required for binding of P. vivax and P. knowlesi erythrocyte binding proteins. Conclusion A heparin binding site motif in members of the DBP family may form part of a conserved erythrocyte receptor binding pocket.

  20. Sequence similarity between the erythrocyte binding domain of the Plasmodium vivax Duffy binding protein and the V3 loop of HIV-1 strain MN reveals a functional heparin binding motif involved in binding to the Duffy antigen receptor for chemokines

    Directory of Open Access Journals (Sweden)

    Bolton Michael J

    2011-11-01

    Full Text Available Abstract Background The HIV surface glycoprotein gp120 (SU, gp120 and the Plasmodium vivax Duffy binding protein (PvDBP bind to chemokine receptors during infection and have a site of amino acid sequence similarity in their binding domains that often includes a heparin binding motif (HBM. Infection by either pathogen has been found to be inhibited by polyanions. Results Specific polyanions that inhibit HIV infection and bind to the V3 loop of X4 strains also inhibited DBP-mediated infection of erythrocytes and DBP binding to the Duffy Antigen Receptor for Chemokines (DARC. A peptide including the HBM of PvDBP had similar affinity for heparin as RANTES and V3 loop peptides, and could be specifically inhibited from heparin binding by the same polyanions that inhibit DBP binding to DARC. However, some V3 peptides can competitively inhibit RANTES binding to heparin, but not the PvDBP HBM peptide. Three other members of the DBP family have an HBM sequence that is necessary for erythrocyte binding, however only the protein which binds to DARC, the P. knowlesi alpha protein, is inhibited by heparin from binding to erythrocytes. Heparitinase digestion does not affect the binding of DBP to erythrocytes. Conclusion The HBMs of DBPs that bind to DARC have similar heparin binding affinities as some V3 loop peptides and chemokines, are responsible for specific sulfated polysaccharide inhibition of parasite binding and invasion of red blood cells, and are more likely to bind to negative charges on the receptor than cell surface glycosaminoglycans.

  1. Sequence similarity between the viral cp gene and the transgene in transgenic papayas Similaridade de seqüência entre o gene cp do vírus e do transgene presente em mamoeiros transgênicos

    Directory of Open Access Journals (Sweden)

    Manoel Teixeira Souza Júnior

    2005-05-01

    Full Text Available The Papaya ringspot virus (PRSV coat protein transgene present in 'Rainbow' and 'SunUp' papayas disclose high sequence similarity (>89% to the cp gene from PRSV BR and TH. Despite this, both isolates are able to break down the resistance in 'Rainbow', while only the latter is able to do so in 'SunUp'. The objective of this work was to evaluate the degree of sequence similarity between the cp gene in the challenge isolate and the cp transgene in transgenic papayas resistant to PRSV. The production of a hybrid virus containing the genome backbone of PRSV HA up to the Apa I site in the NIb gene, and downstream from there, the sequence of PRSV TH was undertaken. This hybrid virus, PRSV HA/TH, was obtained and used to challenge 'Rainbow', 'SunUp', and an R2 population derived from line 63-1, all resistant to PRSV HA. PRSV HA/TH broke down the resistance in both papaya varieties and in the 63-1 population, demonstrating that sequence similarity is a major factor in the mechanism of resistance used by transgenic papayas expressing the cp gene. A comparative analysis of the cp gene present in line 55-1 and 63-1-derived transgenic plants and in PRSV HA, BR, and TH was also performed.O gene da capa protéica (cp do vírus da mancha anelar do mamoeiro (Papaya ringspot virus, PRSV, presente nos mamoeiros 'Rainbow' e 'SunUp', tem alta similaridade de seqüência (>89% com o gene cp dos isolados PRSV BR e TH. Apesar deste alto grau de similaridade, ambos isolados são capazes de quebrar a resistência observada em 'Rainbow', ao passo que TH quebra a resistência em 'SunUp'. O objetivo deste trabalho foi avaliar o grau de similaridade de seqüência entre o gene cp do vírus desafiante e do transgene em mamoeiros transgênicos resistentes a PRSV. Produziu-se um vírus híbrido contendo o genoma do isolado PRSV HA até o sítio de restrição Apa I no gene NIb, e, a partir deste ponto, este vírus continha o genoma do isolado PRSV TH. PRSV HA/TH foi utilizado

  2. Low-dose factor VIII infusion in Chinese adult haemophilia A patients: pharmacokinetics evidence that daily infusion results in higher trough level than with every-other-day infusion with similar factor VIII consumption.

    Science.gov (United States)

    Hua, B; Lee, A; Fan, L; Li, K; Zhang, Y; Poon, M-C; Zhao, Y

    2017-05-01

    Pharmacokinetics (PK) modelling suggests improvement of trough levels are achieved by using more frequent infusion strategy. However, no clinical study data exists to confirm or quantify improvement in trough level, particularly for low-dose prophylaxis in patients with haemophilia A. To provide evidence that low dose daily (ED) prophylaxis can increase trough levels without increasing FVIII consumption compared to every-other-day (EOD) infusion. A cross-over study on 5 IU kg -1 FVIII daily vs. 10 IU kg -1 EOD infusions, each for 14 days was conducted at the PUMCH-HTC. On the ED schedule, trough (immediate prior to infusion), and peak FVIII:C levels (30 min after infusion) were measured on days 1-5; and trough levels alone on days 7, 9, 11 and 13. For the EOD schedule, troughs, peaks and 4-h postinfusion were measured on day 1; troughs and peaks on days 3, 5, and 7; troughs alone on days 9, 11 and 13 and 24-h postinfusion on days 2, 4 and 6. FVIII inhibitors were assessed on days 0 and 14 during both infusion schedules. Six patients were enrolled. PK evidence showed that daily prophylaxis achieved higher (~2 times) steady-state FVIII trough levels compared to EOD with the same total factor consumption. The daily prophylaxis had good acceptability among patients and reduced chronic pain in the joints in some patients. Our PK study shows low-dose factor VIII daily infusion results in higher trough level than with EOD infusion with similar factor VIII consumption in Chinese adult haemophilia A patients. © 2017 John Wiley & Sons Ltd.

  3. Family with sequence similarity 83, member B is a predictor of poor prognosis and a potential therapeutic target for lung adenocarcinoma expressing wild-type epidermal growth factor receptor.

    Science.gov (United States)

    Yamaura, Takumi; Ezaki, Junji; Okabe, Naoyuki; Takagi, Hironori; Ozaki, Yuki; Inoue, Takuya; Watanabe, Yuzuru; Fukuhara, Mitsuro; Muto, Satoshi; Matsumura, Yuki; Hasegawa, Takeo; Hoshino, Mika; Osugi, Jun; Shio, Yutaka; Waguri, Satoshi; Tamura, Hirosumi; Imai, Jun-Ichi; Ito, Emi; Yanagisawa, Yuka; Honma, Reiko; Watanabe, Shinya; Suzuki, Hiroyuki

    2018-02-01

    Lung adenocarcinoma (ADC) patients with tumors that harbor no targetable driver gene mutation, such as epidermal growth factor receptor ( EGFR ) gene mutations, have unfavorable prognosis, and thus, novel therapeutic targets are required. Family with sequence similarity 83, member B ( FAM83B ) is a biomarker for squamous cell lung cancer. FAM83B has also recently been shown to serve an important role in the EGFR signaling pathway. In the present study, the molecular and clinical impact of FAM83B in lung ADC was investigated. Matched tumor and adjacent normal tissue samples were obtained from 216 patients who underwent complete lung resection for primary lung ADC and were examined for FAM83B expression using cDNA microarray analysis. The associations between FAM83B expression and clinicopathological parameters, including patient survival, were examined. FAM83B was highly expressed in tumors from males, smokers and in tumors with wild-type EGFR . Multivariate analyses further confirmed that wild-type EGFR tumors were significantly positively associated with FAM83B expression. In survival analysis, FAM83B expression was associated with poor outcomes in disease-free survival and overall survival, particularly when stratified against tumors with wild-type EGFR . Furthermore, FAM83B knockdown was performed to investigate its phenotypic effect on lung ADC cell lines. Gene silencing by FAM83B RNA interference induced growth suppression in the HLC-1 and H1975 lung ADC cell lines. FAM83B may be involved in lung ADC tumor proliferation and can be a predictor of poor survival. FAM83B is also a potential novel therapeutic target for ADC with wild-type EGFR .

  4. Domain similarity based orthology detection.

    Science.gov (United States)

    Bitard-Feildel, Tristan; Kemena, Carsten; Greenwood, Jenny M; Bornberg-Bauer, Erich

    2015-05-13

    Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .

  5. Molecular characterisation and similarity relationships among iranian basil (Ocimum basilicum L. accessions using inter simple sequence repeat markers Caracterização molecular de acessos de Ocimum basilicum L. por meio de marcadores ISSR

    Directory of Open Access Journals (Sweden)

    Mohammad Aghaei

    2012-06-01

    Full Text Available The study of genetic relationships is a prerequisite for plant breeding activities as well as for conservation of genetic resources. In the present study, genetic diversity among 50 Iranian basil (Ocimum basilicum L. accessions was determined using inter simple sequence repeat (ISSR markers. Thirty-eight alleles were generated at 12 ISSR loci. The number of alleles per locus ranged from 1 to 5 with an average of 3.17. The maximum number of alleles was observed at the A7, 818, 825 and 849 loci, and their size ranged from 300 to 2500 bp. A similarity matrix based on Jaccard's coefficient for all 50 basil accessions gave values from 1.00-0.60. The maximum similarity (1.00 was observed between the "Urmia" and "Shahr-e-Rey II" accessions as well as between the "Urmia" and "Qazvin II" accessions. The lowest similarity (0.60 was observed between the "Tuyserkan I" and "Gom II" accessions. The unweighted pair- group method using arithmetique average UPGMA clustering algorithm classified the studied accessions into three distinct groups. All of the basil accessions, with the exception of "Babol III", "Ahvaz II", "Yazd II" and "Ardebil I", were placed in groups I and II. Leaf colour was a specific characteristic that influenced the clustering of Iranian basil accessions. Because of this relationship, the results of the principal coordinate analysis (PCoA approximately corresponded to those obtained through cluster analysis. Our results revealed that the geographical distribution of genotypes could not be used as a basis for crossing parents to obtain high heterosis, and therefore, it must be carried out by genetic studies.O estudo das relações genéticas é um pré-requisito para atividades em reprodução de plantas assim como para conservação de recursos genéticos. Neste trabalho a diversidade genética entre 50 acessos de Manejericão Iraniano (Ocimum basilicum L. foram determinadas usando marcadores de Seqüência Simples Repetida Interna (ISSR

  6. Multimodal sequence learning.

    Science.gov (United States)

    Kemény, Ferenc; Meier, Beat

    2016-02-01

    While sequence learning research models complex phenomena, previous studies have mostly focused on unimodal sequences. The goal of the current experiment is to put implicit sequence learning into a multimodal context: to test whether it can operate across different modalities. We used the Task Sequence Learning paradigm to test whether sequence learning varies across modalities, and whether participants are able to learn multimodal sequences. Our results show that implicit sequence learning is very similar regardless of the source modality. However, the presence of correlated task and response sequences was required for learning to take place. The experiment provides new evidence for implicit sequence learning of abstract conceptual representations. In general, the results suggest that correlated sequences are necessary for implicit sequence learning to occur. Moreover, they show that elements from different modalities can be automatically integrated into one unitary multimodal sequence. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. Similarity of eigenstates in generalized labyrinth tilings

    International Nuclear Information System (INIS)

    Thiem, Stefanie; Schreiber, Michael

    2010-01-01

    The eigenstates of d-dimensional quasicrystalline models with a separable Hamiltonian are studied within the tight-binding model. The approach is based on mathematical sequences, constructed by an inflation rule P = {w → s,s → sws b-1 } describing the weak/strong couplings of atoms in a quasiperiodic chain. Higher-dimensional quasiperiodic tilings are constructed as a direct product of these chains and their eigenstates can be directly calculated by multiplying the energies E or wave functions ψ of the chain, respectively. Applying this construction rule, the grid in d dimensions splits into 2 d-1 different tilings, for which we investigated the characteristics of the wave functions. For the standard two-dimensional labyrinth tiling constructed from the octonacci sequence (b = 2) the lattice breaks up into two identical lattices, which consequently yield the same eigenstates. While this is not the case for b ≠ 2, our numerical results show that the wave functions of the different grids become increasingly similar for large system sizes. This can be explained by the fact that the structure of the 2 d-1 grids mainly differs at the boundaries and thus for large systems the eigenstates approach each other. This property allows us to analytically derive properties of the higher-dimensional generalized labyrinth tilings from the one-dimensional results. In particular participation numbers and corresponding scaling exponents have been determined.

  8. Higher expression of vascular endothelial growth factor (VEGF and its receptor VEGFR-2 (Flk-1 and metalloproteinase-9 (MMP-9 in a rat model of peritoneal endometriosis is similar to cancer diseases

    Directory of Open Access Journals (Sweden)

    Nasciutti Luiz E

    2010-01-01

    Full Text Available Abstract Background Endometriosis is a common disease characterized by the presence of a functional endometrium outside the uterine cavity, causing pelvic pain, dysmenorrheal, and infertility. This disease has been associated to development of different types of malignancies; therefore new blood vessels are essential for the survival of the endometrial implant. Our previous observations on humans showed that angiogenesis is predominantly found in rectosigmoid endometriosis, a deeply infiltrating disease. In this study, we have established the experimental model of rat peritoneal endometriosis to evaluate the process of angiogenesis and to compare with eutopic endometrium. Methods We have investigated the morphological characteristics of these lesions and the vascular density, VEGF and its receptor Flk-1 and MMP-9 expression, and activated macrophage distribution, using immunohistochemistry and RT-PCR. Results As expected, the auto-transplantation of endometrium pieces into the peritoneal cavity is a well-established method for endometriosis induction in rats. The lesions were cystic and vascularized, and demonstrated histological hallmarks of human pathology, such as endometrial glands and stroma. The vascular density and the presence of VEGF and Flk-1 and MMP-9 were significantly higher in endometriotic lesions than in eutopic endometrium, and confirmed the angiogenic potential of these lesions. We also observed an increase in the number of activated macrophages (ED-1 positive cells in the endometriotic lesions, showing a positive correlation with VEGF. Conclusion The present endometriosis model would be useful for investigation of the mechanisms of angiogenesis process involved in the peritoneal attachment of endometrial cells, as well as of the effects of therapeutic drugs, particularly with antiangiogenic activity.

  9. Cultura organizacional e avaliação de instituições de educação superior: semelhanças e diferenças Organizational culture and evaluation of higher education institutions: similarities and differences

    Directory of Open Access Journals (Sweden)

    José Augusto Dela Coleta

    2007-12-01

    Full Text Available Considerando oito fatores da cultura organizacional, já identificados nos estudos anteriores em organizações empresariais - distância hierárquica; controle da incerteza; individualismo; masculinidade; assertividade; orientação para a realização; para o futuro; para a afiliação - o presente trabalho detectou níveis de presença e variabilidade desses fatores em 14 instituições de educação superior - IES (universidades públicas e particulares, centros universitários e faculdades privadas. Os dados foram recolhidos com 490 professores universitários, com um conjunto de escalas tipo Likert para medida dos oito fatores da cultura organizacional, de avaliação das qualidades das IES e dos sentimentos de a elas pertencer. Os resultados indicaram diferenças estatisticamente significativas entre e intra diferentes classes de IES, correlações negativas significativas entre as avaliações e os fatores distância do poder, individualismo e masculinidade, e positivas entre as avaliações e o controle da incerteza, assertividade, orientação para o futuro, para a realização e para a afiliação.Considering eight dimensions of the organizational culture already identified in previous studies in organizations - power distance, uncertainty avoidance, individualism, masculinity, assertiveness, achievement orientation, future orientation, affiliative orientation - this study found levels of presence and variability of these factors in 14 higher education institutions - HEI (public universities, private university, university centers, and colleges. The data were collect from 490 university professors, using a group of Likert scales to measure the eight factors of the organizational culture, the evaluation of the qualities of the HEI and the feelings related to belong to them. The results showed statistically significant differences between and within the different kinds of HEI; negative and significant correlations between the

  10. New Similarity Functions

    DEFF Research Database (Denmark)

    Yazdani, Hossein; Ortiz-Arroyo, Daniel; Kwasnicka, Halina

    2016-01-01

    spaces, in addition to their similarity in the vector space. Prioritized Weighted Feature Distance (PWFD) works similarly as WFD, but provides the ability to give priorities to desirable features. The accuracy of the proposed functions are compared with other similarity functions on several data sets....... Our results show that the proposed functions work better than other methods proposed in the literature....

  11. Phoneme Similarity and Confusability

    Science.gov (United States)

    Bailey, T.M.; Hahn, U.

    2005-01-01

    Similarity between component speech sounds influences language processing in numerous ways. Explanation and detailed prediction of linguistic performance consequently requires an understanding of these basic similarities. The research reported in this paper contrasts two broad classes of approach to the issue of phoneme similarity-theoretically…

  12. Molecular similarity measures.

    Science.gov (United States)

    Maggiora, Gerald M; Shanmugasundaram, Veerabahu

    2011-01-01

    Molecular similarity is a pervasive concept in chemistry. It is essential to many aspects of chemical reasoning and analysis and is perhaps the fundamental assumption underlying medicinal chemistry. Dissimilarity, the complement of similarity, also plays a major role in a growing number of applications of molecular diversity in combinatorial chemistry, high-throughput screening, and related fields. How molecular information is represented, called the representation problem, is important to the type of molecular similarity analysis (MSA) that can be carried out in any given situation. In this work, four types of mathematical structure are used to represent molecular information: sets, graphs, vectors, and functions. Molecular similarity is a pairwise relationship that induces structure into sets of molecules, giving rise to the concept of chemical space. Although all three concepts - molecular similarity, molecular representation, and chemical space - are treated in this chapter, the emphasis is on molecular similarity measures. Similarity measures, also called similarity coefficients or indices, are functions that map pairs of compatible molecular representations that are of the same mathematical form into real numbers usually, but not always, lying on the unit interval. This chapter presents a somewhat pedagogical discussion of many types of molecular similarity measures, their strengths and limitations, and their relationship to one another. An expanded account of the material on chemical spaces presented in the first edition of this book is also provided. It includes a discussion of the topography of activity landscapes and the role that activity cliffs in these landscapes play in structure-activity studies.

  13. Revisiting Inter-Genre Similarity

    DEFF Research Database (Denmark)

    Sturm, Bob L.; Gouyon, Fabien

    2013-01-01

    We revisit the idea of ``inter-genre similarity'' (IGS) for machine learning in general, and music genre recognition in particular. We show analytically that the probability of error for IGS is higher than naive Bayes classification with zero-one loss (NB). We show empirically that IGS does...... not perform well, even for data that satisfies all its assumptions....

  14. Pythoscape: A framework for generation of large protein similarity networks

    OpenAIRE

    Babbitt, Patricia; Barber, AE; Babbitt, PC

    2012-01-01

    Pythoscape is a framework implemented in Python for processing large protein similarity networks for visualization in other software packages. Protein similarity networks are graphical representations of sequence, structural and other similarities among pr

  15. Similarity Measure of Graphs

    Directory of Open Access Journals (Sweden)

    Amine Labriji

    2017-07-01

    Full Text Available The topic of identifying the similarity of graphs was considered as highly recommended research field in the Web semantic, artificial intelligence, the shape recognition and information research. One of the fundamental problems of graph databases is finding similar graphs to a graph query. Existing approaches dealing with this problem are usually based on the nodes and arcs of the two graphs, regardless of parental semantic links. For instance, a common connection is not identified as being part of the similarity of two graphs in cases like two graphs without common concepts, the measure of similarity based on the union of two graphs, or the one based on the notion of maximum common sub-graph (SCM, or the distance of edition of graphs. This leads to an inadequate situation in the context of information research. To overcome this problem, we suggest a new measure of similarity between graphs, based on the similarity measure of Wu and Palmer. We have shown that this new measure satisfies the properties of a measure of similarities and we applied this new measure on examples. The results show that our measure provides a run time with a gain of time compared to existing approaches. In addition, we compared the relevance of the similarity values obtained, it appears that this new graphs measure is advantageous and  offers a contribution to solving the problem mentioned above.

  16. Processes of Similarity Judgment

    Science.gov (United States)

    Larkey, Levi B.; Markman, Arthur B.

    2005-01-01

    Similarity underlies fundamental cognitive capabilities such as memory, categorization, decision making, problem solving, and reasoning. Although recent approaches to similarity appreciate the structure of mental representations, they differ in the processes posited to operate over these representations. We present an experiment that…

  17. Marriage Matters: Spousal Similarity in Life Satisfaction

    OpenAIRE

    Ulrich Schimmack; Richard Lucas

    2006-01-01

    Examined the concurrent and cross-lagged spousal similarity in life satisfaction over a 21-year period. Analyses were based on married couples (N = 847) in the German Socio-Economic Panel (SOEP). Concurrent spousal similarity was considerably higher than one-year retest similarity, revealing spousal similarity in the variable component of life satisfac-tion. Spousal similarity systematically decreased with length of retest interval, revealing simi-larity in the changing component of life sati...

  18. The semantic similarity ensemble

    Directory of Open Access Journals (Sweden)

    Andrea Ballatore

    2013-12-01

    Full Text Available Computational measures of semantic similarity between geographic terms provide valuable support across geographic information retrieval, data mining, and information integration. To date, a wide variety of approaches to geo-semantic similarity have been devised. A judgment of similarity is not intrinsically right or wrong, but obtains a certain degree of cognitive plausibility, depending on how closely it mimics human behavior. Thus selecting the most appropriate measure for a specific task is a significant challenge. To address this issue, we make an analogy between computational similarity measures and soliciting domain expert opinions, which incorporate a subjective set of beliefs, perceptions, hypotheses, and epistemic biases. Following this analogy, we define the semantic similarity ensemble (SSE as a composition of different similarity measures, acting as a panel of experts having to reach a decision on the semantic similarity of a set of geographic terms. The approach is evaluated in comparison to human judgments, and results indicate that an SSE performs better than the average of its parts. Although the best member tends to outperform the ensemble, all ensembles outperform the average performance of each ensemble's member. Hence, in contexts where the best measure is unknown, the ensemble provides a more cognitively plausible approach.

  19. Gender similarities and differences.

    Science.gov (United States)

    Hyde, Janet Shibley

    2014-01-01

    Whether men and women are fundamentally different or similar has been debated for more than a century. This review summarizes major theories designed to explain gender differences: evolutionary theories, cognitive social learning theory, sociocultural theory, and expectancy-value theory. The gender similarities hypothesis raises the possibility of theorizing gender similarities. Statistical methods for the analysis of gender differences and similarities are reviewed, including effect sizes, meta-analysis, taxometric analysis, and equivalence testing. Then, relying mainly on evidence from meta-analyses, gender differences are reviewed in cognitive performance (e.g., math performance), personality and social behaviors (e.g., temperament, emotions, aggression, and leadership), and psychological well-being. The evidence on gender differences in variance is summarized. The final sections explore applications of intersectionality and directions for future research.

  20. Similarity or difference?

    DEFF Research Database (Denmark)

    Villadsen, Anders Ryom

    2013-01-01

    While the organizational structures and strategies of public organizations have attracted substantial research attention among public management scholars, little research has explored how these organizational core dimensions are interconnected and influenced by pressures for similarity....... In this paper I address this topic by exploring the relation between expenditure strategy isomorphism and structure isomorphism in Danish municipalities. Different literatures suggest that organizations exist in concurrent pressures for being similar to and different from other organizations in their field......-shaped relation exists between expenditure strategy isomorphism and structure isomorphism in a longitudinal quantitative study of Danish municipalities....

  1. Comparing Harmonic Similarity Measures

    NARCIS (Netherlands)

    de Haas, W.B.; Robine, M.; Hanna, P.; Veltkamp, R.C.; Wiering, F.

    2010-01-01

    We present an overview of the most recent developments in polyphonic music retrieval and an experiment in which we compare two harmonic similarity measures. In contrast to earlier work, in this paper we specifically focus on the symbolic chord description as the primary musical representation and

  2. Roles of repetitive sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bell, G.I.

    1991-12-31

    The DNA of higher eukaryotes contains many repetitive sequences. The study of repetitive sequences is important, not only because many have important biological function, but also because they provide information on genome organization, evolution and dynamics. In this paper, I will first discuss some generic effects that repetitive sequences will have upon genome dynamics and evolution. In particular, it will be shown that repetitive sequences foster recombination among, and turnover of, the elements of a genome. I will then consider some examples of repetitive sequences, notably minisatellite sequences and telomere sequences as examples of tandem repeats, without and with respectively known function, and Alu sequences as an example of interspersed repeats. Some other examples will also be considered in less detail.

  3. Similar or different?

    DEFF Research Database (Denmark)

    Cornér, Solveig; Pyhältö, Kirsi; Peltonen, Jouni

    2018-01-01

    Previous research has identified researcher community and supervisory support as key determinants of the doctoral journey contributing to students’ persistence and robustness. However, we still know little about cross-cultural variation in the researcher community and supervisory support experien...... counter partners, whereas the Finnish students perceived lower levels of instrumental support than the Danish students. The findings imply that seemingly similar contexts hold valid differences in experienced social support and educational strategies at the PhD level....... experienced by PhD students within the same discipline. This study explores the support experiences of 381 PhD students within the humanities and social sciences from three research-intensive universities in Denmark (n=145) and Finland (n=236). The mixed methods design was utilized. The data were collected...... counter partners. The results also indicated that the only form of support in which the students expressed more matched support than mismatched support was informational support. Further investigation showed that the Danish students reported a high level of mismatch in emotional support than their Finnish...

  4. Compressional Alfven Eigenmode Similarity Study

    Science.gov (United States)

    Heidbrink, W. W.; Fredrickson, E. D.; Gorelenkov, N. N.; Rhodes, T. L.

    2004-11-01

    NSTX and DIII-D are nearly ideal for Alfven eigenmode (AE) similarity experiments, having similar neutral beams, fast-ion to Alfven speed v_f/v_A, fast-ion pressure, and shape of the plasma, but with a factor of 2 difference in the major radius. Toroidicity-induced AE with ˜100 kHz frequencies were compared in an earlier study [1]; this paper focuses on higher frequency AE with f ˜ 1 MHz. Compressional AE (CAE) on NSTX have a polarization, dependence on the fast-ion distribution function, frequency scaling, and low-frequency limit that are qualitatively consistent with CAE theory [2]. Global AE (GAE) are also observed. On DIII-D, coherent modes in this frequency range are observed during low-field (0.6 T) similarity experiments. Experiments will compare the CAE stability limits on DIII-D with the NSTX stability limits, with the aim of determining if CAE will be excited by alphas in a reactor. Predicted differences in the frequency splitting Δ f between excited modes will also be used. \\vspace0.25em [1] W.W. Heidbrink, et al., Plasmas Phys. Control. Fusion 45, 983 (2003). [2] E.D. Fredrickson, et al., Princeton Plasma Physics Laboratory Report PPPL-3955 (2004).

  5. Protein structural similarity search by Ramachandran codes

    Directory of Open Access Journals (Sweden)

    Chang Chih-Hung

    2007-08-01

    Full Text Available Abstract Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation. SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era.

  6. Repdigits in k-Lucas sequences

    Indian Academy of Sciences (India)

    57(2) 2000 243-254) proved that 11 is the largest number with only one distinct digit (the so-called repdigit) in the sequence ( L n ( 2 ) ) n . In this paper, we address a similar problem in the family of -Lucas sequences. We also show that the -Lucas sequences have similar properties to those of -Fibonacci sequences ...

  7. Seniority bosons from similarity transformations

    International Nuclear Information System (INIS)

    Geyer, H.B.

    1986-01-01

    The requirement of associating in the boson space seniority with twice the number of non-s bosons defines a similarity transformation which re-expresses the Dyson pair boson images in terms of seniority bosons. In particular the fermion S-pair creation operator is mapped onto an operator which, unlike the pair boson image, does not change the number of non-s bosons. The original results of Otsuka, Arima and Iachello are recovered by this procedure while at the same time they are generalized to include g-bosons or even bosons with J>4 as well as any higher order boson terms. Furthermore the seniority boson images are valid for an arbitrary number of d- or g-bosons - a result which is not readily obtainable within the framework of the usual Marumori- or OAI-method

  8. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  9. Domain similarity based orthology detection

    OpenAIRE

    Bitard-Feildel, Tristan; Kemena, Carsten; Greenwood, Jenny M; Bornberg-Bauer, Erich

    2015-01-01

    Background Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationa...

  10. Circulating irisin levels are lower in patients with either stable coronary artery disease (CAD) or myocardial infarction (MI) versus healthy controls, whereas follistatin and activin A levels are higher and can discriminate MI from CAD with similar to CK-MB accuracy.

    Science.gov (United States)

    Anastasilakis, Athanasios D; Koulaxis, Dimitrios; Kefala, Nikoleta; Polyzos, Stergios A; Upadhyay, Jagriti; Pagkalidou, Eirini; Economou, Fotios; Anastasilakis, Chrysostomos D; Mantzoros, Christos S

    2017-08-01

    Several myokines are produced by cardiac muscle. We investigated changes in myokine levels at the time of acute myocardial infarction (MI) and following reperfusion in relation to controls. Patients with MI (MI Group, n=31) treated with percutaneous coronary intervention (PCI) were compared to patients with stable coronary artery disease (CAD) subjected to scheduled PCI (CAD Group, n=40) and controls with symptoms mimicking CAD without stenosis in angiography (Control Group, n=43). The number and degree of stenosis were recorded. Irisin, follistatin, follistatin-like 3, activin A and B, ALT, AST, CK and CK-MB were measured at baseline and 6 or 24h after the intervention. MI and CAD patients had lower irisin than controls (p<0.001). MI patients had higher follistatin, activin A, CK, CK-MB and AST than CAD patients and controls (all p≤0.001). None of the myokines changed following reperfusion. Circulating irisin was associated with the degree of stenosis in all patients (p=0.05). Irisin was not inferior to CK-MB in predicting MI while folistatin and activin A could discriminate MI from CAD patients with similar to CK-MB accuracy. None of these myokines was altered following PCI in contrast to CK-MB. Irisin levels are lower in MI and CAD implying that their production may depend on myocadial blood supply. Follistatin and activin A are higher in MI than in CAD suggesting increased release due to myocardial necrosis. They can predict MI with accuracy similar to CK-MB and their role in the diagnosis of MI remains to be confirmed by prospective large clinical studies. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Similarity as an organising principle in short-term memory.

    Science.gov (United States)

    LeCompte, D C; Watkins, M J

    1993-03-01

    The role of stimulus similarity as an organising principle in short-term memory was explored in a series of seven experiments. Each experiment involved the presentation of a short sequence of items that were drawn from two distinct physical classes and arranged such that item class changed after every second item. Following presentation, one item was re-presented as a probe for the 'target' item that had directly followed it in the sequence. Memory for the sequence was considered organised by class if probability of recall was higher when the probe and target were from the same class than when they were from different classes. Such organisation was found when one class was auditory and the other was visual (spoken vs. written words, and sounds vs. pictures). It was also found when both classes were auditory (words spoken in a male voice vs. words spoken in a female voice) and when both classes were visual (digits shown in one location vs. digits shown in another). It is concluded that short-term memory can be organised on the basis of sensory modality and on the basis of certain features within both the auditory and visual modalities.

  12. Higher Education

    African Journals Online (AJOL)

    Kunle Amuwo: Higher Education Transformation: A Paradigm Shilt in South Africa? ... ty of such skills, especially at the middle management levels within the higher ... istics and virtues of differentiation and diversity. .... may be forced to close shop for lack of capacity to attract ..... necessarily lead to racial and gender equity,.

  13. GROUPING WEB ACCESS SEQUENCES uSING SEQUENCE ALIGNMENT METHOD

    OpenAIRE

    BHUPENDRA S CHORDIA; KRISHNAKANT P ADHIYA

    2011-01-01

    In web usage mining grouping of web access sequences can be used to determine the behavior or intent of a set of users. Grouping websessions is how to measure the similarity between web sessions. There are many shortcomings in traditional measurement methods. The taskof grouping web sessions based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-groupsimilarity is done using sequence alignment method. This paper introduces a new method to group we...

  14. Sequence assembly

    DEFF Research Database (Denmark)

    Scheibye-Alsing, Karsten; Hoffmann, S.; Frankel, Annett Maria

    2009-01-01

    Despite the rapidly increasing number of sequenced and re-sequenced genomes, many issues regarding the computational assembly of large-scale sequencing data have remain unresolved. Computational assembly is crucial in large genome projects as well for the evolving high-throughput technologies and...... in genomic DNA, highly expressed genes and alternative transcripts in EST sequences. We summarize existing comparisons of different assemblers and provide a detailed descriptions and directions for download of assembly programs at: http://genome.ku.dk/resources/assembly/methods.html....

  15. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  16. Higher Education

    Science.gov (United States)

    & Development (LDRD) National Security Education Center (NSEC) Office of Science Programs Richard P Databases National Security Education Center (NSEC) Center for Nonlinear Studies Engineering Institute Scholarships STEM Education Programs Teachers (K-12) Students (K-12) Higher Education Regional Education

  17. A Unified Theoretical Framework for Cognitive Sequencing.

    Science.gov (United States)

    Savalia, Tejas; Shukla, Anuj; Bapi, Raju S

    2016-01-01

    The capacity to sequence information is central to human performance. Sequencing ability forms the foundation stone for higher order cognition related to language and goal-directed planning. Information related to the order of items, their timing, chunking and hierarchical organization are important aspects in sequencing. Past research on sequencing has emphasized two distinct and independent dichotomies: implicit vs. explicit and goal-directed vs. habits. We propose a theoretical framework unifying these two streams. Our proposal relies on brain's ability to implicitly extract statistical regularities from the stream of stimuli and with attentional engagement organizing sequences explicitly and hierarchically. Similarly, sequences that need to be assembled purposively to accomplish a goal require engagement of attentional processes. With repetition, these goal-directed plans become habits with concomitant disengagement of attention. Thus, attention and awareness play a crucial role in the implicit-to-explicit transition as well as in how goal-directed plans become automatic habits. Cortico-subcortical loops basal ganglia-frontal cortex and hippocampus-frontal cortex loops mediate the transition process. We show how the computational principles of model-free and model-based learning paradigms, along with a pivotal role for attention and awareness, offer a unifying framework for these two dichotomies. Based on this framework, we make testable predictions related to the potential influence of response-to-stimulus interval (RSI) on developing awareness in implicit learning tasks.

  18. A Unified Theoretical Framework for Cognitive Sequencing

    Directory of Open Access Journals (Sweden)

    Tejas Savalia

    2016-11-01

    Full Text Available The capacity to sequence information is central to human performance. Sequencing ability forms the foundation stone for higher order cognition related to language and goal-directed planning. Information related to the order of items, their timing, chunking and hierarchical organization are important aspects in sequencing. Past research on sequencing has emphasized two distinct and independent dichotomies: implicit versus explicit and goal-directed versus habits. We propose a theoretical framework unifying these two streams. Our proposal relies on brain's ability to implicitly extract statistical regularities from the stream of stimuli and with attentional engagement organizing sequences explicitly and hierarchically. Similarly, sequences that need to be assembled purposively to accomplish a goal require engagement of attentional processes. With repetition, these goal-directed plans become habits with concomitant disengagement of attention. Thus attention and awareness play a crucial role in the implicit-to-explicit transition as well as in how goal-directed plans become automatic habits. Cortico-subcortical loops ─ basal ganglia-frontal cortex and hippocampus-frontal cortex loops ─ mediate the transition process. We show how the computational principles of model-free and model-based learning paradigms, along with a pivotal role for attention and awareness, offer a unifying framework for these two dichotomies. Based on this framework, we make testable predictions related to the potential influence of response-to-stimulus interval (RSI on developing awareness in implicit learning tasks.

  19. A space-efficient algorithm for local similarities.

    Science.gov (United States)

    Huang, X Q; Hardison, R C; Miller, W

    1990-10-01

    Existing dynamic-programming algorithms for identifying similar regions of two sequences require time and space proportional to the product of the sequence lengths. Often this space requirement is more limiting than the time requirement. We describe a dynamic-programming local-similarity algorithm that needs only space proportional to the sum of the sequence lengths. The method can also find repeats within a single long sequence. To illustrate the algorithm's potential, we discuss comparison of a 73,360 nucleotide sequence containing the human beta-like globin gene cluster and a corresponding 44,594 nucleotide sequence for rabbit, a problem well beyond the capabilities of other dynamic-programming software.

  20. Novel HBV recombinants between genotypes B and C in 3'-terminal reverse transcriptase (RT) sequences are associated with enhanced viral DNA load, higher RT point mutation rates and place of birth among Chinese patients.

    Science.gov (United States)

    Liu, Baoming; Yang, Jing-Xian; Yan, Ling; Zhuang, Hui; Li, Tong

    2018-01-01

    As one of the major global public health concerns, hepatitis B virus (HBV) can be divided into at least eight genotypes, which may be related to disease severity and treatment response. We previously demonstrated that genotypes B and C HBV, with distinct geographical distribution in China, had divergent genotype-dependent amino acid polymorphisms and variations in reverse transcriptase (RT) gene region, a target of antiviral therapy using nucleos(t)ide analogues. Recently recombination between HBV genotypes B and C was reported to occur in the RT region. However, their frequency and clinical significance is poorly understood. Here full-length HBV RT sequences from 201 Chinese chronic hepatitis B (CHB) patients were amplified and sequenced, among which 31.34% (63/201) were genotype B whereas 68.66% (138/201) genotype C. Although no intergenotypic recombination was detected among C-genotype HBV, 38.10% (24/63) of B-genotype HBV had recombination with genotype C in the 3'-terminal RT sequences. The patients with B/C intergenotypic recombinants had significantly (Pdistribution feature in China. Our findings provide novel insight into the virological, clinical and epidemiological features of new HBV B/C intergenotypic recombinants at the 3' end of RT sequences among Chinese CHB patients. The highly complex genetic background of the novel recombinant HBV carrying new mutations affecting RT protein may contribute to an enhanced heterogeneity in treatment response or prognosis among CHB patients. Published by Elsevier B.V.

  1. Pythoscape: a framework for generation of large protein similarity networks.

    Science.gov (United States)

    Barber, Alan E; Babbitt, Patricia C

    2012-11-01

    Pythoscape is a framework implemented in Python for processing large protein similarity networks for visualization in other software packages. Protein similarity networks are graphical representations of sequence, structural and other similarities among proteins for which pairwise all-by-all similarity connections have been calculated. Mapping of biological and other information to network nodes or edges enables hypothesis creation about sequence-structure-function relationships across sets of related proteins. Pythoscape provides several options to calculate pairwise similarities for input sequences or structures, applies filters to network edges and defines sets of similar nodes and their associated data as single nodes (termed representative nodes) for compression of network information and output data or formatted files for visualization.

  2. Testing Self-Similarity Through Lamperti Transformations

    KAUST Repository

    Lee, Myoungji

    2016-07-14

    Self-similar processes have been widely used in modeling real-world phenomena occurring in environmetrics, network traffic, image processing, and stock pricing, to name but a few. The estimation of the degree of self-similarity has been studied extensively, while statistical tests for self-similarity are scarce and limited to processes indexed in one dimension. This paper proposes a statistical hypothesis test procedure for self-similarity of a stochastic process indexed in one dimension and multi-self-similarity for a random field indexed in higher dimensions. If self-similarity is not rejected, our test provides a set of estimated self-similarity indexes. The key is to test stationarity of the inverse Lamperti transformations of the process. The inverse Lamperti transformation of a self-similar process is a strongly stationary process, revealing a theoretical connection between the two processes. To demonstrate the capability of our test, we test self-similarity of fractional Brownian motions and sheets, their time deformations and mixtures with Gaussian white noise, and the generalized Cauchy family. We also apply the self-similarity test to real data: annual minimum water levels of the Nile River, network traffic records, and surface heights of food wrappings. © 2016, International Biometric Society.

  3. Higher Education.

    Science.gov (United States)

    Hendrickson, Robert M.

    This chapter reports 1982 cases involving aspects of higher education. Interesting cases noted dealt with the federal government's authority to regulate state employees' retirement and raised the questions of whether Title IX covers employment, whether financial aid makes a college a program under Title IX, and whether sex segregated mortality…

  4. Information filtering based on transferring similarity.

    Science.gov (United States)

    Sun, Duo; Zhou, Tao; Liu, Jian-Guo; Liu, Run-Ran; Jia, Chun-Xiao; Wang, Bing-Hong

    2009-07-01

    In this Brief Report, we propose an index of user similarity, namely, the transferring similarity, which involves all high-order similarities between users. Accordingly, we design a modified collaborative filtering algorithm, which provides remarkably higher accurate predictions than the standard collaborative filtering. More interestingly, we find that the algorithmic performance will approach its optimal value when the parameter, contained in the definition of transferring similarity, gets close to its critical value, before which the series expansion of transferring similarity is convergent and after which it is divergent. Our study is complementary to the one reported in [E. A. Leicht, P. Holme, and M. E. J. Newman, Phys. Rev. E 73, 026120 (2006)], and is relevant to the missing link prediction problem.

  5. Isolation of laccase gene-specific sequences from white rot and brown rot fungi by PCR

    Energy Technology Data Exchange (ETDEWEB)

    D`Souza, T.M.; Boominathan, K.; Reddy, C.A. [Michigan State Univ., East Lansing, MI (United States)

    1996-10-01

    Degenerate primers corresponding to the consensus sequences of the copper-binding regions in the N-terminal domains of known basidiomycete laccases were used to isolate laccase gene-specific sequences from strains representing nine genera of wood rot fungi. All except three gave the expected PCR product of about 200 bp. Computer searches of the databases identified the sequences of each of the PCR product of about 200 bp. Computer searches of the databases identified the sequence of each of the PCR products analyzed as a laccase gene sequence, suggesting the specificity of the primers. PCR products of the white rot fungi Ganoderma lucidum, Phlebia brevispora, and Trametes versicolor showed 65 to 74% nucleotide sequence similarity to each other; the similarity in deduced amino acid sequences was 83 to 91%. The PCR products of Lentinula edodes and Lentinus tigrinus, on the other hand, showed relatively low nucleotide and amino acid similarities (58 to 64 and 62 to 81%, respectively); however, these similarities were still much higher than when compared with the corresponding regions in the laccases of the ascomycete fungi Aspergillus nidulans and Neurospora crassa. A few of the white rot fungi, as well as Gloeophyllum trabeum, a brown rot fungus, gave a 144-bp PCR fragment which had a nucleotide sequence similarity of 60 to 71%. Demonstration of laccase activity in G. trabeum and several other brown rot fungi was of particular interest because these organisms were not previously shown to produce laccases. 36 refs., 6 figs., 2 tabs.

  6. A COMPARISON OF SEMANTIC SIMILARITY MODELS IN EVALUATING CONCEPT SIMILARITY

    Directory of Open Access Journals (Sweden)

    Q. X. Xu

    2012-08-01

    Full Text Available The semantic similarities are important in concept definition, recognition, categorization, interpretation, and integration. Many semantic similarity models have been established to evaluate semantic similarities of objects or/and concepts. To find out the suitability and performance of different models in evaluating concept similarities, we make a comparison of four main types of models in this paper: the geometric model, the feature model, the network model, and the transformational model. Fundamental principles and main characteristics of these models are introduced and compared firstly. Land use and land cover concepts of NLCD92 are employed as examples in the case study. The results demonstrate that correlations between these models are very high for a possible reason that all these models are designed to simulate the similarity judgement of human mind.

  7. Renewing the Respect for Similarity

    Directory of Open Access Journals (Sweden)

    Shimon eEdelman

    2012-07-01

    Full Text Available In psychology, the concept of similarity has traditionally evoked a mixture of respect, stemmingfrom its ubiquity and intuitive appeal, and concern, due to its dependence on the framing of the problemat hand and on its context. We argue for a renewed focus on similarity as an explanatory concept, bysurveying established results and new developments in the theory and methods of similarity-preservingassociative lookup and dimensionality reduction — critical components of many cognitive functions, aswell as of intelligent data management in computer vision. We focus in particular on the growing familyof algorithms that support associative memory by performing hashing that respects local similarity, andon the uses of similarity in representing structured objects and scenes. Insofar as these similarity-basedideas and methods are useful in cognitive modeling and in AI applications, they should be included inthe core conceptual toolkit of computational neuroscience.

  8. Method and apparatus for biological sequence comparison

    Science.gov (United States)

    Marr, T.G.; Chang, W.I.

    1997-12-23

    A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.

  9. Cross-kingdom similarities in microbiome functions

    NARCIS (Netherlands)

    Mendes, R.; Raaijmakers, J.M.

    2015-01-01

    Recent advances in medical research have revealed how humans rely on their microbiome for diverse traits and functions. Similarly, microbiomes of other higher organisms play key roles in disease, health, growth and development of their host. Exploring microbiome functions across kingdoms holds

  10. Self-similar cosmological models

    Energy Technology Data Exchange (ETDEWEB)

    Chao, W Z [Cambridge Univ. (UK). Dept. of Applied Mathematics and Theoretical Physics

    1981-07-01

    The kinematics and dynamics of self-similar cosmological models are discussed. The degrees of freedom of the solutions of Einstein's equations for different types of models are listed. The relation between kinematic quantities and the classifications of the self-similarity group is examined. All dust local rotational symmetry models have been found.

  11. Self-similar factor approximants

    International Nuclear Information System (INIS)

    Gluzman, S.; Yukalov, V.I.; Sornette, D.

    2003-01-01

    The problem of reconstructing functions from their asymptotic expansions in powers of a small variable is addressed by deriving an improved type of approximants. The derivation is based on the self-similar approximation theory, which presents the passage from one approximant to another as the motion realized by a dynamical system with the property of group self-similarity. The derived approximants, because of their form, are called self-similar factor approximants. These complement the obtained earlier self-similar exponential approximants and self-similar root approximants. The specific feature of self-similar factor approximants is that their control functions, providing convergence of the computational algorithm, are completely defined from the accuracy-through-order conditions. These approximants contain the Pade approximants as a particular case, and in some limit they can be reduced to the self-similar exponential approximants previously introduced by two of us. It is proved that the self-similar factor approximants are able to reproduce exactly a wide class of functions, which include a variety of nonalgebraic functions. For other functions, not pertaining to this exactly reproducible class, the factor approximants provide very accurate approximations, whose accuracy surpasses significantly that of the most accurate Pade approximants. This is illustrated by a number of examples showing the generality and accuracy of the factor approximants even when conventional techniques meet serious difficulties

  12. Dynamic similarity in erosional processes

    Science.gov (United States)

    Scheidegger, A.E.

    1963-01-01

    A study is made of the dynamic similarity conditions obtaining in a variety of erosional processes. The pertinent equations for each type of process are written in dimensionless form; the similarity conditions can then easily be deduced. The processes treated are: raindrop action, slope evolution and river erosion. ?? 1963 Istituto Geofisico Italiano.

  13. Personalized recommendation with corrected similarity

    International Nuclear Information System (INIS)

    Zhu, Xuzhen; Tian, Hui; Cai, Shimin

    2014-01-01

    Personalized recommendation has attracted a surge of interdisciplinary research. Especially, similarity-based methods in applications of real recommendation systems have achieved great success. However, the computations of similarities are overestimated or underestimated, in particular because of the defective strategy of unidirectional similarity estimation. In this paper, we solve this drawback by leveraging mutual correction of forward and backward similarity estimations, and propose a new personalized recommendation index, i.e., corrected similarity based inference (CSI). Through extensive experiments on four benchmark datasets, the results show a greater improvement of CSI in comparison with these mainstream baselines. And a detailed analysis is presented to unveil and understand the origin of such difference between CSI and mainstream indices. (paper)

  14. Towards Personalized Medicine: Leveraging Patient Similarity and Drug Similarity Analytics

    Science.gov (United States)

    Zhang, Ping; Wang, Fei; Hu, Jianying; Sorrentino, Robert

    2014-01-01

    The rapid adoption of electronic health records (EHR) provides a comprehensive source for exploratory and predictive analytic to support clinical decision-making. In this paper, we investigate how to utilize EHR to tailor treatments to individual patients based on their likelihood to respond to a therapy. We construct a heterogeneous graph which includes two domains (patients and drugs) and encodes three relationships (patient similarity, drug similarity, and patient-drug prior associations). We describe a novel approach for performing a label propagation procedure to spread the label information representing the effectiveness of different drugs for different patients over this heterogeneous graph. The proposed method has been applied on a real-world EHR dataset to help identify personalized treatments for hypercholesterolemia. The experimental results demonstrate the effectiveness of the approach and suggest that the combination of appropriate patient similarity and drug similarity analytics could lead to actionable insights for personalized medicine. Particularly, by leveraging drug similarity in combination with patient similarity, our method could perform well even on new or rarely used drugs for which there are few records of known past performance. PMID:25717413

  15. Functional enrichment analyses and construction of functional similarity networks with high confidence function prediction by PFP

    Directory of Open Access Journals (Sweden)

    Kihara Daisuke

    2010-05-01

    Full Text Available Abstract Background A new paradigm of biological investigation takes advantage of technologies that produce large high throughput datasets, including genome sequences, interactions of proteins, and gene expression. The ability of biologists to analyze and interpret such data relies on functional annotation of the included proteins, but even in highly characterized organisms many proteins can lack the functional evidence necessary to infer their biological relevance. Results Here we have applied high confidence function predictions from our automated prediction system, PFP, to three genome sequences, Escherichia coli, Saccharomyces cerevisiae, and Plasmodium falciparum (malaria. The number of annotated genes is increased by PFP to over 90% for all of the genomes. Using the large coverage of the function annotation, we introduced the functional similarity networks which represent the functional space of the proteomes. Four different functional similarity networks are constructed for each proteome, one each by considering similarity in a single Gene Ontology (GO category, i.e. Biological Process, Cellular Component, and Molecular Function, and another one by considering overall similarity with the funSim score. The functional similarity networks are shown to have higher modularity than the protein-protein interaction network. Moreover, the funSim score network is distinct from the single GO-score networks by showing a higher clustering degree exponent value and thus has a higher tendency to be hierarchical. In addition, examining function assignments to the protein-protein interaction network and local regions of genomes has identified numerous cases where subnetworks or local regions have functionally coherent proteins. These results will help interpreting interactions of proteins and gene orders in a genome. Several examples of both analyses are highlighted. Conclusion The analyses demonstrate that applying high confidence predictions from PFP

  16. Large margin classification with indefinite similarities

    KAUST Repository

    Alabdulmohsin, Ibrahim

    2016-01-07

    Classification with indefinite similarities has attracted attention in the machine learning community. This is partly due to the fact that many similarity functions that arise in practice are not symmetric positive semidefinite, i.e. the Mercer condition is not satisfied, or the Mercer condition is difficult to verify. Examples of such indefinite similarities in machine learning applications are ample including, for instance, the BLAST similarity score between protein sequences, human-judged similarities between concepts and words, and the tangent distance or the shape matching distance in computer vision. Nevertheless, previous works on classification with indefinite similarities are not fully satisfactory. They have either introduced sources of inconsistency in handling past and future examples using kernel approximation, settled for local-minimum solutions using non-convex optimization, or produced non-sparse solutions by learning in Krein spaces. Despite the large volume of research devoted to this subject lately, we demonstrate in this paper how an old idea, namely the 1-norm support vector machine (SVM) proposed more than 15 years ago, has several advantages over more recent work. In particular, the 1-norm SVM method is conceptually simpler, which makes it easier to implement and maintain. It is competitive, if not superior to, all other methods in terms of predictive accuracy. Moreover, it produces solutions that are often sparser than more recent methods by several orders of magnitude. In addition, we provide various theoretical justifications by relating 1-norm SVM to well-established learning algorithms such as neural networks, SVM, and nearest neighbor classifiers. Finally, we conduct a thorough experimental evaluation, which reveals that the evidence in favor of 1-norm SVM is statistically significant.

  17. Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

    Science.gov (United States)

    Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

    2017-07-01

    DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.

  18. Clustering and visualizing similarity networks of membrane proteins.

    Science.gov (United States)

    Hu, Geng-Ming; Mai, Te-Lun; Chen, Chi-Ming

    2015-08-01

    We proposed a fast and unsupervised clustering method, minimum span clustering (MSC), for analyzing the sequence-structure-function relationship of biological networks, and demonstrated its validity in clustering the sequence/structure similarity networks (SSN) of 682 membrane protein (MP) chains. The MSC clustering of MPs based on their sequence information was found to be consistent with their tertiary structures and functions. For the largest seven clusters predicted by MSC, the consistency in chain function within the same cluster is found to be 100%. From analyzing the edge distribution of SSN for MPs, we found a characteristic threshold distance for the boundary between clusters, over which SSN of MPs could be properly clustered by an unsupervised sparsification of the network distance matrix. The clustering results of MPs from both MSC and the unsupervised sparsification methods are consistent with each other, and have high intracluster similarity and low intercluster similarity in sequence, structure, and function. Our study showed a strong sequence-structure-function relationship of MPs. We discussed evidence of convergent evolution of MPs and suggested applications in finding structural similarities and predicting biological functions of MP chains based on their sequence information. © 2015 Wiley Periodicals, Inc.

  19. Self-similar analysis of the spherical implosion process

    International Nuclear Information System (INIS)

    Ishiguro, Yukio; Katsuragi, Satoru.

    1976-07-01

    The implosion processes caused by laser-heating ablation has been studied by self-similarity analysis. Attention is paid to the possibility of existence of the self-similar solution which reproduces the implosion process of high compression. Details of the self-similar analysis are reproduced and conclusions are drawn quantitatively on the gas compression by a single shock. The compression process by a sequence of shocks is discussed in self-similarity. The gas motion followed by a homogeneous isentropic compression is represented by a self-similar motion. (auth.)

  20. Isolation of laccase gene-specific sequences from white rot and brown rot fungi by PCR.

    Science.gov (United States)

    D'Souza, T M; Boominathan, K; Reddy, C A

    1996-01-01

    Degenerate primers corresponding to the consensus sequences of the copper-binding regions in the N-terminal domains of known basidiomycete laccases were used to isolate laccase gene-specific sequences from strains representing nine genera of wood rot fungi. All except three gave the expected PCR product of about 200 bp. Computer searches of the databases identified the sequence of each of the PCR products analyzed as a laccase gene sequence, suggesting the specificity of the primers. PCR products of the white rot fungi Ganoderma lucidum, Phlebia brevispora, and Trametes versicolor showed 65 to 74% nucleotide sequence similarity to each other; the similarity in deduced amino acid sequences was 83 to 91%. The PCR products of Lentinula edodes and Lentinus tigrinus, on the other hand, showed relatively low nucleotide and amino acid similarities (58 to 64 and 62 to 81%, respectively); however, these similarities were still much higher than when compared with the corresponding regions in the laccases of the ascomycete fungi Aspergillus nidulans and Neurospora crassa. A few of the white rot fungi, as well as Gloeophyllum trabeum, a brown rot fungus, gave a 144-bp PCR fragment which had a nucleotide sequence similarity of 60 to 71%. Demonstration of laccase activity in G. trabeum and several other brown rot fungi was of particular interest because these organisms were not previously shown to produce laccases. PMID:8837429

  1. Lineup member similarity effects on children's eyewitness identification

    OpenAIRE

    Fitzgerald, Ryan J.; Whiting, Brittany F.; Therrien, Natalie M.; Price, Heather L.

    2014-01-01

    To date, research investigating the similarity among lineup members has focused on adult eyewitnesses. In the present research, children made identifications from lineups containing members of lower or higher similarity to a target person. In Experiment 1, following a live interaction, children's (6–14 years) correct identification rate was reduced in higher-similarity relative to lower-similarity lineups. In Experiment 2, children (6–12 years) and adults watched a video containing a target p...

  2. Assessing Analytical Similarity of Proposed Amgen Biosimilar ABP 501 to Adalimumab.

    Science.gov (United States)

    Liu, Jennifer; Eris, Tamer; Li, Cynthia; Cao, Shawn; Kuhns, Scott

    2016-08-01

    ABP 501 is being developed as a biosimilar to adalimumab. Comprehensive comparative analytical characterization studies have been conducted and completed. The objective of this study was to assess analytical similarity between ABP 501 and two adalimumab reference products (RPs), licensed by the United States Food and Drug Administration (adalimumab [US]) and authorized by the European Union (adalimumab [EU]), using state-of-the-art analytical methods. Comprehensive analytical characterization incorporating orthogonal analytical techniques was used to compare products. Physicochemical property comparisons comprised the primary structure related to amino acid sequence and post-translational modifications including glycans; higher-order structure; primary biological properties mediated by target and receptor binding; product-related substances and impurities; host-cell impurities; general properties of the finished drug product, including strength and formulation; subvisible and submicron particles and aggregates; and forced thermal degradation. ABP 501 had the same amino acid sequence and similar post-translational modification profiles compared with adalimumab RPs. Primary structure, higher-order structure, and biological activities were similar for the three products. Product-related size and charge variants and aggregate and particle levels were also similar. ABP 501 had very low residual host-cell protein and DNA. The finished ABP 501 drug product has the same strength with regard to protein concentration and fill volume as adalimumab RPs. ABP 501 and the RPs had a similar stability profile both in normal storage and thermal stress conditions. Based on the comprehensive analytical similarity assessment, ABP 501 was found to be similar to adalimumab with respect to physicochemical and biological properties.

  3. Similarity measures for face recognition

    CERN Document Server

    Vezzetti, Enrico

    2015-01-01

    Face recognition has several applications, including security, such as (authentication and identification of device users and criminal suspects), and in medicine (corrective surgery and diagnosis). Facial recognition programs rely on algorithms that can compare and compute the similarity between two sets of images. This eBook explains some of the similarity measures used in facial recognition systems in a single volume. Readers will learn about various measures including Minkowski distances, Mahalanobis distances, Hansdorff distances, cosine-based distances, among other methods. The book also summarizes errors that may occur in face recognition methods. Computer scientists "facing face" and looking to select and test different methods of computing similarities will benefit from this book. The book is also useful tool for students undertaking computer vision courses.

  4. ON SOME RECURRENCE TYPE SMARANDACHE SEQUENCES

    OpenAIRE

    MAJUMDAR, A.A.K.; GUNARTO, H.

    2000-01-01

    In this paper, we study some properties of ten recurrence type Smarandache sequences, namely, the Smarandache odd, even, prime product, square product, higher-power product, permutation, consecutive, reverse, symmetric, and pierced chain sequences.

  5. A Novel Hybrid Similarity Calculation Model

    Directory of Open Access Journals (Sweden)

    Xiaoping Fan

    2017-01-01

    Full Text Available This paper addresses the problems of similarity calculation in the traditional recommendation algorithms of nearest neighbor collaborative filtering, especially the failure in describing dynamic user preference. Proceeding from the perspective of solving the problem of user interest drift, a new hybrid similarity calculation model is proposed in this paper. This model consists of two parts, on the one hand the model uses the function fitting to describe users’ rating behaviors and their rating preferences, and on the other hand it employs the Random Forest algorithm to take user attribute features into account. Furthermore, the paper combines the two parts to build a new hybrid similarity calculation model for user recommendation. Experimental results show that, for data sets of different size, the model’s prediction precision is higher than the traditional recommendation algorithms.

  6. Fast business process similarity search

    NARCIS (Netherlands)

    Yan, Z.; Dijkman, R.M.; Grefen, P.W.P.J.

    2012-01-01

    Nowadays, it is common for organizations to maintain collections of hundreds or even thousands of business processes. Techniques exist to search through such a collection, for business process models that are similar to a given query model. However, those techniques compare the query model to each

  7. Glove boxes and similar containments

    International Nuclear Information System (INIS)

    Anon.

    1975-01-01

    According to the present invention a glove box or similar containment is provided with an exhaust system including a vortex amplifier venting into the system, the vortex amplifier also having its main inlet in fluid flow connection with the containment and a control inlet in fluid flow connection with the atmosphere outside the containment. (U.S.)

  8. Sparc: a sparsity-based consensus algorithm for long erroneous sequencing reads

    Directory of Open Access Journals (Sweden)

    Chengxi Ye

    2016-06-01

    Full Text Available Motivation. The third generation sequencing (3GS technology generates long sequences of thousands of bases. However, its current error rates are estimated in the range of 15–40%, significantly higher than those of the prevalent next generation sequencing (NGS technologies (less than 1%. Fundamental bioinformatics tasks such as de novo genome assembly and variant calling require high-quality sequences that need to be extracted from these long but erroneous 3GS sequences. Results. We describe a versatile and efficient linear complexity consensus algorithm Sparc to facilitate de novo genome assembly. Sparc builds a sparse k-mer graph using a collection of sequences from a targeted genomic region. The heaviest path which approximates the most likely genome sequence is searched through a sparsity-induced reweighted graph as the consensus sequence. Sparc supports using NGS and 3GS data together, which leads to significant improvements in both cost efficiency and computational efficiency. Experiments with Sparc show that our algorithm can efficiently provide high-quality consensus sequences using both PacBio and Oxford Nanopore sequencing technologies. With only 30× PacBio data, Sparc can reach a consensus with error rate <0.5%. With the more challenging Oxford Nanopore data, Sparc can also achieve similar error rate when combined with NGS data. Compared with the existing approaches, Sparc calculates the consensus with higher accuracy, and uses approximately 80% less memory and time. Availability. The source code is available for download at https://github.com/yechengxi/Sparc.

  9. An Alfven eigenmode similarity experiment

    International Nuclear Information System (INIS)

    Heidbrink, W W; Fredrickson, E; Gorelenkov, N N; Hyatt, A W; Kramer, G; Luo, Y

    2003-01-01

    The major radius dependence of Alfven mode stability is studied by creating plasmas with similar minor radius, shape, magnetic field (0.5 T), density (n e ≅3x10 19 m -3 ), electron temperature (1.0 keV) and beam ion population (near-tangential 80 keV deuterium injection) on both NSTX and DIII-D. The major radius of NSTX is half the major radius of DIII-D. The super-Alfvenic beam ions that drive the modes have overlapping values of v f /v A in the two devices. Observed beam-driven instabilities include toroidicity-induced Alfven eigenmodes (TAE). The stability threshold for the TAE is similar in the two devices. As expected theoretically, the most unstable toroidal mode number n is larger in DIII-D

  10. Comparative genomics beyond sequence-based alignments

    DEFF Research Database (Denmark)

    Þórarinsson, Elfar; Yao, Zizhen; Wiklund, Eric D.

    2008-01-01

    Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment me...

  11. Purifying selection acts on coding and non-coding sequences of paralogous genes in Arabidopsis thaliana.

    Science.gov (United States)

    Hoffmann, Robert D; Palmgren, Michael

    2016-06-13

    Whole-genome duplications in the ancestors of many diverse species provided the genetic material for evolutionary novelty. Several models explain the retention of paralogous genes. However, how these models are reflected in the evolution of coding and non-coding sequences of paralogous genes is unknown. Here, we analyzed the coding and non-coding sequences of paralogous genes in Arabidopsis thaliana and compared these sequences with those of orthologous genes in Arabidopsis lyrata. Paralogs with lower expression than their duplicate had more nonsynonymous substitutions, were more likely to fractionate, and exhibited less similar expression patterns with their orthologs in the other species. Also, lower-expressed genes had greater tissue specificity. Orthologous conserved non-coding sequences in the promoters, introns, and 3' untranslated regions were less abundant at lower-expressed genes compared to their higher-expressed paralogs. A gene ontology (GO) term enrichment analysis showed that paralogs with similar expression levels were enriched in GO terms related to ribosomes, whereas paralogs with different expression levels were enriched in terms associated with stress responses. Loss of conserved non-coding sequences in one gene of a paralogous gene pair correlates with reduced expression levels that are more tissue specific. Together with increased mutation rates in the coding sequences, this suggests that similar forces of purifying selection act on coding and non-coding sequences. We propose that coding and non-coding sequences evolve concurrently following gene duplication.

  12. Similarity analysis between quantum images

    Science.gov (United States)

    Zhou, Ri-Gui; Liu, XingAo; Zhu, Changming; Wei, Lai; Zhang, Xiafen; Ian, Hou

    2018-06-01

    Similarity analyses between quantum images are so essential in quantum image processing that it provides fundamental research for the other fields, such as quantum image matching, quantum pattern recognition. In this paper, a quantum scheme based on a novel quantum image representation and quantum amplitude amplification algorithm is proposed. At the end of the paper, three examples and simulation experiments show that the measurement result must be 0 when two images are same, and the measurement result has high probability of being 1 when two images are different.

  13. Similarity flows in relativistic hydrodynamics

    International Nuclear Information System (INIS)

    Blaizot, J.P.; Ollitrault, J.Y.

    1986-01-01

    In ultra-relativistic heavy ion collisions, one expects in particular to observe a deconfinement transition leading to a formation of quark gluon plasma. In the framework of the hydrodynamic model, experimental signatures of such a plasma may be looked for as observable consequences of a first order transition on the evolution of the system. In most of the possible scenario, the phase transition is accompanied with discontinuities in the hydrodynamic flow, such as shock waves. The method presented in this paper has been developed to treat without too much numerical effort such discontinuous flow. It relies heavily on the use of similarity solutions of the hydrodynamic equations

  14. Universal sequence map (USM of arbitrary discrete sequences

    Directory of Open Access Journals (Sweden)

    Almeida Jonas S

    2002-02-01

    Full Text Available Abstract Background For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis – without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. Results We have successfully identified such an iterative function for bijective mappingψ of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM, is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR. The latter enables the representation of 4 unit type sequences (like DNA as an order free Markov Chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. Conclusions USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules.

  15. Self-similar gravitational clustering

    International Nuclear Information System (INIS)

    Efstathiou, G.; Fall, S.M.; Hogan, C.

    1979-01-01

    The evolution of gravitational clustering is considered and several new scaling relations are derived for the multiplicity function. These include generalizations of the Press-Schechter theory to different densities and cosmological parameters. The theory is then tested against multiplicity function and correlation function estimates for a series of 1000-body experiments. The results are consistent with the theory and show some dependence on initial conditions and cosmological density parameter. The statistical significance of the results, however, is fairly low because of several small number effects in the experiments. There is no evidence for a non-linear bootstrap effect or a dependence of the multiplicity function on the internal dynamics of condensed groups. Empirical estimates of the multiplicity function by Gott and Turner have a feature near the characteristic luminosity predicted by the theory. The scaling relations allow the inference from estimates of the galaxy luminosity function that galaxies must have suffered considerable dissipation if they originally formed from a self-similar hierarchy. A method is also developed for relating the multiplicity function to similar measures of clustering, such as those of Bhavsar, for the distribution of galaxies on the sky. These are shown to depend on the luminosity function in a complicated way. (author)

  16. Static multiplicities in heterogeneous azeotropic distillation sequences

    DEFF Research Database (Denmark)

    Esbjerg, Klavs; Andersen, Torben Ravn; Jørgensen, Sten Bay

    1998-01-01

    In this paper the results of a bifurcation analysis on heterogeneous azeotropic distillation sequences are given. Two sequences suitable for ethanol dehydration are compared: The 'direct' and the 'indirect' sequence. It is shown, that the two sequences, despite their similarities, exhibit very...... different static behavior. The method of Petlyuk and Avet'yan (1971), Bekiaris et al. (1993), which assumes infinite reflux and infinite number of stages, is extended to and applied on heterogeneous azeotropic distillation sequences. The predictions are substantiated through simulations. The static sequence...

  17. Self-similar pattern formation and continuous mechanics of self-similar systems

    Directory of Open Access Journals (Sweden)

    A. V. Dyskin

    2007-01-01

    Full Text Available In many cases, the critical state of systems that reached the threshold is characterised by self-similar pattern formation. We produce an example of pattern formation of this kind – formation of self-similar distribution of interacting fractures. Their formation starts with the crack growth due to the action of stress fluctuations. It is shown that even when the fluctuations have zero average the cracks generated by them could grow far beyond the scale of stress fluctuations. Further development of the fracture system is controlled by crack interaction leading to the emergence of self-similar crack distributions. As a result, the medium with fractures becomes discontinuous at any scale. We develop a continuum fractal mechanics to model its physical behaviour. We introduce a continuous sequence of continua of increasing scales covering this range of scales. The continuum of each scale is specified by the representative averaging volume elements of the corresponding size. These elements determine the resolution of the continuum. Each continuum hides the cracks of scales smaller than the volume element size while larger fractures are modelled explicitly. Using the developed formalism we investigate the stability of self-similar crack distributions with respect to crack growth and show that while the self-similar distribution of isotropically oriented cracks is stable, the distribution of parallel cracks is not. For the isotropically oriented cracks scaling of permeability is determined. For permeable materials (rocks with self-similar crack distributions permeability scales as cube of crack radius. This property could be used for detecting this specific mechanism of formation of self-similar crack distributions.

  18. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    Energy Technology Data Exchange (ETDEWEB)

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by

  19. Alaska, Gulf spills share similarities

    International Nuclear Information System (INIS)

    Usher, D.

    1991-01-01

    The accidental Exxon Valdez oil spill in Alaska and the deliberate dumping of crude oil into the Persian Gulf as a tactic of war contain both glaring differences and surprising similarities. Public reaction and public response was much greater to the Exxon Valdez spill in pristine Prince William Sound than to the war-related tragedy in the Persian Gulf. More than 12,000 workers helped in the Alaskan cleanup; only 350 have been involved in Kuwait. But in both instances, environmental damages appear to be less than anticipated. Natures highly effective self-cleansing action is primarily responsible for minimizing the damages. One positive action growing out of the two incidents is increased international cooperation and participation in oil-spill clean-up efforts. In 1990, in the aftermath of the Exxon Valdez spill, 94 nations signed an international accord on cooperation in future spills. The spills can be historic environmental landmarks leading to creation of more sophisticated response systems worldwide

  20. Defining a similarity threshold for a functional proteinsequence pattern: The signal peptide cleavage site

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Engelbrecht, Jacob; von Heijne, Gunnar

    1996-01-01

    When preparing data sets of amino acid or nucleotide sequences it is necessary to exclude redundant or homologous sequences in order to avoid overestimating the predictive performance of an algorithm. For some time methods for doing this have been available in the area of protein structure...... prediction. We have developed a similar procedure based on pair-wise alignments for sequences with functional sites. We show how a correlation coefficient between sequence similarity and functional homology can be used to compare the efficiency of different similarity measures and choose a nonarbitrary...

  1. Optimal neighborhood indexing for protein similarity search.

    Science.gov (United States)

    Peterlongo, Pierre; Noé, Laurent; Lavenier, Dominique; Nguyen, Van Hoa; Kucherov, Gregory; Giraud, Mathieu

    2008-12-16

    Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet. The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website http://bioinfo.lifl.fr/reblosum. We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction.

  2. Optimal neighborhood indexing for protein similarity search

    Directory of Open Access Journals (Sweden)

    Nguyen Van

    2008-12-01

    Full Text Available Abstract Background Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet. Results The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website http://bioinfo.lifl.fr/reblosum. Conclusion We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction.

  3. Contrasting HIV phylogenetic relationships and V3 loop protein similarities

    Energy Technology Data Exchange (ETDEWEB)

    Korber, B. (Los Alamos National Lab., NM (United States) Santa Fe Inst., NM (United States)); Myers, G. (Los Alamos National Lab., NM (United States))

    1992-01-01

    At least five distinct sequence subtypes of HIV-I can be identified from the major centers of the AMS pandemic. While it is too early to tell whether these subtypes are serologically or phenotypically similar or distinct in terms of properties such as pathogenicity and transmissibility, we can begin to investigate their potential for phenotypic divergence at the protein sequence level. Phylogenetic analysis of HIV DNA sequences is being widely used to examine lineages of different viral strains as they evolve and spread throughout the globe. We have identified five distinct HIV-1 subtypes (designated A-E), or clades, based on phylogenetic clustering patterns generated from genetic information from both the gag and envelope (env) genes from a spectrum of international isolates. Our initial observations concerning both HIV-1 and HIV-2 sequences indicate that conserved patterns in protein chemistry may indeed exist across distant lineages. Such patterns in V3 loop amino acid chemistry may be indicative of stable lineages or convergence within this highly variable, though functionally and immunologically critical, region. We think that there may be parallels between the apparently stable HIV-2 V3 lineage and the previously mentioned HIV-1 V3 loops which are very similar at the protein level despite being distant by cladistic analysis, and which do not possess the distinctive positively charged residues. Highly conserved V3 loop protein sequences are also encountered in SIVAGMs and CIVs (chimpanzee viral strains), which do not appear to be pathogenic in their wild-caught natural hosts.

  4. Contrasting HIV phylogenetic relationships and V3 loop protein similarities

    Energy Technology Data Exchange (ETDEWEB)

    Korber, B. [Los Alamos National Lab., NM (United States)]|[Santa Fe Inst., NM (United States); Myers, G. [Los Alamos National Lab., NM (United States)

    1992-12-31

    At least five distinct sequence subtypes of HIV-I can be identified from the major centers of the AMS pandemic. While it is too early to tell whether these subtypes are serologically or phenotypically similar or distinct in terms of properties such as pathogenicity and transmissibility, we can begin to investigate their potential for phenotypic divergence at the protein sequence level. Phylogenetic analysis of HIV DNA sequences is being widely used to examine lineages of different viral strains as they evolve and spread throughout the globe. We have identified five distinct HIV-1 subtypes (designated A-E), or clades, based on phylogenetic clustering patterns generated from genetic information from both the gag and envelope (env) genes from a spectrum of international isolates. Our initial observations concerning both HIV-1 and HIV-2 sequences indicate that conserved patterns in protein chemistry may indeed exist across distant lineages. Such patterns in V3 loop amino acid chemistry may be indicative of stable lineages or convergence within this highly variable, though functionally and immunologically critical, region. We think that there may be parallels between the apparently stable HIV-2 V3 lineage and the previously mentioned HIV-1 V3 loops which are very similar at the protein level despite being distant by cladistic analysis, and which do not possess the distinctive positively charged residues. Highly conserved V3 loop protein sequences are also encountered in SIVAGMs and CIVs (chimpanzee viral strains), which do not appear to be pathogenic in their wild-caught natural hosts.

  5. Polynomial sequences generated by infinite Hessenberg matrices

    Directory of Open Access Journals (Sweden)

    Verde-Star Luis

    2017-01-01

    Full Text Available We show that an infinite lower Hessenberg matrix generates polynomial sequences that correspond to the rows of infinite lower triangular invertible matrices. Orthogonal polynomial sequences are obtained when the Hessenberg matrix is tridiagonal. We study properties of the polynomial sequences and their corresponding matrices which are related to recurrence relations, companion matrices, matrix similarity, construction algorithms, and generating functions. When the Hessenberg matrix is also Toeplitz the polynomial sequences turn out to be of interpolatory type and we obtain additional results. For example, we show that every nonderogative finite square matrix is similar to a unique Toeplitz-Hessenberg matrix.

  6. Memory and learning with rapid audiovisual sequences

    Science.gov (United States)

    Keller, Arielle S.; Sekuler, Robert

    2015-01-01

    We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed. PMID:26575193

  7. Memory and learning with rapid audiovisual sequences.

    Science.gov (United States)

    Keller, Arielle S; Sekuler, Robert

    2015-01-01

    We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed.

  8. Comparative analysis of sequences from PT 2013

    DEFF Research Database (Denmark)

    Mikkelsen, Susie Sommer

    Sheatfish and not EHNV. Generally, mistakes occurred at the ends of the sequences. This can be due to several factors. One is that the sequence has not been trimmed of the sequence primer sites. Another is the lack of quality control of the chromatogram. Finally, sequencing in just one direction can result...... diseases in Europe. As part of the EURL proficiency test for fish diseases it is required to sequence any RANA virus isolates found in any of the samples. It is also highly recommended to sequence the ISA virus to determine whether it be HPRΔ or HPR0. Furthermore, it is recommended that any VHSV and IHNV...... isolates be genotyped. As part of the evaluation of the proficiency results it was decided this year to look into the quality and similarity of the sequence results for selected viruses. Ampoule III in the proficiency test 2013 contained an EHNV isolate. The EURL received 43 sequences from 41 laboratories...

  9. Sequence analysis of Leukemia DNA

    Science.gov (United States)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  10. Conservation of RNA sequence and cross-linking ability in ribosomes from a higher eukaryote: photochemical cross-linking of the anticodon of P site bound tRNA to the penultimate cytidine of the UACACACG sequence in Artemia salina 18S rRNA

    International Nuclear Information System (INIS)

    Ciesiolka, J.; Nurse, K.; Klein, J.; Ofengand, J.

    1985-01-01

    The complex of Artemia salina ribosomes and Escherichia coli acetylvalyl-tRNA could be cross-linked by irradiation with near-UV light. Cross-linking required the presence of the codon GUU, GUA being ineffective. The acetylvalyl group could be released from the cross-linked tRNA by treatment with puromycin, demonstrating that cross-linking had occurred at the P site. This was true both for pGUU- and also for poly(U2,G)-dependent cross-linking. All of the cross-linking was to the 18S rRNA of the small ribosomal subunit. Photolysis of the cross-link at 254 nm occurred with the same kinetics as that for the known cyclobutane dimer between this tRNA and Escherichia coli 16S rRNA. T1 RNase digestion of the cross-linked tRNA yielded an oligonucleotide larger in molecular weight than any from un-cross-linked rRNA or tRNA or from a prephotolyzed complex. Extended electrophoresis showed this material to consist of two oligomers of similar mobility, a faster one-third component and a slower two-thirds component. Each oligomer yielded two components on 254-nm photolysis. The slower band from each was the tRNA T1 oligomer CACCUCCCUVACAAGp, which includes the anticodon. The faster band was the rRNA 9-mer UACACACCGp and its derivative UACACACUG. Unexpectedly, the dephosphorylated and slower moving 9-mer was derived from the faster moving dimer. Deamination of the penultimate C to U is probably due to cyclobutane dimer formation and was evidence for that nucleotide being the site of cross-linking. Direct confirmation of the cross-linking site was obtained by Z-gel analysis

  11. Rapid Diagnostics of Onboard Sequences

    Science.gov (United States)

    Starbird, Thomas W.; Morris, John R.; Shams, Khawaja S.; Maimone, Mark W.

    2012-01-01

    Keeping track of sequences onboard a spacecraft is challenging. When reviewing Event Verification Records (EVRs) of sequence executions on the Mars Exploration Rover (MER), operators often found themselves wondering which version of a named sequence the EVR corresponded to. The lack of this information drastically impacts the operators diagnostic capabilities as well as their situational awareness with respect to the commands the spacecraft has executed, since the EVRs do not provide argument values or explanatory comments. Having this information immediately available can be instrumental in diagnosing critical events and can significantly enhance the overall safety of the spacecraft. This software provides auditing capability that can eliminate that uncertainty while diagnosing critical conditions. Furthermore, the Restful interface provides a simple way for sequencing tools to automatically retrieve binary compiled sequence SCMFs (Space Command Message Files) on demand. It also enables developers to change the underlying database, while maintaining the same interface to the existing applications. The logging capabilities are also beneficial to operators when they are trying to recall how they solved a similar problem many days ago: this software enables automatic recovery of SCMF and RML (Robot Markup Language) sequence files directly from the command EVRs, eliminating the need for people to find and validate the corresponding sequences. To address the lack of auditing capability for sequences onboard a spacecraft during earlier missions, extensive logging support was added on the Mars Science Laboratory (MSL) sequencing server. This server is responsible for generating all MSL binary SCMFs from RML input sequences. The sequencing server logs every SCMF it generates into a MySQL database, as well as the high-level RML file and dictionary name inputs used to create the SCMF. The SCMF is then indexed by a hash value that is automatically included in all command

  12. Evidence for Genetic Similarity of Vegetative Compatibility Groupings in Sclerotinia homoeocarpa

    Directory of Open Access Journals (Sweden)

    Seog Won Chang

    2014-12-01

    Full Text Available Vegetative compatibility groups (VCGs are determined for many fungi to test for the ability of fungal isolates to undergo heterokaryon formation. In several fungal plant pathogens, isolates belonging to a VCG have been shown to share significantly higher genetic similarity than those of different VCGs. In this study we sought to examine the relationship between VCG and genetic similarity of an important cool season turfgrass pathogen, Sclerotinia homoeocarpa. Twenty-two S. homoeocarpa isolates from the Midwest and Eastern US, which were previously characterized in several studies, were all evaluated for VCG using an improved nit mutant assay. These isolates were also genotyped using 19 microsatellites developed from partial genome sequence of S. homoeocarpa. Additionally, partial sequences of mitochondrial genes cytochrome oxidase II and mitochondrial small subunit (mtSSU rRNA, and the atp6-rns intergenic spacer, were generated for isolates from each nit mutant VCG to determine if mitochondrial haplotypes differed among VCGs. Of the 22 isolates screened, 15 were amenable to the nit mutant VCG assay and were grouped into six VCGs. The 19 microsatellites gave 57 alleles for this set. Unweighted pair group methods with arithmetic mean (UPGMA tree of binary microsatellite data were used to produce a dendrogram of the isolate genotypes based on microsatellite alleles, which showed high genetic similarity of nit mutant VCGs. Analysis of molecular variance of microsatellite data demonstrates that the current nit mutant VCGs explain the microsatellite genotypic variation among isolates better than the previous nit mutant VCGs or the conventionally determined VCGs. Mitochondrial sequences were identical among all isolates, suggesting that this marker type may not be informative for US populations of S. homoeocarpa.

  13. CREST--classification resources for environmental sequence tags.

    Directory of Open Access Journals (Sweden)

    Anders Lanzén

    Full Text Available Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags, a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3 from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  14. The RNA world, automatic sequences and oncogenetics

    Energy Technology Data Exchange (ETDEWEB)

    Tahir Shah, K

    1993-04-01

    We construct a model of the RNA world in terms of naturally evolving nucleotide sequences assuming only Crick-Watson base pairing and self-cleaving/splicing capability. These sequences have the following properties. (1) They are recognizable by an automation (or automata). That is, to each k-sequence, there exist a k-automation which accepts, recognizes or generates the k-sequence. These are known as automatic sequences. Fibonacci and Morse-Thue sequences are the most natural outcome of pre-biotic chemical conditions. (2) Infinite (resp. large) sequences are self-similar (resp. nearly self-similar) under certain rewrite rules and consequently give rise to fractal (resp.fractal-like) structures. Computationally, such sequences can also be generated by their corresponding deterministic parallel re-write system, known as a DOL system. The self-similar sequences are fixed points of their respective rewrite rules. Some of these automatic sequences have the capability that they can read or ``accept`` other sequences while others can detect errors and trigger error-correcting mechanisms. They can be enlarged and have block and/or palindrome structure. Linear recurring sequences such as Fibonacci sequence are simply Feed-back Shift Registers, a well know model of information processing machines. We show that a mutation of any rewrite rule can cause a combinatorial explosion of error and relates this to oncogenetical behavior. On the other hand, a mutation of sequences that are not rewrite rules, leads to normal evolutionary change. Known experimental results support our hypothesis. (author). Refs.

  15. The RNA world, automatic sequences and oncogenetics

    International Nuclear Information System (INIS)

    Tahir Shah, K.

    1993-04-01

    We construct a model of the RNA world in terms of naturally evolving nucleotide sequences assuming only Crick-Watson base pairing and self-cleaving/splicing capability. These sequences have the following properties. 1) They are recognizable by an automation (or automata). That is, to each k-sequence, there exist a k-automation which accepts, recognizes or generates the k-sequence. These are known as automatic sequences. Fibonacci and Morse-Thue sequences are the most natural outcome of pre-biotic chemical conditions. 2) Infinite (resp. large) sequences are self-similar (resp. nearly self-similar) under certain rewrite rules and consequently give rise to fractal (resp.fractal-like) structures. Computationally, such sequences can also be generated by their corresponding deterministic parallel re-write system, known as a DOL system. The self-similar sequences are fixed points of their respective rewrite rules. Some of these automatic sequences have the capability that they can read or 'accept' other sequences while others can detect errors and trigger error-correcting mechanisms. They can be enlarged and have block and/or palindrome structure. Linear recurring sequences such as Fibonacci sequence are simply Feed-back Shift Registers, a well know model of information processing machines. We show that a mutation of any rewrite rule can cause a combinatorial explosion of error and relates this to oncogenetical behavior. On the other hand, a mutation of sequences that are not rewrite rules, leads to normal evolutionary change. Known experimental results support our hypothesis. (author). Refs

  16. Shotgun protein sequencing.

    Energy Technology Data Exchange (ETDEWEB)

    Faulon, Jean-Loup Michel; Heffelfinger, Grant S.

    2009-06-01

    A novel experimental and computational technique based on multiple enzymatic digestion of a protein or protein mixture that reconstructs protein sequences from sequences of overlapping peptides is described in this SAND report. This approach, analogous to shotgun sequencing of DNA, is to be used to sequence alternative spliced proteins, to identify post-translational modifications, and to sequence genetically engineered proteins.

  17. Sleep and memory consolidation: motor performance and proactive interference effects in sequence learning.

    Science.gov (United States)

    Borragán, Guillermo; Urbain, Charline; Schmitz, Rémy; Mary, Alison; Peigneux, Philippe

    2015-04-01

    That post-training sleep supports the consolidation of sequential motor skills remains debated. Performance improvement and sensitivity to proactive interference are both putative measures of long-term memory consolidation. We tested sleep-dependent memory consolidation for visuo-motor sequence learning using a proactive interference paradigm. Thirty-three young adults were trained on sequence A on Day 1, then had Regular Sleep (RS) or were Sleep Deprived (SD) on the night after learning. After two recovery nights, they were tested on the same sequence A, then had to learn a novel, potentially competing sequence B. We hypothesized that proactive interference effects on sequence B due to the prior learning of sequence A would be higher in the RS condition, considering that proactive interference is an indirect marker of the robustness of sequence A, which should be better consolidated over post-training sleep. Results highlighted sleep-dependent improvement for sequence A, with faster RTs overnight for RS participants only. Moreover, the beneficial impact of sleep was specific to the consolidation of motor but not sequential skills. Proactive interference effects on learning a new material at Day 4 were similar between RS and SD participants. These results suggest that post-training sleep contributes to optimizing motor but not sequential components of performance in visuo-motor sequence learning. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Social values as arguments: similar is convincing

    Science.gov (United States)

    Maio, Gregory R.; Hahn, Ulrike; Frost, John-Mark; Kuppens, Toon; Rehman, Nadia; Kamble, Shanmukh

    2014-01-01

    Politicians, philosophers, and rhetors engage in co-value argumentation: appealing to one value in order to support another value (e.g., “equality leads to freedom”). Across four experiments in the United Kingdom and India, we found that the psychological relatedness of values affects the persuasiveness of the arguments that bind them. Experiment 1 found that participants were more persuaded by arguments citing values that fulfilled similar motives than by arguments citing opposing values. Experiments 2 and 3 replicated this result using a wider variety of values, while finding that the effect is stronger among people higher in need for cognition and that the effect is mediated by the greater plausibility of co-value arguments that link motivationally compatible values. Experiment 4 extended the effect to real-world arguments taken from political propaganda and replicated the mediating effect of argument plausibility. The findings highlight the importance of value relatedness in argument persuasiveness. PMID:25147529

  19. Soldier motivation – different or similar?

    DEFF Research Database (Denmark)

    Brænder, Morten; Andersen, Lotte Bøgh

    Recent research in military sociology has shown that in addition to their strong peer motivation modern soldiers are oriented toward contributing to society. It has not, however, been tested how soldier motivation differs from the motivation of other citizens in this respect. In this paper......, by means of public service motivation, a concept developed within the public administration literature, we compare soldier and civilian motivation. The contribution of this paper is an analysis of whether and how Danish combat soldiers differs from other Danes in regard to public service motivation? Using...... surveys with similar questions, we find that soldiers are more normatively motivated to contribute to society than other citizens (higher commitment to the public interest), while their affectively based motivation is lower (lower compassion). This points towards a potential problem in regard...

  20. Social Values as Arguments: Similar is Convincing

    Directory of Open Access Journals (Sweden)

    Gregory R Maio

    2014-08-01

    Full Text Available Politicians, philosophers, and rhetors engage in co-value argumentation: appealing to one value in order to support another value (e.g., equality leads to freedom. Across four experiments in the United Kingdom and India, we found that the psychological relatedness of values affects the persuasiveness of the arguments that bind them. Experiment 1 found that participants were more persuaded by arguments citing values that fulfilled similar motives than by arguments citing opposing values. Experiments 2 and 3 replicated this result using a wider variety of values, while finding that the effect is stronger among people higher in need for cognition and that the effect is mediated by the greater plausibility of co-value arguments that link motivationally compatible values. Experiment 4 extended the effect to real-world arguments taken from political propaganda and replicated the mediating effect of argument plausibility. The findings highlight the importance of value relatedness in argument persuasiveness.

  1. Development of similarity theory for control systems

    Science.gov (United States)

    Myshlyaev, L. P.; Evtushenko, V. F.; Ivushkin, K. A.; Makarov, G. V.

    2018-05-01

    The area of effective application of the traditional similarity theory and the need necessity of its development for systems are discussed. The main statements underlying the similarity theory of control systems are given. The conditions for the similarity of control systems and the need for similarity control control are formulated. Methods and algorithms for estimating and similarity control of control systems and the results of research of control systems based on their similarity are presented. The similarity control of systems includes the current evaluation of the degree of similarity of control systems and the development of actions controlling similarity, and the corresponding targeted change in the state of any element of control systems.

  2. Sequence Read Archive (SRA)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome...

  3. A Signal Processing Method to Explore Similarity in Protein Flexibility

    Directory of Open Access Journals (Sweden)

    Simina Vasilache

    2010-01-01

    Full Text Available Understanding mechanisms of protein flexibility is of great importance to structural biology. The ability to detect similarities between proteins and their patterns is vital in discovering new information about unknown protein functions. A Distance Constraint Model (DCM provides a means to generate a variety of flexibility measures based on a given protein structure. Although information about mechanical properties of flexibility is critical for understanding protein function for a given protein, the question of whether certain characteristics are shared across homologous proteins is difficult to assess. For a proper assessment, a quantified measure of similarity is necessary. This paper begins to explore image processing techniques to quantify similarities in signals and images that characterize protein flexibility. The dataset considered here consists of three different families of proteins, with three proteins in each family. The similarities and differences found within flexibility measures across homologous proteins do not align with sequence-based evolutionary methods.

  4. A path-based measurement for human miRNA functional similarities using miRNA-disease associations

    Science.gov (United States)

    Ding, Pingjian; Luo, Jiawei; Xiao, Qiu; Chen, Xiangtao

    2016-09-01

    Compared with the sequence and expression similarity, miRNA functional similarity is so important for biology researches and many applications such as miRNA clustering, miRNA function prediction, miRNA synergism identification and disease miRNA prioritization. However, the existing methods always utilized the predicted miRNA target which has high false positive and false negative to calculate the miRNA functional similarity. Meanwhile, it is difficult to achieve high reliability of miRNA functional similarity with miRNA-disease associations. Therefore, it is increasingly needed to improve the measurement of miRNA functional similarity. In this study, we develop a novel path-based calculation method of miRNA functional similarity based on miRNA-disease associations, called MFSP. Compared with other methods, our method obtains higher average functional similarity of intra-family and intra-cluster selected groups. Meanwhile, the lower average functional similarity of inter-family and inter-cluster miRNA pair is obtained. In addition, the smaller p-value is achieved, while applying Wilcoxon rank-sum test and Kruskal-Wallis test to different miRNA groups. The relationship between miRNA functional similarity and other information sources is exhibited. Furthermore, the constructed miRNA functional network based on MFSP is a scale-free and small-world network. Moreover, the higher AUC for miRNA-disease prediction indicates the ability of MFSP uncovering miRNA functional similarity.

  5. Shift workers have a similar diet quality but higher energy intake than day workers

    NARCIS (Netherlands)

    Hulsegge, Gerben; Boer, Jolanda Ma; van der Beek, Allard J; Verschuren, Wm Monique; Sluijs, Ivonne; Vermeulen, Roel; Proper, Karin I

    2016-01-01

    OBJECTIVE: Shift work is associated with adverse health outcomes, and an unhealthy diet may be a contributing factor. We compared diet quantity and quality between day and shift workers, and studied exposure-response relationships regarding frequency of night shifts and years of shift work. METHODS:

  6. Lower prevalence but similar fitness in a parasitic fungus at higher radiation levels near Chernobyl

    OpenAIRE

    Aguileta , Gabriela ,; Badouin , Helene; Hood , Michael E; Møller , Anders Pape; LE PRIEUR , STEPHANIE; Snirc , Alodie; Siguenza , Sophie; MOUSSEAU , TIMOTHY A.; Shykoff , Jacqui ,; Cuomo , Christina A.; Giraud , Tatiana

    2016-01-01

    International audience; Nuclear disasters at Chernobyl and Fukushima provide examples of effects of acute ionizing radiation on mutations that can affect the fitness and distribution of species. Here, we investigated the prevalence of Microbotryum lychnidis-dioicae, a pollinator-transmitted fungal pathogen of plants causing anther-smut disease in Chernobyl, its viability, fertility and karyotype variation, and the accumulation of nonsynonymous mutations in its genome. We collected diseased fl...

  7. Modeling Timbre Similarity of Short Music Clips.

    Science.gov (United States)

    Siedenburg, Kai; Müllensiefen, Daniel

    2017-01-01

    There is evidence from a number of recent studies that most listeners are able to extract information related to song identity, emotion, or genre from music excerpts with durations in the range of tenths of seconds. Because of these very short durations, timbre as a multifaceted auditory attribute appears as a plausible candidate for the type of features that listeners make use of when processing short music excerpts. However, the importance of timbre in listening tasks that involve short excerpts has not yet been demonstrated empirically. Hence, the goal of this study was to develop a method that allows to explore to what degree similarity judgments of short music clips can be modeled with low-level acoustic features related to timbre. We utilized the similarity data from two large samples of participants: Sample I was obtained via an online survey, used 16 clips of 400 ms length, and contained responses of 137,339 participants. Sample II was collected in a lab environment, used 16 clips of 800 ms length, and contained responses from 648 participants. Our model used two sets of audio features which included commonly used timbre descriptors and the well-known Mel-frequency cepstral coefficients as well as their temporal derivates. In order to predict pairwise similarities, the resulting distances between clips in terms of their audio features were used as predictor variables with partial least-squares regression. We found that a sparse selection of three to seven features from both descriptor sets-mainly encoding the coarse shape of the spectrum as well as spectrotemporal variability-best predicted similarities across the two sets of sounds. Notably, the inclusion of non-acoustic predictors of musical genre and record release date allowed much better generalization performance and explained up to 50% of shared variance ( R 2 ) between observations and model predictions. Overall, the results of this study empirically demonstrate that both acoustic features related

  8. Similarity of Symbol Frequency Distributions with Heavy Tails

    Directory of Open Access Journals (Sweden)

    Martin Gerlach

    2016-04-01

    Full Text Available Quantifying the similarity between symbolic sequences is a traditional problem in information theory which requires comparing the frequencies of symbols in different sequences. In numerous modern applications, ranging from DNA over music to texts, the distribution of symbol frequencies is characterized by heavy-tailed distributions (e.g., Zipf’s law. The large number of low-frequency symbols in these distributions poses major difficulties to the estimation of the similarity between sequences; e.g., they hinder an accurate finite-size estimation of entropies. Here, we show analytically how the systematic (bias and statistical (fluctuations errors in these estimations depend on the sample size N and on the exponent γ of the heavy-tailed distribution. Our results are valid for the Shannon entropy (α=1, its corresponding similarity measures (e.g., the Jensen-Shanon divergence, and also for measures based on the generalized entropy of order α. For small α’s, including α=1, the errors decay slower than the 1/N decay observed in short-tailed distributions. For α larger than a critical value α^{*}=1+1/γ≤2, the 1/N decay is recovered. We show the practical significance of our results by quantifying the evolution of the English language over the last two centuries using a complete α spectrum of measures. We find that frequent words change more slowly than less frequent words and that α=2 provides the most robust measure to quantify language change.

  9. Is the phonological similarity effect in working memory due to proactive interference?

    Science.gov (United States)

    Baddeley, Alan D; Hitch, Graham J; Quinlan, Philip T

    2018-04-12

    Immediate serial recall of verbal material is highly sensitive to impairment attributable to phonological similarity. Although this has traditionally been interpreted as a within-sequence similarity effect, Engle (2007) proposed an interpretation based on interference from prior sequences, a phenomenon analogous to that found in the Peterson short-term memory (STM) task. We use the method of serial reconstruction to test this in an experiment contrasting the standard paradigm in which successive sequences are drawn from the same set of phonologically similar or dissimilar words and one in which the vowel sound on which similarity is based is switched from trial to trial, a manipulation analogous to that producing release from PI in the Peterson task. A substantial similarity effect occurs under both conditions although there is a small advantage from switching across similar sequences. There is, however, no evidence for the suggestion that the similarity effect will be absent from the very first sequence tested. Our results support the within-sequence similarity rather than a between-list PI interpretation. Reasons for the contrast with the classic Peterson short-term forgetting task are briefly discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  10. Estimating the annotation error rate of curated GO database sequence annotations

    Directory of Open Access Journals (Sweden)

    Brown Alfred L

    2007-05-01

    Full Text Available Abstract Background Annotations that describe the function of sequences are enormously important to researchers during laboratory investigations and when making computational inferences. However, there has been little investigation into the data quality of sequence function annotations. Here we have developed a new method of estimating the error rate of curated sequence annotations, and applied this to the Gene Ontology (GO sequence database (GOSeqLite. This method involved artificially adding errors to sequence annotations at known rates, and used regression to model the impact on the precision of annotations based on BLAST matched sequences. Results We estimated the error rate of curated GO sequence annotations in the GOSeqLite database (March 2006 at between 28% and 30%. Annotations made without use of sequence similarity based methods (non-ISS had an estimated error rate of between 13% and 18%. Annotations made with the use of sequence similarity methodology (ISS had an estimated error rate of 49%. Conclusion While the overall error rate is reasonably low, it would be prudent to treat all ISS annotations with caution. Electronic annotators that use ISS annotations as the basis of predictions are likely to have higher false prediction rates, and for this reason designers of these systems should consider avoiding ISS annotations where possible. Electronic annotators that use ISS annotations to make predictions should be viewed sceptically. We recommend that curators thoroughly review ISS annotations before accepting them as valid. Overall, users of curated sequence annotations from the GO database should feel assured that they are using a comparatively high quality source of information.

  11. On different forms of self similarity

    International Nuclear Information System (INIS)

    Aswathy, R.K.; Mathew, Sunil

    2016-01-01

    Fractal geometry is mainly based on the idea of self-similar forms. To be self-similar, a shape must able to be divided into parts that are smaller copies, which are more or less similar to the whole. There are different forms of self similarity in nature and mathematics. In this paper, some of the topological properties of super self similar sets are discussed. It is proved that in a complete metric space with two or more elements, the set of all non super self similar sets are dense in the set of all non-empty compact sub sets. It is also proved that the product of self similar sets are super self similar in product metric spaces and that the super self similarity is preserved under isometry. A characterization of super self similar sets using contracting sub self similarity is also presented. Some relevant counterexamples are provided. The concepts of exact super and sub self similarity are introduced and a necessary and sufficient condition for a set to be exact super self similar in terms of condensation iterated function systems (Condensation IFS’s) is obtained. A method to generate exact sub self similar sets using condensation IFS’s and the denseness of exact super self similar sets are also discussed.

  12. Sequence requirement of the ade6-4095 meiotic recombination hotspot in Schizosaccharomyces pombe.

    Science.gov (United States)

    Foulis, Steven J; Fowler, Kyle R; Steiner, Walter W

    2018-02-01

    Homologous recombination occurs at a greatly elevated frequency in meiosis compared to mitosis and is initiated by programmed double-strand DNA breaks (DSBs). DSBs do not occur at uniform frequency throughout the genome in most organisms, but occur preferentially at a limited number of sites referred to as hotspots. The location of hotspots have been determined at nucleotide-level resolution in both the budding and fission yeasts, and while several patterns have emerged regarding preferred locations for DSB hotspots, it remains unclear why particular sites experience DSBs at much higher frequency than other sites with seemingly similar properties. Short sequence motifs, which are often sites for binding of transcription factors, are known to be responsible for a number of hotspots. In this study we identified the minimum sequence required for activity of one of such motif identified in a screen of random sequences capable of producing recombination hotspots. The experimentally determined sequence, GGTCTRGACC, closely matches the previously inferred sequence. Full hotspot activity requires an effective sequence length of 9.5 bp, whereas moderate activity requires an effective sequence length of approximately 8.2 bp and shows significant association with DSB hotspots. In combination with our previous work, this result is consistent with a large number of different sequence motifs capable of producing recombination hotspots, and supports a model in which hotspots can be rapidly regenerated by mutation as they are lost through recombination.

  13. Complete genome sequence of an isolate of Potato virus X (PVX) infecting Cape gooseberry (Physalis peruviana) in Colombia.

    Science.gov (United States)

    Gutiérrez, Pablo A; Alzate, Juan F; Montoya, Mauricio Marín

    2015-06-01

    Transcriptome analysis of a Cape gooseberry (Physalis peruviana) plant with leaf symptoms of a mild yellow mosaic typical of a viral disease revealed an infection with Potato virus X (PVX). The genome sequence of the PVX-Physalis isolate comprises 6435 nt and exhibits higher sequence similarity to members of the Eurasian group of PVX (~95 %) than to the American group (~77 %). Genome organization is similar to other PVX isolates with five open reading frames coding for proteins RdRp, TGBp1, TGBp2, TGBp3, and CP. 5' and 3' untranslated regions revealed all regulatory motifs typically found in PVX isolates. The PVX-Physalis genome is the only complete sequence available for a Potexvirus in Colombia and is a new addition to the restricted number of available sequences of PVX isolates infecting plant species different to potato.

  14. Microbiota epitope similarity either dampens or enhances the immunogenicity of disease-associated antigenic epitopes.

    Directory of Open Access Journals (Sweden)

    Sebastian Carrasco Pro

    Full Text Available The microbiome influences adaptive immunity and molecular mimicry influences T cell reactivity. Here, we evaluated whether the sequence similarity of various antigens to the microbiota dampens or increases immunogenicity of T cell epitopes. Sets of epitopes and control sequences derived from 38 antigenic categories (infectious pathogens, allergens, autoantigens were retrieved from the Immune Epitope Database (IEDB. Their similarity to microbiome sequences was calculated using the BLOSUM62 matrix. We found that sequence similarity was associated with either dampened (tolerogenic; e.g. most allergens or increased (inflammatory; e.g. Dengue and West Nile viruses likelihood of a peptide being immunogenic as a function of epitope source category. Ten-fold cross-validation and validation using sets of manually curated epitopes and non-epitopes derived from allergens were used to confirm these initial observations. Furthermore, the genus from which the microbiome homologous sequences were derived influenced whether a tolerogenic versus inflammatory modulatory effect was observed, with Fusobacterium most associated with inflammatory influences and Bacteroides most associated with tolerogenic influences. We validated these effects using PBMCs stimulated with various sets of microbiome peptides. "Tolerogenic" microbiome peptides elicited IL-10 production, "inflammatory" peptides elicited mixed IL-10/IFNγ production, while microbiome epitopes homologous to self were completely unreactive for both cytokines. We also tested the sequence similarity of cockroach epitopes to specific microbiome sequences derived from households of cockroach allergic individuals and non-allergic controls. Microbiomes from cockroach allergic households were less likely to contain sequences homologous to previously defined cockroach allergens. These results are compatible with the hypothesis that microbiome sequences may contribute to the tolerization of T cells for allergen

  15. Sequence stratigraphy on an early wet Mars

    Science.gov (United States)

    Barker, Donald C.; Bhattacharya, Janok P.

    2018-02-01

    The evolution of Mars as a water-bearing body is of considerable interest for the understanding of its early history and evolution. The principles of terrestrial sequence stratigraphy provide a useful conceptual framework to hypothesize about the stratigraphic history of the planets northern plains. We present a model based on the hypothesized presence of an early ocean and the accumulation of lowland sediments eroded from highland terrain during the time of the valley networks and later outflow channels. Ancient, global environmental changes, induced by a progressively cooling climate would have led to a protracted loss of surface and near surface water from low-latitudes and eventual cold-trapping at higher latitudes - resulting in a unique and prolonged, perpetual forced regression within basins and lowland depositional environments. The Messinian Salinity Crisis (MSC) serves as a potential terrestrial analogue of the depositional and environmental consequences relating to the progressive removal of large standing bodies of water. We suggest that the evolution of similar conditions on Mars would have led to the emplacement of diagnostic sequences of deposits and regional scale unconformities, consistent with intermittent resurfacing of the northern plains and the progressive loss of an early ocean by the end of the Hesperian era.

  16. Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform.

    Science.gov (United States)

    Wen, Chongqing; Wu, Liyou; Qin, Yujia; Van Nostrand, Joy D; Ning, Daliang; Sun, Bo; Xue, Kai; Liu, Feifei; Deng, Ye; Liang, Yuting; Zhou, Jizhong

    2017-01-01

    Illumina's MiSeq has become the dominant platform for gene amplicon sequencing in microbial ecology studies; however, various technical concerns, such as reproducibility, still exist. To assess reproducibility, 16S rRNA gene amplicons from 18 soil samples of a reciprocal transplantation experiment were sequenced on an Illumina MiSeq. The V4 region of 16S rRNA gene from each sample was sequenced in triplicate with each replicate having a unique barcode. The average OTU overlap, without considering sequence abundance, at a rarefaction level of 10,323 sequences was 33.4±2.1% and 20.2±1.7% between two and among three technical replicates, respectively. When OTU sequence abundance was considered, the average sequence abundance weighted OTU overlap was 85.6±1.6% and 81.2±2.1% for two and three replicates, respectively. Removing singletons significantly increased the overlap for both (~1-3%, pdeep sequencing increased OTU overlap both when sequence abundance was considered (95%) and when not (44%). However, if singletons were not removed the overlap between two technical replicates (not considering sequence abundance) plateaus at 39% with 30,000 sequences. Diversity measures were not affected by the low overlap as α-diversities were similar among technical replicates while β-diversities (Bray-Curtis) were much smaller among technical replicates than among treatment replicates (e.g., 0.269 vs. 0.374). Higher diversity coverage, but lower OTU overlap, was observed when replicates were sequenced in separate runs. Detrended correspondence analysis indicated that while there was considerable variation among technical replicates, the reproducibility was sufficient for detecting treatment effects for the samples examined. These results suggest that although there is variation among technical replicates, amplicon sequencing on MiSeq is useful for analyzing microbial community structure if used appropriately and with caution. For example, including technical replicates

  17. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko; Tanaka, Tsuyoshi; Ohyanagi, Hajime; Hsing, Yue-Ie C.; Itoh, Takeshi

    2018-01-01

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  18. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko

    2018-02-14

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  19. The relationships within the mathematical content of teachers’ lesson sequences

    Science.gov (United States)

    Shahrill, M.; Prahmana, R. C. I.; Roslan, R.

    2017-12-01

    This study explored how mathematics content is carried through by means of the problems presented during lessons. Following the definitions and the coding criteria from the TIMSS 1999 Video Study, a total of 163 mathematics problems were identified in the video- recorded lesson sequences of four Bruneian mathematics teachers teaching at the Year 8 level. These problems were classified according to the four basic kinds of relationships: mathematically related, thematically related, repetition and unrelated. Drawing on the mathematical content of the teachers’ lesson sequences, the findings revealed variations among the mathematical problems coded as repetition and thematically related, between the four Brunei classes. The aggregated results obtained from the four classes highlighted several points of discussion, such as the relatively higher proportion of repetition problems (52%) from one teacher in particular; the percentage similarities of thematically related problems for all four classes (ranging from 26% to 33%); and the incredibly varied results for mathematically related problems across the four Brunei classes.

  20. Sequence Factorization with Multiple References.

    Directory of Open Access Journals (Sweden)

    Sebastian Wandelt

    Full Text Available The success of high-throughput sequencing has lead to an increasing number of projects which sequence large populations of a species. Storage and analysis of sequence data is a key challenge in these projects, because of the sheer size of the datasets. Compression is one simple technology to deal with this challenge. Referential factorization and compression schemes, which store only the differences between input sequence and a reference sequence, gained lots of interest in this field. Highly-similar sequences, e.g., Human genomes, can be compressed with a compression ratio of 1,000:1 and more, up to two orders of magnitude better than with standard compression techniques. Recently, it was shown that the compression against multiple references from the same species can boost the compression ratio up to 4,000:1. However, a detailed analysis of using multiple references is lacking, e.g., for main memory consumption and optimality. In this paper, we describe one key technique for the referential compression against multiple references: The factorization of sequences. Based on the notion of an optimal factorization, we propose optimization heuristics and identify parameter settings which greatly influence 1 the size of the factorization, 2 the time for factorization, and 3 the required amount of main memory. We evaluate a total of 30 setups with a varying number of references on data from three different species. Our results show a wide range of factorization sizes (optimal to an overhead of up to 300%, factorization speed (0.01 MB/s to more than 600 MB/s, and main memory usage (few dozen MB to dozens of GB. Based on our evaluation, we identify the best configurations for common use cases. Our evaluation shows that multi-reference factorization is much better than single-reference factorization.

  1. Nonparametric combinatorial sequence models.

    Science.gov (United States)

    Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa

    2011-11-01

    This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.

  2. Large margin classification with indefinite similarities

    KAUST Repository

    Alabdulmohsin, Ibrahim; Cisse, Moustapha; Gao, Xin; Zhang, Xiangliang

    2016-01-01

    Classification with indefinite similarities has attracted attention in the machine learning community. This is partly due to the fact that many similarity functions that arise in practice are not symmetric positive semidefinite, i.e. the Mercer

  3. Testing Self-Similarity Through Lamperti Transformations

    KAUST Repository

    Lee, Myoungji; Genton, Marc G.; Jun, Mikyoung

    2016-01-01

    extensively, while statistical tests for self-similarity are scarce and limited to processes indexed in one dimension. This paper proposes a statistical hypothesis test procedure for self-similarity of a stochastic process indexed in one dimension and multi

  4. A Comparison of Molecular Typing Methods Applied to Enterobacter cloacae complex: hsp60 Sequencing, Rep-PCR, and MLST

    Directory of Open Access Journals (Sweden)

    Roberto Viau

    2017-02-01

    Full Text Available Molecular typing using repetitive sequenced-based PCR (rep-PCR and hsp60 sequencing were applied to a collection of diverse Enterobacter cloacae complex isolates. To determine the most practical method for reference laboratories, we analyzed 71 E. cloacae complex isolates from sporadic and outbreak occurrences originating from 4 geographic areas. While rep-PCR was more discriminating, hsp60 sequencing provided a broader and a more objective geographical tracking method similar to multilocus sequence typing (MLST. In addition, we suggest that MLST may have higher discriminative power compared to hsp60 sequencing, although rep-PCR remains the most discriminative method for local outbreak investigations. In addition, rep-PCR can be an effective and inexpensive method for local outbreak investigation.

  5. Whole-body magnetic resonance imaging for staging and follow-up of pediatric patients with Hodgkin's lymphoma: comparison of different sequences

    International Nuclear Information System (INIS)

    Nava, Daniel; Oliveira, Heverton Cesar de

    2011-01-01

    Objective: to compare the performance of the T1, T2, STIR and DWIBS (diffusion-weighted whole-body imaging with background body signal suppression) sequences in the staging and follow-up of pediatric patients with Hodgkin's lymphoma in lymph node chains, parenchymal organs and bone marrow, and to evaluate interobserver agreement. Materials and methods: the authors studied 12 patients with confirmed diagnosis of Hodgkin's lymphoma. The patients were referred for whole body magnetic resonance imaging with T1-weighted, T2-weighted, STIR and DWIBS sequences. Results: the number of lymph node sites characterized as affected by the disease on T1- and T2-weighted sequences showed similar results (8 sites for both sequences), but lower than DWIBS and STIR sequences (11 and 12 sites, respectively). The bone marrow involvement by lymphoma showed the same values for the T1-, T2-weighted and DWIBS sequences (17 lesions), higher than the value found on STIR (13 lesions). A high rate of interobserver agreement was observed as the four sequences were analyzed. Conclusion: STIR and DWIBS sequences detected the highest number of lymph node sites characterized as affected by the disease. Similar results were demonstrated by all the sequences in the evaluation of parenchymal organs and bone marrow. A high interobserver agreement was observed as the four sequences were analyzed. (author)

  6. Personality similarity and life satisfaction in couples

    OpenAIRE

    Furler Katrin; Gomez Veronica; Grob Alexander

    2013-01-01

    The present study examined the association between personality similarity and life satisfaction in a large nationally representative sample of 1608 romantic couples. Similarity effects were computed for the Big Five personality traits as well as for personality profiles with global and differentiated indices of similarity. Results showed substantial actor and partner effects indicating that both partners' personality traits were related to both partners' life satisfaction. Personality similar...

  7. Retinoid-binding proteins: similar protein architectures bind similar ligands via completely different ways.

    Directory of Open Access Journals (Sweden)

    Yu-Ru Zhang

    Full Text Available BACKGROUND: Retinoids are a class of compounds that are chemically related to vitamin A, which is an essential nutrient that plays a key role in vision, cell growth and differentiation. In vivo, retinoids must bind with specific proteins to perform their necessary functions. Plasma retinol-binding protein (RBP and epididymal retinoic acid binding protein (ERABP carry retinoids in bodily fluids, while cellular retinol-binding proteins (CRBPs and cellular retinoic acid-binding proteins (CRABPs carry retinoids within cells. Interestingly, although all of these transport proteins possess similar structures, the modes of binding for the different retinoid ligands with their carrier proteins are different. METHODOLOGY/PRINCIPAL FINDINGS: In this work, we analyzed the various retinoid transport mechanisms using structure and sequence comparisons, binding site analyses and molecular dynamics simulations. Our results show that in the same family of proteins and subcellular location, the orientation of a retinoid molecule within a binding protein is same, whereas when different families of proteins are considered, the orientation of the bound retinoid is completely different. In addition, none of the amino acid residues involved in ligand binding is conserved between the transport proteins. However, for each specific binding protein, the amino acids involved in the ligand binding are conserved. The results of this study allow us to propose a possible transport model for retinoids. CONCLUSIONS/SIGNIFICANCE: Our results reveal the differences in the binding modes between the different retinoid-binding proteins.

  8. DNA SEQUENCE SIMILARITY REQUIREMENTS FOR INTERSPECIFIC RECOMBINATION IN BACILLUS. (R825348)

    Science.gov (United States)

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  9. Similar network activated by young and elderly adults during the acquisition of a motor sequence.

    NARCIS (Netherlands)

    Daselaar, S.M.; Veltman, D.J.; Rombouts, S.A.R.B.; Raaijmakers, J.G.W.; Jonker, C.

    2003-01-01

    Age-related impairments in episodic memory have been related to a deficiency in semantic processing, based on the finding that elderly adults typically benefit less than young adults from deep, semantic as opposed to shallow, nonsemantic processing of study items. In the present study, we tested the

  10. Scaling Relations of Local Magnitude versus Moment Magnitude for Sequences of Similar Earthquakes in Switzerland

    KAUST Repository

    Bethmann, F.; Deichmann, N.; Mai, Paul Martin

    2011-01-01

    Theoretical considerations and empirical regressions show that, in the magnitude range between 3 and 5, local magnitude, ML, and moment magnitude, Mw, scale 1:1. Previous studies suggest that for smaller magnitudes this 1:1 scaling breaks down

  11. An approach to large scale identification of non-obvious structural similarities between proteins

    Science.gov (United States)

    Cherkasov, Artem; Jones, Steven JM

    2004-01-01

    Background A new sequence independent bioinformatics approach allowing genome-wide search for proteins with similar three dimensional structures has been developed. By utilizing the numerical output of the sequence threading it establishes putative non-obvious structural similarities between proteins. When applied to the testing set of proteins with known three dimensional structures the developed approach was able to recognize structurally similar proteins with high accuracy. Results The method has been developed to identify pathogenic proteins with low sequence identity and high structural similarity to host analogues. Such protein structure relationships would be hypothesized to arise through convergent evolution or through ancient horizontal gene transfer events, now undetectable using current sequence alignment techniques. The pathogen proteins, which could mimic or interfere with host activities, would represent candidate virulence factors. The developed approach utilizes the numerical outputs from the sequence-structure threading. It identifies the potential structural similarity between a pair of proteins by correlating the threading scores of the corresponding two primary sequences against the library of the standard folds. This approach allowed up to 64% sensitivity and 99.9% specificity in distinguishing protein pairs with high structural similarity. Conclusion Preliminary results obtained by comparison of the genomes of Homo sapiens and several strains of Chlamydia trachomatis have demonstrated the potential usefulness of the method in the identification of bacterial proteins with known or potential roles in virulence. PMID:15147578

  12. An approach to large scale identification of non-obvious structural similarities between proteins

    Directory of Open Access Journals (Sweden)

    Cherkasov Artem

    2004-05-01

    Full Text Available Abstract Background A new sequence independent bioinformatics approach allowing genome-wide search for proteins with similar three dimensional structures has been developed. By utilizing the numerical output of the sequence threading it establishes putative non-obvious structural similarities between proteins. When applied to the testing set of proteins with known three dimensional structures the developed approach was able to recognize structurally similar proteins with high accuracy. Results The method has been developed to identify pathogenic proteins with low sequence identity and high structural similarity to host analogues. Such protein structure relationships would be hypothesized to arise through convergent evolution or through ancient horizontal gene transfer events, now undetectable using current sequence alignment techniques. The pathogen proteins, which could mimic or interfere with host activities, would represent candidate virulence factors. The developed approach utilizes the numerical outputs from the sequence-structure threading. It identifies the potential structural similarity between a pair of proteins by correlating the threading scores of the corresponding two primary sequences against the library of the standard folds. This approach allowed up to 64% sensitivity and 99.9% specificity in distinguishing protein pairs with high structural similarity. Conclusion Preliminary results obtained by comparison of the genomes of Homo sapiens and several strains of Chlamydia trachomatis have demonstrated the potential usefulness of the method in the identification of bacterial proteins with known or potential roles in virulence.

  13. Average is Boring: How Similarity Kills a Meme's Success

    Science.gov (United States)

    Coscia, Michele

    2014-09-01

    Every day we are exposed to different ideas, or memes, competing with each other for our attention. Previous research explained popularity and persistence heterogeneity of memes by assuming them in competition for limited attention resources, distributed in a heterogeneous social network. Little has been said about what characteristics make a specific meme more likely to be successful. We propose a similarity-based explanation: memes with higher similarity to other memes have a significant disadvantage in their potential popularity. We employ a meme similarity measure based on semantic text analysis and computer vision to prove that a meme is more likely to be successful and to thrive if its characteristics make it unique. Our results show that indeed successful memes are located in the periphery of the meme similarity space and that our similarity measure is a promising predictor of a meme success.

  14. Long sequence correlation coprocessor

    Science.gov (United States)

    Gage, Douglas W.

    1994-09-01

    A long sequence correlation coprocessor (LSCC) accelerates the bitwise correlation of arbitrarily long digital sequences by calculating in parallel the correlation score for 16, for example, adjacent bit alignments between two binary sequences. The LSCC integrated circuit is incorporated into a computer system with memory storage buffers and a separate general purpose computer processor which serves as its controller. Each of the LSCC's set of sequential counters simultaneously tallies a separate correlation coefficient. During each LSCC clock cycle, computer enable logic associated with each counter compares one bit of a first sequence with one bit of a second sequence to increment the counter if the bits are the same. A shift register assures that the same bit of the first sequence is simultaneously compared to different bits of the second sequence to simultaneously calculate the correlation coefficient by the different counters to represent different alignments of the two sequences.

  15. Anomaly Detection in Sequences

    Data.gov (United States)

    National Aeronautics and Space Administration — We present a set of novel algorithms which we call sequenceMiner, that detect and characterize anomalies in large sets of high-dimensional symbol sequences that...

  16. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  17. sequenceMiner algorithm

    Data.gov (United States)

    National Aeronautics and Space Administration — Detecting and describing anomalies in large repositories of discrete symbol sequences. sequenceMiner has been open-sourced! Download the file below to try it out....

  18. UNSOLVED AND LATENT CRIME: DIFFERENCES AND SIMILARITIES

    Directory of Open Access Journals (Sweden)

    Mikhail Kleymenov

    2017-01-01

    Full Text Available УДК 343Purpose of the article is to study the specific legal and informational nature of the unsolved crime in comparison with the phenomenon of delinquency, special study and analysis to improve the efficiency of law enforcement.Methods of research are abstract-logical, systematic, statistical, study of documents. The main results of research. Unsolved crime has specific legal, statistical and informational na-ture as the crime phenomenon, which is expressed in cumulative statistical population of unsolved crimes. An array of unsolved crimes is the sum of the number of acts, things of which is suspended and not terminated. The fault of the perpetrator in these cases is not proven, they are not considered by the court, it is not a conviction. Unsolved crime must be registered. Latent crime has a different informational nature. The main symptom of latent crimes is the uncertainty for the subjects of law enforcement, which delegated functions of identification, registration and accounting. Latent crime is not recorded. At the same time, there is a "border" area between the latent and unsolved crimes, which includes covered from the account of the crime. In modern Russia the majority of crimes covered from accounting by passing the decision about refusal in excitation of criminal case. Unsolved crime on their criminogenic consequences represents a significant danger to the public is higher compared to latent crime.It is conducted in the article a special analysis of the differences and similarities in the unsolved latent crime for the first time in criminological literature.The analysis proves the need for radical changes in the current Russian assessment of the state of crime and law enforcement to solve crimes. The article argues that an unsolved crime is a separate and, in contrast to latent crime, poorly understood phenomenon. However unsolved latent crime and have common features and areas of interaction.

  19. Multifractal and higher-dimensional zeta functions

    International Nuclear Information System (INIS)

    Véhel, Jacques Lévy; Mendivil, Franklin

    2011-01-01

    In this paper, we generalize the zeta function for a fractal string (as in Lapidus and Frankenhuijsen 2006 Fractal Geometry, Complex Dimensions and Zeta Functions: Geometry and Spectra of Fractal Strings (New York: Springer)) in several directions. We first modify the zeta function to be associated with a sequence of covers instead of the usual definition involving gap lengths. This modified zeta function allows us to define both a multifractal zeta function and a zeta function for higher-dimensional fractal sets. In the multifractal case, the critical exponents of the zeta function ζ(q, s) yield the usual multifractal spectrum of the measure. The presence of complex poles for ζ(q, s) indicates oscillations in the continuous partition function of the measure, and thus gives more refined information about the multifractal spectrum of a measure. In the case of a self-similar set in R n , the modified zeta function yields asymptotic information about both the 'box' counting function of the set and the n-dimensional volume of the ε-dilation of the set

  20. Similarity increases altruistic punishment in humans.

    Science.gov (United States)

    Mussweiler, Thomas; Ockenfels, Axel

    2013-11-26

    Humans are attracted to similar others. As a consequence, social networks are homogeneous in sociodemographic, intrapersonal, and other characteristics--a principle called homophily. Despite abundant evidence showing the importance of interpersonal similarity and homophily for human relationships, their behavioral correlates and cognitive foundations are poorly understood. Here, we show that perceived similarity substantially increases altruistic punishment, a key mechanism underlying human cooperation. We induced (dis)similarity perception by manipulating basic cognitive mechanisms in an economic cooperation game that included a punishment phase. We found that similarity-focused participants were more willing to punish others' uncooperative behavior. This influence of similarity is not explained by group identity, which has the opposite effect on altruistic punishment. Our findings demonstrate that pure similarity promotes reciprocity in ways known to encourage cooperation. At the same time, the increased willingness to punish norm violations among similarity-focused participants provides a rationale for why similar people are more likely to build stable social relationships. Finally, our findings show that altruistic punishment is differentially involved in encouraging cooperation under pure similarity vs. in-group conditions.

  1. Characterization of a highly toxic strain of Bacillus thuringiensis serovar kurstaki very similar to the HD-73 strain.

    Science.gov (United States)

    Reinoso-Pozo, Yaritza; Del Rincón-Castro, Ma Cristina; Ibarra, Jorge E

    2016-09-01

    The LBIT-1200 strain of Bacillus thuringiensis was recently isolated from soil, and showed a 6.4 and 9.5 increase in toxicity, against Manduca sexta and Trichoplusia ni, respectively, compared to HD-73. However, LBIT-1200 was still highly similar to HD-73, including the production of bipyramidal crystals containing only one protein of ∼130 000 kDa, its flagellin gene sequence related to the kurstaki serotype, plasmid and RepPCR patterns similar to HD-73, no production of β-exotoxin and no presence of VIP genes. Sequencing of its cry gene showed the presence of a cry1Ac-type gene with four amino acid differences, including two amino acid replacements in domain III, compared to Cry1Ac1, which may explain its higher toxicity. In conclusion, the LBIT-1200 strain is a variant of the HD-73 strain but shows a much higher toxicity, which makes this new strain an important candidate to be developed as a bioinsecticide, once it passes other tests, throughout its biotechnological development. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Comparative genome sequencing of drosophila pseudoobscura: Chromosomal, gene and cis-element evolution

    Energy Technology Data Exchange (ETDEWEB)

    Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Todd, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catherine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenee; Verduzco, Daniel; Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.

    2004-04-01

    The genome sequence of a second fruit fly, D. pseudoobscura, presents an opportunity for comparative analysis of a primary model organism D. melanogaster. The vast majority of Drosophila genes have remained on the same arm, but within each arm gene order has been extensively reshuffled leading to the identification of approximately 1300 syntenic blocks. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 35 My since divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome wide average consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than control sequences between the species but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a picture of repeat mediated chromosomal rearrangement, and high co-adaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila.

  3. Notions of similarity for systems biology models.

    Science.gov (United States)

    Henkel, Ron; Hoehndorf, Robert; Kacprowski, Tim; Knüpfer, Christian; Liebermeister, Wolfram; Waltemath, Dagmar

    2018-01-01

    Systems biology models are rapidly increasing in complexity, size and numbers. When building large models, researchers rely on software tools for the retrieval, comparison, combination and merging of models, as well as for version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of 'similarity' may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here we survey existing methods for the comparison of models, introduce quantitative measures for model similarity, and discuss potential applications of combined similarity measures. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on a combination of different model aspects. The six aspects that we define as potentially relevant for similarity are underlying encoding, references to biological entities, quantitative behaviour, qualitative behaviour, mathematical equations and parameters and network structure. We argue that future similarity measures will benefit from combining these model aspects in flexible, problem-specific ways to mimic users' intuition about model similarity, and to support complex model searches in databases. © The Author 2016. Published by Oxford University Press.

  4. Similar speaker recognition using nonlinear analysis

    International Nuclear Information System (INIS)

    Seo, J.P.; Kim, M.S.; Baek, I.C.; Kwon, Y.H.; Lee, K.S.; Chang, S.W.; Yang, S.I.

    2004-01-01

    Speech features of the conventional speaker identification system, are usually obtained by linear methods in spectral space. However, these methods have the drawback that speakers with similar voices cannot be distinguished, because the characteristics of their voices are also similar in spectral space. To overcome the difficulty in linear methods, we propose to use the correlation exponent in the nonlinear space as a new feature vector for speaker identification among persons with similar voices. We show that our proposed method surprisingly reduces the error rate of speaker identification system to speakers with similar voices

  5. Self-similar solutions for implosion and reflection of coalesced shocks in a plasma : spherical and cylindrical geometries

    International Nuclear Information System (INIS)

    Chavda, L.K.

    1978-01-01

    Approximate analytic solutions to the self-similar equations of gas dynamics for a plasma, treated as an ideal gas with specific heat ratio γ=5/3 are obtained for the implosion and subsequent reflection of various types of shock sequences in spherical and cylindrical geometries. This is based on the lowest-order polynomial approximation in the reduced fluid velocity, for a suitable nonlinear function of the sound velocity and the fluid velocity. However, the method developed here is powerful enough to be extended analytically to higher order polynomial approximations, to obtain successive approximations to the exact self-similar solutions. Also obtained, for the first time, are exact asymptotic solutions, in analytic form, for the reflected shocks. Criteria are given that may enable one to make a choice between the two geometries for maximising compression or temperature of the gas. These solutions should be useful in the study of inertial confinement of a plasma. (author)

  6. Two genes with similarity to bacterial response regulators are rapidly and specifically induced by cytokinin in Arabidopsis

    Science.gov (United States)

    Brandstatter, I.; Kieber, J. J.; Evans, M. L. (Principal Investigator)

    1998-01-01

    Cytokinins are central regulators of plant growth and development, but little is known about their mode of action. By using differential display, we identified a gene, IBC6 (for induced by cytokinin), from etiolated Arabidopsis seedlings, that is induced rapidly by cytokinin. The steady state level of IBC6 mRNA was elevated within 10 min by the exogenous application of cytokinin, and this induction did not require de novo protein synthesis. IBC6 was not induced by other plant hormones or by light. A second Arabidopsis gene with a sequence highly similar to IBC6 was identified. This IBC7 gene also was induced by cytokinin, although with somewhat slower kinetics and to a lesser extent. The pattern of expression of the two genes was similar, with higher expression in leaves, rachises, and flowers and lower transcript levels in roots and siliques. Sequence analysis revealed that IBC6 and IBC7 are similar to the receiver domain of bacterial two-component response regulators. This homology, coupled with previously published work on the CKI1 histidine kinase homolog, suggests that these proteins may play a role in early cytokinin signaling.

  7. Entropic fluctuations in DNA sequences

    Science.gov (United States)

    Thanos, Dimitrios; Li, Wentian; Provata, Astero

    2018-03-01

    The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.

  8. On self-similar Tolman models

    International Nuclear Information System (INIS)

    Maharaj, S.D.

    1988-01-01

    The self-similar spherically symmetric solutions of the Einstein field equation for the case of dust are identified. These form a subclass of the Tolman models. These self-similar models contain the solution recently presented by Chi [J. Math. Phys. 28, 1539 (1987)], thereby refuting the claim of having found a new solution to the Einstein field equations

  9. Mining Diagnostic Assessment Data for Concept Similarity

    Science.gov (United States)

    Madhyastha, Tara; Hunt, Earl

    2009-01-01

    This paper introduces a method for mining multiple-choice assessment data for similarity of the concepts represented by the multiple choice responses. The resulting similarity matrix can be used to visualize the distance between concepts in a lower-dimensional space. This gives an instructor a visualization of the relative difficulty of concepts…

  10. Similarity indices I: what do they measure

    International Nuclear Information System (INIS)

    Johnston, J.W.

    1976-11-01

    A method for estimating the effects of environmental effusions on ecosystems is described. The characteristics of 25 similarity indices used in studies of ecological communities were investigated. The type of data structure, to which these indices are frequently applied, was described as consisting of vectors of measurements on attributes (species) observed in a set of samples. A general similarity index was characterized as the result of a two-step process defined on a pair of vectors. In the first step an attribute similarity score is obtained for each attribute by comparing the attribute values observed in the pair of vectors. The result is a vector of attribute similarity scores. These are combined in the second step to arrive at the similarity index. The operation in the first step was characterized as a function, g, defined on pairs of attribute values. The second operation was characterized as a function, F, defined on the vector of attribute similarity scores from the first step. Usually, F was a simple sum or weighted sum of the attribute similarity scores. It is concluded that similarity indices should not be used as the test statistic to discriminate between two ecological communities

  11. Measuring transferring similarity via local information

    Science.gov (United States)

    Yin, Likang; Deng, Yong

    2018-05-01

    Recommender systems have developed along with the web science, and how to measure the similarity between users is crucial for processing collaborative filtering recommendation. Many efficient models have been proposed (i.g., the Pearson coefficient) to measure the direct correlation. However, the direct correlation measures are greatly affected by the sparsity of dataset. In other words, the direct correlation measures would present an inauthentic similarity if two users have a very few commonly selected objects. Transferring similarity overcomes this drawback by considering their common neighbors (i.e., the intermediates). Yet, the transferring similarity also has its drawback since it can only provide the interval of similarity. To break the limitations, we propose the Belief Transferring Similarity (BTS) model. The contributions of BTS model are: (1) BTS model addresses the issue of the sparsity of dataset by considering the high-order similarity. (2) BTS model transforms uncertain interval to a certain state based on fuzzy systems theory. (3) BTS model is able to combine the transferring similarity of different intermediates using information fusion method. Finally, we compare BTS models with nine different link prediction methods in nine different networks, and we also illustrate the convergence property and efficiency of the BTS model.

  12. On distributional assumptions and whitened cosine similarities

    DEFF Research Database (Denmark)

    Loog, Marco

    2008-01-01

    Recently, an interpretation of the whitened cosine similarity measure as a Bayes decision rule was proposed (C. Liu, "The Bayes Decision Rule Induced Similarity Measures,'' IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1086-1090, June 2007. This communication makes th...

  13. Self-Similar Traffic In Wireless Networks

    OpenAIRE

    Jerjomins, R.; Petersons, E.

    2005-01-01

    Many studies have shown that traffic in Ethernet and other wired networks is self-similar. This paper reveals that wireless network traffic is also self-similar and long-range dependant by analyzing big amount of data captured from the wireless router.

  14. Similarity Structure of Wave-Collapse

    DEFF Research Database (Denmark)

    Rypdal, Kristoffer; Juul Rasmussen, Jens; Thomsen, Kenneth

    1985-01-01

    Similarity transformations of the cubic Schrödinger equation (CSE) are investigated. The transformations are used to remove the explicit time variation in the CSE and reduce it to differential equations in the spatial variables only. Two different methods for similarity reduction are employed and...

  15. Similarity indices I: what do they measure.

    Energy Technology Data Exchange (ETDEWEB)

    Johnston, J.W.

    1976-11-01

    A method for estimating the effects of environmental effusions on ecosystems is described. The characteristics of 25 similarity indices used in studies of ecological communities were investigated. The type of data structure, to which these indices are frequently applied, was described as consisting of vectors of measurements on attributes (species) observed in a set of samples. A general similarity index was characterized as the result of a two-step process defined on a pair of vectors. In the first step an attribute similarity score is obtained for each attribute by comparing the attribute values observed in the pair of vectors. The result is a vector of attribute similarity scores. These are combined in the second step to arrive at the similarity index. The operation in the first step was characterized as a function, g, defined on pairs of attribute values. The second operation was characterized as a function, F, defined on the vector of attribute similarity scores from the first step. Usually, F was a simple sum or weighted sum of the attribute similarity scores. It is concluded that similarity indices should not be used as the test statistic to discriminate between two ecological communities.

  16. Self-similar continued root approximants

    International Nuclear Information System (INIS)

    Gluzman, S.; Yukalov, V.I.

    2012-01-01

    A novel method of summing asymptotic series is advanced. Such series repeatedly arise when employing perturbation theory in powers of a small parameter for complicated problems of condensed matter physics, statistical physics, and various applied problems. The method is based on the self-similar approximation theory involving self-similar root approximants. The constructed self-similar continued roots extrapolate asymptotic series to finite values of the expansion parameter. The self-similar continued roots contain, as a particular case, continued fractions and Padé approximants. A theorem on the convergence of the self-similar continued roots is proved. The method is illustrated by several examples from condensed-matter physics.

  17. Correlation between social proximity and mobility similarity.

    Science.gov (United States)

    Fan, Chao; Liu, Yiding; Huang, Junming; Rong, Zhihai; Zhou, Tao

    2017-09-20

    Human behaviors exhibit ubiquitous correlations in many aspects, such as individual and collective levels, temporal and spatial dimensions, content, social and geographical layers. With rich Internet data of online behaviors becoming available, it attracts academic interests to explore human mobility similarity from the perspective of social network proximity. Existent analysis shows a strong correlation between online social proximity and offline mobility similarity, namely, mobile records between friends are significantly more similar than between strangers, and those between friends with common neighbors are even more similar. We argue the importance of the number and diversity of common friends, with a counter intuitive finding that the number of common friends has no positive impact on mobility similarity while the diversity plays a key role, disagreeing with previous studies. Our analysis provides a novel view for better understanding the coupling between human online and offline behaviors, and will help model and predict human behaviors based on social proximity.

  18. Scalar Similarity for Relaxed Eddy Accumulation Methods

    Science.gov (United States)

    Ruppert, Johannes; Thomas, Christoph; Foken, Thomas

    2006-07-01

    The relaxed eddy accumulation (REA) method allows the measurement of trace gas fluxes when no fast sensors are available for eddy covariance measurements. The flux parameterisation used in REA is based on the assumption of scalar similarity, i.e., similarity of the turbulent exchange of two scalar quantities. In this study changes in scalar similarity between carbon dioxide, sonic temperature and water vapour were assessed using scalar correlation coefficients and spectral analysis. The influence on REA measurements was assessed by simulation. The evaluation is based on observations over grassland, irrigated cotton plantation and spruce forest. Scalar similarity between carbon dioxide, sonic temperature and water vapour showed a distinct diurnal pattern and change within the day. Poor scalar similarity was found to be linked to dissimilarities in the energy contained in the low frequency part of the turbulent spectra ( definition.

  19. Surf similarity and solitary wave runup

    DEFF Research Database (Denmark)

    Fuhrman, David R.; Madsen, Per A.

    2008-01-01

    The notion of surf similarity in the runup of solitary waves is revisited. We show that the surf similarity parameter for solitary waves may be effectively reduced to the beach slope divided by the offshore wave height to depth ratio. This clarifies its physical interpretation relative to a previ...... functional dependence on their respective surf similarity parameters. Important equivalencies in the runup of sinusoidal and solitary waves are thus revealed.......The notion of surf similarity in the runup of solitary waves is revisited. We show that the surf similarity parameter for solitary waves may be effectively reduced to the beach slope divided by the offshore wave height to depth ratio. This clarifies its physical interpretation relative...... to a previous parameterization, which was not given in an explicit form. Good coherency with experimental (breaking) runup data is preserved with this simpler parameter. A recasting of analytical (nonbreaking) runup expressions for sinusoidal and solitary waves additionally shows that they contain identical...

  20. Similarity in Bilateral Isolated Internal Orbital Fractures.

    Science.gov (United States)

    Chen, Hung-Chang; Cox, Jacob T; Sanyal, Abanti; Mahoney, Nicholas R

    2018-04-13

    In evaluating patients sustaining bilateral isolated internal orbital fractures, the authors have observed both similar fracture locations and also similar expansion of orbital volumes. In this study, we aim to investigate if there is a propensity for the 2 orbits to fracture in symmetrically similar patterns when sustaining similar trauma. A retrospective chart review was performed studying all cases at our institution of bilateral isolated internal orbital fractures involving the medial wall and/or the floor at the time of presentation. The similarity of the bilateral fracture locations was evaluated using the Fisher's exact test. The bilateral expanded orbital volumes were analyzed using the Wilcoxon signed-rank test to assess for orbital volume similarity. Twenty-four patients with bilateral internal orbital fractures were analyzed for fracture location similarity. Seventeen patients (70.8%) had 100% concordance in the orbital subregion fractured, and the association between the right and the left orbital fracture subregion locations was statistically significant (P < 0.0001). Fifteen patients were analyzed for orbital volume similarity. The average orbital cavity volume was 31.2 ± 3.8 cm on the right and 32.0 ± 3.7 cm on the left. There was a statistically significant difference between right and left orbital cavity volumes (P = 0.0026). The data from this study suggest that an individual who suffers isolated bilateral internal orbital fractures has a statistically significant similarity in the location of their orbital fractures. However, there does not appear to be statistically significant similarity in the expansion of the orbital volumes in these patients.

  1. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    OpenAIRE

    Arabi E. keshk

    2014-01-01

    The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between se...

  2. Protein secondary structure prediction for a single-sequence using hidden semi-Markov models

    Directory of Open Access Journals (Sweden)

    Borodovsky Mark

    2006-03-01

    Full Text Available Abstract Background The accuracy of protein secondary structure prediction has been improving steadily towards the 88% estimated theoretical limit. There are two types of prediction algorithms: Single-sequence prediction algorithms imply that information about other (homologous proteins is not available, while algorithms of the second type imply that information about homologous proteins is available, and use it intensively. The single-sequence algorithms could make an important contribution to studies of proteins with no detected homologs, however the accuracy of protein secondary structure prediction from a single-sequence is not as high as when the additional evolutionary information is present. Results In this paper, we further refine and extend the hidden semi-Markov model (HSMM initially considered in the BSPSS algorithm. We introduce an improved residue dependency model by considering the patterns of statistically significant amino acid correlation at structural segment borders. We also derive models that specialize on different sections of the dependency structure and incorporate them into HSMM. In addition, we implement an iterative training method to refine estimates of HSMM parameters. The three-state-per-residue accuracy and other accuracy measures of the new method, IPSSP, are shown to be comparable or better than ones for BSPSS as well as for PSIPRED, tested under the single-sequence condition. Conclusions We have shown that new dependency models and training methods bring further improvements to single-sequence protein secondary structure prediction. The results are obtained under cross-validation conditions using a dataset with no pair of sequences having significant sequence similarity. As new sequences are added to the database it is possible to augment the dependency structure and obtain even higher accuracy. Current and future advances should contribute to the improvement of function prediction for orphan proteins inscrutable

  3. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  4. Globalisation and Higher Education

    NARCIS (Netherlands)

    Marginson, Simon; van der Wende, Marijk

    2007-01-01

    Economic and cultural globalisation has ushered in a new era in higher education. Higher education was always more internationally open than most sectors because of its immersion in knowledge, which never showed much respect for juridical boundaries. In global knowledge economies, higher education

  5. Measure of Node Similarity in Multilayer Networks.

    Directory of Open Access Journals (Sweden)

    Anders Mollgaard

    Full Text Available The weight of links in a network is often related to the similarity of the nodes. Here, we introduce a simple tunable measure for analysing the similarity of nodes across different link weights. In particular, we use the measure to analyze homophily in a group of 659 freshman students at a large university. Our analysis is based on data obtained using smartphones equipped with custom data collection software, complemented by questionnaire-based data. The network of social contacts is represented as a weighted multilayer network constructed from different channels of telecommunication as well as data on face-to-face contacts. We find that even strongly connected individuals are not more similar with respect to basic personality traits than randomly chosen pairs of individuals. In contrast, several socio-demographics variables have a significant degree of similarity. We further observe that similarity might be present in one layer of the multilayer network and simultaneously be absent in the other layers. For a variable such as gender, our measure reveals a transition from similarity between nodes connected with links of relatively low weight to dis-similarity for the nodes connected by the strongest links. We finally analyze the overlap between layers in the network for different levels of acquaintanceships.

  6. Notions of similarity for computational biology models

    KAUST Repository

    Waltemath, Dagmar

    2016-03-21

    Computational models used in biology are rapidly increasing in complexity, size, and numbers. To build such large models, researchers need to rely on software tools for model retrieval, model combination, and version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of similarity may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here, we introduce a general notion of quantitative model similarities, survey the use of existing model comparison methods in model building and management, and discuss potential applications of model comparison. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on different model aspects. Potentially relevant aspects of a model comprise its references to biological entities, network structure, mathematical equations and parameters, and dynamic behaviour. Future similarity measures could combine these model aspects in flexible, problem-specific ways in order to mimic users\\' intuition about model similarity, and to support complex model searches in databases.

  7. Trajectory similarity join in spatial networks

    KAUST Repository

    Shang, Shuo

    2017-09-07

    The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider the case of trajectory similarity join (TS-Join), where the objects are trajectories of vehicles moving in road networks. Thus, given two sets of trajectories and a threshold θ, the TS-Join returns all pairs of trajectories from the two sets with similarity above θ. This join targets applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide a purposeful definition of similarity. To enable efficient TS-Join processing on large sets of trajectories, we develop search space pruning techniques and take into account the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer algorithm. For each trajectory, the algorithm first finds similar trajectories. Then it merges the results to achieve a final result. The algorithm exploits an upper bound on the spatiotemporal similarity and a heuristic scheduling strategy for search space pruning. The algorithm\\'s per-trajectory searches are independent of each other and can be performed in parallel, and the merging has constant cost. An empirical study with real data offers insight in the performance of the algorithm and demonstrates that is capable of outperforming a well-designed baseline algorithm by an order of magnitude.

  8. The baryonic self similarity of dark matter

    International Nuclear Information System (INIS)

    Alard, C.

    2014-01-01

    The cosmological simulations indicates that dark matter halos have specific self-similar properties. However, the halo similarity is affected by the baryonic feedback. By using momentum-driven winds as a model to represent the baryon feedback, an equilibrium condition is derived which directly implies the emergence of a new type of similarity. The new self-similar solution has constant acceleration at a reference radius for both dark matter and baryons. This model receives strong support from the observations of galaxies. The new self-similar properties imply that the total acceleration at larger distances is scale-free, the transition between the dark matter and baryons dominated regime occurs at a constant acceleration, and the maximum amplitude of the velocity curve at larger distances is proportional to M 1/4 . These results demonstrate that this self-similar model is consistent with the basics of modified Newtonian dynamics (MOND) phenomenology. In agreement with the observations, the coincidence between the self-similar model and MOND breaks at the scale of clusters of galaxies. Some numerical experiments show that the behavior of the density near the origin is closely approximated by a Einasto profile.

  9. Notions of similarity for computational biology models

    KAUST Repository

    Waltemath, Dagmar; Henkel, Ron; Hoehndorf, Robert; Kacprowski, Tim; Knuepfer, Christian; Liebermeister, Wolfram

    2016-01-01

    Computational models used in biology are rapidly increasing in complexity, size, and numbers. To build such large models, researchers need to rely on software tools for model retrieval, model combination, and version control. These tools need to be able to quantify the differences and similarities between computational models. However, depending on the specific application, the notion of similarity may greatly vary. A general notion of model similarity, applicable to various types of models, is still missing. Here, we introduce a general notion of quantitative model similarities, survey the use of existing model comparison methods in model building and management, and discuss potential applications of model comparison. To frame model comparison as a general problem, we describe a theoretical approach to defining and computing similarities based on different model aspects. Potentially relevant aspects of a model comprise its references to biological entities, network structure, mathematical equations and parameters, and dynamic behaviour. Future similarity measures could combine these model aspects in flexible, problem-specific ways in order to mimic users' intuition about model similarity, and to support complex model searches in databases.

  10. Immunoinformatics and Similarity Analysis of House Dust Mite Tropomyosin

    Directory of Open Access Journals (Sweden)

    Mohammad Mehdi Ranjbar

    2015-10-01

    Full Text Available Background: Dermatophagoides farinae and Dermatophagoides pteronyssinus are house dust mites (HDM that they cause severe asthma and allergic symptoms. Tropomyosin protein plays an important role in mentioned immune and allergic reactions to HDMs. Here, tropomyosin protein from Dermatophagoides spp. was comprehensively screened in silico for its allergenicity, antigenicity and similarity/conservation.Materials and Methods: The amino acid sequences of D. farinae tropomyosin, D. pteronyssinus and other mites were retrieved. We included alignments and evaluated conserved/ variable regions along sequences, constructed their phylogenetic tree and estimated overall mean distances. Then, followed by with prediction of linear B-cell epitope based on different approaches, and besides in-silico evaluation of IgE epitopes allergenicity (by SVMc, IgE epitope, ARPs BLAST, MAST and hybrid method. Finally, comparative analysis of results by different approaches was made.Results: Alignment results revealed near complete identity between D. farina and D. pteronyssinus members, and also there was close similarity among Dermatophagoides spp. Most of the variations among mites' tropomyosin were approximately located at amino acids 23 to 80, 108 to 120, 142 to 153 and 220 to 230. Topology of tree showed close relationships among mites in tropomyosin protein sequence, although their sequences in D. farina, D. pteronyssinus and Psoroptes ovis are more similar to each other and clustered. Dermanyssus gallinae (AC: Q2WBI0 has less relationship to other mites, being located in a separate branch. Hydrophilicity and flexibility plots revealed that many parts of this protein have potential to be hydrophilic and flexible. Surface accessibility represented 7 different epitopes. Beta-turns in this protein are with high probability in the middle part and its two terminals. Kolaskar and Tongaonkar method analysis represented 11 immunogenic epitopes between amino acids 7-16. From

  11. Query-dependent banding (QDB for faster RNA similarity searches.

    Directory of Open Access Journals (Sweden)

    Eric P Nawrocki

    2007-03-01

    Full Text Available When searching sequence databases for RNAs, it is desirable to score both primary sequence and RNA secondary structure similarity. Covariance models (CMs are probabilistic models well-suited for RNA similarity search applications. However, the computational complexity of CM dynamic programming alignment algorithms has limited their practical application. Here we describe an acceleration method called query-dependent banding (QDB, which uses the probabilistic query CM to precalculate regions of the dynamic programming lattice that have negligible probability, independently of the target database. We have implemented QDB in the freely available Infernal software package. QDB reduces the average case time complexity of CM alignment from LN(2.4 to LN(1.3 for a query RNA of N residues and a target database of L residues, resulting in a 4-fold speedup for typical RNA queries. Combined with other improvements to Infernal, including informative mixture Dirichlet priors on model parameters, benchmarks also show increased sensitivity and specificity resulting from improved parameterization.

  12. A self-similar hierarchy of the Korean stock market

    Science.gov (United States)

    Lim, Gyuchang; Min, Seungsik; Yoo, Kun-Woo

    2013-01-01

    A scaling analysis is performed on market values of stocks listed on Korean stock exchanges such as the KOSPI and the KOSDAQ. Different from previous studies on price fluctuations, market capitalizations are dealt with in this work. First, we show that the sum of the two stock exchanges shows a clear rank-size distribution, i.e., the Zipf's law, just as each separate one does. Second, by abstracting Zipf's law as a γ-sequence, we define a self-similar hierarchy consisting of many levels, with the numbers of firms at each level forming a geometric sequence. We also use two exponential functions to describe the hierarchy and derive a scaling law from them. Lastly, we propose a self-similar hierarchical process and perform an empirical analysis on our data set. Based on our findings, we argue that all money invested in the stock market is distributed in a hierarchical way and that a slight difference exists between the two exchanges.

  13. A Similarity Search Using Molecular Topological Graphs

    Directory of Open Access Journals (Sweden)

    Yoshifumi Fukunishi

    2009-01-01

    Full Text Available A molecular similarity measure has been developed using molecular topological graphs and atomic partial charges. Two kinds of topological graphs were used. One is the ordinary adjacency matrix and the other is a matrix which represents the minimum path length between two atoms of the molecule. The ordinary adjacency matrix is suitable to compare the local structures of molecules such as functional groups, and the other matrix is suitable to compare the global structures of molecules. The combination of these two matrices gave a similarity measure. This method was applied to in silico drug screening, and the results showed that it was effective as a similarity measure.

  14. Similarity-based pattern analysis and recognition

    CERN Document Server

    Pelillo, Marcello

    2013-01-01

    This accessible text/reference presents a coherent overview of the emerging field of non-Euclidean similarity learning. The book presents a broad range of perspectives on similarity-based pattern analysis and recognition methods, from purely theoretical challenges to practical, real-world applications. The coverage includes both supervised and unsupervised learning paradigms, as well as generative and discriminative models. Topics and features: explores the origination and causes of non-Euclidean (dis)similarity measures, and how they influence the performance of traditional classification alg

  15. New similarity of triangular fuzzy number and its application.

    Science.gov (United States)

    Zhang, Xixiang; Ma, Weimin; Chen, Liping

    2014-01-01

    The similarity of triangular fuzzy numbers is an important metric for application of it. There exist several approaches to measure similarity of triangular fuzzy numbers. However, some of them are opt to be large. To make the similarity well distributed, a new method SIAM (Shape's Indifferent Area and Midpoint) to measure triangular fuzzy number is put forward, which takes the shape's indifferent area and midpoint of two triangular fuzzy numbers into consideration. Comparison with other similarity measurements shows the effectiveness of the proposed method. Then, it is applied to collaborative filtering recommendation to measure users' similarity. A collaborative filtering case is used to illustrate users' similarity based on cloud model and triangular fuzzy number; the result indicates that users' similarity based on triangular fuzzy number can obtain better discrimination. Finally, a simulated collaborative filtering recommendation system is developed which uses cloud model and triangular fuzzy number to express users' comprehensive evaluation on items, and result shows that the accuracy of collaborative filtering recommendation based on triangular fuzzy number is higher.

  16. HYPOTHESIS TESTING WITH THE SIMILARITY INDEX

    Science.gov (United States)

    Mulltilocus DNA fingerprinting methods have been used extensively to address genetic issues in wildlife populations. Hypotheses concerning population subdivision and differing levels of diversity can be addressed through the use of the similarity index (S), a band-sharing coeffic...

  17. On self-similarity of crack layer

    Science.gov (United States)

    Botsis, J.; Kunin, B.

    1987-01-01

    The crack layer (CL) theory of Chudnovsky (1986), based on principles of thermodynamics of irreversible processes, employs a crucial hypothesis of self-similarity. The self-similarity hypothesis states that the value of the damage density at a point x of the active zone at a time t coincides with that at the corresponding point in the initial (t = 0) configuration of the active zone, the correspondence being given by a time-dependent affine transformation of the space variables. In this paper, the implications of the self-similarity hypothesis for qusi-static CL propagation is investigated using polystyrene as a model material and examining the evolution of damage distribution along the trailing edge which is approximated by a straight segment perpendicular to the crack path. The results support the self-similarity hypothesis adopted by the CL theory.

  18. Bilateral Trade Flows and Income Distribution Similarity

    Science.gov (United States)

    2016-01-01

    Current models of bilateral trade neglect the effects of income distribution. This paper addresses the issue by accounting for non-homothetic consumer preferences and hence investigating the role of income distribution in the context of the gravity model of trade. A theoretically justified gravity model is estimated for disaggregated trade data (Dollar volume is used as dependent variable) using a sample of 104 exporters and 108 importers for 1980–2003 to achieve two main goals. We define and calculate new measures of income distribution similarity and empirically confirm that greater similarity of income distribution between countries implies more trade. Using distribution-based measures as a proxy for demand similarities in gravity models, we find consistent and robust support for the hypothesis that countries with more similar income-distributions trade more with each other. The hypothesis is also confirmed at disaggregated level for differentiated product categories. PMID:27137462

  19. Discovering Music Structure via Similarity Fusion

    DEFF Research Database (Denmark)

    for representing music structure is studied in a simplified scenario consisting of 4412 songs and two similarity measures among them. The results suggest that the PLSA model is a useful framework to combine different sources of information, and provides a reasonable space for song representation.......Automatic methods for music navigation and music recommendation exploit the structure in the music to carry out a meaningful exploration of the “song space”. To get a satisfactory performance from such systems, one should incorporate as much information about songs similarity as possible; however...... semantics”, in such a way that all observed similarities can be satisfactorily explained using the latent semantics. Therefore, one can think of these semantics as the real structure in music, in the sense that they can explain the observed similarities among songs. The suitability of the PLSA model...

  20. Abundance estimation of spectrally similar minerals

    CSIR Research Space (South Africa)

    Debba, Pravesh

    2009-07-01

    Full Text Available This paper evaluates a spectral unmixing method for estimating the partial abundance of spectrally similar minerals in complex mixtures. The method requires formulation of a linear function of individual spectra of individual minerals. The first...

  1. Lagrangian-similarity diffusion-deposition model

    International Nuclear Information System (INIS)

    Horst, T.W.

    1979-01-01

    A Lagrangian-similarity diffusion model has been incorporated into the surface-depletion deposition model. This model predicts vertical concentration profiles far downwind of the source that agree with those of a one-dimensional gradient-transfer model

  2. Discovering Music Structure via Similarity Fusion

    DEFF Research Database (Denmark)

    Arenas-García, Jerónimo; Parrado-Hernandez, Emilio; Meng, Anders

    Automatic methods for music navigation and music recommendation exploit the structure in the music to carry out a meaningful exploration of the “song space”. To get a satisfactory performance from such systems, one should incorporate as much information about songs similarity as possible; however...... semantics”, in such a way that all observed similarities can be satisfactorily explained using the latent semantics. Therefore, one can think of these semantics as the real structure in music, in the sense that they can explain the observed similarities among songs. The suitability of the PLSA model...... for representing music structure is studied in a simplified scenario consisting of 4412 songs and two similarity measures among them. The results suggest that the PLSA model is a useful framework to combine different sources of information, and provides a reasonable space for song representation....

  3. Outsourced similarity search on metric data assets

    KAUST Repository

    Yiu, Man Lung

    2012-02-01

    This paper considers a cloud computing setting in which similarity querying of metric data is outsourced to a service provider. The data is to be revealed only to trusted users, not to the service provider or anyone else. Users query the server for the most similar data objects to a query example. Outsourcing offers the data owner scalability and a low-initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise confidential. Given this setting, the paper presents techniques that transform the data prior to supplying it to the service provider for similarity queries on the transformed data. Our techniques provide interesting trade-offs between query cost and accuracy. They are then further extended to offer an intuitive privacy guarantee. Empirical studies with real data demonstrate that the techniques are capable of offering privacy while enabling efficient and accurate processing of similarity queries.

  4. Similarity search processing. Paralelization and indexing technologies.

    Directory of Open Access Journals (Sweden)

    Eder Dos Santos

    2015-08-01

    The next Scientific-Technical Report addresses the similarity search and the implementation of metric structures on parallel environments. It also presents the state of the art related to similarity search on metric structures and parallelism technologies. Comparative analysis are also proposed, seeking to identify the behavior of a set of metric spaces and metric structures over processing platforms multicore-based and GPU-based.

  5. Parallel trajectory similarity joins in spatial networks

    KAUST Repository

    Shang, Shuo

    2018-04-04

    The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider two cases of trajectory similarity joins (TS-Joins), including a threshold-based join (Tb-TS-Join) and a top-k TS-Join (k-TS-Join), where the objects are trajectories of vehicles moving in road networks. Given two sets of trajectories and a threshold θ, the Tb-TS-Join returns all pairs of trajectories from the two sets with similarity above θ. In contrast, the k-TS-Join does not take a threshold as a parameter, and it returns the top-k most similar trajectory pairs from the two sets. The TS-Joins target diverse applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide purposeful definitions of similarity. To enable efficient processing of the TS-Joins on large sets of trajectories, we develop search space pruning techniques and enable use of the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer search framework that lays the foundation for the algorithms for the Tb-TS-Join and the k-TS-Join that rely on different pruning techniques to achieve efficiency. For each trajectory, the algorithms first find similar trajectories. Then they merge the results to obtain the final result. The algorithms for the two joins exploit different upper and lower bounds on the spatiotemporal trajectory similarity and different heuristic scheduling strategies for search space pruning. Their per-trajectory searches are independent of each other and can be performed in parallel, and the mergings have constant cost. An empirical study with real data offers insight in the performance of the algorithms and demonstrates that they are capable of outperforming well-designed baseline algorithms by an order of magnitude.

  6. Parallel trajectory similarity joins in spatial networks

    KAUST Repository

    Shang, Shuo; Chen, Lisi; Wei, Zhewei; Jensen, Christian S.; Zheng, Kai; Kalnis, Panos

    2018-01-01

    The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider two cases of trajectory similarity joins (TS-Joins), including a threshold-based join (Tb-TS-Join) and a top-k TS-Join (k-TS-Join), where the objects are trajectories of vehicles moving in road networks. Given two sets of trajectories and a threshold θ, the Tb-TS-Join returns all pairs of trajectories from the two sets with similarity above θ. In contrast, the k-TS-Join does not take a threshold as a parameter, and it returns the top-k most similar trajectory pairs from the two sets. The TS-Joins target diverse applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide purposeful definitions of similarity. To enable efficient processing of the TS-Joins on large sets of trajectories, we develop search space pruning techniques and enable use of the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer search framework that lays the foundation for the algorithms for the Tb-TS-Join and the k-TS-Join that rely on different pruning techniques to achieve efficiency. For each trajectory, the algorithms first find similar trajectories. Then they merge the results to obtain the final result. The algorithms for the two joins exploit different upper and lower bounds on the spatiotemporal trajectory similarity and different heuristic scheduling strategies for search space pruning. Their per-trajectory searches are independent of each other and can be performed in parallel, and the mergings have constant cost. An empirical study with real data offers insight in the performance of the algorithms and demonstrates that they are capable of outperforming well-designed baseline algorithms by an order of magnitude.

  7. Are calanco landforms similar to river basins?

    Science.gov (United States)

    Caraballo-Arias, N A; Ferro, V

    2017-12-15

    In the past badlands have been often considered as ideal field laboratories for studying landscape evolution because of their geometrical similarity to larger fluvial systems. For a given hydrological process, no scientific proof exists that badlands can be considered a model of river basin prototypes. In this paper the measurements carried out on 45 Sicilian calanchi, a type of badlands that appears as a small-scale hydrographic unit, are used to establish their morphological similarity with river systems whose data are available in the literature. At first the geomorphological similarity is studied by identifying the dimensionless groups, which can assume the same value or a scaled one in a fixed ratio, representing drainage basin shape, stream network and relief properties. Then, for each property, the dimensionless groups are calculated for the investigated calanchi and the river basins and their corresponding scale ratio is evaluated. The applicability of Hack's, Horton's and Melton's laws for establishing similarity criteria is also tested. The developed analysis allows to conclude that a quantitative morphological similarity between calanco landforms and river basins can be established using commonly applied dimensionless groups. In particular, the analysis showed that i) calanchi and river basins have a geometrically similar shape respect to the parameters Rf and Re with a scale factor close to 1, ii) calanchi and river basins are similar respect to the bifurcation and length ratios (λ=1), iii) for the investigated calanchi the Melton number assumes values less than that (0.694) corresponding to the river case and a scale ratio ranging from 0.52 and 0.78 can be used, iv) calanchi and river basins have similar mean relief ratio values (λ=1.13) and v) calanchi present active geomorphic processes and therefore fall in a more juvenile stage with respect to river basins. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. An accurate and rapid continuous wavelet dynamic time warping algorithm for unbalanced global mapping in nanopore sequencing

    KAUST Repository

    Han, Renmin

    2017-12-24

    Long-reads, point-of-care, and PCR-free are the promises brought by nanopore sequencing. Among various steps in nanopore data analysis, the global mapping between the raw electrical current signal sequence and the expected signal sequence from the pore model serves as the key building block to base calling, reads mapping, variant identification, and methylation detection. However, the ultra-long reads of nanopore sequencing and an order of magnitude difference in the sampling speeds of the two sequences make the classical dynamic time warping (DTW) and its variants infeasible to solve the problem. Here, we propose a novel multi-level DTW algorithm, cwDTW, based on continuous wavelet transforms with different scales of the two signal sequences. Our algorithm starts from low-resolution wavelet transforms of the two sequences, such that the transformed sequences are short and have similar sampling rates. Then the peaks and nadirs of the transformed sequences are extracted to form feature sequences with similar lengths, which can be easily mapped by the original DTW. Our algorithm then recursively projects the warping path from a lower-resolution level to a higher-resolution one by building a context-dependent boundary and enabling a constrained search for the warping path in the latter. Comprehensive experiments on two real nanopore datasets on human and on Pandoraea pnomenusa, as well as two benchmark datasets from previous studies, demonstrate the efficiency and effectiveness of the proposed algorithm. In particular, cwDTW can almost always generate warping paths that are very close to the original DTW, which are remarkably more accurate than the state-of-the-art methods including FastDTW and PrunedDTW. Meanwhile, on the real nanopore datasets, cwDTW is about 440 times faster than FastDTW and 3000 times faster than the original DTW. Our program is available at https://github.com/realbigws/cwDTW.

  9. A software pipeline for processing and identification of fungal ITS sequences

    Directory of Open Access Journals (Sweden)

    Kristiansson Erik

    2009-01-01

    Full Text Available Abstract Background Fungi from environmental samples are typically identified to species level through DNA sequencing of the nuclear ribosomal internal transcribed spacer (ITS region for use in BLAST-based similarity searches in the International Nucleotide Sequence Databases. These searches are time-consuming and regularly require a significant amount of manual intervention and complementary analyses. We here present software – in the form of an identification pipeline for large sets of fungal ITS sequences – developed to automate the BLAST process and several additional analysis steps. The performance of the pipeline was evaluated on a dataset of 350 ITS sequences from fungi growing as epiphytes on building material. Results The pipeline was written in Perl and uses a local installation of NCBI-BLAST for the similarity searches of the query sequences. The variable subregion ITS2 of the ITS region is extracted from the sequences and used for additional searches of higher sensitivity. Multiple alignments of each query sequence and its closest matches are computed, and query sequences sharing at least 50% of their best matches are clustered to facilitate the evaluation of hypothetically conspecific groups. The pipeline proved to speed up the processing, as well as enhance the resolution, of the evaluation dataset considerably, and the fungi were found to belong chiefly to the Ascomycota, with Penicillium and Aspergillus as the two most common genera. The ITS2 was found to indicate a different taxonomic affiliation than did the complete ITS region for 10% of the query sequences, though this figure is likely to vary with the taxonomic scope of the query sequences. Conclusion The present software readily assigns large sets of fungal query sequences to their respective best matches in the international sequence databases and places them in a larger biological context. The output is highly structured to be easy to process, although it still needs

  10. Multi-scale structural similarity index for motion detection

    Directory of Open Access Journals (Sweden)

    M. Abdel-Salam Nasr

    2017-07-01

    Full Text Available The most recent approach for measuring the image quality is the structural similarity index (SSI. This paper presents a novel algorithm based on the multi-scale structural similarity index for motion detection (MS-SSIM in videos. The MS-SSIM approach is based on modeling of image luminance, contrast and structure at multiple scales. The MS-SSIM has resulted in much better performance than the single scale SSI approach but at the cost of relatively lower processing speed. The major advantages of the presented algorithm are both: the higher detection accuracy and the quasi real-time processing speed.

  11. Cloning, sequencing, and sequence analysis of two novel plasmids from the thermophilic anaerobic bacterium Anaerocellum thermophilum

    DEFF Research Database (Denmark)

    Clausen, Anders; Mikkelsen, Marie Just; Schrøder, I.

    2004-01-01

    The nucleotide sequence of two novel plasmids isolated from the extreme thermophilic anaerobic bacterium Anaerocellum thermophilum DSM6725 (A. thermophilum), growing optimally at 70degreesC, has been determined. pBAS2 was found to be a 3653 bp plasmid with a GC content of 43%, and the sequence re...... with highest similarity to DNA repair protein from Campylobacter jejuni (25% aa). Orf34 showed similarity to sigma factors with highest similarity (28% aa) to the sporulation specific Sigma factor, Sigma 28(K) from Bacillus thuringiensis....

  12. Identifying mechanistic similarities in drug responses

    KAUST Repository

    Zhao, C.

    2012-05-15

    Motivation: In early drug development, it would be beneficial to be able to identify those dynamic patterns of gene response that indicate that drugs targeting a particular gene will be likely or not to elicit the desired response. One approach would be to quantitate the degree of similarity between the responses that cells show when exposed to drugs, so that consistencies in the regulation of cellular response processes that produce success or failure can be more readily identified.Results: We track drug response using fluorescent proteins as transcription activity reporters. Our basic assumption is that drugs inducing very similar alteration in transcriptional regulation will produce similar temporal trajectories on many of the reporter proteins and hence be identified as having similarities in their mechanisms of action (MOA). The main body of this work is devoted to characterizing similarity in temporal trajectories/signals. To do so, we must first identify the key points that determine mechanistic similarity between two drug responses. Directly comparing points on the two signals is unrealistic, as it cannot handle delays and speed variations on the time axis. Hence, to capture the similarities between reporter responses, we develop an alignment algorithm that is robust to noise, time delays and is able to find all the contiguous parts of signals centered about a core alignment (reflecting a core mechanism in drug response). Applying the proposed algorithm to a range of real drug experiments shows that the result agrees well with the prior drug MOA knowledge. © The Author 2012. Published by Oxford University Press. All rights reserved.

  13. Semantic similarity between ontologies at different scales

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Qingpeng; Haglin, David J.

    2016-04-01

    In the past decade, existing and new knowledge and datasets has been encoded in different ontologies for semantic web and biomedical research. The size of ontologies is often very large in terms of number of concepts and relationships, which makes the analysis of ontologies and the represented knowledge graph computational and time consuming. As the ontologies of various semantic web and biomedical applications usually show explicit hierarchical structures, it is interesting to explore the trade-offs between ontological scales and preservation/precision of results when we analyze ontologies. This paper presents the first effort of examining the capability of this idea via studying the relationship between scaling biomedical ontologies at different levels and the semantic similarity values. We evaluate the semantic similarity between three Gene Ontology slims (Plant, Yeast, and Candida, among which the latter two belong to the same kingdom—Fungi) using four popular measures commonly applied to biomedical ontologies (Resnik, Lin, Jiang-Conrath, and SimRel). The results of this study demonstrate that with proper selection of scaling levels and similarity measures, we can significantly reduce the size of ontologies without losing substantial detail. In particular, the performance of Jiang-Conrath and Lin are more reliable and stable than that of the other two in this experiment, as proven by (a) consistently showing that Yeast and Candida are more similar (as compared to Plant) at different scales, and (b) small deviations of the similarity values after excluding a majority of nodes from several lower scales. This study provides a deeper understanding of the application of semantic similarity to biomedical ontologies, and shed light on how to choose appropriate semantic similarity measures for biomedical engineering.

  14. Similarity of satellite DNA properties in the order Rodentia

    Energy Technology Data Exchange (ETDEWEB)

    Mazrimas, J A; Hatch, F T

    1977-09-01

    We have characterized satellite DNAs from 9 species of kangaroo rat (Dipodomys) and have shown that the HS-..cap alpha.. and HS-..beta.. satellites, where present, are nearly identical in all species as to melting transition midpoint (Tm), and density in neutral CsCl, alkaline CsCl, and Cs/sub 2/SO/sub 4/-Ag/sup +/ gradients. However, the MS satellites exist in two internally similar classes. The satellite DNAs from three other rodents were characterized (densities listed are in neutral CsCl). The pocket gopher, Thomomys bottae, contains Th-..cap alpha.. (1.713 g/ml) and Th-..beta.. (1.703 g/ml). The guinea pig (Cavia porcellus) contains Ca-..cap alpha.., Ca-..beta.., and Ca-..gamma.. at densities of 1.706 g/ml, 1.704 g/ml, and 1.704 g/ml, respectively. The antelope ground squirrel (Ammospermophilus harrisi) contains Am-..cap alpha.., 1.708 g/ml, Am-..beta.., 1.717 g/ml, and Am-..gamma.., 1.707 g/ml. The physical and chemical properties of the alpha-satellites from the above four rodents representing four different families in two suborders of Rodentia were compared. They show nearly identical Tm, nucleoside composition of single strands, and single strand densities in alkaline CsCl. Similar comparisons on the second or third satellite DNAs from these rodents also indicate a close relationship to each other. Thus the high degree of similarity of satellite sequences found in such a diverse group of rodents suggests a cellular function that is subject to natural selection, and implies that these sequences have been conserved over a considerable span of evolutionary time since the divergence of these rodents about 50 million years ago.

  15. Similarity of satellite DNA properties in the order Rodentia

    Energy Technology Data Exchange (ETDEWEB)

    Mazrimas, J A; Hatch, F T

    1977-09-01

    Satellite DNAs from 9 species of kangaroo rat (Dipodomys) have been characterized and have shown that the HS-..cap alpha.. and HS-..beta.. satellites, where present, are nearly identical in all species as to melting transition midpoint (Tm), and density in neutral CsCl, alkaline CsCl, and Cs/sub 2/SO/sub 4/-Ag/sup +/ gradients. However, the MS satellites exist in two internally similar classes. The satellite DNAs from three other rodents were characterized (densities listed are in neutral CsCl). The pocket gopher, Thomomys bottae, contains Th-..cap alpha.. (1.713 g/ml) and Th..beta.. (1.703 g/ml). The guinea pig (Cavia porcellus) contains Ca-..cap alpha.., Ca-..beta.. and Ca-..gamma.. at densities of 1.706 g/ml, 1.704 g/ml and 1.704 g/ml, respectively. The antelope ground squirrel (Ammospermophilus harrisi) contains Am-..cap alpha.., 1.708 g/ml, Am-..beta.., 1.717 g/ml, and Am-..gamma.., 1.707 g/ml. The physical and chemical properties of the alpha-satellites from the above four rodents representing four different families in two suborders of Rodentia were compared. They show nearly identical Tm, nucleoside composition of single strands, and single strand densities in alkaline CsCl. Similar comparisons on the second or third satellite DNAs from these rodents also indicate a close relationship to each other. Thus the high degree of similarity of satellite sequences found in such a diverse group of rodents suggests a cellular function that is subject to natural selection, and implies that these sequences have been conserved over a considerable span of evolutionary time since the divergence of these rodents about 50 million years ago.

  16. Cloning and sequence analysis of sucrose phosphate synthase gene from varieties of Pennisetum species.

    Science.gov (United States)

    Li, H C; Lu, H B; Yang, F Y; Liu, S J; Bai, C J; Zhang, Y W

    2015-03-31

    Sucrose phosphate synthase (SPS) is an enzyme used by higher plants for sucrose synthesis. In this study, three primer sets were designed on the basis of known SPS sequences from maize (GenBank: NM_001112224.1) and sugarcane (GenBank: JN584485.1), and five novel SPS genes were identified by RT-PCR from the genomes of Pennisetum spp (the hybrid P. americanum x P. purpureum, P. purpureum Schum., P. purpureum Schum. cv. Red, P. purpureum Schum. cv. Taiwan, and P. purpureum Schum. cv. Mott). The cloned sequences showed 99.9% identity and 80-88% similarity to the SPS sequences of other plants. The SPS gene of hybrid Pennisetum had one nucleotide and four amino acid polymorphisms compared to the other four germplasms, and cluster analysis was performed to assess genetic diversity in this species. Additional characterization of the SPS gene product can potentially allow Pennisetum to be exploited as a biofuel source.

  17. Protecting genomic sequence anonymity with generalization lattices.

    Science.gov (United States)

    Malin, B A

    2005-01-01

    Current genomic privacy technologies assume the identity of genomic sequence data is protected if personal information, such as demographics, are obscured, removed, or encrypted. While demographic features can directly compromise an individual's identity, recent research demonstrates such protections are insufficient because sequence data itself is susceptible to re-identification. To counteract this problem, we introduce an algorithm for anonymizing a collection of person-specific DNA sequences. The technique is termed DNA lattice anonymization (DNALA), and is based upon the formal privacy protection schema of k -anonymity. Under this model, it is impossible to observe or learn features that distinguish one genetic sequence from k-1 other entries in a collection. To maximize information retained in protected sequences, we incorporate a concept generalization lattice to learn the distance between two residues in a single nucleotide region. The lattice provides the most similar generalized concept for two residues (e.g. adenine and guanine are both purines). The method is tested and evaluated with several publicly available human population datasets ranging in size from 30 to 400 sequences. Our findings imply the anonymization schema is feasible for the protection of sequences privacy. The DNALA method is the first computational disclosure control technique for general DNA sequences. Given the computational nature of the method, guarantees of anonymity can be formally proven. There is room for improvement and validation, though this research provides the groundwork from which future researchers can construct genomics anonymization schemas tailored to specific datasharing scenarios.

  18. Sequences for Student Investigation

    Science.gov (United States)

    Barton, Jeffrey; Feil, David; Lartigue, David; Mullins, Bernadette

    2004-01-01

    We describe two classes of sequences that give rise to accessible problems for undergraduate research. These problems may be understood with virtually no prerequisites and are well suited for computer-aided investigation. The first sequence is a variation of one introduced by Stephen Wolfram in connection with his study of cellular automata. The…

  19. Measure of Node Similarity in Multilayer Networks

    DEFF Research Database (Denmark)

    Møllgaard, Anders; Zettler, Ingo; Dammeyer, Jesper

    2016-01-01

    The weight of links in a network is often related to the similarity of thenodes. Here, we introduce a simple tunable measure for analysing the similarityof nodes across different link weights. In particular, we use the measure toanalyze homophily in a group of 659 freshman students at a large...... university.Our analysis is based on data obtained using smartphones equipped with customdata collection software, complemented by questionnaire-based data. The networkof social contacts is represented as a weighted multilayer network constructedfrom different channels of telecommunication as well as data...... might bepresent in one layer of the multilayer network and simultaneously be absent inthe other layers. For a variable such as gender, our measure reveals atransition from similarity between nodes connected with links of relatively lowweight to dis-similarity for the nodes connected by the strongest...

  20. Universal self-similarity of propagating populations.

    Science.gov (United States)

    Eliazar, Iddo; Klafter, Joseph

    2010-07-01

    This paper explores the universal self-similarity of propagating populations. The following general propagation model is considered: particles are randomly emitted from the origin of a d-dimensional Euclidean space and propagate randomly and independently of each other in space; all particles share a statistically common--yet arbitrary--motion pattern; each particle has its own random propagation parameters--emission epoch, motion frequency, and motion amplitude. The universally self-similar statistics of the particles' displacements and first passage times (FPTs) are analyzed: statistics which are invariant with respect to the details of the displacement and FPT measurements and with respect to the particles' underlying motion pattern. Analysis concludes that the universally self-similar statistics are governed by Poisson processes with power-law intensities and by the Fréchet and Weibull extreme-value laws.

  1. Universal self-similarity of propagating populations

    Science.gov (United States)

    Eliazar, Iddo; Klafter, Joseph

    2010-07-01

    This paper explores the universal self-similarity of propagating populations. The following general propagation model is considered: particles are randomly emitted from the origin of a d -dimensional Euclidean space and propagate randomly and independently of each other in space; all particles share a statistically common—yet arbitrary—motion pattern; each particle has its own random propagation parameters—emission epoch, motion frequency, and motion amplitude. The universally self-similar statistics of the particles’ displacements and first passage times (FPTs) are analyzed: statistics which are invariant with respect to the details of the displacement and FPT measurements and with respect to the particles’ underlying motion pattern. Analysis concludes that the universally self-similar statistics are governed by Poisson processes with power-law intensities and by the Fréchet and Weibull extreme-value laws.

  2. Trajectory similarity join in spatial networks

    KAUST Repository

    Shang, Shuo; Chen, Lisi; Wei, Zhewei; Jensen, Christian S.; Zheng, Kai; Kalnis, Panos

    2017-01-01

    With these applications in mind, we provide a purposeful definition of similarity. To enable efficient TS-Join processing on large sets of trajectories, we develop search space pruning techniques and take into account the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer algorithm. For each trajectory, the algorithm first finds similar trajectories. Then it merges the results to achieve a final result. The algorithm exploits an upper bound on the spatiotemporal similarity and a heuristic scheduling strategy for search space pruning. The algorithm's per-trajectory searches are independent of each other and can be performed in parallel, and the merging has constant cost. An empirical study with real data offers insight in the performance of the algorithm and demonstrates that is capable of outperforming a well-designed baseline algorithm by an order of magnitude.

  3. Phonological similarity in working memory span tasks.

    Science.gov (United States)

    Chow, Michael; Macnamara, Brooke N; Conway, Andrew R A

    2016-08-01

    In a series of four experiments, we explored what conditions are sufficient to produce a phonological similarity facilitation effect in working memory span tasks. By using the same set of memoranda, but differing the secondary-task requirements across experiments, we showed that a phonological similarity facilitation effect is dependent upon the semantic relationship between the memoranda and the secondary-task stimuli, and is robust to changes in the representation, ordering, and pool size of the secondary-task stimuli. These findings are consistent with interference accounts of memory (Brown, Neath, & Chater, Psychological Review, 114, 539-576, 2007; Oberauer, Lewandowsky, Farrell, Jarrold, & Greaves, Psychonomic Bulletin & Review, 19, 779-819, 2012), whereby rhyming stimuli provide a form of categorical similarity that allows distractors to be excluded from retrieval at recall.

  4. Unveiling Music Structure Via PLSA Similarity Fusion

    DEFF Research Database (Denmark)

    Arenas-García, Jerónimo; Meng, Anders; Petersen, Kaare Brandt

    2007-01-01

    Nowadays there is an increasing interest in developing methods for building music recommendation systems. In order to get a satisfactory performance from such a system, one needs to incorporate as much information about songs similarity as possible; however, how to do so is not obvious. In this p......Nowadays there is an increasing interest in developing methods for building music recommendation systems. In order to get a satisfactory performance from such a system, one needs to incorporate as much information about songs similarity as possible; however, how to do so is not obvious...... observed similarities can be satisfactorily explained using the latent semantics. Additionally, this approach significantly simplifies the song retrieval phase, leading to a more practical system implementation. The suitability of the PLSA model for representing music structure is studied in a simplified...

  5. Sequence History Update Tool

    Science.gov (United States)

    Khanampompan, Teerapat; Gladden, Roy; Fisher, Forest; DelGuercio, Chris

    2008-01-01

    The Sequence History Update Tool performs Web-based sequence statistics archiving for Mars Reconnaissance Orbiter (MRO). Using a single UNIX command, the software takes advantage of sequencing conventions to automatically extract the needed statistics from multiple files. This information is then used to populate a PHP database, which is then seamlessly formatted into a dynamic Web page. This tool replaces a previous tedious and error-prone process of manually editing HTML code to construct a Web-based table. Because the tool manages all of the statistics gathering and file delivery to and from multiple data sources spread across multiple servers, there is also a considerable time and effort savings. With the use of The Sequence History Update Tool what previously took minutes is now done in less than 30 seconds, and now provides a more accurate archival record of the sequence commanding for MRO.

  6. Higher Education and Inequality

    Science.gov (United States)

    Brown, Roger

    2018-01-01

    After climate change, rising economic inequality is the greatest challenge facing the advanced Western societies. Higher education has traditionally been seen as a means to greater equality through its role in promoting social mobility. But with increased marketisation higher education now not only reflects the forces making for greater inequality…

  7. Higher Education in California

    Science.gov (United States)

    Public Policy Institute of California, 2016

    2016-01-01

    Higher education enhances Californians' lives and contributes to the state's economic growth. But population and education trends suggest that California is facing a large shortfall of college graduates. Addressing this short­fall will require strong gains for groups that have been historically under­represented in higher education. Substantial…

  8. Reimagining Christian Higher Education

    Science.gov (United States)

    Hulme, E. Eileen; Groom, David E., Jr.; Heltzel, Joseph M.

    2016-01-01

    The challenges facing higher education continue to mount. The shifting of the U.S. ethnic and racial demographics, the proliferation of advanced digital technologies and data, and the move from traditional degrees to continuous learning platforms have created an unstable environment to which Christian higher education must adapt in order to remain…

  9. Happiness in Higher Education

    Science.gov (United States)

    Elwick, Alex; Cannizzaro, Sara

    2017-01-01

    This paper investigates the higher education literature surrounding happiness and related notions: satisfaction, despair, flourishing and well-being. It finds that there is a real dearth of literature relating to profound happiness in higher education: much of the literature using the terms happiness and satisfaction interchangeably as if one were…

  10. Gender and Higher Education

    Science.gov (United States)

    Bank, Barbara J., Ed.

    2011-01-01

    This comprehensive, encyclopedic review explores gender and its impact on American higher education across historical and cultural contexts. Challenging recent claims that gender inequities in U.S. higher education no longer exist, the contributors--leading experts in the field--reveal the many ways in which gender is embedded in the educational…

  11. Divide and conquer: enriching environmental sequencing data.

    Directory of Open Access Journals (Sweden)

    Anne Bergeron

    2007-09-01

    Full Text Available In environmental sequencing projects, a mix of DNA from a whole microbial community is fragmented and sequenced, with one of the possible goals being to reconstruct partial or complete genomes of members of the community. In communities with high diversity of species, a significant proportion of the sequences do not overlap any other fragment in the sample. This problem will arise not only in situations with a relatively even distribution of many species, but also when the community in a particular environment is routinely dominated by the same few species. In the former case, no genomes may be assembled at all, while in the latter case a few dominant species in an environment will always be sequenced at high coverage to the detriment of coverage of the greater number of sparse species.Here we show that, with the same global sequencing effort, separating the species into two or more sub-communities prior to sequencing can yield a much higher proportion of sequences that can be assembled. We first use the Lander-Waterman model to show that, if the expected percentage of singleton sequences is higher than 25%, then, under the uniform distribution hypothesis, splitting the community is always a wise choice. We then construct simulated microbial communities to show that the results hold for highly non-uniform distributions. We also show that, for the distributions considered in the experiments, it is possible to estimate quite accurately the relative diversity of the two sub-communities.Given the fact that several methods exist to split microbial communities based on physical properties such as size, density, surface biochemistry, or optical properties, we strongly suggest that groups involved in environmental sequencing, and expecting high diversity, consider splitting their communities in order to maximize the information content of their sequencing effort.

  12. Human Chromosome 7: DNA Sequence and Biology

    OpenAIRE

    Scherer, Stephen W.; Cheung, Joseph; MacDonald, Jeffrey R.; Osborne, Lucy R.; Nakabayashi, Kazuhiko; Herbrick, Jo-Anne; Carson, Andrew R.; Parker-Katiraee, Layla; Skaug, Jennifer; Khaja, Razi; Zhang, Junjun; Hudek, Alexander K.; Li, Martin; Haddad, May; Duggan, Gavin E.

    2003-01-01

    DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical genetic data, including 440 chromosome rearrangement breakpoints associated with disease. This approach enabled the discovery of candidate gene...

  13. Similarity joins in relational database systems

    CERN Document Server

    Augsten, Nikolaus

    2013-01-01

    State-of-the-art database systems manage and process a variety of complex objects, including strings and trees. For such objects equality comparisons are often not meaningful and must be replaced by similarity comparisons. This book describes the concepts and techniques to incorporate similarity into database systems. We start out by discussing the properties of strings and trees, and identify the edit distance as the de facto standard for comparing complex objects. Since the edit distance is computationally expensive, token-based distances have been introduced to speed up edit distance comput

  14. Outsourced Similarity Search on Metric Data Assets

    DEFF Research Database (Denmark)

    Yiu, Man Lung; Assent, Ira; Jensen, Christian S.

    2012-01-01

    . Outsourcing offers the data owner scalability and a low initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise confidential. Given this setting, the paper presents techniques that transform the data prior to supplying......This paper considers a cloud computing setting in which similarity querying of metric data is outsourced to a service provider. The data is to be revealed only to trusted users, not to the service provider or anyone else. Users query the server for the most similar data objects to a query example...

  15. Measure of Node Similarity in Multilayer Networks

    DEFF Research Database (Denmark)

    Møllgaard, Anders; Zettler, Ingo; Dammeyer, Jesper

    2016-01-01

    university.Our analysis is based on data obtained using smartphones equipped with customdata collection software, complemented by questionnaire-based data. The networkof social contacts is represented as a weighted multilayer network constructedfrom different channels of telecommunication as well as data...... might bepresent in one layer of the multilayer network and simultaneously be absent inthe other layers. For a variable such as gender, our measure reveals atransition from similarity between nodes connected with links of relatively lowweight to dis-similarity for the nodes connected by the strongest...

  16. Cultural similarity and adjustment of expatriate academics

    DEFF Research Database (Denmark)

    Selmer, Jan; Lauring, Jakob

    2009-01-01

    The findings of a number of recent empirical studies of business expatriates, using different samples and methodologies, seem to support the counter-intuitive proposition that cultural similarity may be as difficult to adjust to as cultural dissimilarity. However, it is not obvious...... and non-EU countries. Results showed that although the perceived cultural similarity between host and home country for the two groups of investigated respondents was different, there was neither any difference in their adjustment nor in the time it took for them to become proficient. Implications...

  17. cDNA sequencing improves the detection of P53 missense mutations in colorectal cancer

    International Nuclear Information System (INIS)

    Szybka, Malgorzata; Kordek, Radzislaw; Zakrzewska, Magdalena; Rieske, Piotr; Pasz-Walczak, Grazyna; Kulczycka-Wojdala, Dominika; Zawlik, Izabela; Stawski, Robert; Jesionek-Kupnicka, Dorota; Liberski, Pawel P

    2009-01-01

    Recently published data showed discrepancies beteween P53 cDNA and DNA sequencing in glioblastomas. We hypothesised that similar discrepancies may be observed in other human cancers. To this end, we analyzed 23 colorectal cancers for P53 mutations and gene expression using both DNA and cDNA sequencing, real-time PCR and immunohistochemistry. We found P53 gene mutations in 16 cases (15 missense and 1 nonsense). Two of the 15 cases with missense mutations showed alterations based only on cDNA, and not DNA sequencing. Moreover, in 6 of the 15 cases with a cDNA mutation those mutations were difficult to detect in the DNA sequencing, so the results of DNA analysis alone could be misinterpreted if the cDNA sequencing results had not also been available. In all those 15 cases, we observed a higher ratio of the mutated to the wild type template by cDNA analysis, but not by the DNA analysis. Interestingly, a similar overexpression of P53 mRNA was present in samples with and without P53 mutations. In terms of colorectal cancer, those discrepancies might be explained under three conditions: 1, overexpression of mutated P53 mRNA in cancer cells as compared with normal cells; 2, a higher content of cells without P53 mutation (normal cells and cells showing K-RAS and/or APC but not P53 mutation) in samples presenting P53 mutation; 3, heterozygous or hemizygous mutations of P53 gene. Additionally, for heterozygous mutations unknown mechanism(s) causing selective overproduction of mutated allele should also be considered. Our data offer new clues for studying discrepancy in P53 cDNA and DNA sequencing analysis

  18. Quality of Higher Education

    DEFF Research Database (Denmark)

    Zou, Yihuan

    is about constructing a more inclusive understanding of quality in higher education through combining the macro, meso and micro levels, i.e. from the perspectives of national policy, higher education institutions as organizations in society, individual teaching staff and students. It covers both......Quality in higher education was not invented in recent decades – universities have always possessed mechanisms for assuring the quality of their work. The rising concern over quality is closely related to the changes in higher education and its social context. Among others, the most conspicuous...... changes are the massive expansion, diversification and increased cost in higher education, and new mechanisms of accountability initiated by the state. With these changes the traditional internally enacted academic quality-keeping has been given an important external dimension – quality assurance, which...

  19. Whole-genome sequencing of veterinary pathogens

    DEFF Research Database (Denmark)

    Ronco, Troels

    -electrophoresis and single-locus sequencing has been widely used to characterize such types of veterinary pathogens. However, DNA sequencing techniques have become fast and cost effective in recent years and whole-genome sequencing data provide a much higher discriminative power and reproducibility than any...... genetic background. This indicates that dairy cows can be natural carriers of S. aureus subtypes that in certain cases lead to CM. A group of isolates that mostly belonged to ST151 carried three pathogenicity islands that were primarily found in this group. The prevalence of resistance genes was generally...

  20. Long-term oil contamination causes similar changes in microbial communities of two distinct soils.

    Science.gov (United States)

    Liao, Jingqiu; Wang, Jie; Jiang, Dalin; Wang, Michael Cai; Huang, Yi

    2015-12-01

    Since total petroleum hydrocarbons (TPH) are toxic and persistent in environments, studying the impact of oil contamination on microbial communities in different soils is vital to oil production engineering, effective soil management and pollution control. This study analyzed the impact of oil contamination on the structure, activity and function in carbon metabolism of microbial communities of Chernozem soil from Daqing oil field and Cinnamon soil from Huabei oil field through both culture-dependent techniques and a culture-independent technique-pyrosequencing. Results revealed that pristine microbial communities in these two soils presented disparate patterns, where Cinnamon soil showed higher abundance of alkane, (polycyclic aromatic hydrocarbons) PAHs and TPH degraders, number of cultivable microbes, bacterial richness, bacterial biodiversity, and stronger microbial activity and function in carbon metabolism than Chernozem soil. It suggested that complicated properties of microbes and soils resulted in the difference in soil microbial patterns. However, the changes of microbial communities caused by oil contamination were similar in respect of two dominant phenomena. Firstly, the microbial community structures were greatly changed, with higher abundance, higher bacterial biodiversity, occurrence of Candidate_division_BRC1 and TAO6, disappearance of BD1-5 and Candidate_division_OD1, dominance of Streptomyces, higher percentage of hydrocarbon-degrading groups, and lower percentage of nitrogen-transforming groups. Secondly, microbial activity and function in carbon metabolism were significantly enhanced. Based on the characteristics of microbial communities in the two soils, appropriate strategy for in situ bioremediation was provided for each oil field. This research underscored the usefulness of combination of culture-dependent techniques and next-generation sequencing techniques both to unravel the microbial patterns and understand the ecological impact of

  1. Planarian homeobox genes: cloning, sequence analysis, and expression.

    Science.gov (United States)

    Garcia-Fernàndez, J; Baguñà, J; Saló, E

    1991-01-01

    Freshwater planarians (Platyhelminthes, Turbellaria, and Tricladida) are acoelomate, triploblastic, unsegmented, and bilaterally symmetrical organisms that are mainly known for their ample power to regenerate a complete organism from a small piece of their body. To identify potential pattern-control genes in planarian regeneration, we have isolated two homeobox-containing genes, Dth-1 and Dth-2 [Dugesia (Girardia) tigrina homeobox], by using degenerate oligonucleotides corresponding to the most conserved amino acid sequence from helix-3 of the homeodomain. Dth-1 and Dth-2 homeodomains are closely related (68% at the nucleotide level and 78% at the protein level) and show the conserved residues characteristic of the homeodomains identified to data. Similarity with most homeobox sequences is low (30-50%), except with Drosophila NK homeodomains (80-82% with NK-2) and the rodent TTF-1 homeodomain (77-87%). Some unusual amino acid residues specific to NK-2, TTF-1, Dth-1, and Dth-2 can be observed in the recognition helix (helix-3) and may define a family of homeodomains. The deduced amino acid sequences from the cDNAs contain, in addition to the homeodomain, other domains also present in various homeobox-containing genes. The expression of both genes, detected by Northern blot analysis, appear slightly higher in cephalic regions than in the rest of the intact organism, while a slight increase is detected in the central period (5 days) or regeneration. Images PMID:1714599

  2. Nuclear markers reveal that inter-lake cichlids' similar morphologies do not reflect similar genealogy.

    Science.gov (United States)

    Kassam, Daud; Seki, Shingo; Horic, Michio; Yamaoka, Kosaku

    2006-08-01

    The apparent inter-lake morphological similarity among East African Great Lakes' cichlid species/genera has left evolutionary biologists asking whether such similarity is due to sharing of common ancestor or mere convergent evolution. In order to answer such question, we first used Geometric Morphometrics, GM, to quantify morphological similarity and then subsequently used Amplified Fragment Length Polymorphism, AFLP, to determine if similar morphologies imply shared ancestry or convergent evolution. GM revealed that not all presumed morphological similar pairs were indeed similar, and the dendrogram generated from AFLP data indicated distinct clusters corresponding to each lake and not inter-lake morphological similar pairs. Such results imply that the morphological similarity is due to convergent evolution and not shared ancestry. The congruency of GM and AFLP generated dendrograms imply that GM is capable of picking up phylogenetic signal, and thus GM can be potential tool in phylogenetic systematics.

  3. Self-similar slip distributions on irregular shaped faults

    Science.gov (United States)

    Herrero, A.; Murphy, S.

    2018-06-01

    We propose a strategy to place a self-similar slip distribution on a complex fault surface that is represented by an unstructured mesh. This is possible by applying a strategy based on the composite source model where a hierarchical set of asperities, each with its own slip function which is dependent on the distance from the asperity centre. Central to this technique is the efficient, accurate computation of distance between two points on the fault surface. This is known as the geodetic distance problem. We propose a method to compute the distance across complex non-planar surfaces based on a corollary of the Huygens' principle. The difference between this method compared to others sample-based algorithms which precede it is the use of a curved front at a local level to calculate the distance. This technique produces a highly accurate computation of the distance as the curvature of the front is linked to the distance from the source. Our local scheme is based on a sequence of two trilaterations, producing a robust algorithm which is highly precise. We test the strategy on a planar surface in order to assess its ability to keep the self-similarity properties of a slip distribution. We also present a synthetic self-similar slip distribution on a real slab topography for a M8.5 event. This method for computing distance may be extended to the estimation of first arrival times in both complex 3D surfaces or 3D volumes.

  4. Concept similarity in publications precedes cross-disciplinary collaboration.

    Science.gov (United States)

    Post, Andrew R; Harrison, James H

    2008-11-06

    Innovative science frequently occurs as a result of cross-disciplinary collaboration, the importance of which is reflected by recent NIH funding initiatives that promote communication and collaboration. If shared research interests between collaborators are important for the formation of collaborations,methods for identifying these shared interests across scientific domains could potentially reveal new and useful collaboration opportunities. MEDLINE represents a comprehensive database of collaborations and research interests, as reflected by article co-authors and concept content. We analyzed six years of citations using information retrieval based methods to compute articles conceptual similarity, and found that articles by basic and clinical scientists who later collaborated had significantly higher average similarity than articles by similar scientists who did not collaborate.Refinement of these methods and characterization of found conceptual overlaps could allow automated discovery of collaboration opportunities that are currently missed.

  5. Protein structure similarity from principle component correlation analysis

    Directory of Open Access Journals (Sweden)

    Chou James

    2006-01-01

    Full Text Available Abstract Background Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. Results We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. Conclusion The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum

  6. Clustering biomolecular complexes by residue contacts similarity

    NARCIS (Netherlands)

    Garcia Lopes Maia Rodrigues, João; Trellet, Mikaël; Schmitz, Christophe; Kastritis, Panagiotis; Karaca, Ezgi; Melquiond, Adrien S J; Bonvin, Alexandre M J J; Garcia Lopes Maia Rodrigues, João

    Inaccuracies in computational molecular modeling methods are often counterweighed by brute-force generation of a plethora of putative solutions. These are then typically sieved via structural clustering based on similarity measures such as the root mean square deviation (RMSD) of atomic positions.

  7. Similarity principles for equipment qualification by experience

    International Nuclear Information System (INIS)

    Kana, D.D.; Pomerening, D.J.

    1988-07-01

    A methodology is developed for seismic qualification of nuclear plant equipment by applying similarity principles to existing experience data. Experience data are available from previous qualifications by analysis or testing, or from actual earthquake events. Similarity principles are defined in terms of excitation, equipment physical characteristics, and equipment response. Physical similarity is further defined in terms of a critical transfer function for response at a location on a primary structure, whose response can be assumed directly related to ultimate fragility of the item under elevated levels of excitation. Procedures are developed for combining experience data into composite specifications for qualification of equipment that can be shown to be physically similar to the reference equipment. Other procedures are developed for extending qualifications beyond the original specifications under certain conditions. Some examples for application of the procedures and verification of them are given for certain cases that can be approximated by a two degree of freedom simple primary/secondary system. Other examples are based on use of actual test data available from previous qualifications. Relationships of the developments with other previously-published methods are discussed. The developments are intended to elaborate on the rather broad revised guidelines developed by the IEEE 344 Standards Committee for equipment qualification in new nuclear plants. However, the results also contribute to filling a gap that exists between the IEEE 344 methodology and that previously developed by the Seismic Qualification Utilities Group. The relationship of the results to safety margin methodology is also discussed. (author)

  8. 7 CFR 51.1997 - Similar type.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Similar type. 51.1997 Section 51.1997 Agriculture Regulations of the Department of Agriculture AGRICULTURAL MARKETING SERVICE (Standards, Inspections, Marketing Practices), DEPARTMENT OF AGRICULTURE REGULATIONS AND STANDARDS UNDER THE AGRICULTURAL MARKETING ACT OF 1946...

  9. Efficient Similarity Retrieval in Music Databases

    DEFF Research Database (Denmark)

    Ruxanda, Maria Magdalena; Jensen, Christian Søndergaard

    2006-01-01

    Audio music is increasingly becoming available in digital form, and the digital music collections of individuals continue to grow. Addressing the need for effective means of retrieving music from such collections, this paper proposes new techniques for content-based similarity search. Each music...

  10. Similarity search of business process models

    NARCIS (Netherlands)

    Dumas, M.; García-Bañuelos, L.; Dijkman, R.M.

    2009-01-01

    Similarity search is a general class of problems in which a given object, called a query object, is compared against a collection of objects in order to retrieve those that most closely resemble the query object. This paper reviews recent work on an instance of this class of problems, where the

  11. Evaluating gender similarities and differences using metasynthesis.

    Science.gov (United States)

    Zell, Ethan; Krizan, Zlatan; Teeter, Sabrina R

    2015-01-01

    Despite the common lay assumption that males and females are profoundly different, Hyde (2005) used data from 46 meta-analyses to demonstrate that males and females are highly similar. Nonetheless, the gender similarities hypothesis has remained controversial. Since Hyde's provocative report, there has been an explosion of meta-analytic interest in psychological gender differences. We utilized this enormous collection of 106 meta-analyses and 386 individual meta-analytic effects to reevaluate the gender similarities hypothesis. Furthermore, we employed a novel data-analytic approach called metasynthesis (Zell & Krizan, 2014) to estimate the average difference between males and females and to explore moderators of gender differences. The average, absolute difference between males and females across domains was relatively small (d = 0.21, SD = 0.14), with the majority of effects being either small (46%) or very small (39%). Magnitude of differences fluctuated somewhat as a function of the psychological domain (e.g., cognitive variables, social and personality variables, well-being), but remained largely constant across age, culture, and generations. These findings provide compelling support for the gender similarities hypothesis, but also underscore conditions under which gender differences are most pronounced. PsycINFO Database Record (c) 2015 APA, all rights reserved.

  12. Measuring structural similarity in large online networks.

    Science.gov (United States)

    Shi, Yongren; Macy, Michael

    2016-09-01

    Structural similarity based on bipartite graphs can be used to detect meaningful communities, but the networks have been tiny compared to massive online networks. Scalability is important in applications involving tens of millions of individuals with highly skewed degree distributions. Simulation analysis holding underlying similarity constant shows that two widely used measures - Jaccard index and cosine similarity - are biased by the distribution of out-degree in web-scale networks. However, an alternative measure, the Standardized Co-incident Ratio (SCR), is unbiased. We apply SCR to members of Congress, musical artists, and professional sports teams to show how massive co-following on Twitter can be used to map meaningful affiliations among cultural entities, even in the absence of direct connections to one another. Our results show how structural similarity can be used to map cultural alignments and demonstrate the potential usefulness of social media data in the study of culture, politics, and organizations across the social and behavioral sciences. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. Phonological Similarity in American Sign Language.

    Science.gov (United States)

    Hildebrandt, Ursula; Corina, David

    2002-01-01

    Investigates deaf and hearing subjects' ratings of American Sign Language (ASL) signs to assess whether linguistic experience shapes judgments of sign similarity. Findings are consistent with linguistic theories that posit movement and location as core structural elements of syllable structure in ASL. (Author/VWL)

  14. Structural similarity and category-specificity

    DEFF Research Database (Denmark)

    Gerlach, Christian; Law, Ian; Paulson, Olaf B

    2004-01-01

    It has been suggested that category-specific recognition disorders for natural objects may reflect that natural objects are more structurally (visually) similar than artefacts and therefore more difficult to recognize following brain damage. On this account one might expect a positive relationshi...

  15. Music Retrieval based on Melodic Similarity

    NARCIS (Netherlands)

    Typke, R.

    2007-01-01

    This thesis introduces a method for measuring melodic similarity for notated music such as MIDI files. This music search algorithm views music as sets of notes that are represented as weighted points in the two-dimensional space of time and pitch. Two point sets can be compared by calculating how

  16. Measurement of Similarity in Academic Contexts

    Directory of Open Access Journals (Sweden)

    Omid Mahian

    2017-06-01

    Full Text Available We propose some reflections, comments and suggestions about the measurement of similar and matched content in scientific papers and documents, and the need to develop appropriate tools and standards for an ethically fair and equitable treatment of authors.

  17. Appropriate Similarity Measures for Author Cocitation Analysis

    NARCIS (Netherlands)

    N.J.P. van Eck (Nees Jan); L. Waltman (Ludo)

    2007-01-01

    textabstractWe provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of

  18. Similarity of Experience and Empathy in Preschoolers.

    Science.gov (United States)

    Barnett, Mark A.

    The present study examined the role of similarity of experience in young children's affective reactions to others. Some preschoolers played one of two games (Puzzle Board or Buckets) and were informed that they had either failed or succeeded; others merely observed the games being played and were given no evaluative feedback. Subsequently, each…

  19. Cultural Similarities and Differences on Idiom Translation

    Institute of Scientific and Technical Information of China (English)

    黄频频; 陈于全

    2010-01-01

    Both English and Chinese are abound with idioms. Idioms are an important part of the hnguage and culture of a society. English and Chinese idioms carved with cultural characteristics account for a great part in the tramlation. This paper studies the translation of idioms concerning their cultural similarities, cultural differences and transhtion principles.

  20. Learning by similarity in coordination problems

    Czech Academy of Sciences Publication Activity Database

    Steiner, Jakub; Stewart, C.

    -, č. 324 (2007), s. 1-40 ISSN 1211-3298 R&D Projects: GA MŠk LC542 Institutional research plan: CEZ:AV0Z70850503 Keywords : similarity * learning * case-based reasoning Subject RIV: AH - Economics http://www.cerge-ei.cz/pdf/wp/Wp324.pdf

  1. Outsourced similarity search on metric data assets

    KAUST Repository

    Yiu, Man Lung; Assent, Ira; Jensen, Christian Sø ndergaard; Kalnis, Panos

    2012-01-01

    for the most similar data objects to a query example. Outsourcing offers the data owner scalability and a low-initial investment. The need for privacy may be due to the data being sensitive (e.g., in medicine), valuable (e.g., in astronomy), or otherwise

  2. Protein-protein interaction network-based detection of functionally similar proteins within species.

    Science.gov (United States)

    Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli

    2012-07-01

    Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.

  3. Higher dimensional loop quantum cosmology

    International Nuclear Information System (INIS)

    Zhang, Xiangdong

    2016-01-01

    Loop quantum cosmology (LQC) is the symmetric sector of loop quantum gravity. In this paper, we generalize the structure of loop quantum cosmology to the theories with arbitrary spacetime dimensions. The isotropic and homogeneous cosmological model in n + 1 dimensions is quantized by the loop quantization method. Interestingly, we find that the underlying quantum theories are divided into two qualitatively different sectors according to spacetime dimensions. The effective Hamiltonian and modified dynamical equations of n + 1 dimensional LQC are obtained. Moreover, our results indicate that the classical big bang singularity is resolved in arbitrary spacetime dimensions by a quantum bounce. We also briefly discuss the similarities and differences between the n + 1 dimensional model and the 3 + 1 dimensional one. Our model serves as a first example of higher dimensional loop quantum cosmology and offers the possibility to investigate quantum gravity effects in higher dimensional cosmology. (orig.)

  4. Extending the Similarity-Attraction Effect : The effects of When-Similarity in mediated communication

    NARCIS (Netherlands)

    Kaptein, M.C.; Castaneda, D.; Fernandez, N.; Nass, C.

    2014-01-01

    The feeling of connectedness experienced in computer-mediated relationships can be explained by the similarity-attraction effect (SAE). Though SAE is well established in psychology, the effects of some types of similarity have not yet been explored. In 2 studies, we demonstrate similarity-attraction

  5. Higher English for CFE

    CERN Document Server

    Bridges, Ann; Mitchell, John

    2015-01-01

    A brand new edition of the former Higher English: Close Reading , completely revised and updated for the new Higher element (Reading for Understanding, Analysis and Evaluation) - worth 30% of marks in the final exam!. We are working with SQA to secure endorsement for this title. Written by two highly experienced authors this book shows you how to practice for the Reading for Understanding, Analysis and Evaluation section of the new Higher English exam. This book introduces the terms and concepts that lie behind success and offers guidance on the interpretation of questions and targeting answer

  6. Predicting drug-target interaction for new drugs using enhanced similarity measures and super-target clustering.

    Science.gov (United States)

    Shi, Jian-Yu; Yiu, Siu-Ming; Li, Yiming; Leung, Henry C M; Chin, Francis Y L

    2015-07-15

    Predicting drug-target interaction using computational approaches is an important step in drug discovery and repositioning. To predict whether there will be an interaction between a drug and a target, most existing methods identify similar drugs and targets in the database. The prediction is then made based on the known interactions of these drugs and targets. This idea is promising. However, there are two shortcomings that have not yet been addressed appropriately. Firstly, most of the methods only use 2D chemical structures and protein sequences to measure the similarity of drugs and targets respectively. However, this information may not fully capture the characteristics determining whether a drug will interact with a target. Secondly, there are very few known interactions, i.e. many interactions are "missing" in the database. Existing approaches are biased towards known interactions and have no good solutions to handle possibly missing interactions which affect the accuracy of the prediction. In this paper, we enhance the similarity measures to include non-structural (and non-sequence-based) information and introduce the concept of a "super-target" to handle the problem of possibly missing interactions. Based on evaluations on real data, we show that our similarity measure is better than the existing measures and our approach is able to achieve higher accuracy than the two best existing algorithms, WNN-GIP and KBMF2K. Our approach is available at http://web.hku.hk/∼liym1018/projects/drug/drug.html or http://www.bmlnwpu.org/us/tools/PredictingDTI_S2/METHODS.html. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. AlignMe—a membrane protein sequence alignment web server

    Science.gov (United States)

    Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.

    2014-01-01

    We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425

  8. Enhanced throughput for infrared automated DNA sequencing

    Science.gov (United States)

    Middendorf, Lyle R.; Gartside, Bill O.; Humphrey, Pat G.; Roemer, Stephen C.; Sorensen, David R.; Steffens, David L.; Sutter, Scott L.

    1995-04-01

    Several enhancements have been developed and applied to infrared automated DNA sequencing resulting in significantly higher throughput. A 41 cm sequencing gel (31 cm well- to-read distance) combines high resolution of DNA sequencing fragments with optimized run times yielding two runs per day of 500 bases per sample. A 66 cm sequencing gel (56 cm well-to-read distance) produces sequence read lengths of up to 1000 bases for ds and ss templates using either T7 polymerase or cycle-sequencing protocols. Using a multichannel syringe to load 64 lanes allows 16 samples (compatible with 96-well format) to be visualized for each run. The 41 cm gel configuration allows 16,000 bases per day (16 samples X 500 bases/sample X 2 ten hour runs/day) to be sequenced with the advantages of infrared technology. Enhancements to internal labeling techniques using an infrared-labeled dATP molecule (Boehringer Mannheim GmbH, Penzberg, Germany; Sequenase (U.S. Biochemical) have also been made. The inclusion of glycerol in the sequencing reactions yields greatly improved results for some primer and template combinations. The inclusion of (alpha) -Thio-dNTP's in the labeling reaction increases signal intensity two- to three-fold.

  9. HIV Sequence Compendium 2015

    Energy Technology Data Exchange (ETDEWEB)

    Foley, Brian Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas Kenneth [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Cristian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Pennsylvania, Philadelphia, PA (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette Tina Marie [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2015-10-05

    This compendium is an annual printed summary of the data contained in the HIV sequence database. We try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2015. Hence, though it is published in 2015 and called the 2015 Compendium, its contents correspond to the 2014 curated alignments on our website. The number of sequences in the HIV database is still increasing. In total, at the end of 2014, there were 624,121 sequences in the HIV Sequence Database, an increase of 7% since the previous year. This is the first year that the number of new sequences added to the database has decreased compared to the previous year. The number of near complete genomes (>7000 nucleotides) increased to 5834 by end of 2014. However, as in previous years, the compendium alignments contain only a fraction of these. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/ content/sequence/NEWALIGN/align.html As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  10. Mapping sequences by parts

    Directory of Open Access Journals (Sweden)

    Guziolowski Carito

    2007-09-01

    Full Text Available Abstract Background: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. Results: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N using O (|s| × |t| × N memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. Practical Application: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events.

  11. Planning for Higher Education.

    Science.gov (United States)

    Lindstrom, Caj-Gunnar

    1984-01-01

    Decision processes for strategic planning for higher education institutions are outlined using these parameters: institutional goals and power structure, organizational climate, leadership attitudes, specific problem type, and problem-solving conditions and alternatives. (MSE)

  12. Advert for higher education

    OpenAIRE

    N.V. Provozin; А.S. Teletov

    2011-01-01

    The article discusses the features advertising higher education institution. The analysis results of marketing research students for their choice of institutions and further study. Principles of the advertising campaign on three levels: the university, the faculty, the separate department.

  13. The Colliding Beams Sequencer

    International Nuclear Information System (INIS)

    Johnson, D.E.; Johnson, R.P.

    1989-01-01

    The Colliding Beam Sequencer (CBS) is a computer program used to operate the pbar-p Collider by synchronizing the applications programs and simulating the activities of the accelerator operators during filling and storage. The Sequencer acts as a meta-program, running otherwise stand alone applications programs, to do the set-up, beam transfers, acceleration, low beta turn on, and diagnostics for the transfers and storage. The Sequencer and its operational performance will be described along with its special features which include a periodic scheduler and command logger. 14 refs., 3 figs

  14. Phylogenetic Trees From Sequences

    Science.gov (United States)

    Ryvkin, Paul; Wang, Li-San

    In this chapter, we review important concepts and approaches for phylogeny reconstruction from sequence data.We first cover some basic definitions and properties of phylogenetics, and briefly explain how scientists model sequence evolution and measure sequence divergence. We then discuss three major approaches for phylogenetic reconstruction: distance-based phylogenetic reconstruction, maximum parsimony, and maximum likelihood. In the third part of the chapter, we review how multiple phylogenies are compared by consensus methods and how to assess confidence using bootstrapping. At the end of the chapter are two sections that list popular software packages and additional reading.

  15. On higher derivative gravity

    International Nuclear Information System (INIS)

    Accioly, A.J.

    1987-01-01

    A possible classical route conducting towards a general relativity theory with higher-derivatives starting, in a sense, from first principles, is analysed. A completely causal vacuum solution with the symmetries of the Goedel universe is obtained in the framework of this higher-derivative gravity. This very peculiar and rare result is the first known vcuum solution of the fourth-order gravity theory that is not a solution of the corresponding Einstein's equations.(Author) [pt

  16. Higher Spins & Strings

    CERN Multimedia

    CERN. Geneva

    2014-01-01

    The conjectured relation between higher spin theories on anti de-Sitter (AdS) spaces and weakly coupled conformal field theories is reviewed. I shall then outline the evidence in favour of a concrete duality of this kind, relating a specific higher spin theory on AdS3 to a family of 2d minimal model CFTs. Finally, I shall explain how this relation fits into the framework of the familiar stringy AdS/CFT correspondence.

  17. Recalling visual serial order for verbal sequences

    NARCIS (Netherlands)

    Logie, R.H.; Saito, S.; Morita, A.; Varma, S.; Norris, D.

    2016-01-01

    We report three experiments in which participants performed written serial recall of visually presented verbal sequences with items varying in visual similarity. In Experiments 1 and 2 native speakers of Japanese recalled visually presented Japanese Kanji characters. In Experiment 3, native speakers

  18. Draft Genome Sequence of Lactobacillus rhamnosus 2166.

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

    2014-01-01

    In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains.

  19. Complete chloroplast DNA sequence from a Korean endemic genus, Megaleranthis saniculifolia, and its evolutionary implications.

    Science.gov (United States)

    Kim, Young-Kyu; Park, Chong-wook; Kim, Ki-Joong

    2009-03-31

    The chloroplast DNA sequences of Megaleranthis saniculifolia, an endemic and monotypic endangered plant species, were completed in this study (GenBank FJ597983). The genome is 159,924 bp in length. It harbors a pair of IR regions consisting of 26,608 bp each. The lengths of the LSC and SSC regions are 88,326 bp and 18,382 bp, respectively. The structural organizations, gene and intron contents, gene orders, AT contents, codon usages, and transcription units of the Megaleranthis chloroplast genome are similar to those of typical land plant cp DNAs. However, the detailed features of Megaleranthis chloroplast genomes are substantially different from that of Ranunculus, which belongs to the same family, the Ranunculaceae. First, the Megaleranthis cp DNA was 4,797 bp longer than that of Ranunculus due to an expanded IR region into the SSC region and duplicated sequence elements in several spacer regions of the Megaleranthis cp genome. Second, the chloroplast genomes of Megaleranthis and Ranunculus evidence 5.6% sequence divergence in the coding regions, 8.9% sequence divergence in the intron regions, and 18.7% sequence divergence in the intergenic spacer regions, respectively. In both the coding and noncoding regions, average nucleotide substitution rates differed markedly, depending on the genome position. Our data strongly implicate the positional effects of the evolutionary modes of chloroplast genes. The genes evidencing higher levels of base substitutions also have higher incidences of indel mutations and low Ka/Ks ratios. A total of 54 simple sequence repeat loci were identified from the Megaleranthis cp genome. The existence of rich cp SSR loci in the Megaleranthis cp genome provides a rare opportunity to study the population genetic structures of this endangered species. Our phylogenetic trees based on the two independent markers, the nuclear ITS and chloroplast matK sequences, strongly support the inclusion of the Megaleranthis to the Trollius. Therefore, our

  20. Popularity versus similarity in growing networks

    Science.gov (United States)

    Krioukov, Dmitri; Papadopoulos, Fragkiskos; Kitsak, Maksim; Serrano, Mariangeles; Boguna, Marian

    2012-02-01

    Preferential attachment is a powerful mechanism explaining the emergence of scaling in growing networks. If new connections are established preferentially to more popular nodes in a network, then the network is scale-free. Here we show that not only popularity but also similarity is a strong force shaping the network structure and dynamics. We develop a framework where new connections, instead of preferring popular nodes, optimize certain trade-offs between popularity and similarity. The framework admits a geometric interpretation, in which preferential attachment emerges from local optimization processes. As opposed to preferential attachment, the optimization framework accurately describes large-scale evolution of technological (Internet), social (web of trust), and biological (E.coli metabolic) networks, predicting the probability of new links in them with a remarkable precision. The developed framework can thus be used for predicting new links in evolving networks, and provides a different perspective on preferential attachment as an emergent phenomenon.

  1. Similarity, trust in institutions, affect, and populism

    DEFF Research Database (Denmark)

    Scholderer, Joachim; Finucane, Melissa L.

    -based evaluations are fundamental to human information processing, they can contribute significantly to other judgments (such as the risk, cost-effectiveness, trustworthiness) of the same stimulus object. Although deliberation and analysis are certainly important in some decision-making circumstances, reliance...... on affect is a quicker, easier, and a more efficient way of navigating in a complex and uncertain world. Hence, many theorists give affect a direct and primary role in motivating behavior. Taken together, the results provide uncannily strong support for the value-similarity hypothesis, strengthening...... types of information about gene technology. The materials were attributed to different institutions. The results indicated that participants' trust in an institution was a function of the similarity between the position advocated in the materials and participants' own attitudes towards gene technology...

  2. Contingency and similarity in response selection.

    Science.gov (United States)

    Prinz, Wolfgang

    2018-05-09

    This paper explores issues of task representation in choice reaction time tasks. How is it possible, and what does it take, to represent such a task in a way that enables a performer to do the task in line with the prescriptions entailed in the instructions? First, a framework for task representation is outlined which combines the implementation of task sets and their use for performance with different kinds of representational operations (pertaining to feature compounds for event codes and code assemblies for task sets, respectively). Then, in a second step, the framework is itself embedded in the bigger picture of the classical debate on the roles of contingency and similarity for the formation of associations. The final conclusion is that both principles are needed and that the operation of similarity at the level of task sets requires and presupposes the operation of contingency at the level of event codes. Copyright © 2018 The Author. Published by Elsevier Inc. All rights reserved.

  3. Similarity and Modeling in Science and Engineering

    CERN Document Server

    Kuneš, Josef

    2012-01-01

    The present text sets itself in relief to other titles on the subject in that it addresses the means and methodologies versus a narrow specific-task oriented approach. Concepts and their developments which evolved to meet the changing needs of applications are addressed. This approach provides the reader with a general tool-box to apply to their specific needs. Two important tools are presented: dimensional analysis and the similarity analysis methods. The fundamental point of view, enabling one to sort all models, is that of information flux between a model and an original expressed by the similarity and abstraction. Each chapter includes original examples and ap-plications. In this respect, the models can be divided into several groups. The following models are dealt with separately by chapter; mathematical and physical models, physical analogues, deterministic, stochastic, and cybernetic computer models. The mathematical models are divided into asymptotic and phenomenological models. The phenomenological m...

  4. Similarity solutions for phase-change problems

    Science.gov (United States)

    Canright, D.; Davis, S. H.

    1989-01-01

    A modification of Ivantsov's (1947) similarity solutions is proposed which can describe phase-change processes which are limited by diffusion. The method has application to systems that have n-components and possess cross-diffusion and Soret and Dufour effects, along with convection driven by density discontinuities at the two-phase interface. Local thermal equilibrium is assumed at the interface. It is shown that analytic solutions are possible when the material properties are constant.

  5. Stochastic self-similar and fractal universe

    International Nuclear Information System (INIS)

    Iovane, G.; Laserra, E.; Tortoriello, F.S.

    2004-01-01

    The structures formation of the Universe appears as if it were a classically self-similar random process at all astrophysical scales. An agreement is demonstrated for the present hypotheses of segregation with a size of astrophysical structures by using a comparison between quantum quantities and astrophysical ones. We present the observed segregated Universe as the result of a fundamental self-similar law, which generalizes the Compton wavelength relation. It appears that the Universe has a memory of its quantum origin as suggested by R. Penrose with respect to quasi-crystal. A more accurate analysis shows that the present theory can be extended from the astrophysical to the nuclear scale by using generalized (stochastically) self-similar random process. This transition is connected to the relevant presence of the electromagnetic and nuclear interactions inside the matter. In this sense, the presented rule is correct from a subatomic scale to an astrophysical one. We discuss the near full agreement at organic cell scale and human scale too. Consequently the Universe, with its structures at all scales (atomic nucleus, organic cell, human, planet, solar system, galaxy, clusters of galaxy, super clusters of galaxy), could have a fundamental quantum reason. In conclusion, we analyze the spatial dimensions of the objects in the Universe as well as space-time dimensions. The result is that it seems we live in an El Naschie's E-infinity Cantorian space-time; so we must seriously start considering fractal geometry as the geometry of nature, a type of arena where the laws of physics appear at each scale in a self-similar way as advocated long ago by the Swedish school of astrophysics

  6. Similarity-based Polymorphic Shellcode Detection

    Directory of Open Access Journals (Sweden)

    Denis Yurievich Gamayunov

    2013-02-01

    Full Text Available In the work the method for polymorphic shellcode dedection based on the set of known shellcodes is proposed. The method’s main idea is in sequential applying of deobfuscating transformations to a data analyzed and then recognizing similarity with malware samples. The method has been tested on the sets of shellcodes generated using Metasploit Framework v.4.1.0 and PELock Obfuscator and shows 87 % precision with zero false positives rate.

  7. Quasi-Similarity Model of Synthetic Jets

    Czech Academy of Sciences Publication Activity Database

    Tesař, Václav; Kordík, Jozef

    2009-01-01

    Roč. 149, č. 2 (2009), s. 255-265 ISSN 0924-4247 R&D Projects: GA AV ČR IAA200760705; GA ČR GA101/07/1499 Institutional research plan: CEZ:AV0Z20760514 Keywords : jets * synthetic jets * similarity solution Subject RIV: BK - Fluid Dynamics Impact factor: 1.674, year: 2009 http://www.sciencedirect.com

  8. Multidimensional Scaling Visualization using Parametric Similarity Indices

    OpenAIRE

    Machado, J. A. Tenreiro; Lopes, António M.; Galhano, A.M.

    2015-01-01

    In this paper, we apply multidimensional scaling (MDS) and parametric similarity indices (PSI) in the analysis of complex systems (CS). Each CS is viewed as a dynamical system, exhibiting an output time-series to be interpreted as a manifestation of its behavior. We start by adopting a sliding window to sample the original data into several consecutive time periods. Second, we define a given PSI for tracking pieces of data. We then compare the windows for different values of the parameter, an...

  9. The fluid similarity of the boiling crisis

    International Nuclear Information System (INIS)

    Katsaounis, A.

    1986-01-01

    Most of the measurements related to the boiling crisis have, until now, been undertaken for a wide parameter variation in the water, and were mainly related to the water-cooled reactor. This article investigates, whether or how the measuring results can be transferred to other fluids. Derived dimensionless similarity figures and those taken from literature are verified by measurements from complex geometries in water and freon 12. (orig.) [de

  10. The fluid similarity of the boiling crisis

    International Nuclear Information System (INIS)

    Katsaounis, A.

    1987-01-01

    Most of the measurements related to the boiling crisis have, until now, been undertaken for a wide parameter variation in the water, and were mainly related to the water-cooled reactor. This article investigates, whether or how the measuring results can be transferred to other fluids. Derived dimensionless similarity figures and those taken from literature are verified by measurements from complex geometries in water and freon 12. (orig./GL) [de

  11. Brand name confusion: Subjective and objective measures of orthographic similarity.

    Science.gov (United States)

    Burt, Jennifer S; McFarlane, Kimberley A; Kelly, Sarah J; Humphreys, Michael S; Weatherall, Kimberlee; Burrell, Robert G

    2017-09-01

    Determining brand name similarity is vital in areas of trademark registration and brand confusion. Students rated the orthographic (spelling) similarity of word pairs (Experiments 1, 2, and 4) and brand name pairs (Experiment 5). Similarity ratings were consistently higher when words shared beginnings rather than endings, whereas shared pronunciation of the stressed vowel had small and less consistent effects on ratings. In Experiment 3 a behavioral task confirmed the similarity of shared beginnings in lexical processing. Specifically, in a task requiring participants to decide whether 2 words presented in the clear (a probe and a later target) were the same or different, a masked prime word preceding the target shortened response latencies if it shared its initial 3 letters with the target. The ratings of students for word and brand name pairs were strongly predicted by metrics of orthographic similarity from the visual word identification literature based on the number of shared letters and their relative positions. The results indicate a potential use for orthographic metrics in brand name registration and trademark law. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  12. Semantic Similarity between Web Documents Using Ontology

    Science.gov (United States)

    Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

    2018-06-01

    The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.

  13. Semantic Similarity between Web Documents Using Ontology

    Science.gov (United States)

    Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

    2018-03-01

    The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.

  14. Exploration of noncoding sequences in metagenomes.

    Directory of Open Access Journals (Sweden)

    Fabián Tobar-Tosse

    Full Text Available Environment-dependent genomic features have been defined for different metagenomes, whose genes and their associated processes are related to specific environments. Identification of ORFs and their functional categories are the most common methods for association between functional and environmental features. However, this analysis based on finding ORFs misses noncoding sequences and, therefore, some metagenome regulatory or structural information could be discarded. In this work we analyzed 23 whole metagenomes, including coding and noncoding sequences using the following sequence patterns: (G+C content, Codon Usage (Cd, Trinucleotide Usage (Tn, and functional assignments for ORF prediction. Herein, we present evidence of a high proportion of noncoding sequences discarded in common similarity-based methods in metagenomics, and the kind of relevant information present in those. We found a high density of trinucleotide repeat sequences (TRS in noncoding sequences, with a regulatory and adaptive function for metagenome communities. We present associations between trinucleotide values and gene function, where metagenome clustering correlate with microorganism adaptations and kinds of metagenomes. We propose here that noncoding sequences have relevant information to describe metagenomes that could be considered in a whole metagenome analysis in order to improve their organization, classification protocols, and their relation with the environment.

  15. Subgrouping Automata: automatic sequence subgrouping using phylogenetic tree-based optimum subgrouping algorithm.

    Science.gov (United States)

    Seo, Joo-Hyun; Park, Jihyang; Kim, Eun-Mi; Kim, Juhan; Joo, Keehyoung; Lee, Jooyoung; Kim, Byung-Gee

    2014-02-01

    Sequence subgrouping for a given sequence set can enable various informative tasks such as the functional discrimination of sequence subsets and the functional inference of unknown sequences. Because an identity threshold for sequence subgrouping may vary according to the given sequence set, it is highly desirable to construct a robust subgrouping algorithm which automatically identifies an optimal identity threshold and generates subgroups for a given sequence set. To meet this end, an automatic sequence subgrouping method, named 'Subgrouping Automata' was constructed. Firstly, tree analysis module analyzes the structure of tree and calculates the all possible subgroups in each node. Sequence similarity analysis module calculates average sequence similarity for all subgroups in each node. Representative sequence generation module finds a representative sequence using profile analysis and self-scoring for each subgroup. For all nodes, average sequence similarities are calculated and 'Subgrouping Automata' searches a node showing statistically maximum sequence similarity increase using Student's t-value. A node showing the maximum t-value, which gives the most significant differences in average sequence similarity between two adjacent nodes, is determined as an optimum subgrouping node in the phylogenetic tree. Further analysis showed that the optimum subgrouping node from SA prevents under-subgrouping and over-subgrouping. Copyright © 2013. Published by Elsevier Ltd.

  16. Sequencing of BAC pools by different next generation sequencing platforms and strategies

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2011-10-01

    Full Text Available Abstract Background Next generation sequencing of BACs is a viable option for deciphering the sequence of even large and highly repetitive genomes. In order to optimize this strategy, we examined the influence of read length on the quality of Roche/454 sequence assemblies, to what extent Illumina/Solexa mate pairs (MPs improve the assemblies by scaffolding and whether barcoding of BACs is dispensable. Results Sequencing four BACs with both FLX and Titanium technologies revealed similar sequencing accuracy, but showed that the longer Titanium reads produce considerably less misassemblies and gaps. The 454 assemblies of 96 barcoded BACs were improved by scaffolding 79% of the total contig length with MPs from a non-barcoded library. Assembly of the unmasked 454 sequences without separation by barcodes revealed chimeric contig formation to be a major problem, encompassing 47% of the total contig length. Masking the sequences reduced this fraction to 24%. Conclusion Optimal BAC pool sequencing should be based on the longest available reads, with barcoding essential for a comprehensive assessment of both repetitive and non-repetitive sequence information. When interest is restricted to non-repetitive regions and repeats are masked prior to assembly, barcoding is non-essential. In any case, the assemblies can be improved considerably by scaffolding with non-barcoded BAC pool MPs.

  17. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  18. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  19. Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment

    Directory of Open Access Journals (Sweden)

    Daniels Noah M

    2012-10-01

    Full Text Available Abstract Background The quality of multiple protein structure alignments are usually computed and assessed based on geometric functions of the coordinates of the backbone atoms from the protein chains. These purely geometric methods do not utilize directly protein sequence similarity, and in fact, determining the proper way to incorporate sequence similarity measures into the construction and assessment of protein multiple structure alignments has proved surprisingly difficult. Results We present Formatt, a multiple structure alignment based on the Matt purely geometric multiple structure alignment program, that also takes into account sequence similarity when constructing alignments. We show that Formatt outperforms Matt and other popular structure alignment programs on the popular HOMSTRAD benchmark. For the SABMark twilight zone benchmark set that captures more remote homology, Formatt and Matt outperform other programs; depending on choice of embedded sequence aligner, Formatt produces either better sequence and structural alignments with a smaller core size than Matt, or similarly sized alignments with better sequence similarity, for a small cost in average RMSD. Conclusions Considering sequence information as well as purely geometric information seems to improve quality of multiple structure alignments, though defining what constitutes the best alignment when sequence and structural measures would suggest different alignments remains a difficult open question.

  20. Dynamic Sequence Assignment.

    Science.gov (United States)

    1983-12-01

    D-136 548 DYNAMIIC SEQUENCE ASSIGNMENT(U) ADVANCED INFORMATION AND 1/2 DECISION SYSTEMS MOUNTAIN YIELW CA C A 0 REILLY ET AL. UNCLSSIIED DEC 83 AI/DS...I ADVANCED INFORMATION & DECISION SYSTEMS Mountain View. CA 94040 84 u ,53 V,..’. Unclassified _____ SCURITY CLASSIFICATION OF THIS PAGE REPORT...reviews some important heuristic algorithms developed for fas- ter solution of the sequence assignment problem. 3.1. DINAMIC MOGRAMUNIG FORMULATION FOR

  1. HIV Sequence Compendium 2010

    Energy Technology Data Exchange (ETDEWEB)

    Kuiken, Carla [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Foley, Brian [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Christian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Alabama, Tuscaloosa, AL (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2010-12-31

    This compendium is an annual printed summary of the data contained in the HIV sequence database. In these compendia we try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2010. Hence, though it is called the 2010 Compendium, its contents correspond to the 2009 curated alignments on our website. The number of sequences in the HIV database is still increasing exponentially. In total, at the time of printing, there were 339,306 sequences in the HIV Sequence Database, an increase of 45% since last year. The number of near complete genomes (>7000 nucleotides) increased to 2576 by end of 2009, reflecting a smaller increase than in previous years. However, as in previous years, the compendium alignments contain only a small fraction of these. Included in the alignments are a small number of sequences representing each of the subtypes and the more prevalent circulating recombinant forms (CRFs) such as 01 and 02, as well as a few outgroup sequences (group O and N and SIV-CPZ). Of the rarer CRFs we included one representative each. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/content/sequence/NEWALIGN/align.html. Reprints are available from our website in the form of both HTML and PDF files. As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  2. General LTE Sequence

    OpenAIRE

    Billal, Masum

    2015-01-01

    In this paper,we have characterized sequences which maintain the same property described in Lifting the Exponent Lemma. Lifting the Exponent Lemma is a very powerful tool in olympiad number theory and recently it has become very popular. We generalize it to all sequences that maintain a property like it i.e. if p^{\\alpha}||a_k and p^\\b{eta}||n, then p^{{\\alpha}+\\b{eta}}||a_{nk}.

  3. Fuel Class Higher Alcohols

    KAUST Repository

    Sarathy, Mani

    2016-08-17

    This chapter focuses on the production and combustion of alcohol fuels with four or more carbon atoms, which we classify as higher alcohols. It assesses the feasibility of utilizing various C4-C8 alcohols as fuels for internal combustion engines. Utilizing higher-molecular-weight alcohols as fuels requires careful analysis of their fuel properties. ASTM standards provide fuel property requirements for spark-ignition (SI) and compression-ignition (CI) engines such as the stability, lubricity, viscosity, and cold filter plugging point (CFPP) properties of blends of higher alcohols. Important combustion properties that are studied include laminar and turbulent flame speeds, flame blowout/extinction limits, ignition delay under various mixing conditions, and gas-phase and particulate emissions. The chapter focuses on the combustion of higher alcohols in reciprocating SI and CI engines and discusses higher alcohol performance in SI and CI engines. Finally, the chapter identifies the sources, production pathways, and technologies currently being pursued for production of some fuels, including n-butanol, iso-butanol, and n-octanol.

  4. Pairwise Sequence Alignment Library

    Energy Technology Data Exchange (ETDEWEB)

    2015-05-20

    Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, a novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.

  5. Constraint Satisfaction Inference : Non-probabilistic Global Inference for Sequence Labelling

    NARCIS (Netherlands)

    Canisius, S.V.M.; van den Bosch, A.; Daelemans, W.; Basili, R.; Moschitti, A.

    2006-01-01

    We present a new method for performing sequence labelling based on the idea of using a machine-learning classifier to generate several possible output sequences, and then applying an inference procedure to select the best sequence among those. Most sequence labelling methods following a similar

  6. Higher spin gauge theories

    CERN Document Server

    Henneaux, Marc; Vasiliev, Mikhail A

    2017-01-01

    Symmetries play a fundamental role in physics. Non-Abelian gauge symmetries are the symmetries behind theories for massless spin-1 particles, while the reparametrization symmetry is behind Einstein's gravity theory for massless spin-2 particles. In supersymmetric theories these particles can be connected also to massless fermionic particles. Does Nature stop at spin-2 or can there also be massless higher spin theories. In the past strong indications have been given that such theories do not exist. However, in recent times ways to evade those constraints have been found and higher spin gauge theories have been constructed. With the advent of the AdS/CFT duality correspondence even stronger indications have been given that higher spin gauge theories play an important role in fundamental physics. All these issues were discussed at an international workshop in Singapore in November 2015 where the leading scientists in the field participated. This volume presents an up-to-date, detailed overview of the theories i...

  7. INTERNATIONALIZATION IN HIGHER EDUCATION

    Directory of Open Access Journals (Sweden)

    Catalina Crisan-Mitra

    2016-03-01

    Full Text Available Internationalization of higher education is one of the key trends of development. There are several approaches on how to achieve competitiveness and performance in higher education and international academic mobility; students’ exchange programs, partnerships are some of the aspects that can play a significant role in this process. This paper wants to point out the student’s perception regarding two main directions: one about the master students’ expectation regarding how an internationalized master should be organized and should function, and second the degree of satisfaction of the beneficiaries of internationalized master programs from Babe-Bolyai University. This article is based on an empirical qualitative research that was implemented to students of an internationalized master from the Faculty of Economics and Business Administration. This research can be considered a useful example for those preoccupied to increase the quality of higher education and conclusions drawn have relevance both theoretically and especially practically.

  8. Quality of Higher Education

    DEFF Research Database (Denmark)

    Zou, Yihuan; Zhao, Yingsheng; Du, Xiangyun

    . This transformation involves a broad scale of change at individual level, organizational level, and societal level. In this change process in higher education, staff development remains one of the key elements for university innovation and at the same time demands a systematic and holistic approach.......This paper starts with a critical approach to reflect on the current practice of quality assessment and assurance in higher education. This is followed by a proposal that in response to the global challenges for improving the quality of higher education, universities should take active actions...... of change by improving the quality of teaching and learning. From a constructivist perspective of understanding education and learning, this paper also discusses why and how universities should give more weight to learning and change the traditional role of teaching to an innovative approach of facilitation...

  9. Molecular diagnosis of Usher syndrome: application of two different next generation sequencing-based procedures.

    Directory of Open Access Journals (Sweden)

    Danilo Licastro

    Full Text Available Usher syndrome (USH is a clinically and genetically heterogeneous disorder characterized by visual and hearing impairments. Clinically, it is subdivided into three subclasses with nine genes identified so far. In the present study, we investigated whether the currently available Next Generation Sequencing (NGS technologies are already suitable for molecular diagnostics of USH. We analyzed a total of 12 patients, most of which were negative for previously described mutations in known USH genes upon primer extension-based microarray genotyping. We enriched the NGS template either by whole exome capture or by Long-PCR of the known USH genes. The main NGS sequencing platforms were used: SOLiD for whole exome sequencing, Illumina (Genome Analyzer II and Roche 454 (GS FLX for the Long-PCR sequencing. Long-PCR targeting was more efficient with up to 94% of USH gene regions displaying an overall coverage higher than 25×, whereas whole exome sequencing yielded a similar coverage for only 50% of those regions. Overall this integrated analysis led to the identification of 11 novel sequence variations in USH genes (2 homozygous and 9 heterozygous out of 18 detected. However, at least two cases were not genetically solved. Our result highlights the current limitations in the diagnostic use of NGS for USH patients. The limit for whole exome sequencing is linked to the need of a strong coverage and to the correct interpretation of sequence variations with a non obvious, pathogenic role, whereas the targeted approach suffers from the high genetic heterogeneity of USH that may be also caused by the presence of additional causative genes yet to be identified.

  10. Molecular Diagnosis of Usher Syndrome: Application of Two Different Next Generation Sequencing-Based Procedures

    Science.gov (United States)

    Licastro, Danilo; Mutarelli, Margherita; Peluso, Ivana; Neveling, Kornelia; Wieskamp, Nienke; Rispoli, Rossella; Vozzi, Diego; Athanasakis, Emmanouil; D'Eustacchio, Angela; Pizzo, Mariateresa; D'Amico, Francesca; Ziviello, Carmela; Simonelli, Francesca; Fabretto, Antonella; Scheffer, Hans; Gasparini, Paolo; Banfi, Sandro; Nigro, Vincenzo

    2012-01-01

    Usher syndrome (USH) is a clinically and genetically heterogeneous disorder characterized by visual and hearing impairments. Clinically, it is subdivided into three subclasses with nine genes identified so far. In the present study, we investigated whether the currently available Next Generation Sequencing (NGS) technologies are already suitable for molecular diagnostics of USH. We analyzed a total of 12 patients, most of which were negative for previously described mutations in known USH genes upon primer extension-based microarray genotyping. We enriched the NGS template either by whole exome capture or by Long-PCR of the known USH genes. The main NGS sequencing platforms were used: SOLiD for whole exome sequencing, Illumina (Genome Analyzer II) and Roche 454 (GS FLX) for the Long-PCR sequencing. Long-PCR targeting was more efficient with up to 94% of USH gene regions displaying an overall coverage higher than 25×, whereas whole exome sequencing yielded a similar coverage for only 50% of those regions. Overall this integrated analysis led to the identification of 11 novel sequence variations in USH genes (2 homozygous and 9 heterozygous) out of 18 detected. However, at least two cases were not genetically solved. Our result highlights the current limitations in the diagnostic use of NGS for USH patients. The limit for whole exome sequencing is linked to the need of a strong coverage and to the correct interpretation of sequence variations with a non obvious, pathogenic role, whereas the targeted approach suffers from the high genetic heterogeneity of USH that may be also caused by the presence of additional causative genes yet to be identified. PMID:22952768

  11. Reputation in Higher Education

    DEFF Research Database (Denmark)

    Martensen, Anne; Grønholdt, Lars

    2005-01-01

    leaders of higher education institutions to set strategic directions and support their decisions in an effort to create even better study programmes with a better reputation. Finally, managerial implications and directions for future research are discussed.Keywords: Reputation, image, corporate identity......The purpose of this paper is to develop a reputation model for higher education programmes, provide empirical evidence for the model and illustrate its application by using Copenhagen Business School (CBS) as the recurrent case. The developed model is a cause-and-effect model linking image...

  12. Reputation in Higher Education

    DEFF Research Database (Denmark)

    Plewa, Carolin; Ho, Joanne; Conduit, Jodie

    2016-01-01

    Reputation is critical for institutions wishing to attract and retain students in today's competitive higher education setting. Drawing on the resource based view and configuration theory, this research proposes that Higher Education Institutions (HEIs) need to understand not only the impact...... of independent resources but of resource configurations when seeking to achieve a strong, positive reputation. Utilizing fuzzy set qualitative comparative analysis (fsQCA), the paper provides insight into different configurations of resources that HEIs can utilize to build their reputation within their domestic...

  13. Navigating in higher education

    DEFF Research Database (Denmark)

    Thingholm, Hanne Balsby; Reimer, David; Keiding, Tina Bering

    Denne rapport er skrevet på baggrund af spørgeskemaundersøgelsen – Navigating in Higher Education (NiHE) – der rummer besvarelser fra 1410 bachelorstuderende og 283 undervisere fordelt på ni uddannelser fra Aarhus Universitet: Uddannelsesvidenskab, Historie, Nordisk sprog og litteratur, Informati......Denne rapport er skrevet på baggrund af spørgeskemaundersøgelsen – Navigating in Higher Education (NiHE) – der rummer besvarelser fra 1410 bachelorstuderende og 283 undervisere fordelt på ni uddannelser fra Aarhus Universitet: Uddannelsesvidenskab, Historie, Nordisk sprog og litteratur...

  14. Higher osmium beryllide

    International Nuclear Information System (INIS)

    Matyushenko, N.N.; Verkhorobin, L.F.; Serykh, V.P.; Pugachev, N.S.

    1982-01-01

    Results of experimental determination of composition and crystal structure of new beryllide OsBe 12 are presented. The beryllide is observed to be in equilibrium with Os 2 Be 17 (in the range of 90-92% Be) and α-Be phase (in the range of 93-99% Be). The structure OsBe 12 is similar to structures of the known beryllides Os 2 Be 17 and Os 3 Be 17

  15. Higher Education: A Time for Triage?

    Science.gov (United States)

    Lagowski, J. J.

    1995-10-01

    Higher education faces unprecedented challenges. The confluence of changing economic and demographic tends; new patterns of federal and state spending; more explicit expectations by students and their families for affordable, accessible education; and heightened scrutiny by those who claim a legitimate interest in higher education is inescapably altering the environment in which this system operates. Higher education will never again be as it was before. Further, many believe that tinkering around the margins is no longer an adequate response to the new demands. Fundamental change is deemed necessary to meet the challenge of this melange of pressures. A number of commentators have observed that political and corporate America have responded to their challenges by instituting a fundamental restructuring of those institutions. The medical community is also in the midst of a similar basic restructuring of the health care delivery system in this country. Now its education's turn. People are questioning the historically expressed mission of higher education. They make the claim that we cost too much, spend carelessly, teach poorly, plan myopically, and when questioned, act defensively. Educational administrators, from department chairs up, are confronted with the task of simultaneously reforming and cutting back. They have no choice. They must establish politically sophisticated priority settings and effect a hard-nosed reallocation of resources in a social environment where competing public needs have equivalent--or stronger--emotional pulls. Triage in a medical context involves confronting an emergency in which the demand for attention far outstrips available assistance by establishing a sequence of care in which one key individual orchestrates the application of harsh priorities which have been designed to maximize the number of survivors. In recent years, the decisions that have been made in some centers of higher education bear a striking similarity. The literature

  16. Emergent self-similarity of cluster coagulation

    Science.gov (United States)

    Pushkin, Dmtiri O.

    A wide variety of nonequilibrium processes, such as coagulation of colloidal particles, aggregation of bacteria into colonies, coalescence of rain drops, bond formation between polymerization sites, and formation of planetesimals, fall under the rubric of cluster coagulation. We predict emergence of self-similar behavior in such systems when they are 'forced' by an external source of the smallest particles. The corresponding self-similar coagulation spectra prove to be power laws. Starting from the classical Smoluchowski coagulation equation, we identify the conditions required for emergence of self-similarity and show that the power-law exponent value for a particular coagulation mechanism depends on the homogeneity index of the corresponding coagulation kernel only. Next, we consider the current wave of mergers of large American banks as an 'unorthodox' application of coagulation theory. We predict that the bank size distribution has propensity to become a power law, and verify our prediction in a statistical study of the available economical data. We conclude this chapter by discussing economically significant phenomenon of capital condensation and predicting emergence of power-law distributions in other economical and social data. Finally, we turn to apparent semblance between cluster coagulation and turbulence and conclude that it is not accidental: both of these processes are instances of nonlinear cascades. This class of processes also includes river network formation models, certain force-chain models in granular mechanics, fragmentation due to collisional cascades, percolation, and growing random networks. We characterize a particular cascade by three indicies and show that the resulting power-law spectrum exponent depends on the indicies values only. The ensuing algebraic formula is remarkable for its simplicity.

  17. Spherically symmetric self-similar universe

    Energy Technology Data Exchange (ETDEWEB)

    Dyer, C C [Toronto Univ., Ontario (Canada)

    1979-10-01

    A spherically symmetric self-similar dust-filled universe is considered as a simple model of a hierarchical universe. Observable differences between the model in parabolic expansion and the corresponding homogeneous Einstein-de Sitter model are considered in detail. It is found that an observer at the centre of the distribution has a maximum observable redshift and can in principle see arbitrarily large blueshifts. It is found to yield an observed density-distance law different from that suggested by the observations of de Vaucouleurs. The use of these solutions as central objects for Swiss-cheese vacuoles is discussed.

  18. Image magnification based on similarity analogy

    International Nuclear Information System (INIS)

    Chen Zuoping; Ye Zhenglin; Wang Shuxun; Peng Guohua

    2009-01-01

    Aiming at the high time complexity of the decoding phase in the traditional image enlargement methods based on fractal coding, a novel image magnification algorithm is proposed in this paper, which has the advantage of iteration-free decoding, by using the similarity analogy between an image and its zoom-out and zoom-in. A new pixel selection technique is also presented to further improve the performance of the proposed method. Furthermore, by combining some existing fractal zooming techniques, an efficient image magnification algorithm is obtained, which can provides the image quality as good as the state of the art while greatly decrease the time complexity of the decoding phase.

  19. Similar on the Inside (pre-grinding)

    Science.gov (United States)

    2004-01-01

    This approximate true-color image taken by the panoramic camera on the Mars Exploration Rover Opportunity show the rock called 'Pilbara' located in the small crater dubbed 'Fram.' The rock appears to be dotted with the same 'blueberries,' or spherules, found at 'Eagle Crater.' Spirit drilled into this rock with its rock abrasion tool. After analyzing the hole with the rover's scientific instruments, scientists concluded that Pilbara has a similar chemical make-up, and thus watery past, to rocks studied at Eagle Crater. This image was taken with the panoramic camera's 480-, 530- and 600-nanometer filters.

  20. Similar on the Inside (post-grinding)

    Science.gov (United States)

    2004-01-01

    This approximate true-color image taken by the panoramic camera on the Mars Exploration Rover Opportunity show the hole drilled into the rock called 'Pilbara,' which is located in the small crater dubbed 'Fram.' Spirit drilled into this rock with its rock abrasion tool. The rock appears to be dotted with the same 'blueberries,' or spherules, found at 'Eagle Crater.' After analyzing the hole with the rover's scientific instruments, scientists concluded that Pilbara has a similar chemical make-up, and thus watery past, to rocks studied at Eagle Crater. This image was taken with the panoramic camera's 480-, 530- and 600-nanometer filters.

  1. Self-similar magnetohydrodynamic boundary layers

    Energy Technology Data Exchange (ETDEWEB)

    Nunez, Manuel; Lastra, Alberto, E-mail: mnjmhd@am.uva.e [Departamento de Analisis Matematico, Universidad de Valladolid, 47005 Valladolid (Spain)

    2010-10-15

    The boundary layer created by parallel flow in a magnetized fluid of high conductivity is considered in this paper. Under appropriate boundary conditions, self-similar solutions analogous to the ones studied by Blasius for the hydrodynamic problem may be found. It is proved that for these to be stable, the size of the Alfven velocity at the outer flow must be smaller than the flow velocity, a fact that has a ready physical explanation. The process by which the transverse velocity and the thickness of the layer grow with the size of the Alfven velocity is detailed.

  2. Self-similar magnetohydrodynamic boundary layers

    International Nuclear Information System (INIS)

    Nunez, Manuel; Lastra, Alberto

    2010-01-01

    The boundary layer created by parallel flow in a magnetized fluid of high conductivity is considered in this paper. Under appropriate boundary conditions, self-similar solutions analogous to the ones studied by Blasius for the hydrodynamic problem may be found. It is proved that for these to be stable, the size of the Alfven velocity at the outer flow must be smaller than the flow velocity, a fact that has a ready physical explanation. The process by which the transverse velocity and the thickness of the layer grow with the size of the Alfven velocity is detailed.

  3. [Similarity system theory to evaluate similarity of chromatographic fingerprints of traditional Chinese medicine].

    Science.gov (United States)

    Liu, Yongsuo; Meng, Qinghua; Jiang, Shumin; Hu, Yuzhu

    2005-03-01

    The similarity evaluation of the fingerprints is one of the most important problems in the quality control of the traditional Chinese medicine (TCM). Similarity measures used to evaluate the similarity of the common peaks in the chromatogram of TCM have been discussed. Comparative studies were carried out among correlation coefficient, cosine of the angle and an improved extent similarity method using simulated data and experimental data. Correlation coefficient and cosine of the angle are not sensitive to the differences of the data set. They are still not sensitive to the differences of the data even after normalization. According to the similarity system theory, an improved extent similarity method was proposed. The improved extent similarity is more sensitive to the differences of the data sets than correlation coefficient and cosine of the angle. And the character of the data sets needs not to be changed compared with log-transformation. The improved extent similarity can be used to evaluate the similarity of the chromatographic fingerprints of TCM.

  4. Sequence determinants of human microsatellite variability

    Directory of Open Access Journals (Sweden)

    Jakobsson Mattias

    2009-12-01

    Full Text Available Abstract Background Microsatellite loci are frequently used in genomic studies of DNA sequence repeats and in population studies of genetic variability. To investigate the effect of sequence properties of microsatellites on their level of variability we have analyzed genotypes at 627 microsatellite loci in 1,048 worldwide individuals from the HGDP-CEPH cell line panel together with the DNA sequences of these microsatellites in the human RefSeq database. Results Calibrating PCR fragment lengths in individual genotypes by using the RefSeq sequence enabled us to infer repeat number in the HGDP-CEPH dataset and to calculate the mean number of repeats (as opposed to the mean PCR fragment length, under the assumption that differences in PCR fragment length reflect differences in the numbers of repeats in the embedded repeat sequences. We find the mean and maximum numbers of repeats across individuals to be positively correlated with heterozygosity. The size and composition of the repeat unit of a microsatellite are also important factors in predicting heterozygosity, with tetra-nucleotide repeat units high in G/C content leading to higher heterozygosity. Finally, we find that microsatellites containing more separate sets of repeated motifs generally have higher heterozygosity. Conclusions These results suggest that sequence properties of microsatellites have a significant impact in determining the features of human microsatellite variability.

  5. [Genome similarity of Baikal omul and sig].

    Science.gov (United States)

    Bychenko, O S; Sukhanova, L V; Ukolova, S S; Skvortsov, T A; Potapov, V K; Azhikina, T L; Sverdlov, E D

    2009-01-01

    Two members of the Baikal sig family, a lake sig (Coregonus lavaretus baicalensis Dybovsky) and omul (C. autumnalis migratorius Georgi), are close relatives that diverged from the same ancestor 10-20 thousand years ago. In this work, we studied genomic polymorphism of these two fish species. The method of subtraction hybridization (SH) did not reveal the presence of extended sequences in the sig genome and their absence in the omul genome. All the fragments found by SH corresponded to polymorphous noncoding genome regions varying in mononucleotide substitutions and short deletions. Many of them are mapped close to genes of the immune system and have regions identical to the Tc-1-like transposons abundant among fish, whose transcription activity may affect the expression of adjacent genes. Thus, we showed for the first time that genetic differences between Baikal sig family members are extremely small and cannot be revealed by the SH method. This is another endorsement of the hypothesis on the close relationship between Baikal sig and omul and their evolutionarily recent divergence from a common ancestor.

  6. Exploring Higher Thinking.

    Science.gov (United States)

    Conover, Willis M.

    1992-01-01

    Maintains that the social studies reform movement includes a call for the de-emphasis of rote memory and more attention to the development of higher-order thinking skills. Discusses the "thinking tasks" concept derived from the work of Hilda Taba and asserts that the tasks can be used with almost any social studies topic. (CFR)

  7. Higher-Order Hierarchies

    DEFF Research Database (Denmark)

    Ernst, Erik

    2003-01-01

    This paper introduces the notion of higher-order inheritance hierarchies. They are useful because they provide well-known benefits of object-orientation at the level of entire hierarchies-benefits which are not available with current approaches. Three facets must be adressed: First, it must be po...

  8. Inflation from higher dimensions

    International Nuclear Information System (INIS)

    Shafi, Q.

    1987-01-01

    We argue that an inflationary phase in the very early universe is related to the transition from a higher dimensional to a four-dimensional universe. We present details of a previously considered model which gives sufficient inflation without fine tuning of parameters. (orig.)

  9. Higher Education Funding Formulas.

    Science.gov (United States)

    McKeown-Moak, Mary P.

    1999-01-01

    One of the most critical components of the college or university chief financial officer's job is budget planning, especially using formulas. A discussion of funding formulas looks at advantages, disadvantages, and types of formulas used by states in budgeting for higher education, and examines how chief financial officers can position the campus…

  10. Liberty and Higher Education.

    Science.gov (United States)

    Thompson, Dennis F.

    1989-01-01

    John Stuart Mill's principle of liberty is discussed with the view that it needs to be revised to guide moral judgments in higher education. Three key elements need to be modified: the action that is constrained; the constraint on the action; and the agent whose action is constrained. (MLW)

  11. Fuel Class Higher Alcohols

    KAUST Repository

    Sarathy, Mani

    2016-01-01

    This chapter focuses on the production and combustion of alcohol fuels with four or more carbon atoms, which we classify as higher alcohols. It assesses the feasibility of utilizing various C4-C8 alcohols as fuels for internal combustion engines

  12. Evaluation in Higher Education

    Science.gov (United States)

    Bognar, Branko; Bungic, Maja

    2014-01-01

    One of the means of transforming classroom experience is by conducting action research with students. This paper reports about the action research with university students. It has been carried out within a semester of the course "Methods of Upbringing". Its goal has been to improve evaluation of higher education teaching. Different forms…

  13. Higher-level Innovization

    DEFF Research Database (Denmark)

    Bandaru, Sunith; Tutum, Cem Celal; Deb, Kalyanmoy

    2011-01-01

    we introduce the higher-level innovization task through an application of a manufacturing process simulation for the Friction Stir Welding (FSW) process where commonalities among two different Pareto-optimal fronts are analyzed. Multiple design rules are simultaneously deciphered from each front...

  14. Benchmarking for Higher Education.

    Science.gov (United States)

    Jackson, Norman, Ed.; Lund, Helen, Ed.

    The chapters in this collection explore the concept of benchmarking as it is being used and developed in higher education (HE). Case studies and reviews show how universities in the United Kingdom are using benchmarking to aid in self-regulation and self-improvement. The chapters are: (1) "Introduction to Benchmarking" (Norman Jackson…

  15. Creativity in Higher Education

    Science.gov (United States)

    Gaspar, Drazena; Mabic, Mirela

    2015-01-01

    The paper presents results of research related to perception of creativity in higher education made by the authors at the University of Mostar from Bosnia and Herzegovina. This research was based on a survey conducted among teachers and students at the University. The authors developed two types of questionnaires, one for teachers and the other…

  16. California's Future: Higher Education

    Science.gov (United States)

    Johnson, Hans

    2015-01-01

    California's higher education system is not keeping up with the changing economy. Projections suggest that the state's economy will continue to need more highly educated workers. In 2025, if current trends persist, 41 percent of jobs will require at least a bachelor's degree and 36 percent will require some college education short of a bachelor's…

  17. Cyberbullying in Higher Education

    Science.gov (United States)

    Minor, Maria A.; Smith, Gina S.; Brashen, Henry

    2013-01-01

    Bullying has extended beyond the schoolyard into online forums in the form of cyberbullying. Cyberbullying is a growing concern due to the effect on its victims. Current studies focus on grades K-12; however, cyberbullying has entered the world of higher education. The focus of this study was to identify the existence of cyberbullying in higher…

  18. Gait Recognition Using Image Self-Similarity

    Directory of Open Access Journals (Sweden)

    Chiraz BenAbdelkader

    2004-04-01

    Full Text Available Gait is one of the few biometrics that can be measured at a distance, and is hence useful for passive surveillance as well as biometric applications. Gait recognition research is still at its infancy, however, and we have yet to solve the fundamental issue of finding gait features which at once have sufficient discrimination power and can be extracted robustly and accurately from low-resolution video. This paper describes a novel gait recognition technique based on the image self-similarity of a walking person. We contend that the similarity plot encodes a projection of gait dynamics. It is also correspondence-free, robust to segmentation noise, and works well with low-resolution video. The method is tested on multiple data sets of varying sizes and degrees of difficulty. Performance is best for fronto-parallel viewpoints, whereby a recognition rate of 98% is achieved for a data set of 6 people, and 70% for a data set of 54 people.

  19. Self-similarity in applied superconductivity

    International Nuclear Information System (INIS)

    Dresner, Lawrence

    1981-09-01

    Self-similarity is a descriptive term applying to a family of curves. It means that the family is invariant to a one-parameter group of affine (stretching) transformations. The property of self-similarity has been exploited in a wide variety of problems in applied superconductivity, namely, (i) transient distribution of the current among the filaments of a superconductor during charge-up, (ii) steady distribution of current among the filaments of a superconductor near the current leads, (iii) transient heat transfer in superfluid helium, (iv) transient diffusion in cylindrical geometry (important in studying the growth rate of the reacted layer in A15 materials), (v) thermal expulsion of helium from quenching cable-in-conduit conductors, (vi) eddy current heating of irregular plates by slow, ramped fields, and (vii) the specific heat of type-II superconductors. Most, but not all, of the applications involve differential equations, both ordinary and partial. The novel methods explained in this report should prove of great value in other fields, just as they already have done in applied superconductivity. (author)

  20. Phonological similarity effect in complex span task.

    Science.gov (United States)

    Camos, Valérie; Mora, Gérôme; Barrouillet, Pierre

    2013-01-01

    The aim of our study was to test the hypothesis that two systems are involved in verbal working memory; one is specifically dedicated to the maintenance of phonological representations through verbal rehearsal while the other would maintain multimodal representations through attentional refreshing. This theoretical framework predicts that phonologically related phenomena such as the phonological similarity effect (PSE) should occur when the domain-specific system is involved in maintenance, but should disappear when concurrent articulation hinders its use. Impeding maintenance in the domain-general system by a concurrent attentional demand should impair recall performance without affecting PSE. In three experiments, we manipulated the concurrent articulation and the attentional demand induced by the processing component of complex span tasks in which participants had to maintain lists of either similar or dissimilar words. Confirming our predictions, PSE affected recall performance in complex span tasks. Although both the attentional demand and the articulatory requirement of the concurrent task impaired recall, only the induction of an articulatory suppression during maintenance made the PSE disappear. These results suggest a duality in the systems devoted to verbal maintenance in the short term, constraining models of working memory.

  1. Popularity versus similarity in growing networks.

    Science.gov (United States)

    Papadopoulos, Fragkiskos; Kitsak, Maksim; Serrano, M Ángeles; Boguñá, Marián; Krioukov, Dmitri

    2012-09-27

    The principle that 'popularity is attractive' underlies preferential attachment, which is a common explanation for the emergence of scaling in growing networks. If new connections are made preferentially to more popular nodes, then the resulting distribution of the number of connections possessed by nodes follows power laws, as observed in many real networks. Preferential attachment has been directly validated for some real networks (including the Internet), and can be a consequence of different underlying processes based on node fitness, ranking, optimization, random walks or duplication. Here we show that popularity is just one dimension of attractiveness; another dimension is similarity. We develop a framework in which new connections optimize certain trade-offs between popularity and similarity, instead of simply preferring popular nodes. The framework has a geometric interpretation in which popularity preference emerges from local optimization. As opposed to preferential attachment, our optimization framework accurately describes the large-scale evolution of technological (the Internet), social (trust relationships between people) and biological (Escherichia coli metabolic) networks, predicting the probability of new links with high precision. The framework that we have developed can thus be used for predicting new links in evolving networks, and provides a different perspective on preferential attachment as an emergent phenomenon.

  2. Predicting the performance of fingerprint similarity searching.

    Science.gov (United States)

    Vogt, Martin; Bajorath, Jürgen

    2011-01-01

    Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially depends on three parameters: the nature of the fingerprint, the active compounds serving as reference molecules, and the composition of the screening database. It is of considerable interest and practical relevance to predict the performance of fingerprint similarity searching. A quantitative assessment of the potential that a fingerprint search might successfully retrieve active compounds, if available in the screening database, would substantially help to select the type of fingerprint most suitable for a given search problem. The method presented herein utilizes concepts from information theory to relate the fingerprint feature distributions of reference compounds to screening libraries. If these feature distributions do not sufficiently differ, active database compounds that are similar to reference molecules cannot be retrieved because they disappear in the "background." By quantifying the difference in feature distribution using the Kullback-Leibler divergence and relating the divergence to compound recovery rates obtained for different benchmark classes, fingerprint search performance can be quantitatively predicted.

  3. Competitiveness - higher education

    Directory of Open Access Journals (Sweden)

    Labas Istvan

    2016-03-01

    Full Text Available Involvement of European Union plays an important role in the areas of education and training equally. The member states are responsible for organizing and operating their education and training systems themselves. And, EU policy is aimed at supporting the efforts of member states and trying to find solutions for the common challenges which appear. In order to make our future sustainable maximally; the key to it lies in education. The highly qualified workforce is the key to development, advancement and innovation of the world. Nowadays, the competitiveness of higher education institutions has become more and more appreciated in the national economy. In recent years, the frameworks of operation of higher education systems have gone through a total transformation. The number of applying students is continuously decreasing in some European countries therefore only those institutions can “survive” this shortfall, which are able to minimize the loss of the number of students. In this process, the factors forming the competitiveness of these budgetary institutions play an important role from the point of view of survival. The more competitive a higher education institution is, the greater the chance is that the students would like to continue their studies there and thus this institution will have a greater chance for the survival in the future, compared to ones lagging behind in the competition. Aim of our treatise prepared is to present the current situation and main data of the EU higher education and we examine the performance of higher education: to what extent it fulfils the strategy for smart, sustainable and inclusive growth which is worded in the framework of Europe 2020 programme. The treatise is based on analysis of statistical data.

  4. Structural similarities between prokaryotic and eukaryotic 5S ribosomal RNAs

    International Nuclear Information System (INIS)

    Welfle, H.; Boehm, S.; Damaschun, G.; Fabian, H.; Gast, K.; Misselwitz, R.; Mueller, J.J.; Zirwer, D.; Filimonov, V.V.; Venyaminov, S.Yu.; Zalkova, T.N.

    1986-01-01

    5S RNAs from rat liver and E. coli have been studied by diffuse X-ray and dynamic light scattering and by infrared and Raman spectroscopy. Identical structures at a resolution of 1 nm can be deduced from the comparison of the experimental X-ray scattering curves and electron distance distribution functions and from the agreement of the shape parameters. A flat shape model with a compact central region and two protruding arms was derived. Double helical stems are eleven-fold helices with a mean base pair distance of 0.28 nm. The number of base pairs (26 GC, 9 AU for E. coli; 27 GC, 9 AU for rat liver) and the degree of base stacking are the same within the experimental error. A very high regularity in the ribophosphate backbone is indicated for both 5S RNAs. The observed structural similarity and the consensus secondary structure pattern derived from comparative sequence analyses suggest the conclusion that prokaryotic and eukaryotic 5S RNAs are in general very similar with respect to their fundamental structural features. (author)

  5. GIS: a comprehensive source for protein structure similarities.

    Science.gov (United States)

    Guerler, Aysam; Knapp, Ernst-Walter

    2010-07-01

    A web service for analysis of protein structures that are sequentially or non-sequentially similar was generated. Recently, the non-sequential structure alignment algorithm GANGSTA+ was introduced. GANGSTA+ can detect non-sequential structural analogs for proteins stated to possess novel folds. Since GANGSTA+ ignores the polypeptide chain connectivity of secondary structure elements (i.e. alpha-helices and beta-strands), it is able to detect structural similarities also between proteins whose sequences were reshuffled during evolution. GANGSTA+ was applied in an all-against-all comparison on the ASTRAL40 database (SCOP version 1.75), which consists of >10,000 protein domains yielding about 55 x 10(6) possible protein structure alignments. Here, we provide the resulting protein structure alignments as a public web-based service, named GANGSTA+ Internet Services (GIS). We also allow to browse the ASTRAL40 database of protein structures with GANGSTA+ relative to an externally given protein structure using different constraints to select specific results. GIS allows us to analyze protein structure families according to the SCOP classification scheme. Additionally, users can upload their own protein structures for pairwise protein structure comparison, alignment against all protein structures of the ASTRAL40 database (SCOP version 1.75) or symmetry analysis. GIS is publicly available at http://agknapp.chemie.fu-berlin.de/gplus.

  6. On fuzzy semantic similarity measure for DNA coding.

    Science.gov (United States)

    Ahmad, Muneer; Jung, Low Tang; Bhuiyan, Md Al-Amin

    2016-02-01

    A coding measure scheme numerically translates the DNA sequence to a time domain signal for protein coding regions identification. A number of coding measure schemes based on numerology, geometry, fixed mapping, statistical characteristics and chemical attributes of nucleotides have been proposed in recent decades. Such coding measure schemes lack the biologically meaningful aspects of nucleotide data and hence do not significantly discriminate coding regions from non-coding regions. This paper presents a novel fuzzy semantic similarity measure (FSSM) coding scheme centering on FSSM codons׳ clustering and genetic code context of nucleotides. Certain natural characteristics of nucleotides i.e. appearance as a unique combination of triplets, preserving special structure and occurrence, and ability to own and share density distributions in codons have been exploited in FSSM. The nucleotides׳ fuzzy behaviors, semantic similarities and defuzzification based on the center of gravity of nucleotides revealed a strong correlation between nucleotides in codons. The proposed FSSM coding scheme attains a significant enhancement in coding regions identification i.e. 36-133% as compared to other existing coding measure schemes tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.; Bonny, Talal; Salama, Khaled N.

    2012-01-01

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  8. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.

    2012-01-26

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  9. Similarity Analysis of Cable Insulations by Chemical Test

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Jong Seog [Central Research Institute of Korea Hydro and Nuclear Power Co., Daejeon (Korea, Republic of)

    2013-10-15

    As result of this experiment, it was found that FT-IR test for material composition, TGA test for aging trend are applicable for similarity analysis of cable materials. OIT is recommended as option if TGA doesn't show good trend. Qualification of new insulation by EQ report of old insulation should be based on higher activation energy of new insulation than that of old one in the consideration of conservatism. In old nuclear power plant, it is easy to find black cable which has no marking of cable information such as manufacturer, material name and voltage. If a type test is required for qualification of these cables, how could I select representative cable? How could I determine the similarity of these cables? If manufacturer has qualified a cable for nuclear power plant more than a decade ago and composition of cable material is changed with similar one, is it acceptable to use the old EQ report for recently manufactured cable? It is well known to use FT-IR method to determine the similarity of cable materials. Infrared ray is easy tool to compare compositions of each material. But, it is not proper to compare aging trend of these materials. Study for similarity analysis of cable insulation by chemical test is described herein. To study a similarity evaluation method for polymer materials, FT-IR, TGA and OIT tests were performed for two cable insulation(old and new) which were supplied from same manufacturer. FT-IR shows good result to compare material compositions while TGA and OIT show good result to compare aging character of materials.

  10. Similarity Analysis of Cable Insulations by Chemical Test

    International Nuclear Information System (INIS)

    Kim, Jong Seog

    2013-01-01

    As result of this experiment, it was found that FT-IR test for material composition, TGA test for aging trend are applicable for similarity analysis of cable materials. OIT is recommended as option if TGA doesn't show good trend. Qualification of new insulation by EQ report of old insulation should be based on higher activation energy of new insulation than that of old one in the consideration of conservatism. In old nuclear power plant, it is easy to find black cable which has no marking of cable information such as manufacturer, material name and voltage. If a type test is required for qualification of these cables, how could I select representative cable? How could I determine the similarity of these cables? If manufacturer has qualified a cable for nuclear power plant more than a decade ago and composition of cable material is changed with similar one, is it acceptable to use the old EQ report for recently manufactured cable? It is well known to use FT-IR method to determine the similarity of cable materials. Infrared ray is easy tool to compare compositions of each material. But, it is not proper to compare aging trend of these materials. Study for similarity analysis of cable insulation by chemical test is described herein. To study a similarity evaluation method for polymer materials, FT-IR, TGA and OIT tests were performed for two cable insulation(old and new) which were supplied from same manufacturer. FT-IR shows good result to compare material compositions while TGA and OIT show good result to compare aging character of materials

  11. Vere-Jones' self-similar branching model

    International Nuclear Information System (INIS)

    Saichev, A.; Sornette, D.

    2005-01-01

    Motivated by its potential application to earthquake statistics as well as for its intrinsic interest in the theory of branching processes, we study the exactly self-similar branching process introduced recently by Vere-Jones. This model extends the ETAS class of conditional self-excited branching point-processes of triggered seismicity by removing the problematic need for a minimum (as well as maximum) earthquake size. To make the theory convergent without the need for the usual ultraviolet and infrared cutoffs, the distribution of magnitudes m ' of daughters of first-generation of a mother of magnitude m has two branches m ' ' >m with exponent β+d, where β and d are two positive parameters. We investigate the condition and nature of the subcritical, critical, and supercritical regime in this and in an extended version interpolating smoothly between several models. We predict that the distribution of magnitudes of events triggered by a mother of magnitude m over all generations has also two branches m ' ' >m with exponent β+h, with h=d√(1-s), where s is the fraction of triggered events. This corresponds to a renormalization of the exponent d into h by the hierarchy of successive generations of triggered events. For a significant part of the parameter space, the distribution of magnitudes over a full catalog summed over an average steady flow of spontaneous sources (immigrants) reproduces the distribution of the spontaneous sources with a single branch and is blind to the exponents β,d of the distribution of triggered events. Since the distribution of earthquake magnitudes is usually obtained with catalogs including many sequences, we conclude that the two branches of the distribution of aftershocks are not directly observable and the model is compatible with real seismic catalogs. In summary, the exactly self-similar Vere-Jones model provides an attractive new approach to model triggered seismicity, which alleviates delicate questions on the role of

  12. Similar or different?: the importance of similarities and differences for support between siblings

    NARCIS (Netherlands)

    Voorpostel, M.; van der Lippe, T.; Dykstra, P.A.; Flap, H.

    2007-01-01

    Using a large-scale Dutch national sample (N = 7,126), the authors examine the importance of similarities and differences in the sibling dyad for the provision of support. Similarities are assumed to enhance attraction and empathy; differences are assumed to be related to different possibilities for

  13. Similar or Different? The Importance of Similarities and Differences for Support Between Siblings

    NARCIS (Netherlands)

    Voorpostel, Marieke; Lippe, Tanja van der; Dykstra, Pearl A.; Flap, Henk

    2007-01-01

    Using a large-scale Dutch national sample (N = 7,126), the authors examine the importance of similarities and differences in the sibling dyad for the provision of support. Similarities are assumed to enhance attraction and empathy; differences are assumed to be related to different possibilities for

  14. Suspect filler similarity in eyewitness lineups: a literature review and a novel methodology.

    Science.gov (United States)

    Fitzgerald, Ryan J; Oriet, Chris; Price, Heather L

    2015-02-01

    Eyewitness lineups typically contain a suspect (guilty or innocent) and fillers (known innocents). The degree to which fillers should resemble the suspect is a complex issue that has yet to be resolved. Previously, researchers have voiced concern that eyewitnesses would be unable to identify their target from a lineup containing highly similar fillers; however, our literature review suggests highly similar fillers have only rarely been shown to have this effect. To further examine the effect of highly similar fillers on lineup responses, we used morphing software to create fillers of moderately high and very high similarity to the suspect. When the culprit was in the lineup, a higher correct identification rate was observed in moderately high similarity lineups than in very high similarity lineups. When the culprit was absent, similarity did not yield a significant effect on innocent suspect misidentification rates. However, the correct rejection rate in the moderately high similarity lineup was 20% higher than in the very high similarity lineup. When choosing rates were controlled by calculating identification probabilities for only those who made a selection from the lineup, culprit identification rates as well as innocent suspect misidentification rates were significantly higher in the moderately high similarity lineup than in the very high similarity lineup. Thus, very high similarity fillers yielded costs and benefits. Although our research suggests that selecting the most similar fillers available may adversely affect correct identification rates, we recommend additional research using fillers obtained from police databases to corroborate our findings.

  15. Higher-Order and Symbolic Computation

    DEFF Research Database (Denmark)

    Danvy, Olivier; Mason, Ian

    2008-01-01

    a series of implementaions that properly account for multiple invocations of the derivative-taking opeatro. In "Adapting Functional Programs to Higher-Order Logic," Scott Owens and Konrad Slind present a variety of examples of terminiation proofs of functional programs written in HOL proof systems. Since......-calculus programs, historically. The anaylsis determines the possible locations of ambients and mirrors the temporla sequencing of actions in the structure of types....

  16. Dynamical similarity of geomagnetic field reversals.

    Science.gov (United States)

    Valet, Jean-Pierre; Fournier, Alexandre; Courtillot, Vincent; Herrero-Bervera, Emilio

    2012-10-04

    No consensus has been reached so far on the properties of the geomagnetic field during reversals or on the main features that might reveal its dynamics. A main characteristic of the reversing field is a large decrease in the axial dipole and the dominant role of non-dipole components. Other features strongly depend on whether they are derived from sedimentary or volcanic records. Only thermal remanent magnetization of lava flows can capture faithful records of a rapidly varying non-dipole field, but, because of episodic volcanic activity, sequences of overlying flows yield incomplete records. Here we show that the ten most detailed volcanic records of reversals can be matched in a very satisfactory way, under the assumption of a common duration, revealing common dynamical characteristics. We infer that the reversal process has remained unchanged, with the same time constants and durations, at least since 180 million years ago. We propose that the reversing field is characterized by three successive phases: a precursory event, a 180° polarity switch and a rebound. The first and third phases reflect the emergence of the non-dipole field with large-amplitude secular variation. They are rarely both recorded at the same site owing to the rapidly changing field geometry and last for less than 2,500 years. The actual transit between the two polarities does not last longer than 1,000 years and might therefore result from mechanisms other than those governing normal secular variation. Such changes are too brief to be accurately recorded by most sediments.

  17. Dynamic programming algorithms for biological sequence comparison.

    Science.gov (United States)

    Pearson, W R; Miller, W

    1992-01-01

    Efficient dynamic programming algorithms are available for a broad class of protein and DNA sequence comparison problems. These algorithms require computer time proportional to the product of the lengths of the two sequences being compared [O(N2)] but require memory space proportional only to the sum of these lengths [O(N)]. Although the requirement for O(N2) time limits use of the algorithms to the largest computers when searching protein and DNA sequence databases, many other applications of these algorithms, such as calculation of distances for evolutionary trees and comparison of a new sequence to a library of sequence profiles, are well within the capabilities of desktop computers. In particular, the results of library searches with rapid searching programs, such as FASTA or BLAST, should be confirmed by performing a rigorous optimal alignment. Whereas rapid methods do not overlook significant sequence similarities, FASTA limits the number of gaps that can be inserted into an alignment, so that a rigorous alignment may extend the alignment substantially in some cases. BLAST does not allow gaps in the local regions that it reports; a calculation that allows gaps is very likely to extend the alignment substantially. Although a Monte Carlo evaluation of the statistical significance of a similarity score with a rigorous algorithm is much slower than the heuristic approach used by the RDF2 program, the dynamic programming approach should take less than 1 hr on a 386-based PC or desktop Unix workstation. For descriptive purposes, we have limited our discussion to methods for calculating similarity scores and distances that use gap penalties of the form g = rk. Nevertheless, programs for the more general case (g = q+rk) are readily available. Versions of these programs that run either on Unix workstations, IBM-PC class computers, or the Macintosh can be obtained from either of the authors.

  18. Numerical study of similarity in prototype and model pumped turbines

    International Nuclear Information System (INIS)

    Li, Z J; Wang, Z W; Bi, H L

    2014-01-01

    Similarity study of prototype and model pumped turbines are performed by numerical simulation and the partial discharge case is analysed in detail. It is found out that in the RSI (rotor-stator interaction) region where the flow is convectively accelerated with minor flow separation, a high level of similarity in flow patterns and pressure fluctuation appear with relative pressure fluctuation amplitude of model turbine slightly higher than that of prototype turbine. As for the condition in the runner where the flow is convectively accelerated with severe separation, similarity fades substantially due to different topology of flow separation and vortex formation brought by distinctive Reynolds numbers of the two turbines. In the draft tube where the flow is diffusively decelerated, similarity becomes debilitated owing to different vortex rope formation impacted by Reynolds number. It is noted that the pressure fluctuation amplitude and characteristic frequency of model turbine are larger than those of prototype turbine. The differences in pressure fluctuation characteristics are discussed theoretically through dimensionless Navier-Stokes equation. The above conclusions are all made based on simulation without regard to the penstock response and resonance

  19. Similarity of Ferrosilicon Submerged Arc Furnaces With Different Geometrical Parameters

    Directory of Open Access Journals (Sweden)

    Machulec B.

    2017-12-01

    Full Text Available In order to determine reasons of unsatisfactory production output regarding one of the 12 MVA furnaces, a comparative analysis with a furnace of higher power that showed a markedly better production output was performed. For comparison of ferrosilicon furnaces with different geometrical parameters and transformer powers, the theory of physical similarity was applied. Geometrical, electrical and thermal parameters of the reaction zones are included in the comparative analysis. For furnaces with different geometrical parameters, it is important to ensure the same temperature conditions of the reaction zones. Due to diverse mechanisms of heat generation, different criteria for determination of thermal and electrical similarity for the upper and lower reaction zones were assumed contrary to other publications. The parameter c3 (Westly was assumed the similarity criterion for the upper furnace zones where heat is generated as a result of resistive heating while the parameter J1 (Jaccard was assumed the similarity criterion for the lower furnace zones where heat is generated due to arc radiation.

  20. Clinical evaluation of further-developed MRCP sequences in comparison with standard MRCP sequences

    International Nuclear Information System (INIS)

    Hundt, W.; Scheidler, J.; Reiser, M.; Petsch, R.

    2002-01-01

    The purpose of this study was the comparison of technically improved single-shot magnetic resonance cholangiopancreatography (MRCP) sequences with standard single-shot rapid acquisition with relaxation enhancement (RARE) and half-Fourier acquired single-shot turbo spin-echo (HASTE) sequences in evaluating the normal and abnormal biliary duct system. The bile duct system of 45 patients was prospectively investigated on a 1.5-T MRI system. The investigation was performed with RARE and HASTE MR cholangiography sequences with standard and high spatial resolutions, and with a delayed-echo half-Fourier RARE (HASTE) sequence. Findings of the improved MRCP sequences were compared with the standard MRCP sequences. The level of confidence in assessing the diagnosis was divided into five groups. The Wilcoxon signed-rank test at a level of p<0.05 was applied. In 15 patients no pathology was found. The MRCP showed stenoses of the bile duct system in 10 patients and choledocholithiasis and cholecystolithiasis in 16 patients. In 12 patients a dilatation of the bile duct system was found. Comparison of the low- and high spatial resolution sequences and the short and long TE times of the half-Fourier RARE (HASTE) sequence revealed no statistically significant differences regarding accuracy of the examination. The diagnostic confidence level in assessing normal or pathological findings for the high-resolution RARE and half-Fourier RARE (HASTE) was significantly better than for the standard sequences. For the delayed-echo half-Fourier RARE (HASTE) sequence no statistically significant difference was seen. The high-resolution RARE and half-Fourier RARE (HASTE) sequences had a higher confidence level, but there was no significant difference in diagnosis in terms of detection and assessment of pathological changes in the biliary duct system compared with standard sequences. (orig.)

  1. Radiosensitivity of higher plants

    International Nuclear Information System (INIS)

    Feng Zhijie

    1992-11-01

    The general views on radiosensitivity of higher plants have been introduced from published references. The radiosensitivity varies with species, varieties and organs or tissues. The main factors of determining the radiosensitivity in different species are nucleus volume, chromosome volume, DNA content and endogenous compounds. The self-repair ability of DNA damage and chemical group of biological molecules, such as -SH thiohydroxy of proteins, are main factors to determine the radiosensitivity in different varieties. The moisture, oxygen, temperature radiosensitizer and protector are important external factors for radiosensitivity. Both the multiple target model and Chadwick-Leenhouts model are ideal mathematical models for describing the radiosensitivity of higher plants and the latter has more clear significance in biology

  2. Higher Education Language Policy

    DEFF Research Database (Denmark)

    Lauridsen, Karen M.

    2013-01-01

    Summary of recommendations HEIs are encouraged, within the framework of their own societal context, mission, vision and strategies, to develop the aims and objectives of a Higher Education Language Policy (HELP) that allows them to implement these strategies. In this process, they may want......: As the first step in a Higher Education Language Policy, HEIs should determine the relative status and use of the languages employed in the institution, taking into consideration the answers to the following questions:  What is/are the official language(s) of the HEI?  What is/are the language...... and the level of internationalisation the HEI has or wants to have, and as a direct implication of that, what are the language proficiency levels expected from the graduates of these programme?  Given the profile of the HEI and its educational strategies, which language components are to be offered within...

  3. Sequence to Sequence - Video to Text

    Science.gov (United States)

    2015-12-11

    by centering x and y flow values around 128 and multiplying by a scalar such that flow values fall between 0 and 255. We also calculate the flow magni...as for MPII- MD. 4.2. Evaluation Metrics Quantitative evaluation of the models are performed us- ing the METEOR [7] metric which was originally pro...candidate refer- ence sentences. METEOR compares exact token matches, stemmed tokens, paraphrase matches, as well as semanti- cally similar matches

  4. Similarity problems and completely bounded maps

    CERN Document Server

    Pisier, Gilles

    2001-01-01

    These notes revolve around three similarity problems, appearing in three different contexts, but all dealing with the space B(H) of all bounded operators on a complex Hilbert space H. The first one deals with group representations, the second one with C* -algebras and the third one with the disc algebra. We describe them in detail in the introduction which follows. This volume is devoted to the background necessary to understand these three problems, to the solutions that are known in some special cases and to numerous related concepts, results, counterexamples or extensions which their investigation has generated. While the three problems seem different, it is possible to place them in a common framework using the key concept of "complete boundedness", which we present in detail. Using this notion, the three problems can all be formulated as asking whether "boundedness" implies "complete boundedness" for linear maps satisfying certain additional algebraic identities. Two chapters have been added on the HALMO...

  5. Image Steganalysis with Binary Similarity Measures

    Directory of Open Access Journals (Sweden)

    Kharrazi Mehdi

    2005-01-01

    Full Text Available We present a novel technique for steganalysis of images that have been subjected to embedding by steganographic algorithms. The seventh and eighth bit planes in an image are used for the computation of several binary similarity measures. The basic idea is that the correlation between the bit planes as well as the binary texture characteristics within the bit planes will differ between a stego image and a cover image. These telltale marks are used to construct a classifier that can distinguish between stego and cover images. We also provide experimental results using some of the latest steganographic algorithms. The proposed scheme is found to have complementary performance vis-à-vis Farid's scheme in that they outperform each other in alternate embedding techniques.

  6. A Lithium Vapor Box Divertor Similarity Experiment

    Science.gov (United States)

    Cohen, Robert A.; Emdee, Eric D.; Goldston, Robert J.; Jaworski, Michael A.; Schwartz, Jacob A.

    2017-10-01

    A lithium vapor box divertor offers an alternate means of managing the extreme power density of divertor plasmas by leveraging gaseous lithium to volumetrically extract power. The vapor box divertor is a baffled slot with liquid lithium coated walls held at temperatures which increase toward the divertor floor. The resulting vapor pressure differential drives gaseous lithium from hotter chambers into cooler ones, where the lithium condenses and returns. A similarity experiment was devised to investigate the advantages offered by a vapor box divertor design. We discuss the design, construction, and early findings of the vapor box divertor experiment including vapor can construction, power transfer calculations, joint integrity tests, and thermocouple data logging. Heat redistribution of an incident plasma-based heat flux from a typical linear plasma device is also presented. This work supported by DOE Contract No. DE-AC02-09CH11466 and The Princeton Environmental Institute.

  7. Correct Bayesian and frequentist intervals are similar

    International Nuclear Information System (INIS)

    Atwood, C.L.

    1986-01-01

    This paper argues that Bayesians and frequentists will normally reach numerically similar conclusions, when dealing with vague data or sparse data. It is shown that both statistical methodologies can deal reasonably with vague data. With sparse data, in many important practical cases Bayesian interval estimates and frequentist confidence intervals are approximately equal, although with discrete data the frequentist intervals are somewhat longer. This is not to say that the two methodologies are equally easy to use: The construction of a frequentist confidence interval may require new theoretical development. Bayesians methods typically require numerical integration, perhaps over many variables. Also, Bayesian can easily fall into the trap of over-optimism about their amount of prior knowledge. But in cases where both intervals are found correctly, the two intervals are usually not very different. (orig.)

  8. Formulation of similarity porous media systems

    International Nuclear Information System (INIS)

    Anderson, R.M.; Ford, W.T.; Ruttan, A.; Strauss, M.J.

    1982-01-01

    The mathematical formulation of the Porous Media System (PMS) describing two-phase, immiscible, compressible fluid flow in linear, homogeneous porous media is reviewed and expanded. It is shown that families of common vertex, coaxial parabolas and families of parallel lines are the only families of curves on which solutions of the PMS may be constant. A coordinate transformation is used to change the partial differential equations of the PMS to a system of ordinary differential equations, referred to as a similarity Porous Media System (SPMS), in which the independent variable denotes movement from curve to curve in a selected family of curves. Properties of solutions of the first boundary value problem are developed for the SPMS

  9. Contextual Factors for Finding Similar Experts

    DEFF Research Database (Denmark)

    Hofmann, Katja; Balog, Krisztian; Bogers, Toine

    2010-01-01

    -seeking models, are rarely taken into account. In this article, we extend content-based expert-finding approaches with contextual factors that have been found to influence human expert finding. We focus on a task of science communicators in a knowledge-intensive environment, the task of finding similar experts......, given an example expert. Our approach combines expertise-seeking and retrieval research. First, we conduct a user study to identify contextual factors that may play a role in the studied task and environment. Then, we design expert retrieval models to capture these factors. We combine these with content......-based retrieval models and evaluate them in a retrieval experiment. Our main finding is that while content-based features are the most important, human participants also take contextual factors into account, such as media experience and organizational structure. We develop two principled ways of modeling...

  10. Complete genome sequence of a novel pestivirus from sheep.

    Science.gov (United States)

    Becher, Paul; Schmeiser, Stefanie; Oguzoglu, Tuba Cigdem; Postel, Alexander

    2012-10-01

    We report here the complete genome sequence of pestivirus strain Aydin/04-TR, which is the prototype of a group of similar viruses currently present in sheep and goats in Turkey. Sequence data from this virus showed that it clusters separately from the established and previously proposed tentative pestivirus species.

  11. Complete Genome Sequence of a Novel Pestivirus from Sheep

    OpenAIRE

    Becher, Paul; Schmeiser, Stefanie; Oguzoglu, Tuba Cigdem; Postel, Alexander

    2012-01-01

    We report here the complete genome sequence of pestivirus strain Aydin/04-TR, which is the prototype of a group of similar viruses currently present in sheep and goats in Turkey. Sequence data from this virus showed that it clusters separately from the established and previously proposed tentative pestivirus species.

  12. GENETIC POLYMORPHISM IN GYMNODINIUM GALATHEANUM CHLOROPLAST DNA SEQUENCES AND DEVELOPMENT OF A MOLECULAR DETECTION ASSAY. (R827084)

    Science.gov (United States)

    Nuclear and chloroplast-encoded small subunit ribosomal DNA sequences were obtainedfrom several strains of the toxic dinoflagellate Gymnodinium galatheanum. Phylogenetic analyses andcomparison of sequences indicate that the chloroplast sequences show a higher degree of se...

  13. Sequence embedding for fast construction of guide trees for multiple sequence alignment

    LENUS (Irish Health Repository)

    Blackshields, Gordon

    2010-05-14

    Abstract Background The most widely used multiple sequence alignment methods require sequences to be clustered as an initial step. Most sequence clustering methods require a full distance matrix to be computed between all pairs of sequences. This requires memory and time proportional to N 2 for N sequences. When N grows larger than 10,000 or so, this becomes increasingly prohibitive and can form a significant barrier to carrying out very large multiple alignments. Results In this paper, we have tested variations on a class of embedding methods that have been designed for clustering large numbers of complex objects where the individual distance calculations are expensive. These methods involve embedding the sequences in a space where the similarities within a set of sequences can be closely approximated without having to compute all pair-wise distances. Conclusions We show how this approach greatly reduces computation time and memory requirements for clustering large numbers of sequences and demonstrate the quality of the clusterings by benchmarking them as guide trees for multiple alignment. Source code is available for download from http:\\/\\/www.clustal.org\\/mbed.tgz.

  14. Main sequence mass loss

    International Nuclear Information System (INIS)

    Brunish, W.M.; Guzik, J.A.; Willson, L.A.; Bowen, G.

    1987-01-01

    It has been hypothesized that variable stars may experience mass loss, driven, at least in part, by oscillations. The class of stars we are discussing here are the δ Scuti variables. These are variable stars with masses between about 1.2 and 2.25 M/sub θ/, lying on or very near the main sequence. According to this theory, high rotation rates enhance the rate of mass loss, so main sequence stars born in this mass range would have a range of mass loss rates, depending on their initial rotation velocity and the amplitude of the oscillations. The stars would evolve rapidly down the main sequence until (at about 1.25 M/sub θ/) a surface convection zone began to form. The presence of this convective region would slow the rotation, perhaps allowing magnetic braking to occur, and thus sharply reduce the mass loss rate. 7 refs

  15. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Directory of Open Access Journals (Sweden)

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  16. Genomic insight into the common carp (Cyprinus carpio) genome by sequencing analysis of BAC-end sequences

    Science.gov (United States)

    2011-01-01

    Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES) are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio), a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3,100 microsyntenies, covering over 50% of

  17. Mass extrapolation of quarks and leptons to higher generations

    Energy Technology Data Exchange (ETDEWEB)

    Barik, N [Utkal Univ., Bhubaneswar (India). Dept. of Physics

    1981-05-01

    An empirical mass formula is tested for the basic fermion sequences of charged quarks and leptons. This relation is a generalization of Barut's mass formula for the lepton sequence (e, ..mu.., tau ....). It is found that successful mass extrapolation to the third and possibly to other higher generations (N > 2) can be obtained with the first and second generation masses as inputs, which predicts the top quark mass msub(t) to be around 20 GeV. This also leads to the mass ratios between members of two different sequences (i) and (i') corresponding to the same higher generations (N > 2).

  18. Mass extrapolation of quarks and leptons to higher generations

    International Nuclear Information System (INIS)

    Barik, N.

    1981-01-01

    An empirical mass formula is tested for the basic fermion sequences of charged quarks and leptons. This relation is a generalization of Barut's mass formula for the lepton sequence (e, μ, tau ....). It is found that successful mass extrapolation to the third and possibly to other higher generations (N > 2) can be obtained with the first and second generation masses as inputs, which predicts the top quark mass msub(t) to be around 20 GeV. This also leads to the mass ratios between members of two different sequences (i) and (i') corresponding to the same higher generations (N > 2). (author)

  19. Electricity sequence control

    International Nuclear Information System (INIS)

    Shin, Heung Ryeol

    2010-03-01

    The contents of the book are introduction of control system, like classification and control signal, introduction of electricity power switch, such as push-button and detection switch sensor for induction type and capacitance type machinery for control, solenoid valve, expression of sequence and type of electricity circuit about using diagram, time chart, marking and term, logic circuit like Yes, No, and, or and equivalence logic, basic electricity circuit, electricity sequence control, added condition, special program control about choice and jump of program, motor control, extra circuit on repeat circuit, pause circuit in a conveyer, safety regulations and rule about classification of electricity disaster and protective device for insulation.

  20. Next-generation sequencing

    DEFF Research Database (Denmark)

    Rieneck, Klaus; Bak, Mads; Jønson, Lars

    2013-01-01

    , Illumina); several millions of PCR sequences were analyzed. RESULTS: The results demonstrated the feasibility of diagnosing the fetal KEL1 or KEL2 blood group from cell-free DNA purified from maternal plasma. CONCLUSION: This method requires only one primer pair, and the large amount of sequence...... information obtained allows well for statistical analysis of the data. This general approach can be integrated into current laboratory practice and has numerous applications. Besides DNA-based predictions of blood group phenotypes, platelet phenotypes, or sickle cell anemia, and the determination of zygosity...

  1. Improvement of training set structure in fusion data cleaning using Time-Domain Global Similarity method

    International Nuclear Information System (INIS)

    Liu, J.; Lan, T.; Qin, H.

    2017-01-01

    Traditional data cleaning identifies dirty data by classifying original data sequences, which is a class-imbalanced problem since the proportion of incorrect data is much less than the proportion of correct ones for most diagnostic systems in Magnetic Confinement Fusion (MCF) devices. When using machine learning algorithms to classify diagnostic data based on class-imbalanced training set, most classifiers are biased towards the major class and show very poor classification rates on the minor class. By transforming the direct classification problem about original data sequences into a classification problem about the physical similarity between data sequences, the class-balanced effect of Time-Domain Global Similarity (TDGS) method on training set structure is investigated in this paper. Meanwhile, the impact of improved training set structure on data cleaning performance of TDGS method is demonstrated with an application example in EAST POlarimetry-INTerferometry (POINT) system.

  2. Identification of structural similarities between putative transmission proteins of Polymyxa and Spongospora transmitted bymoviruses and furoviruses.

    Science.gov (United States)

    Dessens, J T; Meyer, M

    1996-01-01

    Comparison of amino acid sequence and hydropathy profiles shows conserved, structural similarities between the capsid readthrough protein of potato mop top virus (transmitted by Spongospora subterranea) and furovirus and bymovirus proteins implicated in transmission by Polymyxa spp. This suggests that these proteins have a common ancestry and are involved in a common biological process: virus transmission by plasmodiophorid fungi.

  3. Collaborative Filtering Recommendation on Users' Interest Sequences.

    Directory of Open Access Journals (Sweden)

    Weijie Cheng

    Full Text Available As an important factor for improving recommendations, time information has been introduced to model users' dynamic preferences in many papers. However, the sequence of users' behaviour is rarely studied in recommender systems. Due to the users' unique behavior evolution patterns and personalized interest transitions among items, users' similarity in sequential dimension should be introduced to further distinguish users' preferences and interests. In this paper, we propose a new collaborative filtering recommendation method based on users' interest sequences (IS that rank users' ratings or other online behaviors according to the timestamps when they occurred. This method extracts the semantics hidden in the interest sequences by the length of users' longest common sub-IS (LCSIS and the count of users' total common sub-IS (ACSIS. Then, these semantics are utilized to obtain users' IS-based similarities and, further, to refine the similarities acquired from traditional collaborative filtering approaches. With these updated similarities, transition characteristics and dynamic evolution patterns of users' preferences are considered. Our new proposed method was compared with state-of-the-art time-aware collaborative filtering algorithms on datasets MovieLens, Flixster and Ciao. The experimental results validate that the proposed recommendation method is effective and outperforms several existing algorithms in the accuracy of rating prediction.

  4. Collaborative Filtering Recommendation on Users' Interest Sequences.

    Science.gov (United States)

    Cheng, Weijie; Yin, Guisheng; Dong, Yuxin; Dong, Hongbin; Zhang, Wansong

    2016-01-01

    As an important factor for improving recommendations, time information has been introduced to model users' dynamic preferences in many papers. However, the sequence of users' behaviour is rarely studied in recommender systems. Due to the users' unique behavior evolution patterns and personalized interest transitions among items, users' similarity in sequential dimension should be introduced to further distinguish users' preferences and interests. In this paper, we propose a new collaborative filtering recommendation method based on users' interest sequences (IS) that rank users' ratings or other online behaviors according to the timestamps when they occurred. This method extracts the semantics hidden in the interest sequences by the length of users' longest common sub-IS (LCSIS) and the count of users' total common sub-IS (ACSIS). Then, these semantics are utilized to obtain users' IS-based similarities and, further, to refine the similarities acquired from traditional collaborative filtering approaches. With these updated similarities, transition characteristics and dynamic evolution patterns of users' preferences are considered. Our new proposed method was compared with state-of-the-art time-aware collaborative filtering algorithms on datasets MovieLens, Flixster and Ciao. The experimental results validate that the proposed recommendation method is effective and outperforms several existing algorithms in the accuracy of rating prediction.

  5. Collaborative Filtering Recommendation on Users’ Interest Sequences

    Science.gov (United States)

    Cheng, Weijie; Yin, Guisheng; Dong, Yuxin; Dong, Hongbin; Zhang, Wansong

    2016-01-01

    As an important factor for improving recommendations, time information has been introduced to model users’ dynamic preferences in many papers. However, the sequence of users’ behaviour is rarely studied in recommender systems. Due to the users’ unique behavior evolution patterns and personalized interest transitions among items, users’ similarity in sequential dimension should be introduced to further distinguish users’ preferences and interests. In this paper, we propose a new collaborative filtering recommendation method based on users’ interest sequences (IS) that rank users’ ratings or other online behaviors according to the timestamps when they occurred. This method extracts the semantics hidden in the interest sequences by the length of users’ longest common sub-IS (LCSIS) and the count of users’ total common sub-IS (ACSIS). Then, these semantics are utilized to obtain users’ IS-based similarities and, further, to refine the similarities acquired from traditional collaborative filtering approaches. With these updated similarities, transition characteristics and dynamic evolution patterns of users’ preferences are considered. Our new proposed method was compared with state-of-the-art time-aware collaborative filtering algorithms on datasets MovieLens, Flixster and Ciao. The experimental results validate that the proposed recommendation method is effective and outperforms several existing algorithms in the accuracy of rating prediction. PMID:27195787

  6. Higher Education in Scandinavia

    DEFF Research Database (Denmark)

    Nielsen, Jørgen Lerche; Andreasen, Lars Birch

    2015-01-01

    Higher education systems around the world have been undergoing fundamental changes through the last 50 years from more narrow self-sustaining universities for the elite and into mass universities, where new groups of students have been recruited and the number of students enrolled has increased...... an impact on the educational systems in Scandinavia, and what possible futures can be envisioned?...... dramatically. In adjusting to the role of being a mass educational institution, universities have been challenged on how to cope with external pressures, such as forces of globalization and international markets, increased national and international competition for students and research grants, increased...

  7. Higher engineering mathematics

    CERN Document Server

    John Bird

    2014-01-01

    A practical introduction to the core mathematics principles required at higher engineering levelJohn Bird's approach to mathematics, based on numerous worked examples and interactive problems, is ideal for vocational students that require an advanced textbook.Theory is kept to a minimum, with the emphasis firmly placed on problem-solving skills, making this a thoroughly practical introduction to the advanced mathematics engineering that students need to master. The extensive and thorough topic coverage makes this an ideal text for upper level vocational courses. Now in

  8. Comparison of MRI pulse sequences for investigation of lesions of the cervical spinal cord

    International Nuclear Information System (INIS)

    Campi, A.; Pontesilli, S.; Gerevini, S.; Scotti, G.

    2000-01-01

    Small spinal cord lesions, even if clinically significant, can be due to the low sensitivity of some pulse sequences. We compared T2-weighted fast (FSE), and conventional (CSE) spin-echo and short-tau inversion-recovery (STIR)-FSE overlooked on MRI sequences to evaluate their sensitivity to and specificity for lesions of different types. We compared the three sequences in MRI of 57 patients with cervical spinal symptoms. The image sets were assessed by two of us individually for final diagnosis, lesion detectability and image quality. Both readers arrived at the same final diagnoses with all sequences, differentiating four groups of patients. Group 1 (30 patients, 53 %), with a final diagnosis of multiple sclerosis (MS). Demyelinating lesions were better seen on STIR-FSE images, on which the number of lesions was significantly higher than on FSE, while the FSE and CSE images showed approximately equal numbers of lesions; additional lesions were found in 9 patients. The contrast-to-noise ratio (CNR) of 17 demyelinating lesions was significantly higher on STIR-FSE images than with the other sequences. Group 2, 19 patients (33 %) with cervical pain, 15 of whom had disc protrusion or herniation: herniated discs were equally well delineated with all sequences, with better myelographic effect on FSE. In five patients with intrinsic spinal cord abnormalities, the conspicuity and demarcation of the lesions were similar with STIR-FSE and FSE. Group 3, 4 patients (7 %) with acute myelopathy of unknown aetiology. In two patients, STIR-FSE gave better demarcation of lesions and in one a questionable additional lesions. Group 4, 4 patients (7 %) with miscellaneous final diagnoses. STIR-FSE had high sensitivity to demyelinating lesions, can be considered quite specific and should be included in spinal MRI for assessment of suspected demyelinating disease. (orig.)

  9. BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server

    Directory of Open Access Journals (Sweden)

    Jiang Hualiang

    2010-01-01

    Full Text Available Abstract Background Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology. Results Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function, which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins. Conclusions This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.

  10. Similarity queries for temporal toxicogenomic expression profiles.

    Directory of Open Access Journals (Sweden)

    Adam A Smith

    2008-07-01

    Full Text Available We present an approach for answering similarity queries about gene expression time series that is motivated by the task of characterizing the potential toxicity of various chemicals. Our approach involves two key aspects. First, our method employs a novel alignment algorithm based on time warping. Our time warping algorithm has several advantages over previous approaches. It allows the user to impose fairly strong biases on the form that the alignments can take, and it permits a type of local alignment in which the entirety of only one series has to be aligned. Second, our method employs a relaxed spline interpolation to predict expression responses for unmeasured time points, such that the spline does not necessarily exactly fit every observed point. We evaluate our approach using expression time series from the Edge toxicology database. Our experiments show the value of using spline representations for sparse time series. More significantly, they show that our time warping method provides more accurate alignments and classifications than previous standard alignment methods for time series.

  11. Humans and mice express similar olfactory preferences.

    Directory of Open Access Journals (Sweden)

    Nathalie Mandairon

    Full Text Available In humans, the pleasantness of odors is a major contributor to social relationships and food intake. Smells evoke attraction and repulsion responses, reflecting the hedonic value of the odorant. While olfactory preferences are known to be strongly modulated by experience and learning, it has been recently suggested that, in humans, the pleasantness of odors may be partly explained by the physicochemical properties of the odorant molecules themselves. If odor hedonic value is indeed predetermined by odorant structure, then it could be hypothesized that other species will show similar odor preferences to humans. Combining behavioral and psychophysical approaches, we here show that odorants rated as pleasant by humans were also those which, behaviorally, mice investigated longer and human subjects sniffed longer, thereby revealing for the first time a component of olfactory hedonic perception conserved across species. Consistent with this, we further show that odor pleasantness rating in humans and investigation time in mice were both correlated with the physicochemical properties of the molecules, suggesting that olfactory preferences are indeed partly engraved in the physicochemical structure of the odorant. That odor preferences are shared between mammal species and are guided by physicochemical features of odorant stimuli strengthens the view that odor preference is partially predetermined. These findings open up new perspectives for the study of the neural mechanisms of hedonic perception.

  12. Different-but-Similar Judgments by Bumblebees

    Directory of Open Access Journals (Sweden)

    Vicki Xu

    2016-08-01

    Full Text Available This study examines picture perception in an invertebrate. Two questions regarding possible picture-object correspondence are addressed for bumblebees (Bombus impatiens: (1 Do bees perceive the difference between an object and its corresponding picture even when they have not been trained to do so? (2 Do they also perceive the similarity? Twenty bees from each of four colonies underwent discrimination training of stimuli placed in a radial maze. Bees were trained to discriminate between two objects (artificial flowers in one group and between photos of those objects in another. Subsequent testing on unrewarding stimuli revealed, for both groups, a significant discrimination between the object and its photo: discrimination training was not necessary for bees to detect a difference between corresponding objects and pictures. We obtained not only object-to-picture transfer, as in previous research, but also the reverse: picture-to-object transfer. In the absence of the rewarding object, its photo, though never seen before by the bees, was accepted as a substitute. The reverse was also true. Bumblebees treated pictures as “different-but-similar” without having been trained to do so, which is in turn useful in floral categorization.

  13. Block generators for the similarity renormalization group

    Energy Technology Data Exchange (ETDEWEB)

    Huether, Thomas; Roth, Robert [TU Darmstadt (Germany)

    2016-07-01

    The Similarity Renormalization Group (SRG) is a powerful tool to improve convergence behavior of many-body calculations using NN and 3N interactions from chiral effective field theory. The SRG method decouples high and low-energy physics, through a continuous unitary transformation implemented via a flow equation approach. The flow is determined by a generator of choice. This generator governs the decoupling pattern and, thus, the improvement of convergence, but it also induces many-body interactions. Through the design of the generator we can optimize the balance between convergence and induced forces. We explore a new class of block generators that restrict the decoupling to the high-energy sector and leave the diagonalization in the low-energy sector to the many-body method. In this way one expects a suppression of induced forces. We analyze the induced many-body forces and the convergence behavior in light and medium-mass nuclei in No-Core Shell Model and In-Medium SRG calculations.

  14. State and Mafia, Differences and Similarities

    Directory of Open Access Journals (Sweden)

    Alfano Vincenzo

    2015-02-01

    Full Text Available The purpose of this article is to investigate about the differences and, if any, the similarities among the modern State and the mafia criminal organizations. In particular, starting from their definitions, I will try to find the differences between State and mafia, to then focus on the operational aspects of the functioning of these two organizations, with specific reference to the effect/impact that both these human constructs have on citizens’ existences, and especially on citizen’s economic lives. All this in order to understand whether it is possible to identify an objective difference – beside morals – between taxation by the modern State and extortion by criminal organizations. With this of course I do not want to argue that the mafia is in any way justifiable or absolvable, nor that it is better than the State. However, I want to investigate whether there is a real, logical reason why the State should be considered by the citizens more desirable than the criminal organizations oppressing Southern Italy, from a strictly logical point of view and not from the point of view of ethics and morality.

  15. Genetic and 'cultural' similarity in wild chimpanzees.

    Science.gov (United States)

    Langergraber, Kevin E; Boesch, Christophe; Inoue, Eiji; Inoue-Murayama, Miho; Mitani, John C; Nishida, Toshisada; Pusey, Anne; Reynolds, Vernon; Schubert, Grit; Wrangham, Richard W; Wroblewski, Emily; Vigilant, Linda

    2011-02-07

    The question of whether animals possess 'cultures' or 'traditions' continues to generate widespread theoretical and empirical interest. Studies of wild chimpanzees have featured prominently in this discussion, as the dominant approach used to identify culture in wild animals was first applied to them. This procedure, the 'method of exclusion,' begins by documenting behavioural differences between groups and then infers the existence of culture by eliminating ecological explanations for their occurrence. The validity of this approach has been questioned because genetic differences between groups have not explicitly been ruled out as a factor contributing to between-group differences in behaviour. Here we investigate this issue directly by analysing genetic and behavioural data from nine groups of wild chimpanzees. We find that the overall levels of genetic and behavioural dissimilarity between groups are highly and statistically significantly correlated. Additional analyses show that only a very small number of behaviours vary between genetically similar groups, and that there is no obvious pattern as to which classes of behaviours (e.g. tool-use versus communicative) have a distribution that matches patterns of between-group genetic dissimilarity. These results indicate that genetic dissimilarity cannot be eliminated as playing a major role in generating group differences in chimpanzee behaviour.

  16. Multidimensional Scaling Visualization Using Parametric Similarity Indices

    Directory of Open Access Journals (Sweden)

    J. A. Tenreiro Machado

    2015-03-01

    Full Text Available In this paper, we apply multidimensional scaling (MDS and parametric similarity indices (PSI in the analysis of complex systems (CS. Each CS is viewed as a dynamical system, exhibiting an output time-series to be interpreted as a manifestation of its behavior. We start by adopting a sliding window to sample the original data into several consecutive time periods. Second, we define a given PSI for tracking pieces of data. We then compare the windows for different values of the parameter, and we generate the corresponding MDS maps of ‘points’. Third, we use Procrustes analysis to linearly transform the MDS charts for maximum superposition and to build a globalMDS map of “shapes”. This final plot captures the time evolution of the phenomena and is sensitive to the PSI adopted. The generalized correlation, theMinkowski distance and four entropy-based indices are tested. The proposed approach is applied to the Dow Jones Industrial Average stock market index and the Europe Brent Spot Price FOB time-series.

  17. Exploring similarities among many species distributions

    Science.gov (United States)

    Simmerman, Scott; Wang, Jingyuan; Osborne, James; Shook, Kimberly; Huang, Jian; Godsoe, William; Simons, Theodore R.

    2012-01-01

    Collecting species presence data and then building models to predict species distribution has been long practiced in the field of ecology for the purpose of improving our understanding of species relationships with each other and with the environment. Due to limitations of computing power as well as limited means of using modeling software on HPC facilities, past species distribution studies have been unable to fully explore diverse data sets. We build a system that can, for the first time to our knowledge, leverage HPC to support effective exploration of species similarities in distribution as well as their dependencies on common environmental conditions. Our system can also compute and reveal uncertainties in the modeling results enabling domain experts to make informed judgments about the data. Our work was motivated by and centered around data collection efforts within the Great Smoky Mountains National Park that date back to the 1940s. Our findings present new research opportunities in ecology and produce actionable field-work items for biodiversity management personnel to include in their planning of daily management activities.

  18. Similarities and differences in vapor explosion criteria

    International Nuclear Information System (INIS)

    Cronenberg, A.W.

    1978-01-01

    An overview of recent ideas pertaining to vapor explosion criteria indicates that in general sense, a consensus of opinion is emerging on the conditions applicable to explosive vaporization. Experimental and theoretical work has lead a number of investigators to the formulation of such conditions which are quite similar in many respects, although the quantitative details of the model formulation of such conditions are somewhat different. All model concepts are consistent in that an initial period of stable film boiling, separating molten fuel from coolant, is considered necessary (at least for large-scale interactions and efficient intermixing), with subsequent breakdown of film boiling due to pressure and/or thermal effects, followed by intimate fuel-coolant contact and a rapid vaporization process which is sufficient to cause shock pressurization. Although differences arise as to the conditions for and the energetics associated with film boiling destabilization and the mode and energetics of fragmentation and intermixing. However, the principal area of difference seems to be the question of what constitutes the requisite condition(s) for rapid vapor production to cause shock pressurization

  19. PHOG analysis of self-similarity in aesthetic images

    Science.gov (United States)

    Amirshahi, Seyed Ali; Koch, Michael; Denzler, Joachim; Redies, Christoph

    2012-03-01

    non-aesthetic categories of monochrome images. The aesthetic image datasets comprise a large variety of artworks of Western provenance. Other man-made aesthetically pleasing images, such as comics, cartoons and mangas, were also studied. For comparison, a database of natural scene photographs is used, as well as datasets of photographs of plants, simple objects and faces that are in general of low aesthetic value. As expected, natural scenes exhibit the highest degree of PHOG self-similarity. Images of artworks also show high selfsimilarity values, followed by cartoons, comics and mangas. On average, other (non-aesthetic) image categories are less self-similar in the PHOG analysis. A measure of scale-invariant self-similarity (PHOG) allows a good separation of the different aesthetic and non-aesthetic image categories. Our results provide further support for the notion that, like complex natural scenes, images of artworks display a higher degree of self-similarity across different scales of resolution than other image categories. Whether the high degree of self-similarity is the basis for the perception of beauty in both complex natural scenery and artworks remains to be investigated.

  20. Biological sequence analysis

    DEFF Research Database (Denmark)

    Durbin, Richard; Eddy, Sean; Krogh, Anders Stærmose

    This book provides an up-to-date and tutorial-level overview of sequence analysis methods, with particular emphasis on probabilistic modelling. Discussed methods include pairwise alignment, hidden Markov models, multiple alignment, profile searches, RNA secondary structure analysis, and phylogene...

  1. THE RHIC SEQUENCER

    International Nuclear Information System (INIS)

    VAN ZEIJTS, J.; DOTTAVIO, T.; FRAK, B.; MICHNOFF, R.

    2001-01-01

    The Relativistic Heavy Ion Collider (RHIC) has a high level asynchronous time-line driven by a controlling program called the ''Sequencer''. Most high-level magnet and beam related issues are orchestrated by this system. The system also plays an important task in coordinated data acquisition and saving. We present the program, operator interface, operational impact and experience

  2. Twin anemia polycythemia sequence

    NARCIS (Netherlands)

    Slaghekke, Femke

    2014-01-01

    In this thesis we describe that Twin Anemia Polycythemia Sequence (TAPS) is a form of chronic feto-fetal transfusion in monochorionic (identical) twins based on a small amount of blood transfusion through very small anastomoses. For the antenatal diagnosis of TAPS, Middle Cerebral Artery – Peak

  3. simple sequence repeat (SSR)

    African Journals Online (AJOL)

    In the present study, 78 mapped simple sequence repeat (SSR) markers representing 11 linkage groups of adzuki bean were evaluated for transferability to mungbean and related Vigna spp. 41 markers amplified characteristic bands in at least one Vigna species. The transferability percentage across the genotypes ranged ...

  4. Towards higher intensities

    CERN Multimedia

    CERN Bulletin

    2010-01-01

    Over the past 2 weeks, commissioning of the machine protection system has advanced significantly, opening up the possibility of higher intensity collisions at 3.5 TeV. The intensity has been increased from 2 bunches of 1010 protons to 6 bunches of 2x1010 protons. Luminosities of 6x1028 cm-2s-1 have been achieved at the start of fills, a factor of 60 higher than those provided for the first collisions on 30 March.   The recent increase in LHC luminosity as recorded by the experiments. (Graph courtesy of the experiments and M. Ferro-Luzzi) To increase the luminosity further, the commissioning crews are now trying to push up the intensity of the individual proton bunches. After the successful injection of nominal intensity bunches containing 1.1x1011 protons, collisions were subsequently achieved at 450 GeV with these intensities. However, half-way through the first ramping of these nominal intensity bunches to 3.5 TeV on 15 May, a beam instability was observed, leading to partial beam loss...

  5. Similarly shaped letters evoke similar colors in grapheme-color synesthesia.

    Science.gov (United States)

    Brang, David; Rouw, Romke; Ramachandran, V S; Coulson, Seana

    2011-04-01

    Grapheme-color synesthesia is a neurological condition in which viewing numbers or letters (graphemes) results in the concurrent sensation of color. While the anatomical substrates underlying this experience are well understood, little research to date has investigated factors influencing the particular colors associated with particular graphemes or how synesthesia occurs developmentally. A recent suggestion of such an interaction has been proposed in the cascaded cross-tuning (CCT) model of synesthesia, which posits that in synesthetes connections between grapheme regions and color area V4 participate in a competitive activation process, with synesthetic colors arising during the component-stage of grapheme processing. This model more directly suggests that graphemes sharing similar component features (lines, curves, etc.) should accordingly activate more similar synesthetic colors. To test this proposal, we created and regressed synesthetic color-similarity matrices for each of 52 synesthetes against a letter-confusability matrix, an unbiased measure of visual similarity among graphemes. Results of synesthetes' grapheme-color correspondences indeed revealed that more similarly shaped graphemes corresponded with more similar synesthetic colors, with stronger effects observed in individuals with more intense synesthetic experiences (projector synesthetes). These results support the CCT model of synesthesia, implicate early perceptual mechanisms as driving factors in the elicitation of synesthetic hues, and further highlight the relationship between conceptual and perceptual factors in this phenomenon. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. Asteroid clusters similar to asteroid pairs

    Science.gov (United States)

    Pravec, P.; Fatka, P.; Vokrouhlický, D.; Scheeres, D. J.; Kušnirák, P.; Hornoch, K.; Galád, A.; Vraštil, J.; Pray, D. P.; Krugly, Yu. N.; Gaftonyuk, N. M.; Inasaridze, R. Ya.; Ayvazian, V. R.; Kvaratskhelia, O. I.; Zhuzhunadze, V. T.; Husárik, M.; Cooney, W. R.; Gross, J.; Terrell, D.; Világi, J.; Kornoš, L.; Gajdoš, Š.; Burkhonov, O.; Ehgamberdiev, Sh. A.; Donchev, Z.; Borisov, G.; Bonev, T.; Rumyantsev, V. V.; Molotov, I. E.

    2018-04-01

    We studied the membership, size ratio and rotational properties of 13 asteroid clusters consisting of between 3 and 19 known members that are on similar heliocentric orbits. By backward integrations of their orbits, we confirmed their cluster membership and estimated times elapsed since separation of the secondaries (the smaller cluster members) from the primary (i.e., cluster age) that are between 105 and a few 106 years. We ran photometric observations for all the cluster primaries and a sample of secondaries and we derived their accurate absolute magnitudes and rotation periods. We found that 11 of the 13 clusters follow the same trend of primary rotation period vs mass ratio as asteroid pairs that was revealed by Pravec et al. (2010). We generalized the model of the post-fission system for asteroid pairs by Pravec et al. (2010) to a system of N components formed by rotational fission and we found excellent agreement between the data for the 11 asteroid clusters and the prediction from the theory of their formation by rotational fission. The two exceptions are the high-mass ratio (q > 0.7) clusters of (18777) Hobson and (22280) Mandragora for which a different formation mechanism is needed. Two candidate mechanisms for formation of more than one secondary by rotational fission were published: the secondary fission process proposed by Jacobson and Scheeres (2011) and a cratering collision event onto a nearly critically rotating primary proposed by Vokrouhlický et al. (2017). It will have to be revealed from future studies which of the clusters were formed by one or the other process. To that point, we found certain further interesting properties and features of the asteroid clusters that place constraints on the theories of their formation, among them the most intriguing being the possibility of a cascade disruption for some of the clusters.

  7. Expanding the boundaries of local similarity analysis.

    Science.gov (United States)

    Durno, W Evan; Hanson, Niels W; Konwar, Kishori M; Hallam, Steven J

    2013-01-01

    Pairwise comparison of time series data for both local and time-lagged relationships is a computationally challenging problem relevant to many fields of inquiry. The Local Similarity Analysis (LSA) statistic identifies the existence of local and lagged relationships, but determining significance through a p-value has been algorithmically cumbersome due to an intensive permutation test, shuffling rows and columns and repeatedly calculating the statistic. Furthermore, this p-value is calculated with the assumption of normality -- a statistical luxury dissociated from most real world datasets. To improve the performance of LSA on big datasets, an asymptotic upper bound on the p-value calculation was derived without the assumption of normality. This change in the bound calculation markedly improved computational speed from O(pm²n) to O(m²n), where p is the number of permutations in a permutation test, m is the number of time series, and n is the length of each time series. The bounding process is implemented as a computationally efficient software package, FASTLSA, written in C and optimized for threading on multi-core computers, improving its practical computation time. We computationally compare our approach to previous implementations of LSA, demonstrate broad applicability by analyzing time series data from public health, microbial ecology, and social media, and visualize resulting networks using the Cytoscape software. The FASTLSA software package expands the boundaries of LSA allowing analysis on datasets with millions of co-varying time series. Mapping metadata onto force-directed graphs derived from FASTLSA allows investigators to view correlated cliques and explore previously unrecognized network relationships. The software is freely available for download at: http://www.cmde.science.ubc.ca/hallam/fastLSA/.

  8. Targeted sequencing of plant genomes

    Science.gov (United States)

    Mark D. Huynh

    2014-01-01

    Next-generation sequencing (NGS) has revolutionized the field of genetics by providing a means for fast and relatively affordable sequencing. With the advancement of NGS, wholegenome sequencing (WGS) has become more commonplace. However, sequencing an entire genome is still not cost effective or even beneficial in all cases. In studies that do not require a whole-...

  9. Almost convergence of triple sequences

    OpenAIRE

    Ayhan Esi; M.Necdet Catalbas

    2013-01-01

    In this paper we introduce and study the concepts of almost convergence and almost Cauchy for triple sequences. Weshow that the set of almost convergent triple sequences of 0's and 1's is of the first category and also almost everytriple sequence of 0's and 1's is not almost convergent.Keywords: almost convergence, P-convergent, triple sequence.

  10. A few Smarandache Integer Sequences

    OpenAIRE

    Ibstedt, Henry

    2010-01-01

    This paper deals with the analysis of a few Smarandache Integer Sequences which first appeared in Properties or the Numbers, F. Smarandache, University or Craiova Archives, 1975. The first four sequences are recurrence generated sequences while the last three are concatenation sequences.

  11. Teaching at higher levels

    Science.gov (United States)

    1998-11-01

    Undergraduate physics programmes for the 21st century were under discussion at a recent event held in Arlington, USA, open to two or three members of the physics faculties of universities from across the whole country. The conference was organized by the American Association of Physics Teachers with co-sponsorship from the American Institute of Physics, the American Physical Society and Project Kaleidoscope. Among the various aims were to learn about physics departments that have successfully revitalized their undergraduate physics programmes with innovative introductory physics courses and multi-track majors programmes. Engineers and life scientists were to be asked directly how physics programmes can better serve their students, and business leaders would be speaking on how physics departments can help to prepare their students for the diverse careers that they will eventually follow. It was planned to highlight ways that departments could fulfil their responsibilities towards trainee teachers, to identify the resources needed for revitalizing a department's programme, and to develop guidelines and recommendations for a funding programme to support collaborative efforts among physics departments for carrying out the enhancements required. More details about the conference can be found on the AAPT website (see http://www.aapt.org/programs/rupc.html). Meanwhile the UK's Higher Education Funding Council has proposed a two-pronged approach to the promotion of high quality teaching and learning, as well as widening participation in higher education from 1999-2000. A total of £60m should be available to support these initiatives by the year 2001-2002. As part of this scheme the Council will invite bids from institutions to support individual academics in enhancing learning and teaching, as well as in recognition of individual excellence. As with research grants, such awards would enable staff to pursue activities such as the development of teaching materials

  12. Advanced Models and Algorithms for Self-Similar IP Network Traffic Simulation and Performance Analysis

    Science.gov (United States)

    Radev, Dimitar; Lokshina, Izabella

    2010-11-01

    The paper examines self-similar (or fractal) properties of real communication network traffic data over a wide range of time scales. These self-similar properties are very different from the properties of traditional models based on Poisson and Markov-modulated Poisson processes. Advanced fractal models of sequentional generators and fixed-length sequence generators, and efficient algorithms that are used to simulate self-similar behavior of IP network traffic data are developed and applied. Numerical examples are provided; and simulation results are obtained and analyzed.

  13. Higher Order Mode Fibers

    DEFF Research Database (Denmark)

    Israelsen, Stine Møller

    This PhD thesis considers higher order modes (HOMs) in optical fibers. That includes their excitation and characteristics. Within the last decades, HOMs have been applied both for space multiplexing in optical communications, group velocity dispersion management and sensing among others......-radial polarization as opposed to the linear polarization of the LP0X modes. The effect is investigated numerically in a double cladding fiber with an outer aircladding using a full vectorial modesolver. Experimentally, the bowtie modes are excited using a long period grating and their free space characteristics...... and polarization state are investigated. For this fiber, the onset of the bowtie effect is shown numerically to be LP011. The characteristics usually associated with Bessel-likes modes such as long diffraction free length and selfhealing are shown to be conserved despite the lack of azimuthal symmetry...

  14. Spiky higher genus strings

    International Nuclear Information System (INIS)

    Ambjoern, J.; Bellini, A.; Johnston, D.

    1990-10-01

    It is clear from both the non-perturbative and perturbative approaches to two-dimensional quantum gravity that a new strong coupling regime is setting in at d=1, independent of the genus of the worldsheet being considered. It has been suggested that a Kosterlitz-Thouless (KT) phase transition in the Liouville theory is the cause of this behaviour. However, it has recently been pointed out that the XY model, which displays a KT transition on the plane and the sphere, is always in the strong coupling, disordered phase on a surface of constant negative curvature. A higher genus worldsheet can be represented as a fundamental region on just such a surface, which might seem to suggest that the KT picture predicts a strong coupling region for arbitrary d, contradicting the known results. We resolve the apparent paradox. (orig.)

  15. Allele Re-sequencing Technologies

    DEFF Research Database (Denmark)

    Byrne, Stephen; Farrell, Jacqueline Danielle; Asp, Torben

    2013-01-01

    The development of next-generation sequencing technologies has made sequencing an affordable approach for detection of genetic variations associated with various traits. However, the cost of whole genome re-sequencing still remains too high to be feasible for many plant species with large...... alternative to whole genome re-sequencing to identify causative genetic variations in plants. One challenge, however, will be efficient bioinformatics strategies for data handling and analysis from the increasing amount of sequence information....

  16. Amplification and chromosomal dispersion of human endogenous retroviral sequences

    International Nuclear Information System (INIS)

    Steele, P.E.; Martin, M.A.; Rabson, A.B.; Bryan, T.; O'Brien, S.J.

    1986-01-01

    Endogenous retroviral sequences have undergone amplification events involving both viral and flanking cellular sequences. The authors cloned members of an amplified family of full-length endogenous retroviral sequences. Genomic blotting, employing a flanking cellular DNA probe derived from a member of this family, revealed a similar array of reactive bands in both humans and chimpanzees, indicating that an amplification event involving retroviral and associated cellular DNA sequences occurred before the evolutionary separation of these two primates. Southern analyses of restricted somatic cell hybrid DNA preparations suggested that endogenous retroviral segments are widely dispersed in the human genome and that amplification and dispersion events may be linked

  17. Heuristics for multiobjective multiple sequence alignment.

    Science.gov (United States)

    Abbasi, Maryam; Paquete, Luís; Pereira, Francisco B

    2016-07-15

    Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment. We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments. The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show

  18. Comparison of the Diversity of Basidiomycetes from Dead Wood of the Manchurian fir (Abies holophylla) as Evaluated by Fruiting Body Collection, Mycelial Isolation, and 454 Sequencing.

    Science.gov (United States)

    Jang, Yeongseon; Jang, Seokyoon; Min, Mihee; Hong, Joo-Hyun; Lee, Hanbyul; Lee, Hwanhwi; Lim, Young Woon; Kim, Jae-Jin

    2015-10-01

    In this study, three different methods (fruiting body collection, mycelial isolation, and 454 sequencing) were implemented to determine the diversity of wood-inhabiting basidiomycetes from dead Manchurian fir (Abies holophylla). The three methods recovered similar species richness (26 species from fruiting bodies, 32 species from mycelia, and 32 species from 454 sequencing), but Fisher's alpha, Shannon-Wiener, Simpson's diversity indices of fungal communities indicated fruiting body collection and mycelial isolation displayed higher diversity compared with 454 sequencing. In total, 75 wood-inhabiting basidiomycetes were detected. The most frequently observed species were Heterobasidion orientale (fruiting body collection), Bjerkandera adusta (mycelial isolation), and Trichaptum fusco-violaceum (454 sequencing). Only two species, Hymenochaete yasudae and Hypochnicium karstenii, were detected by all three methods. This result indicated that Manchurian fir harbors a diverse basidiomycetous fungal community and for complete estimation of fungal diversity, multiple methods should be used. Further studies are required to understand their ecology in the context of forest ecosystems.

  19. Detecting change in stochastic sound sequences.

    Directory of Open Access Journals (Sweden)

    Benjamin Skerritt-Davis

    2018-05-01

    Full Text Available Our ability to parse our acoustic environment relies on the brain's capacity to extract statistical regularities from surrounding sounds. Previous work in regularity extraction has predominantly focused on the brain's sensitivity to predictable patterns in sound sequences. However, natural sound environments are rarely completely predictable, often containing some level of randomness, yet the brain is able to effectively interpret its surroundings by extracting useful information from stochastic sounds. It has been previously shown that the brain is sensitive to the marginal lower-order statistics of sound sequences (i.e., mean and variance. In this work, we investigate the brain's sensitivity to higher-order statistics describing temporal dependencies between sound events through a series of change detection experiments, where listeners are asked to detect changes in randomness in the pitch of tone sequences. Behavioral data indicate listeners collect statistical estimates to process incoming sounds, and a perceptual model based on Bayesian inference shows a capacity in the brain to track higher-order statistics. Further analysis of individual subjects' behavior indicates an important role of perceptual constraints in listeners' ability to track these sensory statistics with high fidelity. In addition, the inference model facilitates analysis of neural electroencephalography (EEG responses, anchoring the analysis relative to the statistics of each stochastic stimulus. This reveals both a deviance response and a change-related disruption in phase of the stimulus-locked response that follow the higher-order statistics. These results shed light on the brain's ability to process stochastic sound sequences.

  20. Nucleotide Sequences and Comparison of Two Large Conjugative Plasmids from Different Campylobacter species

    National Research Council Canada - National Science Library

    Batchelor, Roger A; Pearson, Bruce M; Friis, Lorna M; Guerry, Patricia; Wells, Jerry M

    2004-01-01

    .... Both plasmids are mosaic in structure, having homologues of genes found in a variety of different commensal and pathogenic bacteria, but nevertheless, showed striking similarities in DNA sequence...