WorldWideScience

Sample records for randomly selected sequences

  1. Rapid selection of accessible and cleavable sites in RNA by Escherichia coli RNase P and random external guide sequences

    OpenAIRE

    Lundblad, Eirik W.; Xiao, Gaoping; Ko, Jae-hyeong; Altman, Sidney

    2008-01-01

    A method of inhibiting the expression of particular genes by using external guide sequences (EGSs) has been improved in its rapidity and specificity. Random EGSs that have 14-nt random sequences are used in the selection procedure for an EGS that attacks the mRNA for a gene in a particular location. A mixture of the random EGSs, the particular target RNA, and RNase P is used in the diagnostic procedure, which, after completion, is analyzed in a gel with suitable control lanes. Within a few ho...

  2. Rapid selection of accessible and cleavable sites in RNA by Escherichia coli RNase P and random external guide sequences.

    Science.gov (United States)

    Lundblad, Eirik W; Xiao, Gaoping; Ko, Jae-Hyeong; Altman, Sidney

    2008-02-19

    A method of inhibiting the expression of particular genes by using external guide sequences (EGSs) has been improved in its rapidity and specificity. Random EGSs that have 14-nt random sequences are used in the selection procedure for an EGS that attacks the mRNA for a gene in a particular location. A mixture of the random EGSs, the particular target RNA, and RNase P is used in the diagnostic procedure, which, after completion, is analyzed in a gel with suitable control lanes. Within a few hours, the procedure is complete. The action of EGSs designed by an older method is compared with EGSs designed by the random EGS method on mRNAs from two bacterial pathogens.

  3. absolutely regular random sequences

    Directory of Open Access Journals (Sweden)

    Michel Harel

    1996-01-01

    Full Text Available In this paper, the central limit theorems for the density estimator and for the integrated square error are proved for the case when the underlying sequence of random variables is nonstationary. Applications to Markov processes and ARMA processes are provided.

  4. Expressed sequence tags of randomly selected cDNA clones from Eucalyptus globulus-Pisolithus tinctorius ectomycorrhiza.

    Science.gov (United States)

    Tagu, D; Martin, F

    1995-01-01

    Random sequencing of cDNA clones from Eucalyptus globulus-Pisolithus tinctorius ectomycorrhizal tissues was carried out to generate expressed sequence tags (ESTs). Database comparisons revealed that 42% of the cDNAs corresponded to previously sequenced genes. These ESTs represent efficient molecular markers to analyze changes in gene expression during the formation of the ectomycorrhizal symbiosis.

  5. Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection

    Directory of Open Access Journals (Sweden)

    Xin Ma

    2015-01-01

    Full Text Available The prediction of RNA-binding proteins is one of the most challenging problems in computation biology. Although some studies have investigated this problem, the accuracy of prediction is still not sufficient. In this study, a highly accurate method was developed to predict RNA-binding proteins from amino acid sequences using random forests with the minimum redundancy maximum relevance (mRMR method, followed by incremental feature selection (IFS. We incorporated features of conjoint triad features and three novel features: binding propensity (BP, nonbinding propensity (NBP, and evolutionary information combined with physicochemical properties (EIPP. The results showed that these novel features have important roles in improving the performance of the predictor. Using the mRMR-IFS method, our predictor achieved the best performance (86.62% accuracy and 0.737 Matthews correlation coefficient. High prediction accuracy and successful prediction performance suggested that our method can be a useful approach to identify RNA-binding proteins from sequence information.

  6. Blocked Randomization with Randomly Selected Block Sizes

    Directory of Open Access Journals (Sweden)

    Jimmy Efird

    2010-12-01

    Full Text Available When planning a randomized clinical trial, careful consideration must be given to how participants are selected for various arms of a study. Selection and accidental bias may occur when participants are not assigned to study groups with equal probability. A simple random allocation scheme is a process by which each participant has equal likelihood of being assigned to treatment versus referent groups. However, by chance an unequal number of individuals may be assigned to each arm of the study and thus decrease the power to detect statistically significant differences between groups. Block randomization is a commonly used technique in clinical trial design to reduce bias and achieve balance in the allocation of participants to treatment arms, especially when the sample size is small. This method increases the probability that each arm will contain an equal number of individuals by sequencing participant assignments by block. Yet still, the allocation process may be predictable, for example, when the investigator is not blind and the block size is fixed. This paper provides an overview of blocked randomization and illustrates how to avoid selection bias by using random block sizes.

  7. Blocked randomization with randomly selected block sizes.

    Science.gov (United States)

    Efird, Jimmy

    2011-01-01

    When planning a randomized clinical trial, careful consideration must be given to how participants are selected for various arms of a study. Selection and accidental bias may occur when participants are not assigned to study groups with equal probability. A simple random allocation scheme is a process by which each participant has equal likelihood of being assigned to treatment versus referent groups. However, by chance an unequal number of individuals may be assigned to each arm of the study and thus decrease the power to detect statistically significant differences between groups. Block randomization is a commonly used technique in clinical trial design to reduce bias and achieve balance in the allocation of participants to treatment arms, especially when the sample size is small. This method increases the probability that each arm will contain an equal number of individuals by sequencing participant assignments by block. Yet still, the allocation process may be predictable, for example, when the investigator is not blind and the block size is fixed. This paper provides an overview of blocked randomization and illustrates how to avoid selection bias by using random block sizes.

  8. Permutation Entropy for Random Binary Sequences

    Directory of Open Access Journals (Sweden)

    Lingfeng Liu

    2015-12-01

    Full Text Available In this paper, we generalize the permutation entropy (PE measure to binary sequences, which is based on Shannon’s entropy, and theoretically analyze this measure for random binary sequences. We deduce the theoretical value of PE for random binary sequences, which can be used to measure the randomness of binary sequences. We also reveal the relationship between this PE measure with other randomness measures, such as Shannon’s entropy and Lempel–Ziv complexity. The results show that PE is consistent with these two measures. Furthermore, we use PE as one of the randomness measures to evaluate the randomness of chaotic binary sequences.

  9. 32 CFR 1624.1 - Random selection procedures for induction.

    Science.gov (United States)

    2010-07-01

    ... 32 National Defense 6 2010-07-01 2010-07-01 false Random selection procedures for induction. 1624... SYSTEM INDUCTIONS § 1624.1 Random selection procedures for induction. (a) The Director of Selective Service shall from time to time establish a random selection sequence for induction by a drawing to be...

  10. Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian naïve Bayes.

    Directory of Open Access Journals (Sweden)

    Wangchao Lou

    Full Text Available Developing an efficient method for determination of the DNA-binding proteins, due to their vital roles in gene regulation, is becoming highly desired since it would be invaluable to advance our understanding of protein functions. In this study, we proposed a new method for the prediction of the DNA-binding proteins, by performing the feature rank using random forest and the wrapper-based feature selection using forward best-first search strategy. The features comprise information from primary sequence, predicted secondary structure, predicted relative solvent accessibility, and position specific scoring matrix. The proposed method, called DBPPred, used Gaussian naïve Bayes as the underlying classifier since it outperformed five other classifiers, including decision tree, logistic regression, k-nearest neighbor, support vector machine with polynomial kernel, and support vector machine with radial basis function. As a result, the proposed DBPPred yields the highest average accuracy of 0.791 and average MCC of 0.583 according to the five-fold cross validation with ten runs on the training benchmark dataset PDB594. Subsequently, blind tests on the independent dataset PDB186 by the proposed model trained on the entire PDB594 dataset and by other five existing methods (including iDNA-Prot, DNA-Prot, DNAbinder, DNABIND and DBD-Threader were performed, resulting in that the proposed DBPPred yielded the highest accuracy of 0.769, MCC of 0.538, and AUC of 0.790. The independent tests performed by the proposed DBPPred on completely a large non-DNA binding protein dataset and two RNA binding protein datasets also showed improved or comparable quality when compared with the relevant prediction methods. Moreover, we observed that majority of the selected features by the proposed method are statistically significantly different between the mean feature values of the DNA-binding and the non DNA-binding proteins. All of the experimental results indicate that

  11. Improving randomness characterization through Bayesian model selection.

    Science.gov (United States)

    Díaz Hernández Rojas, Rafael; Solís, Aldo; Angulo Martínez, Alí M; U'Ren, Alfred B; Hirsch, Jorge G; Marsili, Matteo; Pérez Castillo, Isaac

    2017-06-08

    Random number generation plays an essential role in technology with important applications in areas ranging from cryptography to Monte Carlo methods, and other probabilistic algorithms. All such applications require high-quality sources of random numbers, yet effective methods for assessing whether a source produce truly random sequences are still missing. Current methods either do not rely on a formal description of randomness (NIST test suite) on the one hand, or are inapplicable in principle (the characterization derived from the Algorithmic Theory of Information), on the other, for they require testing all the possible computer programs that could produce the sequence to be analysed. Here we present a rigorous method that overcomes these problems based on Bayesian model selection. We derive analytic expressions for a model's likelihood which is then used to compute its posterior distribution. Our method proves to be more rigorous than NIST's suite and Borel-Normality criterion and its implementation is straightforward. We applied our method to an experimental device based on the process of spontaneous parametric downconversion to confirm it behaves as a genuine quantum random number generator. As our approach relies on Bayesian inference our scheme transcends individual sequence analysis, leading to a characterization of the source itself.

  12. Randomized selection on the GPU

    Energy Technology Data Exchange (ETDEWEB)

    Monroe, Laura Marie [Los Alamos National Laboratory; Wendelberger, Joanne R [Los Alamos National Laboratory; Michalak, Sarah E [Los Alamos National Laboratory

    2011-01-13

    We implement here a fast and memory-sparing probabilistic top N selection algorithm on the GPU. To our knowledge, this is the first direct selection in the literature for the GPU. The algorithm proceeds via a probabilistic-guess-and-chcck process searching for the Nth element. It always gives a correct result and always terminates. The use of randomization reduces the amount of data that needs heavy processing, and so reduces the average time required for the algorithm. Probabilistic Las Vegas algorithms of this kind are a form of stochastic optimization and can be well suited to more general parallel processors with limited amounts of fast memory.

  13. Sequential selection of random vectors under a sum constraint

    OpenAIRE

    Stanke, Mario

    2004-01-01

    We observe a sequence X1,X2,...,Xn of independent and identically distributed coordinatewise nonnegative d-dimensional random vectors. When a vector is observed it can either be selected or rejected but once made this decision is final. In each coordinate the sum of the selected vectors must not exceed a given constant. The problem is to find a selection policy that maximizes the expected number of selected vectors. For a general absolutely continuous distribution of t...

  14. Evaluation of Genome Sequencing Quality in Selected Plant Species Using Expressed Sequence Tags

    Science.gov (United States)

    Shangguan, Lingfei; Han, Jian; Kayesh, Emrul; Sun, Xin; Zhang, Changqing; Pervaiz, Tariq; Wen, Xicheng; Fang, Jinggui

    2013-01-01

    Background With the completion of genome sequencing projects for more than 30 plant species, large volumes of genome sequences have been produced and stored in online databases. Advancements in sequencing technologies have reduced the cost and time of whole genome sequencing enabling more and more plants to be subjected to genome sequencing. Despite this, genome sequence qualities of multiple plants have not been evaluated. Methodology/Principal Finding Integrity and accuracy were calculated to evaluate the genome sequence quality of 32 plants. The integrity of a genome sequence is presented by the ratio of chromosome size and genome size (or between scaffold size and genome size), which ranged from 55.31% to nearly 100%. The accuracy of genome sequence was presented by the ratio between matched EST and selected ESTs where 52.93% ∼ 98.28% and 89.02% ∼ 98.85% of the randomly selected clean ESTs could be mapped to chromosome and scaffold sequences, respectively. According to the integrity, accuracy and other analysis of each plant species, thirteen plant species were divided into four levels. Arabidopsis thaliana, Oryza sativa and Zea mays had the highest quality, followed by Brachypodium distachyon, Populus trichocarpa, Vitis vinifera and Glycine max, Sorghum bicolor, Solanum lycopersicum and Fragaria vesca, and Lotus japonicus, Medicago truncatula and Malus × domestica in that order. Assembling the scaffold sequences into chromosome sequences should be the primary task for the remaining nineteen species. Low GC content and repeat DNA influences genome sequence assembly. Conclusion The quality of plant genome sequences was found to be lower than envisaged and thus the rapid development of genome sequencing projects as well as research on bioinformatics tools and the algorithms of genome sequence assembly should provide increased processing and correction of genome sequences that have already been published. PMID:23922843

  15. Evaluation of genome sequencing quality in selected plant species using expressed sequence tags.

    Directory of Open Access Journals (Sweden)

    Lingfei Shangguan

    Full Text Available BACKGROUND: With the completion of genome sequencing projects for more than 30 plant species, large volumes of genome sequences have been produced and stored in online databases. Advancements in sequencing technologies have reduced the cost and time of whole genome sequencing enabling more and more plants to be subjected to genome sequencing. Despite this, genome sequence qualities of multiple plants have not been evaluated. METHODOLOGY/PRINCIPAL FINDING: Integrity and accuracy were calculated to evaluate the genome sequence quality of 32 plants. The integrity of a genome sequence is presented by the ratio of chromosome size and genome size (or between scaffold size and genome size, which ranged from 55.31% to nearly 100%. The accuracy of genome sequence was presented by the ratio between matched EST and selected ESTs where 52.93% ∼ 98.28% and 89.02% ∼ 98.85% of the randomly selected clean ESTs could be mapped to chromosome and scaffold sequences, respectively. According to the integrity, accuracy and other analysis of each plant species, thirteen plant species were divided into four levels. Arabidopsis thaliana, Oryza sativa and Zea mays had the highest quality, followed by Brachypodium distachyon, Populus trichocarpa, Vitis vinifera and Glycine max, Sorghum bicolor, Solanum lycopersicum and Fragaria vesca, and Lotus japonicus, Medicago truncatula and Malus × domestica in that order. Assembling the scaffold sequences into chromosome sequences should be the primary task for the remaining nineteen species. Low GC content and repeat DNA influences genome sequence assembly. CONCLUSION: The quality of plant genome sequences was found to be lower than envisaged and thus the rapid development of genome sequencing projects as well as research on bioinformatics tools and the algorithms of genome sequence assembly should provide increased processing and correction of genome sequences that have already been published.

  16. Nonlinear deterministic structures and the randomness of protein sequences

    CERN Document Server

    Huang Yan Zhao

    2003-01-01

    To clarify the randomness of protein sequences, we make a detailed analysis of a set of typical protein sequences representing each structural classes by using nonlinear prediction method. No deterministic structures are found in these protein sequences and this implies that they behave as random sequences. We also give an explanation to the controversial results obtained in previous investigations.

  17. Random selection of Borel sets

    Directory of Open Access Journals (Sweden)

    Bernd Günther

    2010-10-01

    Full Text Available A theory of random Borel sets is presented, based on dyadic resolutions of compact metric spaces. The conditional expectation of the intersection of two independent random Borel sets is investigated. An example based on an embedding of Sierpinski’s universal curve into the space of Borel sets is given.

  18. Sequence determinants in human polyadenylation site selection

    Directory of Open Access Journals (Sweden)

    Gautheret Daniel

    2003-02-01

    Full Text Available Abstract Background Differential polyadenylation is a widespread mechanism in higher eukaryotes producing mRNAs with different 3' ends in different contexts. This involves several alternative polyadenylation sites in the 3' UTR, each with its specific strength. Here, we analyze the vicinity of human polyadenylation signals in search of patterns that would help discriminate strong and weak polyadenylation sites, or true sites from randomly occurring signals. Results We used human genomic sequences to retrieve the region downstream of polyadenylation signals, usually absent from cDNA or mRNA databases. Analyzing 4956 EST-validated polyadenylation sites and their -300/+300 nt flanking regions, we clearly visualized the upstream (USE and downstream (DSE sequence elements, both characterized by U-rich (not GU-rich segments. The presence of a USE and a DSE is the main feature distinguishing true polyadenylation sites from randomly occurring A(A/UUAAA hexamers. While USEs are indifferently associated with strong and weak poly(A sites, DSEs are more conspicuous near strong poly(A sites. We then used the region encompassing the hexamer and DSE as a training set for poly(A site identification by the ERPIN program and achieved a prediction specificity of 69 to 85% for a sensitivity of 56%. Conclusion The availability of complete genomes and large EST sequence databases now permit large-scale observation of polyadenylation sites. Both U-rich sequences flanking both sides of poly(A signals contribute to the definition of "true" sites. However, the downstream U-rich sequences may also play an enhancing role. Based on this information, poly(A site prediction accuracy was moderately but consistently improved compared to the best previously available algorithm.

  19. Natural vs. random protein sequences: Discovering combinatorics properties on amino acid words.

    Science.gov (United States)

    Santoni, Daniele; Felici, Giovanni; Vergni, Davide

    2016-02-21

    Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Simulations Using Random-Generated DNA and RNA Sequences

    Science.gov (United States)

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  1. Species selection and random drift in macroevolution.

    Science.gov (United States)

    Chevin, Luis-Miguel

    2016-03-01

    Species selection resulting from trait-dependent speciation and extinction is increasingly recognized as an important mechanism of phenotypic macroevolution. However, the recent bloom in statistical methods quantifying this process faces a scarcity of dynamical theory for their interpretation, notably regarding the relative contributions of deterministic versus stochastic evolutionary forces. I use simple diffusion approximations of birth-death processes to investigate how the expected and random components of macroevolutionary change depend on phenotype-dependent speciation and extinction rates, as can be estimated empirically. I show that the species selection coefficient for a binary trait, and selection differential for a quantitative trait, depend not only on differences in net diversification rates (speciation minus extinction), but also on differences in species turnover rates (speciation plus extinction), especially in small clades. The randomness in speciation and extinction events also produces a species-level equivalent to random genetic drift, which is stronger for higher turnover rates. I then show how microevolutionary processes including mutation, organismic selection, and random genetic drift cause state transitions at the species level, allowing comparison of evolutionary forces across levels. A key parameter that would be needed to apply this theory is the distribution and rate of origination of new optimum phenotypes along a phylogeny. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.

  2. Entropy and long-range correlations in random symbolic sequences

    CERN Document Server

    Melnik, S S

    2014-01-01

    The goal of this paper is to develop an estimate for the entropy of random long-range correlated symbolic sequences with elements belonging to a finite alphabet. As a plausible model, we use the high-order additive stationary ergodic Markov chain. Supposing that the correlations between random elements of the chain are weak we express the differential entropy of the sequence by means of the symbolic pair correlation function. We also examine an algorithm for estimating the differential entropy of finite symbolic sequences. We show that the entropy contains two contributions, the correlation and fluctuation ones. The obtained analytical results are used for numerical evaluation of the entropy of written English texts and DNA nucleotide sequences. The developed theory opens the way for constructing a more consistent and sophisticated approach to describe the systems with strong short- and weak long-range correlations.

  3. Humans cannot consciously generate random numbers sequences: Polemic study.

    Science.gov (United States)

    Figurska, Małgorzata; Stańczyk, Maciej; Kulesza, Kamil

    2008-01-01

    It is widely believed, that randomness exists in Nature. In fact such an assumption underlies many scientific theories and is embedded in the foundations of quantum mechanics. Assuming that this hypothesis is valid one can use natural phenomena, like radioactive decay, to generate random numbers. Today, computers are capable of generating the so-called pseudorandom numbers. Such series of numbers are only seemingly random (bias in the randomness quality can be observed). Question whether people can produce random numbers, has been investigated by many scientists in the recent years. The paper "Humans can consciously generate random numbers sequences..." published recently in Medical Hypotheses made claims that were in many ways contrary to state of art; it also stated far-reaching hypotheses. So, we decided to repeat the experiments reported, with special care being taken of proper laboratory procedures. Here, we present the results and discuss possible implications in computer and other sciences.

  4. Classification of periodic, chaotic and random sequences using ...

    Indian Academy of Sciences (India)

    ities of different datasets. Entropy cannot differentiate between chaotic and random sequences while ApEn and LZ cannot distinguish between weak and strong chaos. Figure 1. 95% confidence interval for mean LZ complexity of 50 samples of length. 20 using four bins. Pramana – J. Phys., Vol. 84, No. 3, March 2015. 367 ...

  5. An Integrated Cutting Tool Selection & Operation Sequencing Method

    NARCIS (Netherlands)

    Rho, H.M.; Geelink, R.; Geelink, R.; van t Erve, A.H.; van 't Erve, A.H.; Kals, H.J.J.

    1992-01-01

    Within the P.%RT C'APP system. the selection of an optimum operation sequence is related to the modules which perform the machining method and cutting tool selection. This study analyzes the technical and economical aspects of operation sequencing and presents a method which is capable of generating

  6. Selective learning enabled by intention to learn in sequence learning.

    Science.gov (United States)

    Miyawaki, Kaori

    2012-01-01

    This study investigated whether a target sequence that people intend to learn is learned selectively when it is interleaved with another (non-target) sequence. Three experiments used a serial reaction time task in which different spatial and color stimuli occurred alternately. Each of the two interleaved sequences had structural regularity. Participants in an intentional learning group were instructed to learn the target (spatial) sequence whereas those in an incidental learning group were not. In Experiments 1 and 2 spatial and color sequences were correlated. Results showed that the intentional group learned the spatial sequence better than the incidental group and learned it independently of the color sequence, whereas the incidental group learned the two sequences as a combined sequence. In Experiment 3 the sequences were uncorrelated. Results showed that the intentional group was no longer superior in learning the spatial sequence. Findings indicate that the intention to learn a target sequence enables selective learning of it only when it is correlated with a non-target sequence.

  7. G-quadruplex aptamer selection using capillary electrophoresis-LED-induced fluorescence and Illumina sequencing.

    Science.gov (United States)

    Ric, Audrey; Ecochard, Vincent; Iacovoni, Jason S; Boutonnet, Audrey; Ginot, Frédéric; Ong-Meang, Varravaddheay; Poinsot, Véréna; Paquereau, Laurent; Couderc, François

    2018-03-01

    One of the major difficulties that arises when selecting aptamers containing a G-quadruplex is the correct amplification of the ssDNA sequence. Can aptamers containing a G-quadruplex be selected from a degenerate library using non-equilibrium capillary electrophoresis (CE) of equilibrium mixtures (NECEEM) along with high-throughput Illumina sequencing? In this article, we present some mismatches of the G-quadruplex T29 aptamer specific to thrombin, which was PCR amplified and sequenced by Illumina sequencing. Then, we show the proportionality between the number of sequenced molecules of T29 added to the library and the number of sequences obtained in Illumina sequencing, and we find that T29 sequences from this aptamer can be detected in a random library of ssDNA after the sample is fractionated by NECEEM, amplified by PCR, and sequenced. Treatment of the data by the counting of double-stranded DNA T29 sequences containing a maximum of two mismatches reveals a good correlation with the enrichment factor (f E ). This factor is the ratio of the number of aptamer sequences found in the collected complex sample divided by the total number of sequencing reads (aptamer and non-aptamer) plus the quantity of T29 molecules (spiked into a DNA library) injected into CE.

  8. A method for selecting cis-acting regulatory sequences that respond to small molecule effectors

    Directory of Open Access Journals (Sweden)

    Allas Ülar

    2010-08-01

    Full Text Available Abstract Background Several cis-acting regulatory sequences functioning at the level of mRNA or nascent peptide and specifically influencing transcription or translation have been described. These regulatory elements often respond to specific chemicals. Results We have developed a method that allows us to select cis-acting regulatory sequences that respond to diverse chemicals. The method is based on the β-lactamase gene containing a random sequence inserted into the beginning of the ORF. Several rounds of selection are used to isolate sequences that suppress β-lactamase expression in response to the compound under study. We have isolated sequences that respond to erythromycin, troleandomycin, chloramphenicol, meta-toluate and homoserine lactone. By introducing synonymous and non-synonymous mutations we have shown that at least in the case of erythromycin the sequences act at the peptide level. We have also tested the cross-activities of the constructs and found that in most cases the sequences respond most strongly to the compound on which they were isolated. Conclusions Several selected peptides showed ligand-specific changes in amino acid frequencies, but no consensus motif could be identified. This is consistent with previous observations on natural cis-acting peptides, showing that it is often impossible to demonstrate a consensus. Applying the currently developed method on a larger scale, by selecting and comparing an extended set of sequences, might allow the sequence rules underlying the activity of cis-acting regulatory peptides to be identified.

  9. Do natural proteins differ from random sequences polypeptides? Natural vs. random proteins classification using an evolutionary neural network.

    Directory of Open Access Journals (Sweden)

    Davide De Lucrezia

    Full Text Available Are extant proteins the exquisite result of natural selection or are they random sequences slightly edited by evolution? This question has puzzled biochemists for long time and several groups have addressed this issue comparing natural protein sequences to completely random ones coming to contradicting conclusions. Previous works in literature focused on the analysis of primary structure in an attempt to identify possible signature of evolutionary editing. Conversely, in this work we compare a set of 762 natural proteins with an average length of 70 amino acids and an equal number of completely random ones of comparable length on the basis of their structural features. We use an ad hoc Evolutionary Neural Network Algorithm (ENNA in order to assess whether and to what extent natural proteins are edited from random polypeptides employing 11 different structure-related variables (i.e. net charge, volume, surface area, coil, alpha helix, beta sheet, percentage of coil, percentage of alpha helix, percentage of beta sheet, percentage of secondary structure and surface hydrophobicity. The ENNA algorithm is capable to correctly distinguish natural proteins from random ones with an accuracy of 94.36%. Furthermore, we study the structural features of 32 random polypeptides misclassified as natural ones to unveil any structural similarity to natural proteins. Results show that random proteins misclassified by the ENNA algorithm exhibit a significant fold similarity to portions or subdomains of extant proteins at atomic resolution. Altogether, our results suggest that natural proteins are significantly edited from random polypeptides and evolutionary editing can be readily detected analyzing structural features. Furthermore, we also show that the ENNA, employing simple structural descriptors, can predict whether a protein chain is natural or random.

  10. Sequence Selection and Performance in DS/CDMA Systems

    Directory of Open Access Journals (Sweden)

    Jefferson Santos Ambrosio

    2016-03-01

    Full Text Available In this work key concepts on coding division multiple access (CDMA communication systems have been discussed. The sequence selection impact on the performance and capacity of direct sequence CDMA (DS/CDMA systems under AWGN and increasing system loading, as well as under multiple antennas channels was investigated.

  11. Peptide based diagnostics: are random-sequence peptides more useful than tiling proteome sequences?

    Science.gov (United States)

    Navalkar, Krupa Arun; Johnston, Stephan Albert; Stafford, Phillip

    2015-02-01

    Diagnostics using peptide ligands have been available for decades. However, their adoption in diagnostics has been limited, not because of poor sensitivity but in many cases due to diminished specificity. Numerous reports suggest that protein-based rather than peptide-based disease detection is more specific. We examined two different approaches to peptide-based diagnostics using Coccidioides (aka Valley Fever) as the disease model. Although the pathogen was discovered more than a century ago, a highly sensitive diagnostic remains unavailable. We present a case study where two different approaches to diagnosing Valley Fever were used: first, overlapping Valley Fever epitopes representing immunodominant Coccidioides antigens were tiled using a microarray format of presynthesized peptides. Second, a set of random sequence peptides identified using a 10,000 peptide immunosignaturing microarray was compared for sensitivity and specificity. The scientific hypothesis tested was that actual epitope peptides from Coccidioides would provide sufficient sensitivity and specificity as a diagnostic. Results demonstrated that random sequence peptides exhibited higher accuracy when classifying different stages of Valley Fever infection vs. epitope peptides. The epitope peptide array did provide better performance than the existing immunodiffusion array, but when directly compared to the random sequence peptides, reported lower overall accuracy. This study suggests that there are competing aspects of antibody recognition that involve conservation of pathogen sequence and aspects of mimotope recognition and amino acid substitutions. These factors may prove critical when developing the next generation of high-performance immunodiagnostics. Copyright © 2014 Elsevier B.V. All rights reserved.

  12. NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents.

    Directory of Open Access Journals (Sweden)

    Sophia S Liu

    2016-11-01

    Full Text Available The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems.

  13. Chemical rationale for selection of isolates for genome sequencing

    DEFF Research Database (Denmark)

    Rank, Christian; Larsen, Thomas Ostenfeld; Frisvad, Jens Christian

    The advances in gene sequencing will in the near future enable researchers to affordably acquire the full genomes of handpicked isolates. We here present a method to evaluate the chemical potential of an entire species and select representatives for genome sequencing. The selection criteria for new...... strains to be sequenced can be manifold, but for studying the functional phenotype, using a metabolome based approach offers a cheap and rapid assessment of critical strains to cover the chemical diversity. We have applied this methodology on the complex A. flavus/A. oryzae group. Though these two species...... are in principal identical, they represent two different phenotypes. This is clearly presented through a correspondence analysis of selected extrolites, in which the subtle chemical differences are visually dispersed. The results points to a handful of strains, which, if sequenced, will likely enhance our...

  14. Random Sequence for Optimal Low-Power Laser Generated Ultrasound

    Science.gov (United States)

    Vangi, D.; Virga, A.; Gulino, M. S.

    2017-08-01

    Low-power laser generated ultrasounds are lately gaining importance in the research world, thanks to the possibility of investigating a mechanical component structural integrity through a non-contact and Non-Destructive Testing (NDT) procedure. The ultrasounds are, however, very low in amplitude, making it necessary to use pre-processing and post-processing operations on the signals to detect them. The cross-correlation technique is used in this work, meaning that a random signal must be used as laser input. For this purpose, a highly random and simple-to-create code called T sequence, capable of enhancing the ultrasound detectability, is introduced (not previously available at the state of the art). Several important parameters which characterize the T sequence can influence the process: the number of pulses Npulses , the pulse duration δ and the distance between pulses dpulses . A Finite Element FE model of a 3 mm steel disk has been initially developed to analytically study the longitudinal ultrasound generation mechanism and the obtainable outputs. Later, experimental tests have shown that the T sequence is highly flexible for ultrasound detection purposes, making it optimal to use high Npulses and δ but low dpulses . In the end, apart from describing all phenomena that arise in the low-power laser generation process, the results of this study are also important for setting up an effective NDT procedure using this technology.

  15. Random-breakage mapping method applied to human DNA sequences

    Science.gov (United States)

    Lobrich, M.; Rydberg, B.; Cooper, P. K.; Chatterjee, A. (Principal Investigator)

    1996-01-01

    The random-breakage mapping method [Game et al. (1990) Nucleic Acids Res., 18, 4453-4461] was applied to DNA sequences in human fibroblasts. The methodology involves NotI restriction endonuclease digestion of DNA from irradiated calls, followed by pulsed-field gel electrophoresis, Southern blotting and hybridization with DNA probes recognizing the single copy sequences of interest. The Southern blots show a band for the unbroken restriction fragments and a smear below this band due to radiation induced random breaks. This smear pattern contains two discontinuities in intensity at positions that correspond to the distance of the hybridization site to each end of the restriction fragment. By analyzing the positions of those discontinuities we confirmed the previously mapped position of the probe DXS1327 within a NotI fragment on the X chromosome, thus demonstrating the validity of the technique. We were also able to position the probes D21S1 and D21S15 with respect to the ends of their corresponding NotI fragments on chromosome 21. A third chromosome 21 probe, D21S11, has previously been reported to be close to D21S1, although an uncertainty about a second possible location existed. Since both probes D21S1 and D21S11 hybridized to a single NotI fragment and yielded a similar smear pattern, this uncertainty is removed by the random-breakage mapping method.

  16. Animal selection for whole genome sequencing by quantifying the unique contribution of homozygous haplotypes sequenced

    Science.gov (United States)

    Major whole genome sequencing projects promise to identify rare and causal variants within livestock species; however, the efficient selection of animals for sequencing remains a major problem within these surveys. The goal of this project was to develop a library of high accuracy genetic variants f...

  17. The genealogy of sequences containing multiple sites subject to strong selection in a subdivided population.

    Science.gov (United States)

    Nordborg, Magnus; Innan, Hideki

    2003-03-01

    A stochastic model for the genealogy of a sample of recombining sequences containing one or more sites subject to selection in a subdivided population is described. Selection is incorporated by dividing the population into allelic classes and then conditioning on the past sizes of these classes. The past allele frequencies at the selected sites are thus treated as parameters rather than as random variables. The purpose of the model is not to investigate the dynamics of selection, but to investigate effects of linkage to the selected sites on the genealogy of the surrounding chromosomal region. This approach is useful for modeling strong selection, when it is natural to parameterize the past allele frequencies at the selected sites. Several models of strong balancing selection are used as examples, and the effects on the pattern of neutral polymorphism in the chromosomal region are discussed. We focus in particular on the statistical power to detect balancing selection when it is present.

  18. PRIMITIVE MATRICES AND GENERATORS OF PSEUDO RANDOM SEQUENCES OF GALOIS

    Directory of Open Access Journals (Sweden)

    A. Beletsky

    2014-04-01

    Full Text Available In theory and practice of information cryptographic protection one of the key problems is the forming a binary pseudo-random sequences (PRS with a maximum length with acceptable statistical characteristics. PRS generators are usually implemented by linear shift register (LSR of maximum period with linear feedback [1]. In this paper we extend the concept of LSR, assuming that each of its rank (memory cell can be in one of the following condition. Let’s call such registers “generalized linear shift register.” The research goal is to develop algorithms for constructing Galois and Fibonacci generalized matrix of n-order over the field , which uniquely determined both the structure of corresponding generalized of n-order LSR maximal period, and formed on their basis Galois PRS generators of maximum length. Thus the article presents the questions of formation the primitive generalized Fibonacci and Galois arbitrary order matrix over the prime field . The synthesis of matrices is based on the use of irreducible polynomials of degree and primitive elements of the extended field generated by polynomial. The constructing methods of Galois and Fibonacci conjugated primitive matrices are suggested. The using possibilities of such matrices in solving the problem of constructing generalized generators of Galois pseudo-random sequences are discussed.

  19. Theoretical study of polymeric mixtures with different sequence statistics. I. Ising class: Linear random copolymers with different statistical sequences and ternary blends of linear random copolymers with homopolymers

    Energy Technology Data Exchange (ETDEWEB)

    Qi, Shuyan [Department of Chemical Engineering, Department of Chemistry, and Materials Science Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, California 94720 (United States); Chakraborty, Arup K. [Department of Chemical Engineering, Department of Chemistry, and Materials Science Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, California 94720 (United States)

    2000-01-15

    We derive a Landau free energy functional for polymeric mixtures containing components with different sequence statistics. We then apply this general field theory to two mixtures that belong to the Ising universality class: mixtures of two different linear random copolymers, and ternary systems of linear random copolymers and two homopolymers. We discuss the instability conditions for the homogeneous state of these mixtures, and calculate the structure factors for different components in the homogeneous state. The structure factors show interesting features which can directly be compared with scattering experiments carried out with selectively deuterated samples. We also work out the eigenmodes representing the least stable concentration fluctuations for these mixtures. The nature of these concentration fluctuations provides information regarding the ordered phases and the kinetic pathways that lead to them. We find various demixing modes for different characteristics of the two mixtures (e.g., average compositions, statistical correlation lengths, and volume fractions). (c) 2000 American Institute of Physics.

  20. Neural Networks that Learn Temporal Sequences by Selection

    Science.gov (United States)

    Dehaene, Stanislas; Changeux, Jean-Pierre; Nadal, Jean-Pierre

    1987-05-01

    A model for formal neural networks that learn temporal sequences by selection is proposed on the basis of observations on the acquisition of song by birds, on sequence-detecting neurons, and on allosteric receptors. The model relies on hypothetical elementary devices made up of three neurons, the synaptic triads, which yield short-term modification of synaptic efficacy through heterosynaptic interactions, and on a local Hebbian learning rule. The functional units postulated are mutually inhibiting clusters of synergic neurons and bundles of synapses. Networks formalized on this basis display capacities for passive recognition and for production of temporal sequences that may include repetitions. Introduction of the learning rule leads to the differentiation of sequence-detecting neurons and to the stabilization of ongoing temporal sequences. A network architecture composed of three layers of neuronal clusters is shown to exhibit active recognition and learning of time sequences by selection: the network spontaneously produces prerepresentations that are selected according to their resonance with the input percepts. Predictions of the model are discussed.

  1. Novel Zn2+-chelating peptides selected from a fimbria-displayed random peptide library

    DEFF Research Database (Denmark)

    Kjærgaard, Kristian; Schembri, Mark; Klemm, Per

    2001-01-01

    H adhesin. FimH is a component of the fimbrial organelle that can accommodate and display a diverse range of peptide sequences on the E. coli cell surface. In this study we have constructed a random peptide library in FimH. The library, consisting of similar to 40 million individual clones, was screened...... for peptide sequences that conferred on recombinant cells the ability to bind Zn2+. By serial selection, sequences that exhibited various degrees of binding affinity and specificity toward Zn2+ were enriched. None of the isolated sequences showed similarity to known Zn2+-binding proteins, indicating...

  2. Controlled response selection benefits explicit, but not implicit sequence learning

    NARCIS (Netherlands)

    Jiménez, L.; Deroost, N.; van den Broek, Egon; Clegg, B.A.; Abrahamse, E.L.

    2010-01-01

    In two experiments with the serial reaction time task the effect of response selection processes on sequence learning was examined by manipulating stimulus-response compatibility between training groups. In Experiment 1 participants were first trained with either compatible or incompatible

  3. In-Place Randomized Slope Selection

    DEFF Research Database (Denmark)

    Blunck, Henrik; Vahrenhold, Jan

    2006-01-01

    Slope selection is a well-known algorithmic tool used in the context of computing robust estimators for fitting a line to a collection P of n points in the plane. We demonstrate that it is possible to perform slope selection in expected O(nlogn) time using only constant extra space in addition to...

  4. Random effect selection in generalised linear models

    DEFF Research Database (Denmark)

    Denwood, Matt; Houe, Hans; Forkman, Björn

    We analysed abattoir recordings of meat inspection codes with possible relevance to onfarm animal welfare in cattle. Random effects logistic regression models were used to describe individual-level data obtained from 461,406 cattle slaughtered in Denmark. Our results demonstrate that the largest ...

  5. Random Whole Metagenomic Sequencing for Forensic Discrimination of Soils

    Science.gov (United States)

    Khodakova, Anastasia S.; Smith, Renee J.; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian

    2014-01-01

    Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations. PMID:25111003

  6. Random whole metagenomic sequencing for forensic discrimination of soils.

    Directory of Open Access Journals (Sweden)

    Anastasia S Khodakova

    Full Text Available Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA and single arbitrarily primed DNA amplification (AP-PCR based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification and SEED Subsystems (metabolic classification databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER; similarity profile analysis (SIMPROF; non-metric multidimensional scaling (NMDS; and canonical analysis of principal coordinates (CAP at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.

  7. Equivalent Conditions of Complete Convergence for Weighted Sums of Sequences of Negatively Dependent Random Variables

    Directory of Open Access Journals (Sweden)

    Mingle Guo

    2012-01-01

    Full Text Available The complete convergence for weighted sums of sequences of negatively dependent random variables is investigated. By applying moment inequality and truncation methods, the equivalent conditions of complete convergence for weighted sums of sequences of negatively dependent random variables are established. These results not only extend the corresponding results obtained by Li et al. (1995, Gut (1993, and Liang (2000 to sequences of negatively dependent random variables, but also improve them.

  8. Classic selective sweeps revealed by massive sequencing in cattle.

    Directory of Open Access Journals (Sweden)

    Saber Qanbari

    2014-02-01

    Full Text Available Human driven selection during domestication and subsequent breed formation has likely left detectable signatures within the genome of modern cattle. The elucidation of these signatures of selection is of interest from the perspective of evolutionary biology, and for identifying domestication-related genes that ultimately may help to further genetically improve this economically important animal. To this end, we employed a panel of more than 15 million autosomal SNPs identified from re-sequencing of 43 Fleckvieh animals. We mainly applied two somewhat complementary statistics, the integrated Haplotype Homozygosity Score (iHS reflecting primarily ongoing selection, and the Composite of Likelihood Ratio (CLR having the most power to detect completed selection after fixation of the advantageous allele. We find 106 candidate selection regions, many of which are harboring genes related to phenotypes relevant in domestication, such as coat coloring pattern, neurobehavioral functioning and sensory perception including KIT, MITF, MC1R, NRG4, Erbb4, TMEM132D and TAS2R16, among others. To further investigate the relationship between genes with signatures of selection and genes identified in QTL mapping studies, we use a sample of 3062 animals to perform four genome-wide association analyses using appearance traits, body size and somatic cell count. We show that regions associated with coat coloring significantly (P<0.0001 overlap with the candidate selection regions, suggesting that the selection signals we identify are associated with traits known to be affected by selection during domestication. Results also provide further evidence regarding the complexity of the genetics underlying coat coloring in cattle. This study illustrates the potential of population genetic approaches for identifying genomic regions affecting domestication-related phenotypes and further helps to identify specific regions targeted by selection during speciation, domestication and

  9. The evolution of proteins from random amino acid sequences: II. Evidence from the statistical distributions of the lengths of modern protein sequences.

    Science.gov (United States)

    White, S H

    1994-04-01

    This paper continues an examination of the hypothesis that modern proteins evolved from random heteropeptide sequences. In support of the hypothesis, White and Jacobs (1993, J Mol Evol 36:79-95) have shown that any sequence chosen randomly from a large collection of nonhomologous proteins has a 90% or better chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. The goal of the present study was to investigate the possibility that the random-origin hypothesis could explain the lengths of modern protein sequences without invoking specific mechanisms such as gene duplication or exon splicing. The sets of sequences examined were taken from the 1989 PIR database and consisted of 1,792 "super-family" proteins selected to have little sequence identity, 623 E. coli sequences, and 398 human sequences. The length distributions of the proteins could be described with high significance by either of two closely related probability density functions: The gamma distribution with parameter 2 or the distribution for the sum of two exponential random independent variables. A simple theory for the distributions was developed which assumes that (1) protoprotein sequences had exponentially distributed random independent lengths, (2) the length dependence of protein stability determined which of these protoproteins could fold into compact primitive proteins and thereby attain the potential for biochemical activity, (3) the useful protein sequences were preserved by the primitive genome, and (4) the resulting distribution of sequence lengths is reflected by modern proteins. The theory successfully predicts the two observed distributions which can be distinguished by the functional form of the dependence of protein stability on length. The theory leads to three interesting conclusions. First, it predicts that a tetra-nucleotide was the signal for primitive translation termination. This prediction is

  10. Design of Long Period Pseudo-Random Sequences from the Addition of m -Sequences over 𝔽 p

    Directory of Open Access Journals (Sweden)

    Ren Jian

    2004-01-01

    Full Text Available Pseudo-random sequence with good correlation property and large linear span is widely used in code division multiple access (CDMA communication systems and cryptology for reliable and secure information transmission. In this paper, sequences with long period, large complexity, balance statistics, and low cross-correlation property are constructed from the addition of m -sequences with pairwise-prime linear spans (AMPLS. Using m -sequences as building blocks, the proposed method proved to be an efficient and flexible approach to construct long period pseudo-random sequences with desirable properties from short period sequences. Applying the proposed method to 𝔽 2 , a signal set ( ( 2 n − 1 ( 2 m − 1 , ( 2 n + 1 ( 2 m + 1 , ( 2 ( n + 1 / 2 + 1 ( 2 ( m + 1 / 2 + 1 is constructed.

  11. Distributions on unbounded moment spaces and random moment sequences

    OpenAIRE

    Dette, Holger; Nagel, Jan

    2012-01-01

    In this paper we define distributions on moment spaces corresponding to measures on the real line with an unbounded support. We identify these distributions as limiting distributions of random moment vectors defined on compact moment spaces and as distributions corresponding to random spectral measures associated with the Jacobi, Laguerre and Hermite ensemble from random matrix theory. For random vectors on the unbounded moment spaces we prove a central limit theorem where the centering vecto...

  12. Sequence selection by dynamical symmetry breaking in an autocatalytic binary polymer model

    Science.gov (United States)

    Fellermann, Harold; Tanaka, Shinpei; Rasmussen, Steen

    2017-12-01

    Template-directed replication of nucleic acids is at the essence of all living beings and a major milestone for any origin of life scenario. We present an idealized model of prebiotic sequence replication, where binary polymers act as templates for their autocatalytic replication, thereby serving as each others reactants and products in an intertwined molecular ecology. Our model demonstrates how autocatalysis alters the qualitative and quantitative system dynamics in counterintuitive ways. Most notably, numerical simulations reveal a very strong intrinsic selection mechanism that favors the appearance of a few population structures with highly ordered and repetitive sequence patterns when starting from a pool of monomers. We demonstrate both analytically and through simulation how this "selection of the dullest" is caused by continued symmetry breaking through random fluctuations in the transient dynamics that are amplified by autocatalysis and eventually propagate to the population level. The impact of these observations on related prebiotic mathematical models is discussed.

  13. Sequence selection by dynamical symmetry breaking in an autocatalytic binary polymer model

    DEFF Research Database (Denmark)

    Fellermann, Harold; Tanaka, Shinpei; Rasmussen, Steen

    2017-01-01

    as each others reactants and products in an intertwined molecular ecology. Our model demonstrates how autocatalysis alters the qualitative and quantitative system dynamics in counterintuitive ways. Most notably, numerical simulations reveal a very strong intrinsic selection mechanism that favors......Template-directed replication of nucleic acids is at the essence of all living beings and a major milestone for any origin of life scenario. We present an idealized model of prebiotic sequence replication, where binary polymers act as templates for their autocatalytic replication, thereby serving...... the appearance of a few population structures with highly ordered and repetitive sequence patterns when starting from a pool of monomers. We demonstrate both analytically and through simulation how this "selection of the dullest" is caused by continued symmetry breaking through random fluctuations...

  14. Delay line length selection in generating fast random numbers with a chaotic laser.

    Science.gov (United States)

    Zhang, Jianzhong; Wang, Yuncai; Xue, Lugang; Hou, Jiayin; Zhang, Beibei; Wang, Anbang; Zhang, Mingjiang

    2012-04-10

    The chaotic light signals generated by an external cavity semiconductor laser have been experimentally demonstrated to extract fast random numbers. However, the photon round-trip time in the external cavity can cause the occurrence of the periodicity in random sequences. To overcome it, the exclusive-or operation on corresponding random bits in samples of the chaotic signal and its time-delay signal from a chaotic laser is required. In this scheme, the proper selection of delay length is a key issue. By doing a large number of experiments and theoretically analyzing the interplay between the Runs test and the threshold value of the autocorrelation function, we find when the corresponding delay time of autocorrelation trace with the correlation coefficient of less than 0.007 is considered as the delay time between the chaotic signal and its time-delay signal, streams of random numbers can be generated with verified randomness.

  15. A Novel Method for Increasing the Entropy of a Sequence of Independent, Discrete Random Variables

    Directory of Open Access Journals (Sweden)

    Mieczyslaw Jessa

    2015-10-01

    Full Text Available In this paper, we propose a novel method for increasing the entropy of a sequence of independent, discrete random variables with arbitrary distributions. The method uses an auxiliary table and a novel theorem that concerns the entropy of a sequence in which the elements are a bitwise exclusive-or sum of independent discrete random variables.

  16. Cryptographic pseudo-random sequences from the chaotic Hénon ...

    Indian Academy of Sciences (India)

    Pseudo-random number sequences are useful in many applications including Monte-Carlo simulation, spread spectrum ... a pseudo-random binary sequence from the two-dimensional chaotic Hénon map is explored. ... is the Hénon map, a two-dimensional discrete-time nonlinear dynamical system represented by the state ...

  17. [Identification of APEC genes expressed in vivo by selective capture of transcribed sequences].

    Science.gov (United States)

    Chen, Xiang; Gao, Song; Wang, Xiao-quan; Jiao, Xin-an; Liu, Xiu-fan

    2007-06-01

    Direct screening of bacterial genes expressed during infection in the host is limited, because isolation of bacterial transcripts from host tissues necessitates separation from the abundance of host RNA. Selective capture of transcribed sequences (SCOTS) allows the selective capture of bacterial cDNA derived from infected tissues using hybridization to biotinylated bacterial genomic DNA. Avian pathogenic E. coli strain E037 (serogroup O78) was used in a chicken infection model to identify bacterial genes that are expressed in infected tissues. Three-week-old white leghorn specific-pathogen-free chickens were inoculated into the right thoracic air sac with a 0.1 mL suspension containing 10(7) CFU of APEC strain E037. Total RNA was isolated from infected tissues (pericardium and air sacs) 6 or 24h postinfection and converted to cDNAs. By using the cDNA selection method of selective capture of transcribed sequences and enrichment for the isolation of pathogen-specific (non-pathogenic E. coli K-12 strain ) transcripts, pathogen-specific cDNAs were identified. Randomly chosen cDNA clones derived from transcripts in the air sacs or pericardium were selected and sequenced. The clones, termed aec, contained numerous APEC-specific sequences. Among the distinct 31 aec clones, pathogen-specific clones contained sequences homologous to known and novel putative bacterial virulence gene products involved in adherence, iron transport, lipopolysaccharide (LPS) synthesis, plasmid replication and conjugation, putative phage encoded products, and gene products of unknown function. Overall, the current study provided a means to identify novel pathogen-specific genes expressed in vivo and insight regarding the global gene expression of a pathogenic E. coli strain in a natural animal host during the infectious process.

  18. Selectivity and sparseness in randomly connected balanced networks.

    Directory of Open Access Journals (Sweden)

    Cengiz Pehlevan

    Full Text Available Neurons in sensory cortex show stimulus selectivity and sparse population response, even in cases where no strong functionally specific structure in connectivity can be detected. This raises the question whether selectivity and sparseness can be generated and maintained in randomly connected networks. We consider a recurrent network of excitatory and inhibitory spiking neurons with random connectivity, driven by random projections from an input layer of stimulus selective neurons. In this architecture, the stimulus-to-stimulus and neuron-to-neuron modulation of total synaptic input is weak compared to the mean input. Surprisingly, we show that in the balanced state the network can still support high stimulus selectivity and sparse population response. In the balanced state, strong synapses amplify the variation in synaptic input and recurrent inhibition cancels the mean. Functional specificity in connectivity emerges due to the inhomogeneity caused by the generative statistical rule used to build the network. We further elucidate the mechanism behind and evaluate the effects of model parameters on population sparseness and stimulus selectivity. Network response to mixtures of stimuli is investigated. It is shown that a balanced state with unselective inhibition can be achieved with densely connected input to inhibitory population. Balanced networks exhibit the "paradoxical" effect: an increase in excitatory drive to inhibition leads to decreased inhibitory population firing rate. We compare and contrast selectivity and sparseness generated by the balanced network to randomly connected unbalanced networks. Finally, we discuss our results in light of experiments.

  19. Selection for gene junction sequences important for VSV transcription.

    Science.gov (United States)

    Hinzman, Edward E; Barr, John N; Wertz, Gail W

    2008-10-25

    The heptauridine tract at each gene end and intergenic region (IGR) at the gene junctions of vesicular stomatitis virus (VSV) have effects on synthesis of the downstream mRNA, independent of their respective roles in termination of the upstream mRNA. To investigate the role of the U tract and the IGR in downstream gene transcription, we altered the N/P gene junction of infectious VSV such that transcription levels would be affected and result in altered molar ratios of the N and P proteins, which are critical for optimal viral RNA replication. The changes included extended IGRs between the N and P genes and shortening the length of the heptauridine tract upstream of the P gene start. Viruses having various combinations of these changes were recovered from cDNA and selective pressure for efficient viral replication was applied by sequential passage in cell culture. The replicative ability and sequence at the altered intergenic junctions were monitored throughout the passages to compare the effects of the changes at the IGR and U tract. VSV variants with wild-type U tracts upstream of the P gene replicated to levels similar to wt VSV. Variants with shortened U tracts were reduced in their ability to replicate. With passage, populations emerged that replicated to higher levels. Sequence analysis revealed that mutations had been selected for in these populations that increased the length of the U tract. This correlated with an increase in abundance of P mRNA and protein to provide improved N:P protein molar ratios. Extended IGRs resulted in decreased downstream transcription but the effect was not as extensive as that caused by shortened U tracts. Extended IGRs were not selected against in 5 passages. Our results indicate that the size of the upstream gene end U tract is an important determinant of efficient downstream gene transcription in infectious virus.

  20. Fast, Randomized Join-Order Selection - Why Use Transformations?

    NARCIS (Netherlands)

    C.A. Galindo-Legaria; A.J. Pellenkoft (Jan); M.L. Kersten (Martin)

    1994-01-01

    textabstractWe study the effectiveness of probabilistic selection of join-query evaluation plans, without reliance on tree transformation rules. Instead, each candidate plan is chosen uniformly at random from the space of valid evaluation orders. This leads to a transformation-free strategy where a

  1. The reliability of randomly selected final year pharmacy students in ...

    African Journals Online (AJOL)

    Employing ANOVA, factorial experimental analysis, and the theory of error, reliability studies were conducted on the assessment of the drug product chloroquine phosphate tablets. The G–Study employed equal numbers of the factors for uniform control, and involved three analysts (randomly selected final year Pharmacy ...

  2. Using machine learning for sequence-level automated MRI protocol selection in neuroradiology.

    Science.gov (United States)

    Brown, Andrew D; Marotta, Thomas R

    2017-10-27

    Incorrect imaging protocol selection can lead to important clinical findings being missed, contributing to both wasted health care resources and patient harm. We present a machine learning method for analyzing the unstructured text of clinical indications and patient demographics from magnetic resonance imaging (MRI) orders to automatically protocol MRI procedures at the sequence level. We compared 3 machine learning models - support vector machine, gradient boosting machine, and random forest - to a baseline model that predicted the most common protocol for all observations in our test set. The gradient boosting machine model significantly outperformed the baseline and demonstrated the best performance of the 3 models in terms of accuracy (95%), precision (86%), recall (80%), and Hamming loss (0.0487). This demonstrates the feasibility of automating sequence selection by applying machine learning to MRI orders. Automated sequence selection has important safety, quality, and financial implications and may facilitate improvements in the quality and safety of medical imaging service delivery. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  3. Local randomization in neighbor selection improves PRM roadmap quality

    KAUST Repository

    McMahon, Troy

    2012-10-01

    Probabilistic Roadmap Methods (PRMs) are one of the most used classes of motion planning methods. These sampling-based methods generate robot configurations (nodes) and then connect them to form a graph (roadmap) containing representative feasible pathways. A key step in PRM roadmap construction involves identifying a set of candidate neighbors for each node. Traditionally, these candidates are chosen to be the k-closest nodes based on a given distance metric. In this paper, we propose a new neighbor selection policy called LocalRand(k,K\\'), that first computes the K\\' closest nodes to a specified node and then selects k of those nodes at random. Intuitively, LocalRand attempts to benefit from random sampling while maintaining the higher levels of local planner success inherent to selecting more local neighbors. We provide a methodology for selecting the parameters k and K\\'. We perform an experimental comparison which shows that for both rigid and articulated robots, LocalRand results in roadmaps that are better connected than the traditional k-closest policy or a purely random neighbor selection policy. The cost required to achieve these results is shown to be comparable to k-closest. © 2012 IEEE.

  4. Selecting a phoneme-to-grapheme mapping: Random or weighted selection?

    Directory of Open Access Journals (Sweden)

    Binna Lee

    2015-05-01

    Our findings demonstrate that random selection underestimates MOA’s PG correspondences whereas weighted selection predicts higher PG correspondences than he produces. To explain his intermediate spelling performance on PPEs, we will test additional approaches to weighing the relative probability of PG mappings, including using log frequencies, separating consonant and vowel status, and considering the number of grapheme options in each phoneme.

  5. In vitro HIV-1 selective integration into the target sequence and decoy-effect of the modified sequence.

    Directory of Open Access Journals (Sweden)

    Tatsuaki Tsuruyama

    Full Text Available Although there have been a few reports that the HIV-1 genome can be selectively integrated into the genomic DNA of cultured host cell, the biochemistry of integration selectivity has not been fully understood. We modified the in vitro integration reaction protocol and developed a reaction system with higher efficiency. We used a substrate repeat, 5'-(GTCCCTTCCCAGT(n(ACTGGGAAGGGAC(n-3', and a modified sequence DNA ligated into a circular plasmid. CAGT and ACTG (shown in italics in the above sequence in the repeat units originated from the HIV-1 proviral genome ends. Following the incubation of the HIV-1 genome end cDNA and recombinant integrase for the formation of the pre-integration (PI complex, substrate DNA was reacted with this complex. It was confirmed that the integration selectively occurred in the middle segment of the repeat sequence. In addition, integration frequency and selectivity were positively correlated with repeat number n. On the other hand, both frequency and selectivity decreased markedly when using sequences with deletion of CAGT in the middle position of the original target sequence. Moreover, on incubation with the deleted DNAs and original sequence, the integration efficiency and selectivity for the original target sequence were significantly reduced, which indicated interference effects by the deleted sequence DNAs. Efficiency and selectivity were also found to vary discontinuously with changes in manganese dichloride concentration in the reaction buffer, probably due to its influence on the secondary structure of substrate DNA. Finally, integrase was found to form oligomers on the binding site and substrate DNA formed a loop-like structure. In conclusion, there is a considerable selectivity in HIV-integration into the specified sequence; however, similar DNA sequences can interfere with the integration process, and it is therefore difficult for in vivo integration to occur selectively in the actual host genome DNA.

  6. Quantifying biodiversity and asymptotics for a sequence of random strings.

    Science.gov (United States)

    Koyano, Hitoshi; Kishino, Hirohisa

    2010-06-01

    We present a methodology for quantifying biodiversity at the sequence level by developing the probability theory on a set of strings. Further, we apply our methodology to the problem of quantifying the population diversity of microorganisms in several extreme environments and digestive organs and reveal the relation between microbial diversity and various environmental parameters.

  7. Selection for altruism through random drift in variable size populations.

    Science.gov (United States)

    Houchmandzadeh, Bahram; Vallade, Marcel

    2012-05-10

    Altruistic behavior is defined as helping others at a cost to oneself and a lowered fitness. The lower fitness implies that altruists should be selected against, which is in contradiction with their widespread presence is nature. Present models of selection for altruism (kin or multilevel) show that altruistic behaviors can have 'hidden' advantages if the 'common good' produced by altruists is restricted to some related or unrelated groups. These models are mostly deterministic, or assume a frequency dependent fitness. Evolutionary dynamics is a competition between deterministic selection pressure and stochastic events due to random sampling from one generation to the next. We show here that an altruistic allele extending the carrying capacity of the habitat can win by increasing the random drift of "selfish" alleles. In other terms, the fixation probability of altruistic genes can be higher than those of a selfish ones, even though altruists have a smaller fitness. Moreover when populations are geographically structured, the altruists advantage can be highly amplified and the fixation probability of selfish genes can tend toward zero. The above results are obtained both by numerical and analytical calculations. Analytical results are obtained in the limit of large populations. The theory we present does not involve kin or multilevel selection, but is based on the existence of random drift in variable size populations. The model is a generalization of the original Fisher-Wright and Moran models where the carrying capacity depends on the number of altruists.

  8. Selection for altruism through random drift in variable size populations

    Directory of Open Access Journals (Sweden)

    Houchmandzadeh Bahram

    2012-05-01

    Full Text Available Abstract Background Altruistic behavior is defined as helping others at a cost to oneself and a lowered fitness. The lower fitness implies that altruists should be selected against, which is in contradiction with their widespread presence is nature. Present models of selection for altruism (kin or multilevel show that altruistic behaviors can have ‘hidden’ advantages if the ‘common good’ produced by altruists is restricted to some related or unrelated groups. These models are mostly deterministic, or assume a frequency dependent fitness. Results Evolutionary dynamics is a competition between deterministic selection pressure and stochastic events due to random sampling from one generation to the next. We show here that an altruistic allele extending the carrying capacity of the habitat can win by increasing the random drift of “selfish” alleles. In other terms, the fixation probability of altruistic genes can be higher than those of a selfish ones, even though altruists have a smaller fitness. Moreover when populations are geographically structured, the altruists advantage can be highly amplified and the fixation probability of selfish genes can tend toward zero. The above results are obtained both by numerical and analytical calculations. Analytical results are obtained in the limit of large populations. Conclusions The theory we present does not involve kin or multilevel selection, but is based on the existence of random drift in variable size populations. The model is a generalization of the original Fisher-Wright and Moran models where the carrying capacity depends on the number of altruists.

  9. Inter simple sequence repeats (ISSR) and random amplified ...

    African Journals Online (AJOL)

    21 of 30 random amplified polymorphic DNA (RAPD) primers produced 220 reproducible bands with average of 10.47 bands per primer and 80.12% of polymorphism. OPR02 primer showed the highest number of effective allele (Ne), Shannon index (I) and genetic diversity (H). Some of the cultivars had specific bands, ...

  10. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    Science.gov (United States)

    2008-07-01

    Highway, Suite 1204, Arlington, VA 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188) Washington, DC...COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES...sequences which are generalizations of the Fibonacci sequences. 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16

  11. Generation of Aptamers from A Primer-Free Randomized ssDNA Library Using Magnetic-Assisted Rapid Aptamer Selection

    Science.gov (United States)

    Tsao, Shih-Ming; Lai, Ji-Ching; Horng, Horng-Er; Liu, Tu-Chen; Hong, Chin-Yih

    2017-04-01

    Aptamers are oligonucleotides that can bind to specific target molecules. Most aptamers are generated using random libraries in the standard systematic evolution of ligands by exponential enrichment (SELEX). Each random library contains oligonucleotides with a randomized central region and two fixed primer regions at both ends. The fixed primer regions are necessary for amplifying target-bound sequences by PCR. However, these extra-sequences may cause non-specific bindings, which potentially interfere with good binding for random sequences. The Magnetic-Assisted Rapid Aptamer Selection (MARAS) is a newly developed protocol for generating single-strand DNA aptamers. No repeat selection cycle is required in the protocol. This study proposes and demonstrates a method to isolate aptamers for C-reactive proteins (CRP) from a randomized ssDNA library containing no fixed sequences at 5‧ and 3‧ termini using the MARAS platform. Furthermore, the isolated primer-free aptamer was sequenced and binding affinity for CRP was analyzed. The specificity of the obtained aptamer was validated using blind serum samples. The result was consistent with monoclonal antibody-based nephelometry analysis, which indicated that a primer-free aptamer has high specificity toward targets. MARAS is a feasible platform for efficiently generating primer-free aptamers for clinical diagnoses.

  12. Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System

    Directory of Open Access Journals (Sweden)

    Jinjian Jiang

    2017-07-01

    Full Text Available Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structures is extremely less than that of sequences. Moreover, almost all of the predictors identified hotspots from the interfaces of protein complexes, seldom from the whole protein sequences. Therefore, determining hotspots from whole protein sequences by sequence information alone is urgent. To address the issue of hotspot predictions from the whole sequences of proteins, we proposed an ensemble system with random projections using statistical physicochemical properties of amino acids. First, an encoding scheme involving sequence profiles of residues and physicochemical properties from the AAindex1 dataset is developed. Then, the random projection technique was adopted to project the encoding instances into a reduced space. Then, several better random projections were obtained by training an IBk classifier based on the training dataset, which were thus applied to the test dataset. The ensemble of random projection classifiers is therefore obtained. Experimental results showed that although the performance of our method is not good enough for real applications of hotspots, it is very promising in the determination of hotspot residues from whole sequences.

  13. Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System

    Science.gov (United States)

    Jiang, Jinjian; Wang, Nian; Chen, Peng; Zheng, Chunhou; Wang, Bing

    2017-01-01

    Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structures is extremely less than that of sequences. Moreover, almost all of the predictors identified hotspots from the interfaces of protein complexes, seldom from the whole protein sequences. Therefore, determining hotspots from whole protein sequences by sequence information alone is urgent. To address the issue of hotspot predictions from the whole sequences of proteins, we proposed an ensemble system with random projections using statistical physicochemical properties of amino acids. First, an encoding scheme involving sequence profiles of residues and physicochemical properties from the AAindex1 dataset is developed. Then, the random projection technique was adopted to project the encoding instances into a reduced space. Then, several better random projections were obtained by training an IBk classifier based on the training dataset, which were thus applied to the test dataset. The ensemble of random projection classifiers is therefore obtained. Experimental results showed that although the performance of our method is not good enough for real applications of hotspots, it is very promising in the determination of hotspot residues from whole sequences. PMID:28718782

  14. Interference-aware random beam selection for spectrum sharing systems

    KAUST Repository

    Abdallah, Mohamed M.

    2012-09-01

    Spectrum sharing systems have been introduced to alleviate the problem of spectrum scarcity by allowing secondary unlicensed networks to share the spectrum with primary licensed networks under acceptable interference levels to the primary users. In this paper, we develop interference-aware random beam selection schemes that provide enhanced throughput for the secondary link under the condition that the interference observed at the primary link is within a predetermined acceptable value. For a secondary transmitter equipped with multiple antennas, our schemes select a random beam, among a set of power- optimized orthogonal random beams, that maximizes the capacity of the secondary link while satisfying the interference constraint at the primary receiver for different levels of feedback information describing the interference level at the primary receiver. For the proposed schemes, we develop a statistical analysis for the signal-to-noise and interference ratio (SINR) statistics as well as the capacity of the secondary link. Finally, we present numerical results that study the effect of system parameters including number of beams and the maximum transmission power on the capacity of the secondary link attained using the proposed schemes. © 2012 IEEE.

  15. An empirical study of the complexity and randomness of prediction error sequences

    Science.gov (United States)

    Ratsaby, Joel

    2011-07-01

    We investigate a population of binary mistake sequences that result from learning with parametric models of different order. We obtain estimates of their error, algorithmic complexity and divergence from a purely random Bernoulli sequence. We study the relationship of these variables to the learner's information density parameter which is defined as the ratio between the lengths of the compressed to uncompressed files that contain the learner's decision rule. The results indicate that good learners have a low information density ρ while bad learners have a high ρ. Bad learners generate mistake sequences that are atypically complex or diverge stochastically from a purely random Bernoulli sequence. Good learners generate typically complex sequences with low divergence from Bernoulli sequences and they include mistake sequences generated by the Bayes optimal predictor. Based on the static algorithmic interference model of [18] the learner here acts as a static structure which "scatters" the bits of an input sequence (to be predicted) in proportion to its information density ρ thereby deforming its randomness characteristics.

  16. Assessing randomness and complexity in human motion trajectories through analysis of symbolic sequences

    Directory of Open Access Journals (Sweden)

    Zhen ePeng

    2014-03-01

    Full Text Available Complexity is a hallmark of intelligent behavior consisting both of regular patterns and random variation. To quantitatively assess the complexity and randomness of human motion, we designed a motor task in which we translated subjects' motion trajectories into strings of symbol sequences. In the first part of the experiment participants were asked to perform self-paced movements to create repetitive patterns, copy pre-specified letter sequences, and generate random movements. To investigate whether the degree of randomness can be manipulated, in the second part of the experiment participants were asked to perform unpredictable movements in the context of a pursuit game, where they received feedback from an online Bayesian predictor guessing their next move. We analyzed symbol sequences representing subjects' motion trajectories with five common complexity measures: predictability, compressibility, approximate entropy, Lempel-Ziv complexity, as well as effective measure complexity. We found that subjects’ self-created patterns were the most complex, followed by drawing movements of letters and self-paced random motion. We also found that participants could change the randomness of their behavior depending on context and feedback. Our results suggest that humans can adjust both complexity and regularity in different movement types and contexts and that this can be assessed with information-theoretic measures of the symbolic sequences generated from movement trajectories.

  17. Emulsion PCR: a high efficient way of PCR amplification of random DNA libraries in aptamer selection.

    Directory of Open Access Journals (Sweden)

    Keke Shao

    Full Text Available Aptamers are short RNA or DNA oligonucleotides which can bind with different targets. Typically, they are selected from a large number of random DNA sequence libraries. The main strategy to obtain aptamers is systematic evolution of ligands by exponential enrichment (SELEX. Low efficiency is one of the limitations for conventional PCR amplification of random DNA sequence library in aptamer selection because of relative low products and high by-products formation efficiency. Here, we developed emulsion PCR for aptamer selection. With this method, the by-products formation decreased tremendously to an undetectable level, while the products formation increased significantly. Our results indicated that by-products in conventional PCR amplification were from primer-product and product-product hybridization. In emulsion PCR, we can completely avoid the product-product hybridization and avoid the most of primer-product hybridization if the conditions were optimized. In addition, it also showed that the molecule ratio of template to compartment was crucial to by-product formation efficiency in emulsion PCR amplification. Furthermore, the concentration of the Taq DNA polymerase in the emulsion PCR mixture had a significant impact on product formation efficiency. So, the results of our study indicated that emulsion PCR could improve the efficiency of SELEX.

  18. Computationally Efficient Chaotic Spreading Sequence Selection for Asynchronous DS-CDMA

    Directory of Open Access Journals (Sweden)

    Litviņenko Anna

    2017-12-01

    Full Text Available The choice of the spreading sequence for asynchronous direct-sequence code-division multiple-access (DS-CDMA systems plays a crucial role for the mitigation of multiple-access interference. Considering the rich dynamics of chaotic sequences, their use for spreading allows overcoming the limitations of the classical spreading sequences. However, to ensure low cross-correlation between the sequences, careful selection must be performed. This paper presents a novel exhaustive search algorithm, which allows finding sets of chaotic spreading sequences of required length with a particularly low mutual cross-correlation. The efficiency of the search is verified by simulations, which show a significant advantage compared to non-selected chaotic sequences. Moreover, the impact of sequence length on the efficiency of the selection is studied.

  19. Unbiased split variable selection for random survival forests using maximally selected rank statistics.

    Science.gov (United States)

    Wright, Marvin N; Dankowski, Theresa; Ziegler, Andreas

    2017-04-15

    The most popular approach for analyzing survival data is the Cox regression model. The Cox model may, however, be misspecified, and its proportionality assumption may not always be fulfilled. An alternative approach for survival prediction is random forests for survival outcomes. The standard split criterion for random survival forests is the log-rank test statistic, which favors splitting variables with many possible split points. Conditional inference forests avoid this split variable selection bias. However, linear rank statistics are utilized by default in conditional inference forests to select the optimal splitting variable, which cannot detect non-linear effects in the independent variables. An alternative is to use maximally selected rank statistics for the split point selection. As in conditional inference forests, splitting variables are compared on the p-value scale. However, instead of the conditional Monte-Carlo approach used in conditional inference forests, p-value approximations are employed. We describe several p-value approximations and the implementation of the proposed random forest approach. A simulation study demonstrates that unbiased split variable selection is possible. However, there is a trade-off between unbiased split variable selection and runtime. In benchmark studies of prediction performance on simulated and real datasets, the new method performs better than random survival forests if informative dichotomous variables are combined with uninformative variables with more categories and better than conditional inference forests if non-linear covariate effects are included. In a runtime comparison, the method proves to be computationally faster than both alternatives, if a simple p-value approximation is used. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  20. Exploring Multivariate Event Sequences Using Rules, Aggregations, and Selections.

    Science.gov (United States)

    Cappers, Bram C M; van Wijk, Jarke J

    2018-01-01

    Multivariate event sequences are ubiquitous: travel history, telecommunication conversations, and server logs are some examples. Besides standard properties such as type and timestamp, events often have other associated multivariate data. Current exploration and analysis methods either focus on the temporal analysis of a single attribute or the structural analysis of the multivariate data only. We present an approach where users can explore event sequences at multivariate and sequential level simultaneously by interactively defining a set of rewrite rules using multivariate regular expressions. Users can store resulting patterns as new types of events or attributes to interactively enrich or simplify event sequences for further investigation. In Eventpad we provide a bottom-up glyph-oriented approach for multivariate event sequence analysis by searching, clustering, and aligning them according to newly defined domain specific properties. We illustrate the effectiveness of our approach with real-world data sets including telecommunication traffic and hospital treatments.

  1. The signature of positive selection at randomly chosen loci.

    Science.gov (United States)

    Przeworski, Molly

    2002-03-01

    In Drosophila and humans, there are accumulating examples of loci with a significant excess of high-frequency-derived alleles or high levels of linkage disequilibrium, relative to a neutral model of a random-mating population of constant size. These are features expected after a recent selective sweep. Their prevalence suggests that positive directional selection may be widespread in both species. However, as I show here, these features do not persist long after the sweep ends: The high-frequency alleles drift to fixation and no longer contribute to polymorphism, while linkage disequilibrium is broken down by recombination. As a result, loci chosen without independent evidence of recent selection are not expected to exhibit either of these features, even if they have been affected by numerous sweeps in their genealogical history. How then can we explain the patterns in the data? One possibility is population structure, with unequal sampling from different subpopulations. Alternatively, positive selection may not operate as is commonly modeled. In particular, the rate of fixation of advantageous mutations may have increased in the recent past.

  2. Close association between paralogous multiple isomiRs and paralogous/orthologues miRNA sequences implicates dominant sequence selection across various animal species.

    Science.gov (United States)

    Guo, Li; Zhao, Yang; Zhang, Hui; Yang, Sheng; Chen, Feng

    2013-09-25

    MicroRNAs (miRNAs) are crucial negative regulators of gene expression at the post-transcriptional level. Next-generation sequencing technologies have identified a series of miRNA variants (named isomiRs). In this study, paralogous isomiR assemblies (from the miRNA locus) were systematically analyzed based on data acquired from deep sequencing data sets. Evolutionary analysis of paralogous (members in miRNA gene family in a specific species) and orthologues (across different animal species) miRNAs was also performed. The sequence diversity of paralogous isomiRs was found to be similar to the diversity of paralogous and orthologues miRNAs. Additionally, both isomiRs and paralogous/orthologues miRNAs were implicated in 5' and 3' ends (especially 3' ends), nucleotide substitutions, and insertions and deletions. Generally, multiple isomiRs can be produced from a single miRNA locus, but most of them had lower enrichment levels, and only several dominant isomiR sequences were detected. These dominant isomiR groups were always stable, and one of them would be selected as the most abundant miRNA sequence in specific animal species. Some isomiRs might be consistent to miRNA sequences in some species but not the other. Homologous miRNAs were often detected in similar isomiR repertoires, and showed similar expression patterns, while dominant isomiRs showed complex evolutionary patterns from miRNA sequences across the animal kingdom. These results indicate that the phenomenon of multiple isomiRs is not a random event, but rather the result of evolutionary pressures. The existence of multiple isomiRs enables different species to express advantageous sequences in different environments. Thus, dominant sequences emerge in response to functional and evolutionary pressures, allowing an organism to adapt to complex intra- and extra-cellular events. © 2013.

  3. INSTRUCTIONAL CONFERENCE ON THE THEORY OF STOCHASTIC PROCESSES: Controlled random sequences and Markov chains

    Science.gov (United States)

    Yushkevich, A. A.; Chitashvili, R. Ya

    1982-12-01

    CONTENTSIntroduction Chapter I. Foundations of the general theory of controlled random sequences and Markov chains with the expected reward criterion § 1. Controlled random sequences, Markov chains, and models § 2. Necessary and sufficient conditions for optimality § 3. The Bellman equation for the value function and the existence of (ε-) optimal strategies Chapter II. Some problems in the theory of controlled homogeneous Markov chains § 4. Description of the solutions of the Bellman equation, a characterization of the value function, and the Bellman operator § 5. Sufficiency of stationary strategies in homogeneous Markov models § 6. The lexicographic Bellman equation References

  4. The use of whole sequence data for genomic selection in dairy cattle

    DEFF Research Database (Denmark)

    van den Berg, Irene

    2015-01-01

    This thesis investigated the use of Whole sequence data for genomic selection in dairy cattle. From a simulation study it was concluded that sequence data can result in increases in reliability by using prediction markers clse to the causative mutation rather than by using the full sequence...

  5. The algorithm of random length sequences synthesis for frame synchronization of digital television systems

    Directory of Open Access Journals (Sweden)

    Аndriy V. Sadchenko

    2015-12-01

    Full Text Available Digital television systems need to ensure that all digital signals processing operations are performed simultaneously and consistently. Frame synchronization dictated by the need to match phases of transmitter and receiver so that it would be possible to identify the start of a frame. As a frame synchronization signals are often used long length binary sequence with good aperiodic autocorrelation function. Aim: This work is dedicated to the development of the algorithm of random length sequences synthesis. Materials and Methods: The paper provides a comparative analysis of the known sequences, which can be used at present as synchronization ones, revealed their advantages and disadvantages. This work proposes the algorithm for the synthesis of binary synchronization sequences of random length with good autocorrelation properties based on noise generator with a uniform distribution law of probabilities. A "white noise" semiconductor generator is proposed to use as the initial material for the synthesis of binary sequences with desired properties. Results: The statistical analysis of the initial implementations of the "white noise" and synthesized sequences for frame synchronization of digital television is conducted. The comparative analysis of the synthesized sequences with known ones was carried out. The results show the benefits of obtained sequences in compare with known ones. The performed simulations confirm the obtained results. Conclusions: Thus, the search algorithm of binary synchronization sequences with desired autocorrelation properties received. According to this algorithm, the sequence can be longer in length and without length limitations. The received sync sequence can be used for frame synchronization in modern digital communication systems that will increase their efficiency and noise immunity.

  6. Blind Measurement Selection: A Random Matrix Theory Approach

    KAUST Repository

    Elkhalil, Khalil

    2016-12-14

    This paper considers the problem of selecting a set of $k$ measurements from $n$ available sensor observations. The selected measurements should minimize a certain error function assessing the error in estimating a certain $m$ dimensional parameter vector. The exhaustive search inspecting each of the $n\\\\choose k$ possible choices would require a very high computational complexity and as such is not practical for large $n$ and $k$. Alternative methods with low complexity have recently been investigated but their main drawbacks are that 1) they require perfect knowledge of the measurement matrix and 2) they need to be applied at the pace of change of the measurement matrix. To overcome these issues, we consider the asymptotic regime in which $k$, $n$ and $m$ grow large at the same pace. Tools from random matrix theory are then used to approximate in closed-form the most important error measures that are commonly used. The asymptotic approximations are then leveraged to select properly $k$ measurements exhibiting low values for the asymptotic error measures. Two heuristic algorithms are proposed: the first one merely consists in applying the convex optimization artifice to the asymptotic error measure. The second algorithm is a low-complexity greedy algorithm that attempts to look for a sufficiently good solution for the original minimization problem. The greedy algorithm can be applied to both the exact and the asymptotic error measures and can be thus implemented in blind and channel-aware fashions. We present two potential applications where the proposed algorithms can be used, namely antenna selection for uplink transmissions in large scale multi-user systems and sensor selection for wireless sensor networks. Numerical results are also presented and sustain the efficiency of the proposed blind methods in reaching the performances of channel-aware algorithms.

  7. Statistical inference of selection and divergence from a time-dependent Poisson random field model.

    Directory of Open Access Journals (Sweden)

    Amei Amei

    Full Text Available We apply a recently developed time-dependent Poisson random field model to aligned DNA sequences from two related biological species to estimate selection coefficients and divergence time. We use Markov chain Monte Carlo methods to estimate species divergence time and selection coefficients for each locus. The model assumes that the selective effects of non-synonymous mutations are normally distributed across genetic loci but constant within loci, and synonymous mutations are selectively neutral. In contrast with previous models, we do not assume that the individual species are at population equilibrium after divergence. Using a data set of 91 genes in two Drosophila species, D. melanogaster and D. simulans, we estimate the species divergence time t(div = 2.16 N(e (or 1.68 million years, assuming the haploid effective population size N(e = 6.45 x 10(5 years and a mean selection coefficient per generation μ(γ = 1.98/N(e. Although the average selection coefficient is positive, the magnitude of the selection is quite small. Results from numerical simulations are also presented as an accuracy check for the time-dependent model.

  8. Pediatric selective mutism therapy: a randomized controlled trial.

    Science.gov (United States)

    Esposito, Maria; Gimigliano, Francesca; Barillari, Maria R; Precenzano, Francesco; Ruberto, Maria; Sepe, Joseph; Barillari, Umberto; Gimigliano, Raffaele; Militerni, Roberto; Messina, Giovanni; Carotenuto, Marco

    2017-10-01

    Selective mutism (SM) is a rare disease in children coded by DSM-5 as an anxiety disorder. Despite the disabling nature of the disease, there is still no specific treatment. The aims of this study were to verify the efficacy of six-month standard psychomotor treatment and the positive changes in lifestyle, in a population of children affected by SM. Randomized controlled trial registered in the European Clinical Trials Registry (EuDract 2015-001161-36). University third level Centre (Child and Adolescent Neuropsychiatry Clinic). Study population was composed by 67 children in group A (psychomotricity treatment) (35 M, mean age 7.84±1.15) and 71 children in group B (behavioral and educational counseling) (37 M, mean age 7.75±1.36). Psychomotor treatment was administered by trained child therapists in residential settings three times per week. Each child was treated for the whole period by the same therapist and all the therapists shared the same protocol. The standard psychomotor session length is of 45 minutes. At T0 and after 6 months (T1) of treatments, patients underwent a behavioral and SM severity assessment. To verify the effects of the psychomotor management, the Child Behavior Checklist questionnaire (CBCL) and Selective Mutism Questionnaire (SMQ) were administered to the parents. After 6 months of psychomotor treatment SM children showed a significant reduction among CBCL scores such as in social relations, anxious/depressed, social problems and total problems (Ppsychomotricity a safe and efficacy therapy for pediatric selective mutism.

  9. Taxonomic evaluation of selected Ganoderma species and database sequence validation

    Directory of Open Access Journals (Sweden)

    Suldbold Jargalmaa

    2017-07-01

    Full Text Available Species in the genus Ganoderma include several ecologically important and pathogenic fungal species whose medicinal and economic value is substantial. Due to the highly similar morphological features within the Ganoderma, identification of species has relied heavily on DNA sequencing using BLAST searches, which are only reliable if the GenBank submissions are accurately labeled. In this study, we examined 113 specimens collected from 1969 to 2016 from various regions in Korea using morphological features and multigene analysis (internal transcribed spacer, translation elongation factor 1-α, and the second largest subunit of RNA polymerase II. These specimens were identified as four Ganoderma species: G. sichuanense, G. cf. adspersum, G. cf. applanatum, and G. cf. gibbosum. With the exception of G. sichuanense, these species were difficult to distinguish based solely on morphological features. However, phylogenetic analysis at three different loci yielded concordant phylogenetic information, and supported the four species distinctions with high bootstrap support. A survey of over 600 Ganoderma sequences available on GenBank revealed that 65% of sequences were either misidentified or ambiguously labeled. Here, we suggest corrected annotations for GenBank sequences based on our phylogenetic validation and provide updated global distribution patterns for these Ganoderma species.

  10. Synthesis of Speaker Facial Movement to Match Selected Speech Sequences

    Science.gov (United States)

    Scott, K. C.; Kagels, D. S.; Watson, S. H.; Rom, H.; Wright, J. R.; Lee, M.; Hussey, K. J.

    1994-01-01

    A system is described which allows for the synthesis of a video sequence of a realistic-appearing talking human head. A phonic based approach is used to describe facial motion; image processing rather than physical modeling techniques are used to create video frames.

  11. Taxonomic evaluation of selected Ganoderma species and database sequence validation

    Science.gov (United States)

    Jargalmaa, Suldbold; Eimes, John A.; Park, Myung Soo; Park, Jae Young; Oh, Seung-Yoon

    2017-01-01

    Species in the genus Ganoderma include several ecologically important and pathogenic fungal species whose medicinal and economic value is substantial. Due to the highly similar morphological features within the Ganoderma, identification of species has relied heavily on DNA sequencing using BLAST searches, which are only reliable if the GenBank submissions are accurately labeled. In this study, we examined 113 specimens collected from 1969 to 2016 from various regions in Korea using morphological features and multigene analysis (internal transcribed spacer, translation elongation factor 1-α, and the second largest subunit of RNA polymerase II). These specimens were identified as four Ganoderma species: G. sichuanense, G. cf. adspersum, G. cf. applanatum, and G. cf. gibbosum. With the exception of G. sichuanense, these species were difficult to distinguish based solely on morphological features. However, phylogenetic analysis at three different loci yielded concordant phylogenetic information, and supported the four species distinctions with high bootstrap support. A survey of over 600 Ganoderma sequences available on GenBank revealed that 65% of sequences were either misidentified or ambiguously labeled. Here, we suggest corrected annotations for GenBank sequences based on our phylogenetic validation and provide updated global distribution patterns for these Ganoderma species. PMID:28761785

  12. Optimizing Event Selection with the Random Grid Search

    Energy Technology Data Exchange (ETDEWEB)

    Bhat, Pushpalatha C. [Fermilab; Prosper, Harrison B. [Florida State U.; Sekmen, Sezen [Kyungpook Natl. U.; Stewart, Chip [Broad Inst., Cambridge

    2017-06-29

    The random grid search (RGS) is a simple, but efficient, stochastic algorithm to find optimal cuts that was developed in the context of the search for the top quark at Fermilab in the mid-1990s. The algorithm, and associated code, have been enhanced recently with the introduction of two new cut types, one of which has been successfully used in searches for supersymmetry at the Large Hadron Collider. The RGS optimization algorithm is described along with the recent developments, which are illustrated with two examples from particle physics. One explores the optimization of the selection of vector boson fusion events in the four-lepton decay mode of the Higgs boson and the other optimizes SUSY searches using boosted objects and the razor variables.

  13. Extreme Kinematics in Selected Hip Hop Dance Sequences.

    Science.gov (United States)

    Bronner, Shaw; Ojofeitimi, Sheyi; Woo, Helen

    2015-09-01

    Hip hop dance has many styles including breakdance (breaking), house, popping and locking, funk, streetdance, krumping, Memphis jookin', and voguing. These movements combine the complexity of dance choreography with the challenges of gymnastics and acrobatic movements. Despite high injury rates in hip hop dance, particularly in breakdance, to date there are no published biomechanical studies in this population. The purpose of this study was to compare representative hip hop steps found in breakdance (toprock and breaking) and house and provide descriptive statistics of the angular displacements that occurred in these sequences. Six expert female hip hop dancers performed three choreographed dance sequences, top rock, breaking, and house, to standardized music-based tempos. Hip, knee, and ankle kinematics were collected during sequences that were 18 to 30 sec long. Hip, knee, and ankle three-dimensional peak joint angles were compared in repeated measures ANOVAs with post hoc tests where appropriate (pHip hop maximal joint angles exceeded reported activities of daily living and high injury sports such as gymnastics. Hip hop dancers work at weight-bearing joint end ranges where muscles are at a functional disadvantage. These results may explain why lower extremity injury rates are high in this population.

  14. Affinity selection of Nipah and Hendra virus-related vaccine candidates from a complex random peptide library displayed on bacteriophage virus-like particles

    Energy Technology Data Exchange (ETDEWEB)

    Peabody, David S.; Chackerian, Bryce; Ashley, Carlee; Carnes, Eric; Negrete, Oscar

    2017-01-24

    The invention relates to virus-like particles of bacteriophage MS2 (MS2 VLPs) displaying peptide epitopes or peptide mimics of epitopes of Nipah Virus envelope glycoprotein that elicit an immune response against Nipah Virus upon vaccination of humans or animals. Affinity selection on Nipah Virus-neutralizing monoclonal antibodies using random sequence peptide libraries on MS2 VLPs selected peptides with sequence similarity to peptide sequences found within the envelope glycoprotein of Nipah itself, thus identifying the epitopes the antibodies recognize. The selected peptide sequences themselves are not necessarily identical in all respects to a sequence within Nipah Virus glycoprotein, and therefore may be referred to as epitope mimics VLPs displaying these epitope mimics can serve as vaccine. On the other hand, display of the corresponding wild-type sequence derived from Nipah Virus and corresponding to the epitope mapped by affinity selection, may also be used as a vaccine.

  15. rMotifGen: random motif generator for DNA and protein sequences

    Directory of Open Access Journals (Sweden)

    Hardin C Timothy

    2007-08-01

    Full Text Available Abstract Background Detection of short, subtle conserved motif regions within a set of related DNA or amino acid sequences can lead to discoveries about important regulatory domains such as transcription factor and DNA binding sites as well as conserved protein domains. In order to help assess motif detection algorithms on motifs with varying properties and levels of conservation, we have developed a computational tool, rMotifGen, with the sole purpose of generating a number of random DNA or protein sequences containing short sequence motifs. Each motif consensus can be user-defined, randomly generated, or created from a position-specific scoring matrix (PSSM. Insertions and mutations within these motifs are created according to user-defined parameters and substitution matrices. The resulting sequences can be helpful in mutational simulations and in testing the limits of motif detection algorithms. Results Two implementations of rMotifGen have been created, one providing a graphical user interface (GUI for random motif construction, and the other serving as a command line interface. The second implementation has the added advantages of platform independence and being able to be called in a batch mode. rMotifGen was used to construct sample sets of sequences containing DNA motifs and amino acid motifs that were then tested against the Gibbs sampler and MEME packages. Conclusion rMotifGen provides an efficient and convenient method for creating random DNA or amino acid sequences with a variable number of motifs, where the instance of each motif can be incorporated using a position-specific scoring matrix (PSSM or by creating an instance mutated from its corresponding consensus using an evolutionary model based on substitution matrices. rMotifGen is freely available at: http://bioinformatics.louisville.edu/brg/rMotifGen/.

  16. Sequence-based classification using discriminatory motif feature selection.

    Directory of Open Access Journals (Sweden)

    Hao Xiong

    Full Text Available Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length ≤ k, such that potentially important, longer (> k predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated. We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is

  17. Selective Knockdowns in Maize by Sequence-Specific Protein Aggregation.

    Science.gov (United States)

    Betti, Camilla; Schymkowitz, Joost; Rousseau, Frederic; Russinova, Eugenia

    2018-01-01

    Protein aggregation is determined by 5-15 amino acids peptides of the target protein sequence, so-called aggregation-prone regions (APRs) that specifically self-associate to form β-structured inclusions. The presence of APRs in a target protein can be predicted by a dedicated algorithm, such as TANGO. Synthetic aggregation-prone proteins are designed by expressing specific APRs fused to a fluorescent carrier for stability and visualization. Previously, the stable expression of these proteins in Zea mays (maize) has been demonstrated to induce aggregation of target proteins with specific localization, such as the starch-degrading enzyme α-glucan water dikinase, giving rise to plants displaying knockdown phenotypes. Here, we describe how to design synthetic aggregation-prone proteins to harness the sequence specificity of APRs to generate aggregation-associated phenotypes in a targeted manner and in different subcellular compartments. This method points toward the application of induced targeted aggregation as a useful tool to knock down protein functions in maize and to generate crops with improved traits.

  18. The Effect of Interference on Temporal Order Memory for Random and Fixed Sequences in Nondemented Older Adults

    Science.gov (United States)

    Tolentino, Jerlyn C.; Pirogovsky, Eva; Luu, Trinh; Toner, Chelsea K.; Gilbert, Paul E.

    2012-01-01

    Two experiments tested the effect of temporal interference on order memory for fixed and random sequences in young adults and nondemented older adults. The results demonstrate that temporal order memory for fixed and random sequences is impaired in nondemented older adults, particularly when temporal interference is high. However, temporal order…

  19. Deep Mutational Scanning: Library Construction, Functional Selection, and High-Throughput Sequencing.

    Science.gov (United States)

    Starita, Lea M; Fields, Stanley

    2015-08-03

    Deep mutational scanning is a highly parallel method that uses high-throughput sequencing to track changes in >10(5) protein variants before and after selection to measure the effects of mutations on protein function. Here we outline the stages of a deep mutational scanning experiment, focusing on the construction of libraries of protein sequence variants and the preparation of Illumina sequencing libraries. © 2015 Cold Spring Harbor Laboratory Press.

  20. Selective impairments of motor sequence learning in multiple sclerosis patients with minimal disability.

    Science.gov (United States)

    Tacchino, Andrea; Bove, Marco; Roccatagliata, Luca; Luigi Mancardi, Giovanni; Uccelli, Antonio; Bonzano, Laura

    2014-10-17

    Patients with Multiple Sclerosis (PwMS) with severe sensorimotor and cognitive deficits show reduced ability in motor sequence learning. Conversely, in PwMS with minimal disability (EDSS≤2), showing only subtle neurological impairments and no particular deficits in everyday life activities, motor sequence learning has been poorly addressed. Here, we investigated whether PwMS with minimal disability already show a specific impairment in motor sequence learning and which component of this process can be first affected in MS. We implemented a serial reaction time task based on thumb-to-finger opposition movements in response to visual stimuli. Each session included 14 blocks of 120 stimuli presented randomly or in ten repetitions of a 12-item sequence. Random (R) and sequence (S) blocks were temporally alternated (R1, R2, S1/S5, R3, S6/S10, R4). Random blocks were designed to evaluate the motor component; sequence blocks, beside the motor component, allowed to discriminate the procedural performance. Twenty-two PwMS and 22 control healthy subjects were asked to perform the task under implicit or explicit instructions (11 subjects for each experimental condition). PwMS with minimal disability improved motor performance in random blocks reducing response time with practice with a trend similar to control subjects, suggesting that short-term learning of simple motor tasks is nearly preserved at this disease stage. Conversely, they found difficulties in sequence-specific learning in implicit and explicit condition, with more pronounced impairment in the implicit condition. These findings could suggest an involvement of different circuits in implicit and explicit sequence learning that could deteriorate at different disease stages. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  2. Differentiating Schistosoma haematobium from Related Animal Schistosomes by PCR Amplifying Inter-Repeat Sequences Flanking Newly Selected Repeated Sequences

    OpenAIRE

    Abbasi, Ibrahim; HAMBURGER, JOSEPH; Kariuki, Curtis; Peter L Mungai; Muchiri, Eric M.; King, Charles H.

    2012-01-01

    In schistosomiasis elimination programs, successful discrimination of Schistosoma haematobium from the related animal Schistosoma parasites will be essential for accurate detection of human parasite transmission. Polymerase chain reaction assays employing primers from two newly selected repeated sequences, named Sh73 and Sh77, did not discriminate S. haematobium when amplifying Sh73-77 intra- or inter-repeats. However, amplification between Sh73 and the previously described DraI repeat exhibi...

  3. Evaluation of sequence variation and selection in the bindin locus of the red sea urchin, Strongylocentrotus franciscanus.

    Science.gov (United States)

    Debenham, P; Brzezinski, M A; Foltz, K R

    2000-11-01

    Recent evidence suggests that gamete recognition proteins may be subjected to directed evolutionary pressure that enhances sequence variability. We evaluated whether diversity enhancing selection is operating on a marine invertebrate fertilization protein by examining the intraspecific DNA sequence variation of a 273-base pair region located at the 5' end of the sperm bindin locus in 134 adult red sea urchins (Strongylocentrotus franciscanus). Bindin is a sperm recognition protein that mediates species-specific gamete interactions in sea urchins. The region of the bindin locus examined was found to be polymorphic with 14 alleles. Mean pairwise comparison of the 14 alleles indicates moderate sequence diversity (p-distance = 1.06). No evidence of diversity enhancing selection was found. It was not possible to reject the null hypothesis that the sequence variation observed in S. franciscanus bindin is a result of neutral evolution. Statistical evaluation of expected proportions of replacement and silent nucleotide substitutions, observed versus expected proportions of radical replacement substitutions, and conformance to the McDonald and Kreitman test of neutral evolution all indicate that random mutation followed by genetic drift created the polymorphisms observed in bindin. Observed frequencies were also highly similar to results expected for a neutrally evolving locus, suggesting that the polymorphism observed in the 5' region of S. franciscanus bindin is a result of neutral evolution.

  4. Evaluation and Selection of Best Priority Sequencing Rule in Job Shop Scheduling using Hybrid MCDM Technique

    Science.gov (United States)

    Kiran Kumar, Kalla; Nagaraju, Dega; Gayathri, S.; Narayanan, S.

    2017-05-01

    Priority Sequencing Rules provide the guidance for the order in which the jobs are to be processed at a workstation. The application of different priority rules in job shop scheduling gives different order of scheduling. More experimentation needs to be conducted before a final choice is made to know the best priority sequencing rule. Hence, a comprehensive method of selecting the right choice is essential in managerial decision making perspective. This paper considers seven different priority sequencing rules in job shop scheduling. For evaluation and selection of the best priority sequencing rule, a set of eight criteria are considered. The aim of this work is to demonstrate the methodology of evaluating and selecting the best priority sequencing rule by using hybrid multi criteria decision making technique (MCDM), i.e., analytical hierarchy process (AHP) with technique for order preference by similarity to ideal solution (TOPSIS). The criteria weights are calculated by using AHP whereas the relative closeness values of all priority sequencing rules are computed based on TOPSIS with the help of data acquired from the shop floor of a manufacturing firm. Finally, from the findings of this work, the priority sequencing rules are ranked from most important to least important. The comprehensive methodology presented in this paper is very much essential for the management of a workstation to choose the best priority sequencing rule among the available alternatives for processing the jobs with maximum benefit.

  5. No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution

    DEFF Research Database (Denmark)

    Workman, Christopher; Krogh, Anders Stærmose

    1999-01-01

    This work investigates whether mRNA has a lower estimated folding free energy than random sequences. The free energy estimates are calculated by the mfold program for prediction of RNA secondary structures. For a set of 46 mRNAs it is shown that the predicted free energy is not significantly...... different from random sequences with the same dinucleotide distribution. For random sequences with the same mononucleotide distribution it has previously been shown that the native mRNA sequences have a lower predicted free energy, which indicates a more stable structure than random sequences. However......, dinucleotide content is important when assessing the significance of predicted free energy as the physical stability of RNA secondary structure is known to depend on dinucleotide base stacking energies. Even known RNA secondary structures, like tRNAs, can be shown to have predicted free energies...

  6. Event selection with a Random Forest in IceCube

    Energy Technology Data Exchange (ETDEWEB)

    Ruhe, Tim [TU, Dortmund (Germany); Collaboration: IceCube-Collaboration

    2011-07-01

    The Random Forest method is a multivariate algorithm that can be used for classification and regression respectively. The Random Forest implemented in the RapidMiner learning environment has been used for training and validation on data and Monte Carlo simulations of the IceCube neutrino telescope. Latest results are presented.

  7. Whole genome sequencing of peach (Prunus persica L.) for SNP identification and selection.

    Science.gov (United States)

    Ahmad, Riaz; Parfitt, Dan E; Fass, Joseph; Ogundiwin, Ebenezer; Dhingra, Amit; Gradziel, Thomas M; Lin, Dawei; Joshi, Nikhil A; Martinez-Garcia, Pedro J; Crisosto, Carlos H

    2011-11-22

    The application of next generation sequencing technologies and bioinformatic scripts to identify high frequency SNPs distributed throughout the peach genome is described. Three peach genomes were sequenced using Roche 454 and Illumina/Solexa technologies to obtain long contigs for alignment to the draft 'Lovell' peach sequence as well as sufficient depth of coverage for 'in silico' SNP discovery. The sequences were aligned to the 'Lovell' peach genome released April 01, 2010 by the International Peach Genome Initiative (IPGI). 'Dr. Davis', 'F8, 1-42' and 'Georgia Belle' were sequenced to add SNPs segregating in two breeding populations, Pop DF ('Dr. Davis' × 'F8, 1-42') and Pop DG ('Dr. Davis' × 'Georgia Belle'). Roche 454 sequencing produced 980,000 total reads with 236 Mb sequence for 'Dr. Davis' and 735,000 total reads with 172 Mb sequence for 'F8, 1-42'. 84 bp × 84 bp paired end Illumina/Solexa sequences yielded 25.5, 21.4, 25.5 million sequences for 'Dr. Davis', 'F8, 1-42' and 'Georgia Belle', respectively. BWA/SAMtools were used for alignment of raw reads and SNP detection, with custom PERL scripts for SNP filtering. Velvet's Columbus module was used for sequence assembly. Comparison of aligned and overlapping sequences from both Roche 454 and Illumina/Solexa resulted in the selection of 6654 high quality SNPs for 'Dr. Davis' vs. 'F8, 1-42' and 'Georgia Belle', distributed on eight major peach genome scaffolds as defined from the 'Lovell' assembly. The eight scaffolds contained about 215-225 Mb of peach genomic sequences with one SNP/~ 40,000 bases. All sequences from Roche 454 and Illumina/Solexa have been submitted to NCBI for public use in the Short Read Archive database. SNPs have been deposited in the NCBI SNP database.

  8. A sequence-based method to predict the impact of regulatory variants using random forest.

    Science.gov (United States)

    Liu, Qiao; Gan, Mingxin; Jiang, Rui

    2017-03-14

    Most disease-associated variants identified by genome-wide association studies (GWAS) exist in noncoding regions. In spite of the common agreement that such variants may disrupt biological functions of their hosting regulatory elements, it remains a great challenge to characterize the risk of a genetic variant within the implicated genome sequence. Therefore, it is essential to develop an effective computational model that is not only capable of predicting the potential risk of a genetic variant but also valid in interpreting how the function of the genome is affected with the occurrence of the variant. We developed a method named kmerForest that used a random forest classifier with k-mer counts to predict accessible chromatin regions purely based on DNA sequences. We demonstrated that our method outperforms existing methods in distinguishing known accessible chromatin regions from random genomic sequences. Furthermore, the performance of our method can further be improved with the incorporation of sequence conservation features. Based on this model, we assessed importance of the k-mer features by a series of permutation experiments, and we characterized the risk of a single nucleotide polymorphism (SNP) on the function of the genome using the difference between the importance of the k-mer features affected by the occurrence of the SNP. We conducted a series of experiments and showed that our model can well discriminate between pathogenic and normal SNPs. Particularly, our model correctly prioritized SNPs that are proved to be enriched for the binding sites of FOXA1 in breast cancer cell lines from previous studies. We presented a novel method to interpret functional genetic variants purely base on DNA sequences. The proposed k-mer based score offers an effective means of measuring the impact of SNPs on the function of the genome, and thus shedding light on the identification of genetic risk factors underlying complex traits and diseases.

  9. Random amino acid mutations and protein misfolding lead to Shannon limit in sequence-structure communication.

    Directory of Open Access Journals (Sweden)

    Andreas Martin Lisewski

    2008-09-01

    Full Text Available The transmission of genomic information from coding sequence to protein structure during protein synthesis is subject to stochastic errors. To analyze transmission limits in the presence of spurious errors, Shannon's noisy channel theorem is applied to a communication channel between amino acid sequences and their structures established from a large-scale statistical analysis of protein atomic coordinates. While Shannon's theorem confirms that in close to native conformations information is transmitted with limited error probability, additional random errors in sequence (amino acid substitutions and in structure (structural defects trigger a decrease in communication capacity toward a Shannon limit at 0.010 bits per amino acid symbol at which communication breaks down. In several controls, simulated error rates above a critical threshold and models of unfolded structures always produce capacities below this limiting value. Thus an essential biological system can be realistically modeled as a digital communication channel that is (a sensitive to random errors and (b restricted by a Shannon error limit. This forms a novel basis for predictions consistent with observed rates of defective ribosomal products during protein synthesis, and with the estimated excess of mutual information in protein contact potentials.

  10. Random amino acid mutations and protein misfolding lead to Shannon limit in sequence-structure communication.

    Science.gov (United States)

    Lisewski, Andreas Martin

    2008-09-01

    The transmission of genomic information from coding sequence to protein structure during protein synthesis is subject to stochastic errors. To analyze transmission limits in the presence of spurious errors, Shannon's noisy channel theorem is applied to a communication channel between amino acid sequences and their structures established from a large-scale statistical analysis of protein atomic coordinates. While Shannon's theorem confirms that in close to native conformations information is transmitted with limited error probability, additional random errors in sequence (amino acid substitutions) and in structure (structural defects) trigger a decrease in communication capacity toward a Shannon limit at 0.010 bits per amino acid symbol at which communication breaks down. In several controls, simulated error rates above a critical threshold and models of unfolded structures always produce capacities below this limiting value. Thus an essential biological system can be realistically modeled as a digital communication channel that is (a) sensitive to random errors and (b) restricted by a Shannon error limit. This forms a novel basis for predictions consistent with observed rates of defective ribosomal products during protein synthesis, and with the estimated excess of mutual information in protein contact potentials.

  11. (abstract) Synthesis of Speaker Facial Movements to Match Selected Speech Sequences

    Science.gov (United States)

    Scott, Kenneth C.

    1994-01-01

    We are developing a system for synthesizing image sequences the simulate the facial motion of a speaker. To perform this synthesis, we are pursuing two major areas of effort. We are developing the necessary computer graphics technology to synthesize a realistic image sequence of a person speaking selected speech sequences. Next, we are developing a model that expresses the relation between spoken phonemes and face/mouth shape. A subject is video taped speaking an arbitrary text that contains expression of the full list of desired database phonemes. The subject is video taped from the front speaking normally, recording both audio and video detail simultaneously. Using the audio track, we identify the specific video frames on the tape relating to each spoken phoneme. From this range we digitize the video frame which represents the extreme of mouth motion/shape. Thus, we construct a database of images of face/mouth shape related to spoken phonemes. A selected audio speech sequence is recorded which is the basis for synthesizing a matching video sequence; the speaker need not be the same as used for constructing the database. The audio sequence is analyzed to determine the spoken phoneme sequence and the relative timing of the enunciation of those phonemes. Synthesizing an image sequence corresponding to the spoken phoneme sequence is accomplished using a graphics technique known as morphing. Image sequence keyframes necessary for this processing are based on the spoken phoneme sequence and timing. We have been successful in synthesizing the facial motion of a native English speaker for a small set of arbitrary speech segments. Our future work will focus on advancement of the face shape/phoneme model and independent control of facial features.

  12. Study on MAX-MIN Ant System with Random Selection in Quadratic Assignment Problem

    Science.gov (United States)

    Iimura, Ichiro; Yoshida, Kenji; Ishibashi, Ken; Nakayama, Shigeru

    Ant Colony Optimization (ACO), which is a type of swarm intelligence inspired by ants' foraging behavior, has been studied extensively and its effectiveness has been shown by many researchers. The previous studies have reported that MAX-MIN Ant System (MMAS) is one of effective ACO algorithms. The MMAS maintains the balance of intensification and diversification concerning pheromone by limiting the quantity of pheromone to the range of minimum and maximum values. In this paper, we propose MAX-MIN Ant System with Random Selection (MMASRS) for improving the search performance even further. The MMASRS is a new ACO algorithm that is MMAS into which random selection was newly introduced. The random selection is one of the edgechoosing methods by agents (ants). In our experimental evaluation using ten quadratic assignment problems, we have proved that the proposed MMASRS with the random selection is superior to the conventional MMAS without the random selection in the viewpoint of the search performance.

  13. Cardiorespiratory Kinetics Determined by Pseudo-Random Binary Sequences - Comparisons between Walking and Cycling.

    Science.gov (United States)

    Koschate, J; Drescher, U; Thieschäfer, L; Heine, O; Baum, K; Hoffmann, U

    2016-12-01

    This study aims to compare cardiorespiratory kinetics as a response to a standardised work rate protocol with pseudo-random binary sequences between cycling and walking in young healthy subjects. Muscular and pulmonary oxygen uptake (V̇O2) kinetics as well as heart rate kinetics were expected to be similar for walking and cycling. Cardiac data and V̇O2 of 23 healthy young subjects were measured in response to pseudo-random binary sequences. Kinetics were assessed applying time series analysis. Higher maxima of cross-correlation functions between work rate and the respective parameter indicate faster kinetics responses. Muscular V̇O2 kinetics were estimated from heart rate and pulmonary V̇O2 using a circulatory model. Muscular (walking vs. cycling [mean±SD in arbitrary units]: 0.40±0.08 vs. 0.41±0.08) and pulmonary V̇O2 kinetics (0.35±0.06 vs. 0.35±0.06) were not different, although the time courses of the cross-correlation functions of pulmonary V̇O2 showed unexpected biphasic responses. Heart rate kinetics (0.50±0.14 vs. 0.40±0.14; P=0.017) was faster for walking. Regarding the biphasic cross-correlation functions of pulmonary V̇O2 during walking, the assessment of muscular V̇O2 kinetics via pseudo-random binary sequences requires a circulatory model to account for cardio-dynamic distortions. Faster heart rate kinetics for walking should be considered by comparing results from cycle and treadmill ergometry. © Georg Thieme Verlag KG Stuttgart · New York.

  14. Interference Suppression Performance of Automotive UWB Radars Using Pseudo Random Sequences

    Directory of Open Access Journals (Sweden)

    I. Pasya

    2015-12-01

    Full Text Available Ultra wideband (UWB automotive radars have attracted attention from the viewpoint of reducing traffic accidents. The performance of automotive radars may be degraded by interference from nearby radars using the same frequency. In this study, a scenario where two cars pass each other on a road was considered. Considering the utilization of cross-polarization, the desired-to-undesired signal power ratio (DUR was found to vary approximately from -10 to 30 dB. Different pseudo random sequences were employed for spectrum spreading the different radar signals to mitigate the interference effects. This paper evaluates the interference suppression provided by maximum length sequence (MLS and Gold sequence (GS through numerical simulations of the radar’s performance in terms of probability of false alarm and probability of detection. It was found that MLS and GS yielded nearly the same performance when the DUR is -10 dB (worst case; for example when fixing the probability of false alarm to 0.0001, the probabilities of detection were 0.964 and 0.946 respectively. The GS are more advantageous than MLS due to larger number of different sequences having the same length in GS than in MLS.

  15. Partial summations of stationary sequences of non-Gaussian random variables

    DEFF Research Database (Denmark)

    Mohr, Gunnar; Ditlevsen, Ove Dalager

    1996-01-01

    The distribution of the sum of a finite number of identically distributed random variables is in many cases easily determined given that the variables are independent. The moments of any order of the sum can always be expressed by the moments of the single term without computational problems...... of convergence of the distribution of a sum (or an integral) of mutually dependent random variables to the Gaussian distribution. The paper is closely related to the work in Ditlevsen el al. [Ditlevsen, O., Mohr, G. & Hoffmeyer, P. Integration of non-Gaussian fields. Prob. Engng Mech 11 (1996) 15-23](2)........ However, in the case of dependency between the terms even calculation of a few of the first moments of the sum presents serious computational problems. By use of computerized symbol manipulations it is practicable to obtain exact moments of partial sums of stationary sequences of mutually dependent...

  16. Chunking movements into sequence: the visual pre-selection of subsequent goals.

    Science.gov (United States)

    Baldauf, Daniel

    2011-04-01

    The chunking of individual movements into sequences has been studied extensively from a motor point of view. Here we approach the planning of sequential behavior from a perceptual perspective investigating the sensorimotor transformations that accompany visually guided sequential behavior. We show that visual attention pre-selects subsequent goals only if two movements are planned to be carried out in rapid succession and therefore are integrated into one common action. This causes visual attention to select both intended goal locations in advance. In contrast, in more slowly executed motor sequences, the single movements are programmed one-by-one and subsequent movement goals are only later visually prepared ('just in time'). The visual selection of a subsequent goal location crucially depends on the speed of the planned sequence: the longer the inter-reach delay, the less visual attention is deployed to the subsequent goal initially. Copyright © 2011 Elsevier Ltd. All rights reserved.

  17. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  18. The Jackprot Simulation Couples Mutation Rate with Natural Selection to Illustrate How Protein Evolution Is Not Random

    Science.gov (United States)

    Espinosa, Avelina; Bai, Chunyan Y.

    2016-01-01

    Protein evolution is not a random process. Views which attribute randomness to molecular change, deleterious nature to single-gene mutations, insufficient geological time, or population size for molecular improvements to occur, or invoke “design creationism” to account for complexity in molecular structures and biological processes, are unfounded. Scientific evidence suggests that natural selection tinkers with molecular improvements by retaining adaptive peptide sequence. We used slot-machine probabilities and ion channels to show biological directionality on molecular change. Because ion channels reside in the lipid bilayer of cell membranes, their residue location must be in balance with the membrane's hydrophobic/philic nature; a selective “pore” for ion passage is located within the hydrophobic region. We contrasted the random generation of DNA sequence for KcsA, a bacterial two-transmembrane-domain (2TM) potassium channel, from Streptomyces lividans, with an under-selection scenario, the “jackprot,” which predicted much faster evolution than by chance. We wrote a computer program in JAVA APPLET version 1.0 and designed an online interface, The Jackprot Simulation http://faculty.rwu.edu/cbai/JackprotSimulation.htm, to model a numerical interaction between mutation rate and natural selection during a scenario of polypeptide evolution. Winning the “jackprot,” or highest-fitness complete-peptide sequence, required cumulative smaller “wins” (rewarded by selection) at the first, second, and third positions in each of the 161 KcsA codons (“jackdons” that led to “jackacids” that led to the “jackprot”). The “jackprot” is a didactic tool to demonstrate how mutation rate coupled with natural selection suffices to explain the evolution of specialized proteins, such as the complex six-transmembrane (6TM) domain potassium, sodium, or calcium channels. Ancestral DNA sequences coding for 2TM-like proteins underwent nucleotide

  19. The Jackprot Simulation Couples Mutation Rate with Natural Selection to Illustrate How Protein Evolution Is Not Random.

    Science.gov (United States)

    Paz-Y-Miño C, Guillermo; Espinosa, Avelina; Bai, Chunyan Y

    2011-09-01

    Protein evolution is not a random process. Views which attribute randomness to molecular change, deleterious nature to single-gene mutations, insufficient geological time, or population size for molecular improvements to occur, or invoke "design creationism" to account for complexity in molecular structures and biological processes, are unfounded. Scientific evidence suggests that natural selection tinkers with molecular improvements by retaining adaptive peptide sequence. We used slot-machine probabilities and ion channels to show biological directionality on molecular change. Because ion channels reside in the lipid bilayer of cell membranes, their residue location must be in balance with the membrane's hydrophobic/philic nature; a selective "pore" for ion passage is located within the hydrophobic region. We contrasted the random generation of DNA sequence for KcsA, a bacterial two-transmembrane-domain (2TM) potassium channel, from Streptomyces lividans, with an under-selection scenario, the "jackprot," which predicted much faster evolution than by chance. We wrote a computer program in JAVA APPLET version 1.0 and designed an online interface, The Jackprot Simulation http://faculty.rwu.edu/cbai/JackprotSimulation.htm, to model a numerical interaction between mutation rate and natural selection during a scenario of polypeptide evolution. Winning the "jackprot," or highest-fitness complete-peptide sequence, required cumulative smaller "wins" (rewarded by selection) at the first, second, and third positions in each of the 161 KcsA codons ("jackdons" that led to "jackacids" that led to the "jackprot"). The "jackprot" is a didactic tool to demonstrate how mutation rate coupled with natural selection suffices to explain the evolution of specialized proteins, such as the complex six-transmembrane (6TM) domain potassium, sodium, or calcium channels. Ancestral DNA sequences coding for 2TM-like proteins underwent nucleotide "edition" and gene duplications to generate the 6

  20. Application of Ammonium Persulfate for Selective Oxidation of Guanines for Nucleic Acid Sequencing

    Directory of Open Access Journals (Sweden)

    Yafen Wang

    2017-07-01

    Full Text Available Nucleic acids can be sequenced by a chemical procedure that partially damages the nucleotide positions at their base repetition. Many methods have been reported for the selective recognition of guanine. The accurate identification of guanine in both single and double regions of DNA and RNA remains a challenging task. Herein, we present a new, non-toxic and simple method for the selective recognition of guanine in both DNA and RNA sequences via ammonium persulfate modification. This strategy can be further successfully applied to the detection of 5-methylcytosine by using PCR.

  1. A Comparison of mRNA Sequencing with Random Primed and 3'-Directed Libraries.

    Science.gov (United States)

    Xiong, Yuguang; Soumillon, Magali; Wu, Jie; Hansen, Jens; Hu, Bin; van Hasselt, Johan G C; Jayaraman, Gomathi; Lim, Ryan; Bouhaddou, Mehdi; Ornelas, Loren; Bochicchio, Jim; Lenaeus, Lindsay; Stocksdale, Jennifer; Shim, Jaehee; Gomez, Emilda; Sareen, Dhruv; Svendsen, Clive; Thompson, Leslie M; Mahajan, Milind; Iyengar, Ravi; Sobie, Eric A; Azeloglu, Evren U; Birtwistle, Marc R

    2017-11-07

    Creating a cDNA library for deep mRNA sequencing (mRNAseq) is generally done by random priming, creating multiple sequencing fragments along each transcript. A 3'-end-focused library approach cannot detect differential splicing, but has potentially higher throughput at a lower cost, along with the ability to improve quantification by using transcript molecule counting with unique molecular identifiers (UMI) that correct PCR bias. Here, we compare an implementation of such a 3'-digital gene expression (3'-DGE) approach with "conventional" random primed mRNAseq. Given our particular datasets on cultured human cardiomyocyte cell lines, we find that, while conventional mRNAseq detects ~15% more genes and needs ~500,000 fewer reads per sample for equivalent statistical power, the resulting differentially expressed genes, biological conclusions, and gene signatures are highly concordant between two techniques. We also find good quantitative agreement at the level of individual genes between two techniques for both read counts and fold changes between given conditions. We conclude that, for high-throughput applications, the potential cost savings associated with 3'-DGE approach are likely a reasonable tradeoff for modest reduction in sensitivity and inability to observe alternative splicing, and should enable many larger scale studies focusing on not only differential expression analysis, but also quantitative transcriptome profiling.

  2. RSARF: Prediction of residue solvent accessibility from protein sequence using random forest method

    KAUST Repository

    Ganesan, Pugalenthi

    2012-01-01

    Prediction of protein structure from its amino acid sequence is still a challenging problem. The complete physicochemical understanding of protein folding is essential for the accurate structure prediction. Knowledge of residue solvent accessibility gives useful insights into protein structure prediction and function prediction. In this work, we propose a random forest method, RSARF, to predict residue accessible surface area from protein sequence information. The training and testing was performed using 120 proteins containing 22006 residues. For each residue, buried and exposed state was computed using five thresholds (0%, 5%, 10%, 25%, and 50%). The prediction accuracy for 0%, 5%, 10%, 25%, and 50% thresholds are 72.9%, 78.25%, 78.12%, 77.57% and 72.07% respectively. Further, comparison of RSARF with other methods using a benchmark dataset containing 20 proteins shows that our approach is useful for prediction of residue solvent accessibility from protein sequence without using structural information. The RSARF program, datasets and supplementary data are available at http://caps.ncbs.res.in/download/pugal/RSARF/. - See more at: http://www.eurekaselect.com/89216/article#sthash.pwVGFUjq.dpuf

  3. Genome sequencing reveals loci under artificial selection that underlie disease phenotypes in the laboratory rat

    NARCIS (Netherlands)

    Atanur, S.S.; Diaz, A.G.; Maratou, K.; Sarkis, A.; Rotival, M.; Game, L.; Tschannen, M.R.; Kaisaki, P.J.; Otto, G.W.; Ma, M.C.; Keane, T.M.; Hummel, O.; Saar, K.; Chen, W.; Guryev, V.; Gopalakrishnan, K.; Garrett, M.R.; Joe, B.; Citterio, L.; Bianchi, G.; McBride, M.; Dominiczak, A.; Adams, D.J.; Serikawa, T.; Flicek, P.; Cuppen, E.; Hubner, N.; Petretto, E.; Gauguier, D.; Kwitek, A.; Jacob, H.; Aitman, T.J.

    2013-01-01

    Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and

  4. Genome Sequencing Reveals Loci under Artificial Selection that Underlie Disease Phenotypes in the Laboratory Rat

    NARCIS (Netherlands)

    Atanur, Santosh S.; Diaz, Ana Garcia; Maratou, Klio; Sarkis, Allison; Rotival, Maxime; Game, Laurence; Tschannen, Michael R.; Kaisaki, Pamela J.; Otto, Georg W.; Ma, Man Chun John; Keane, Thomas M.; Hummel, Oliver; Saar, Kathrin; Chen, Wei; Guryev, Victor; Gopalakrishnan, Kathirvel; Garrett, Michael R.; Joe, Bina; Citterio, Lorena; Bianchi, Giuseppe; McBride, Martin; Dominiczak, Anna; Adams, David J.; Serikawa, Tadao; Flicek, Paul; Cuppen, Edwin; Hubner, Norbert; Petretto, Enrico; Gauguier, Dominique; Kwitek, Anne; Jacob, Howard; Aitman, Timothy J.

    2013-01-01

    Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and

  5. Genetic selection and DNA sequences of 4.5S RNA homologs

    DEFF Research Database (Denmark)

    Brown, S; Thon, G; Tolentino, E

    1989-01-01

    A general strategy for cloning the functional homologs of an Escherichia coli gene was used to clone homologs of 4.5S RNA from other bacteria. The genes encoding these homologs were selected by their ability to complement a deletion of the gene for 4.5S RNA. DNA sequences of the regions encoding...

  6. Identification of (R)-selective ω-aminotransferases by exploring evolutionary sequence space.

    Science.gov (United States)

    Kim, Eun-Mi; Park, Joon Ho; Kim, Byung-Gee; Seo, Joo-Hyun

    2018-03-01

    Several (R)-selective ω-aminotransferases (R-ωATs) have been reported. The existence of additional R-ωATs having different sequence characteristics from previous ones is highly expected. In addition, it is generally accepted that R-ωATs are variants of aminotransferase group III. Based on these backgrounds, sequences in RefSeq database were scored using family profiles of branched-chain amino acid aminotransferase (BCAT) and d-alanine aminotransferase (DAT) to predict and identify putative R-ωATs. Sequences with two profile analysis scores were plotted on two-dimensional score space. Candidates with relatively similar scores in both BCAT and DAT profiles (i.e., profile analysis score using BCAT profile was similar to profile analysis score using DAT profile) were selected. Experimental results for selected candidates showed that putative R-ωATs from Saccharopolyspora erythraea (R-ωAT_Sery), Bacillus cellulosilyticus (R-ωAT_Bcel), and Bacillus thuringiensis (R-ωAT_Bthu) had R-ωAT activity. Additional experiments revealed that R-ωAT_Sery also possessed DAT activity while R-ωAT_Bcel and R-ωAT_Bthu had BCAT activity. Selecting putative R-ωATs from regions with similar profile analysis scores identified potential R-ωATs. Therefore, R-ωATs could be efficiently identified by using simple family profile analysis and exploring evolutionary sequence space. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. Novel random peptide libraries displayed on AAV serotype 9 for selection of endothelial cell-directed gene transfer vectors.

    Science.gov (United States)

    Varadi, K; Michelfelder, S; Korff, T; Hecker, M; Trepel, M; Katus, H A; Kleinschmidt, J A; Müller, O J

    2012-08-01

    We have demonstrated the potential of random peptide libraries displayed on adeno-associated virus (AAV)2 to select for AAV2 vectors with improved efficiency for cell type-directed gene transfer. AAV9, however, may have advantages over AAV2 because of a lower prevalence of neutralizing antibodies in humans and more efficient gene transfer in vivo. Here we provide evidence that random peptide libraries can be displayed on AAV9 and can be utilized to select for AAV9 capsids redirected to the cell type of interest. We generated an AAV9 peptide display library, which ensures that the displayed peptides correspond to the packaged genomes and performed four consecutive selection rounds on human coronary artery endothelial cells in vitro. This screening yielded AAV9 library capsids with distinct peptide motifs enabling up to 40-fold improved transduction efficiencies compared with wild-type (wt) AAV9 vectors. Incorporating sequences selected from AAV9 libraries into AAV2 capsids could not increase transduction as efficiently as in the AAV9 context. To analyze the potential on endothelial cells in the intact natural vascular context, human umbilical veins were incubated with the selected AAV in situ and endothelial cells were isolated. Fluorescence-activated cell sorting analysis revealed a 200-fold improved transduction efficiency compared with wt AAV9 vectors. Furthermore, AAV9 vectors with targeting sequences selected from AAV9 libraries revealed an increased transduction efficiency in the presence of human intravenous immunoglobulins, suggesting a reduced immunogenicity. We conclude that our novel AAV9 peptide library is functional and can be used to select for vectors for future preclinical and clinical gene transfer applications.

  8. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

    Energy Technology Data Exchange (ETDEWEB)

    Tuskan, Gerald A [ORNL; Gunter, Lee E [ORNL; DiFazio, Stephen P [West Virginia University

    2009-01-01

    The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis -type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequence assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.

  9. Extensive purifying selection acting on synonymous sites in HIV-1 Group M sequences

    Directory of Open Access Journals (Sweden)

    Martin Darren

    2008-12-01

    Full Text Available Abstract Background Positive selection pressure acting on protein-coding sequences is usually inferred when the rate of nonsynonymous substitution is greater than the synonymous rate. However, purifying selection acting directly on the nucleotide sequence can lower the synonymous substitution rate. This could result in false inference of positive selection because when synonymous changes at some sites are under purifying selection, the average synonymous rate is an underestimate of the neutral rate of evolution. Even though HIV-1 coding sequences contain a number of regions that function at the nucleotide level, and are thus likely to be affected by purifying selection, studies of positive selection assume that synonymous substitutions can be used to estimate the neutral rate of evolution. Results We modelled site-to-site variation in the synonymous substitution rate across coding regions of the HIV-1 genome. Synonymous substitution rates were found to vary significantly within and between genes. Surprisingly, regions of the genome that encode proteins in more than one frame had significantly higher synonymous substitution rates than regions coding in a single frame. We found evidence of strong purifying selection pressure affecting synonymous mutations in fourteen regions with known functions. These included an exonic splicing enhancer, the rev-responsive element, the poly-purine tract and a transcription factor binding site. A further five highly conserved regions were located within known functional domains. We also found four conserved regions located in env and vpu which have not been characterized previously. Conclusion We provide the coordinates of genomic regions with markedly lower synonymous substitution rates, which are putatively under the influence of strong purifying selection pressure at the nucleotide level as well as regions encoding proteins in more than one frame. These regions should be excluded from studies of positive

  10. In vivo selection of randomly mutated retroviral genomes

    NARCIS (Netherlands)

    Berkhout, B.; Klaver, B.

    1993-01-01

    Darwinian evolution, that is the outgrowth of the fittest variants in a population, usually applies to living organisms over long periods of time. Recently, in vitro selection/amplification techniques have been developed that allow for the rapid evolution of functionally active nucleic acids from a

  11. Positive Selection or Free to Vary? Assessing the Functional Significance of Sequence Change Using Molecular Dynamics.

    Directory of Open Access Journals (Sweden)

    Jane R Allison

    Full Text Available Evolutionary arms races between pathogens and their hosts may be manifested as selection for rapid evolutionary change of key genes, and are sometimes detectable through sequence-level analyses. In the case of protein-coding genes, such analyses frequently predict that specific codons are under positive selection. However, detecting positive selection can be non-trivial, and false positive predictions are a common concern in such analyses. It is therefore helpful to place such predictions within a structural and functional context. Here, we focus on the p19 protein from tombusviruses. P19 is a homodimer that sequesters siRNAs, thereby preventing the host RNAi machinery from shutting down viral infection. Sequence analysis of the p19 gene is complicated by the fact that it is constrained at the sequence level by overprinting of a viral movement protein gene. Using homology modeling, in silico mutation and molecular dynamics simulations, we assess how non-synonymous changes to two residues involved in forming the dimer interface-one invariant, and one predicted to be under positive selection-impact molecular function. Interestingly, we find that both observed variation and potential variation (where a non-synonymous change to p19 would be synonymous for the overprinted movement protein does not significantly impact protein structure or RNA binding. Consequently, while several methods identify residues at the dimer interface as being under positive selection, MD results suggest they are functionally indistinguishable from a site that is free to vary. Our analyses serve as a caveat to using sequence-level analyses in isolation to detect and assess positive selection, and emphasize the importance of also accounting for how non-synonymous changes impact structure and function.

  12. G-STRATEGY: Optimal Selection of Individuals for Sequencing in Genetic Association Studies

    Science.gov (United States)

    Wang, Miaoyan; Jakobsdottir, Johanna; Smith, Albert V.; McPeek, Mary Sara

    2017-01-01

    In a large-scale genetic association study, the number of phenotyped individuals available for sequencing may, in some cases, be greater than the study’s sequencing budget will allow. In that case, it can be important to prioritize individuals for sequencing in a way that optimizes power for association with the trait. Suppose a cohort of phenotyped individuals is available, with some subset of them possibly already sequenced, and one wants to choose an additional fixed-size subset of individuals to sequence in such a way that the power to detect association is maximized. When the phenotyped sample includes related individuals, power for association can be gained by including partial information, such as phenotype data of ungenotyped relatives, in the analysis, and this should be taken into account when assessing whom to sequence. We propose G-STRATEGY, which uses simulated annealing to choose a subset of individuals for sequencing that maximizes the expected power for association. In simulations, G-STRATEGY performs extremely well for a range of complex disease models and outperforms other strategies with, in many cases, relative power increases of 20–40% over the next best strategy, while maintaining correct type 1 error. G-STRATEGY is computationally feasible even for large datasets and complex pedigrees. We apply G-STRATEGY to data on HDL and LDL from the AGES-Reykjavik and REFINE-Reykjavik studies, in which G-STRATEGY is able to closely-approximate the power of sequencing the full sample by selecting for sequencing a only small subset of the individuals. PMID:27256766

  13. Assessing the accuracy and stability of variable selection methods for random forest modeling in ecology

    Science.gov (United States)

    Random forest (RF) modeling has emerged as an important statistical learning method in ecology due to its exceptional predictive performance. However, for large and complex ecological datasets there is limited guidance on variable selection methods for RF modeling. Typically, e...

  14. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions.

    Directory of Open Access Journals (Sweden)

    Wei Li

    Full Text Available Copy-number variations (CNV, loss of heterozygosity (LOH, and uniparental disomy (UPD are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS, is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs. In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information.

  15. [Selection and optimal sequence of critical elements for medication review: A simulation with hospital pharmacy residents].

    Science.gov (United States)

    Dubois, S; Barbier, A; Thibault, M; Atkinson, S; Bussières, J-F

    2017-03-01

    The main objective of this study was to compare the responses of pharmacy residents regarding critical steps for medication order review, in the presence or absence of clinical pharmacists on patient care units, to describe the sequence of these steps and to compare them to an optimal sequence. The secondary objectives were to test this sequence in a simulation and to assess the residents' level of agreement on medication order review. Twenty-two validation steps were selected from guidelines. A simulation on order review was organized in three steps: selecting elements judged to be necessary or not for the order review critical path, then organizing this sequence in chronological order, implementation of this critical path on two simulated practical cases, resident perceptions about order review in their training. Forty-one residents participated in the activity. Responses were heterogeneous regarding the elements' sequence and the time required for the review of a simulated case (3-13minutes). A majority of residents considered that their training was insufficient (29/41), that pharmacists validated differently (27/41), and that it was impossible to review the 22 proposed items for each prescription (30/41). This article highlights heterogeneous medication order review practices among pharmacy residents, due to a lack of training in their curriculum according to them. It is essential to acquire medication order review standard both locally and nationally. Copyright © 2016 Académie Nationale de Pharmacie. Published by Elsevier Masson SAS. All rights reserved.

  16. Selection is not required to produce invariant T-cell receptor gamma-gene junctional sequences.

    Science.gov (United States)

    Asarnow, D M; Cado, D; Raulet, D H

    1993-03-11

    Recombination of V-, D- and J-gene segments can generate an enormous diversity of T-cell antigen receptor (TCR) gene sequences. Although many gamma delta T cells fully exploit this diversification process, those in the epidermal and vaginal epithelium do not, predominantly expressing invariant gamma delta receptors in which the V-(D)-J junctional sequences in almost all the productive rearrangements are identical. The almost exclusive use of identical TCRs by cells in these sites is thought to reflect recognition of a stress-induced autologous antigen. To explain the prevalence of the invariant junctional sequences, it has been proposed that thymic selection operates on a population of originally diverse progenitor cells, resulting in a homogeneous repertoire. Alternatively the invariant sequences may result from biases in the recombination machinery in the fetal thymic progenitors of these cells. We report here the use of mice into which mutated TCR gamma-gene rearrangement substrates have been introduced as transgenes to demonstrate directly that the canonical TCR V gamma 3-J gamma 1 and V gamma 4-J gamma 1 sequences occur at high frequency in the absence of the possibility of selection for the protein products.

  17. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data

    Directory of Open Access Journals (Sweden)

    Sheng Yang

    2015-01-01

    Full Text Available Sequencing is widely used to discover associations between microRNAs (miRNAs and diseases. However, the negative binomial distribution (NB and high dimensionality of data obtained using sequencing can lead to low-power results and low reproducibility. Several statistical learning algorithms have been proposed to address sequencing data, and although evaluation of these methods is essential, such studies are relatively rare. The performance of seven feature selection (FS algorithms, including baySeq, DESeq, edgeR, the rank sum test, lasso, particle swarm optimistic decision tree, and random forest (RF, was compared by simulation under different conditions based on the difference of the mean, the dispersion parameter of the NB, and the signal to noise ratio. Real data were used to evaluate the performance of RF, logistic regression, and support vector machine. Based on the simulation and real data, we discuss the behaviour of the FS and classification algorithms. The Apriori algorithm identified frequent item sets (mir-133a, mir-133b, mir-183, mir-937, and mir-96 from among the deregulated miRNAs of six datasets from The Cancer Genomics Atlas. Taking these findings altogether and considering computational memory requirements, we propose a strategy that combines edgeR and DESeq for large sample sizes.

  18. A Systematic Evaluation of Feature Selection and Classification Algorithms Using Simulated and Real miRNA Sequencing Data.

    Science.gov (United States)

    Yang, Sheng; Guo, Li; Shao, Fang; Zhao, Yang; Chen, Feng

    2015-01-01

    Sequencing is widely used to discover associations between microRNAs (miRNAs) and diseases. However, the negative binomial distribution (NB) and high dimensionality of data obtained using sequencing can lead to low-power results and low reproducibility. Several statistical learning algorithms have been proposed to address sequencing data, and although evaluation of these methods is essential, such studies are relatively rare. The performance of seven feature selection (FS) algorithms, including baySeq, DESeq, edgeR, the rank sum test, lasso, particle swarm optimistic decision tree, and random forest (RF), was compared by simulation under different conditions based on the difference of the mean, the dispersion parameter of the NB, and the signal to noise ratio. Real data were used to evaluate the performance of RF, logistic regression, and support vector machine. Based on the simulation and real data, we discuss the behaviour of the FS and classification algorithms. The Apriori algorithm identified frequent item sets (mir-133a, mir-133b, mir-183, mir-937, and mir-96) from among the deregulated miRNAs of six datasets from The Cancer Genomics Atlas. Taking these findings altogether and considering computational memory requirements, we propose a strategy that combines edgeR and DESeq for large sample sizes.

  19. The frequency of drugs in randomly selected drivers in Denmark

    DEFF Research Database (Denmark)

    Simonsen, Kirsten Wiese; Steentoft, Anni; Hels, Tove

    Introduction Driving under the influence of alcohol and drugs is a global problem. In Denmark as well as in other countries there is an increasing focus on impaired driving. Little is known about the occurrence of psychoactive drugs in the general traffic. Therefore the European commission...... initiated the DRUID project. This roadside study is the Danish part of the EU-project DRUID (Driving under the Influence of Drugs, Alcohol, and Medicines) and included three representative regions in Denmark. Methods Oral fluid samples (n = 3002) were collected randomly from drivers using a sampling scheme...... stratified by time, season, and road type. The oral fluid samples were screened for 29 illegal and legal psychoactive substances and metabolites as well as ethanol. Results Fourteen (0.5%) drivers were positive for ethanol (alone or in combination with drugs) at concentrations above 0.53 g/l, which...

  20. Sample Selection in Randomized Experiments: A New Method Using Propensity Score Stratified Sampling

    Science.gov (United States)

    Tipton, Elizabeth; Hedges, Larry; Vaden-Kiernan, Michael; Borman, Geoffrey; Sullivan, Kate; Caverly, Sarah

    2014-01-01

    Randomized experiments are often seen as the "gold standard" for causal research. Despite the fact that experiments use random assignment to treatment conditions, units are seldom selected into the experiment using probability sampling. Very little research on experimental design has focused on how to make generalizations to well-defined…

  1. Pseudo cluster randomization dealt with selection bias and contamination in clinical trials

    NARCIS (Netherlands)

    Teerenstra, S.; Melis, R.J.F.; Peer, P.G.M.; Borm, G.F.

    2006-01-01

    BACKGROUND AND OBJECTIVES: When contamination is present, randomization on a patient level leads to dilution of the treatment effect. The usual solution is to randomize on a cluster level, but at the cost of efficiency and more importantly, this may introduce selection bias. Furthermore, it may slow

  2. Pseudo-random-bit-sequence phase modulation for reduced errors in a fiber optic gyroscope.

    Science.gov (United States)

    Chamoun, Jacob; Digonnet, Michel J F

    2016-12-15

    Low noise and drift in a laser-driven fiber optic gyroscope (FOG) are demonstrated by interrogating the sensor with a low-coherence laser. The laser coherence was reduced by broadening its optical spectrum using an external electro-optic phase modulator driven by either a sinusoidal or a pseudo-random bit sequence (PRBS) waveform. The noise reduction measured in a FOG driven by a modulated laser agrees with the calculations based on the broadened laser spectrum. Using PRBS modulation, the linewidth of a laser was broadened from 10 MHz to more than 10 GHz, leading to a measured FOG noise of only 0.00073  deg/√h and a drift of 0.023  deg/h. To the best of our knowledge, these are the lowest noise and drift reported in a laser-driven FOG, and this noise is below the requirement for the inertial navigation of aircraft.

  3. Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences

    Science.gov (United States)

    De Silva, Dilrini R.; Nichols, Richard; Elgar, Greg

    2014-01-01

    Comparison of polymorphism at synonymous and non-synonymous sites in protein-coding DNA can provide evidence for selective constraint. Non-coding DNA that forms part of the regulatory landscape presents more of a challenge since there is not such a clear-cut distinction between sites under stronger and weaker selective constraint. Here, we consider putative regulatory elements termed Conserved Non-coding Elements (CNEs) defined by their high level of sequence identity across all vertebrates. Some mutations in these regions have been implicated in developmental disorders; we analyse CNE polymorphism data to investigate whether such deleterious effects are widespread in humans. Single nucleotide variants from the HapMap and 1000 Genomes Projects were mapped across nearly 2000 CNEs. In the 1000 Genomes data we find a significant excess of rare derived alleles in CNEs relative to coding sequences; this pattern is absent in HapMap data, apparently obscured by ascertainment bias. The distribution of polymorphism within CNEs is not uniform; we could identify two categories of sites by exploiting deep vertebrate alignments: stretches that are non-variant, and those that have at least one substitution. The conserved category has fewer polymorphic sites and a greater excess of rare derived alleles, which can be explained by a large proportion of sites under strong purifying selection within humans – higher than that for non-synonymous sites in most protein coding regions, and comparable to that at the strongly conserved trans-dev genes. Conversely, the more evolutionarily labile CNE sites have an allele frequency distribution not significantly different from non-synonymous sites. Future studies should exploit genome-wide re-sequencing to obtain better coverage in selected non-coding regions, given the likelihood that mutations in evolutionarily conserved enhancer sequences are deleterious. Discovery pipelines should validate non-coding variants to aid in identifying causal

  4. Evidence for Clonal Expansion After Antibiotic Selection Pressure: Pneumococcal Multilocus Sequence Types Before and After Mass Azithromycin Treatments

    Science.gov (United States)

    Keenan, Jeremy D.; Klugman, Keith P.; McGee, Lesley; Vidal, Jorge E.; Chochua, Sopio; Hawkins, Paulina; Cevallos, Vicky; Gebre, Teshome; Tadesse, Zerihun; Emerson, Paul M.; Jorgensen, James H.; Gaynor, Bruce D.; Lietman, Thomas M.

    2015-01-01

    Background. A clinical trial of mass azithromycin distributions for trachoma created a convenient experiment to test the hypothesis that antibiotic use selects for clonal expansion of preexisting resistant bacterial strains. Methods. Twelve communities in Ethiopia received mass azithromycin distributions every 3 months for 1 year. A random sample of 10 children aged 0–9 years from each community was monitored by means of nasopharyngeal swab sampling before mass azithromycin distribution and after 4 mass treatments. Swab specimens were tested for Streptococcus pneumoniae, and isolates underwent multilocus sequence typing. Results. Of 82 pneumococcal isolates identified before treatment, 4 (5%) exhibited azithromycin resistance, representing 3 different sequence types (STs): 177, 6449, and 6494. The proportion of isolates that were classified as one of these 3 STs and were resistant to azithromycin increased after 4 mass azithromycin treatments (14 of 96 isolates [15%]; P = .04). Using a classification index, we found evidence for a relationship between ST and macrolide resistance after mass treatments (P azithromycin treatment (P = .045). Conclusions. Resistant clones present before mass azithromycin treatments increased in frequency after treatment, consistent with the theory that antibiotic selection pressure results in clonal expansion of existing resistant strains. PMID:25293366

  5. Sequence features associated with microRNA strand selection in humans and flies

    Directory of Open Access Journals (Sweden)

    Menzel Corinna

    2009-09-01

    Full Text Available Abstract Background During microRNA (miRNA maturation in humans and flies, Drosha and Dicer cut the precursor transcript, thereby producing a short RNA duplex. One strand of this duplex becomes a functional component of the RNA-Induced Silencing Complex (RISC, while the other is eliminated. While thermodynamic asymmetry of the duplex ends appears to play a decisive role in the strand selection process, the details of the selection mechanism are not yet understood. Results Here, we assess miRNA strand selection bias in humans and fruit flies by analyzing the sequence composition and relative expression levels of the two strands of the precursor duplex in these species. We find that the sequence elements associated with preferential miRNA strand selection and/or rejection differ between the two species. Further, we identify another feature that distinguishes human and fly miRNA processing machinery: the relative accuracy of the Drosha and Dicer enzymes. Conclusion Our result provides clues to the mechanistic aspects of miRNA strand selection in humans and other mammals. Further, it indicates that human and fly miRNA processing pathways are more distinct than currently recognized. Finally, the observed strand selection determinants are instrumental in the rational design of efficient miRNA-based expression regulators.

  6. Pulsed magnetization transfer contrast MRI by a sequence with water selective excitation

    Energy Technology Data Exchange (ETDEWEB)

    Schick, F. [Univ. of Tuebingen (Germany)

    1996-01-01

    A water selective SE imaging sequence was developed providing suitable properties for the assessment of magnetization transfer (MT) effects in tissues with considerable amounts of fat. The sequence with water selective excitation and slice selective refocusing combines the following features: The RIF exposure on the macromolecular protons is relatively low for single slice imaging without MT prepulses, since no additional pulses for fat saturation are necessary. Water selection by frequency selective excitation diminishes faults in the subtraction of images recorded with and without MT prepulses (which might arise from movements). High differences in the signal amplitudes from hyaline cartilage and muscle tissue were obtained comparing images recorded with irradiation of the series of prepulses for MT and those lacking MT prepulses. Utilizations of the described water selective approach for the assessment of MT effects in lesions of cartilage and bone are demonstrated. MT saturation was also examined in muscles with fatty degeneration of patients suffering from progressive muscular dystrophy. The described technique allows determination of MT effects with good precision in a single slice, especially in regions with dominating fat signals. 22 refs., 5 figs.

  7. HLA DNA sequence variation among human populations: molecular signatures of demographic and selective events.

    Directory of Open Access Journals (Sweden)

    Stéphane Buhler

    Full Text Available Molecular differences between HLA alleles vary up to 57 nucleotides within the peptide binding coding region of human Major Histocompatibility Complex (MHC genes, but it is still unclear whether this variation results from a stochastic process or from selective constraints related to functional differences among HLA molecules. Although HLA alleles are generally treated as equidistant molecular units in population genetic studies, DNA sequence diversity among populations is also crucial to interpret the observed HLA polymorphism. In this study, we used a large dataset of 2,062 DNA sequences defined for the different HLA alleles to analyze nucleotide diversity of seven HLA genes in 23,500 individuals of about 200 populations spread worldwide. We first analyzed the HLA molecular structure and diversity of these populations in relation to geographic variation and we further investigated possible departures from selective neutrality through Tajima's tests and mismatch distributions. All results were compared to those obtained by classical approaches applied to HLA allele frequencies.Our study shows that the global patterns of HLA nucleotide diversity among populations are significantly correlated to geography, although in some specific cases the molecular information reveals unexpected genetic relationships. At all loci except HLA-DPB1, populations have accumulated a high proportion of very divergent alleles, suggesting an advantage of heterozygotes expressing molecularly distant HLA molecules (asymmetric overdominant selection model. However, both different intensities of selection and unequal levels of gene conversion may explain the heterogeneous mismatch distributions observed among the loci. Also, distinctive patterns of sequence divergence observed at the HLA-DPB1 locus suggest current neutrality but old selective pressures on this gene. We conclude that HLA DNA sequences advantageously complement HLA allele frequencies as a source of data used

  8. LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone

    KAUST Repository

    Chen, Peng

    2014-12-03

    Background Protein-ligand binding is important for some proteins to perform their functions. Protein-ligand binding sites are the residues of proteins that physically bind to ligands. Despite of the recent advances in computational prediction for protein-ligand binding sites, the state-of-the-art methods search for similar, known structures of the query and predict the binding sites based on the solved structures. However, such structural information is not commonly available. Results In this paper, we propose a sequence-based approach to identify protein-ligand binding residues. We propose a combination technique to reduce the effects of different sliding residue windows in the process of encoding input feature vectors. Moreover, due to the highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we construct several balanced data sets, for each of which a random forest (RF)-based classifier is trained. The ensemble of these RF classifiers forms a sequence-based protein-ligand binding site predictor. Conclusions Experimental results on CASP9 and CASP8 data sets demonstrate that our method compares favorably with the state-of-the-art protein-ligand binding site prediction methods.

  9. Acceptance sampling using judgmental and randomly selected samples

    Energy Technology Data Exchange (ETDEWEB)

    Sego, Landon H.; Shulman, Stanley A.; Anderson, Kevin K.; Wilson, John E.; Pulsipher, Brent A.; Sieber, W. Karl

    2010-09-01

    We present a Bayesian model for acceptance sampling where the population consists of two groups, each with different levels of risk of containing unacceptable items. Expert opinion, or judgment, may be required to distinguish between the high and low-risk groups. Hence, high-risk items are likely to be identifed (and sampled) using expert judgment, while the remaining low-risk items are sampled randomly. We focus on the situation where all observed samples must be acceptable. Consequently, the objective of the statistical inference is to quantify the probability that a large percentage of the unsampled items in the population are also acceptable. We demonstrate that traditional (frequentist) acceptance sampling and simpler Bayesian formulations of the problem are essentially special cases of the proposed model. We explore the properties of the model in detail, and discuss the conditions necessary to ensure that required samples sizes are non-decreasing function of the population size. The method is applicable to a variety of acceptance sampling problems, and, in particular, to environmental sampling where the objective is to demonstrate the safety of reoccupying a remediated facility that has been contaminated with a lethal agent.

  10. AQUa: An Adaptive Framework for Compression of Sequencing Quality Scores with Random Access Functionality.

    Science.gov (United States)

    Paridaens, Tom; Van Wallendael, Glenn; De Neve, Wesley; Lambert, Peter

    2017-09-25

    The past decade has seen the introduction of new technologies that significantly lowered the cost of genome sequencing. As a result, the amount of genomic data that must be stored and transmitted is increasing exponentially. To mitigate storage and transmission issues, we introduce a framework for lossless compression of quality scores. This paper proposes AQUa, an adaptive framework for lossless compression of quality scores. To compress these quality scores, AQUa makes use of a configurable set of coding tools, extended with a Context-Adaptive Binary Arithmetic Coding scheme (CABAC). When benchmarking AQUa against generic single-pass compressors, file sizes are reduced by up to 38.49% when comparing with GNU Gzip and by up to 6.48% when comparing with 7-Zip at the Ultra Setting, while still providing support for random access. When comparing AQUa with the purpose-built, single-pass, and state-of-the-art compressor SCALCE, which does not support random access, file sizes are reduced by up to 21.14%. When comparing AQUa with the purpose-built, dual-pass, and state-of-the-art compressor QVZ, which does not support random access, file sizes are larger by 6.42% to 33.47%. However, for one test file, the file size is 0.38% smaller, illustrating the strength of our single-pass compression framework. This work has been spurred by the current activity on genomic information representation (MPEG-G) within the ISO/IEC SC29/WG11 technical committee. The software is available on Github: https://github.com/tparidae/AQUa. Tom Paridaens (tom.paridaens@ugent.be).

  11. Rhipicephalus microplus dataset of nonredundant raw sequence reads from 454 GS FLX sequencing of Cot-selected (Cot = 660) genomic DNA

    Science.gov (United States)

    A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...

  12. Whole genome sequencing of Plasmodium falciparum from dried blood spots using selective whole genome amplification.

    Science.gov (United States)

    Oyola, Samuel O; Ariani, Cristina V; Hamilton, William L; Kekre, Mihir; Amenga-Etego, Lucas N; Ghansah, Anita; Rutledge, Gavin G; Redmond, Seth; Manske, Magnus; Jyothi, Dushyanth; Jacob, Chris G; Otto, Thomas D; Rockett, Kirk; Newbold, Chris I; Berriman, Matthew; Kwiatkowski, Dominic P

    2016-12-20

    Translating genomic technologies into healthcare applications for the malaria parasite Plasmodium falciparum has been limited by the technical and logistical difficulties of obtaining high quality clinical samples from the field. Sampling by dried blood spot (DBS) finger-pricks can be performed safely and efficiently with minimal resource and storage requirements compared with venous blood (VB). Here, the use of selective whole genome amplification (sWGA) to sequence the P. falciparum genome from clinical DBS samples was evaluated, and the results compared with current methods that use leucodepleted VB. Parasite DNA with high (>95%) human DNA contamination was selectively amplified by Phi29 polymerase using short oligonucleotide probes of 8-12 mers as primers. These primers were selected on the basis of their differential frequency of binding the desired (P. falciparum DNA) and contaminating (human) genomes. Using sWGA method, clinical samples from 156 malaria patients, including 120 paired samples for head-to-head comparison of DBS and leucodepleted VB were sequenced. Greater than 18-fold enrichment of P. falciparum DNA was achieved from DBS extracts. The parasitaemia threshold to achieve >5× coverage for 50% of the genome was 0.03% (40 parasites per 200 white blood cells). Over 99% SNP concordance between VB and DBS samples was achieved after excluding missing calls. The sWGA methods described here provide a reliable and scalable way of generating P. falciparum genome sequence data from DBS samples. The current data indicate that it will be possible to get good quality sequence on most if not all drug resistance loci from the majority of symptomatic malaria patients. This technique overcomes a major limiting factor in P. falciparum genome sequencing from field samples, and paves the way for large-scale epidemiological applications.

  13. Footprinting: a method for determining the sequence selectivity, affinity and kinetics of DNA-binding ligands.

    Science.gov (United States)

    Hampshire, Andrew J; Rusling, David A; Broughton-Head, Victoria J; Fox, Keith R

    2007-06-01

    Footprinting is a simple method for assessing the sequence selectivity of DNA-binding ligands. The method is based on the ability of the ligand to protect DNA from cleavage at its binding site. This review describes the use of DNase I and hydroxyl radicals, the most commonly used footprinting probes, in footprinting experiments. The success of a footprinting experiment depends on using an appropriate DNA substrate and we describe how these can best be chosen or designed. Although footprinting was originally developed for assessing a ligand's sequence selectivity, it can also be employed to estimate the binding strength (quantitative footprinting) and to assess the association and dissociation rate constants for slow binding reactions.

  14. Comparative analysis of idiom selection and sequencing 5 in Estonian basic school EFL coursebooks

    Directory of Open Access Journals (Sweden)

    Rita Anita Forssten

    2017-05-01

    Full Text Available The article investigates the selection and sequencing of the idioms encountered in two locally-produced and international coursebook series currently employed in Estonian basic schools. It is hypothesized that there exists a positive correlation between idioms’ difficulty and coursebooks’ language proficiency level. The hypothesis is tested through a statistical analysis of the idioms found which are categorized in terms of their analysability into three categories where category 1 includes analysable semi-literal idioms, category 2 comprises analysable semi-transparent idioms, and category 3 encompasses non-analysable opaque idioms, and then analysed through an online language corpus (British National Corpus. The results of the study reveal that the coursebook authors under discussion have disregarded idioms’ frequency as a criterion for selection or sequencing, whereas the factor utilized to some extent is the degree of analysability.

  15. Automatic seed selection for segmentation of liver cirrhosis in laparoscopic sequences

    Science.gov (United States)

    Sinha, Rahul; Marcinczak, Jan Marek; Grigat, Rolf-Rainer

    2014-03-01

    For computer aided diagnosis based on laparoscopic sequences, image segmentation is one of the basic steps which define the success of all further processing. However, many image segmentation algorithms require prior knowledge which is given by interaction with the clinician. We propose an automatic seed selection algorithm for segmentation of liver cirrhosis in laparoscopic sequences which assigns each pixel a probability of being cirrhotic liver tissue or background tissue. Our approach is based on a trained classifier using SIFT and RGB features with PCA. Due to the unique illumination conditions in laparoscopic sequences of the liver, a very low dimensional feature space can be used for classification via logistic regression. The methodology is evaluated on 718 cirrhotic liver and background patches that are taken from laparoscopic sequences of 7 patients. Using a linear classifier we achieve a precision of 91% in a leave-one-patient-out cross-validation. Furthermore, we demonstrate that with logistic probability estimates, seeds with high certainty of being cirrhotic liver tissue can be obtained. For example, our precision of liver seeds increases to 98.5% if only seeds with more than 95% probability of being liver are used. Finally, these automatically selected seeds can be used as priors in Graph Cuts which is demonstrated in this paper.

  16. Identifying selection in the within-host evolution of influenza using viral sequence data.

    Directory of Open Access Journals (Sweden)

    Christopher J R Illingworth

    2014-07-01

    Full Text Available The within-host evolution of influenza is a vital component of its epidemiology. A question of particular interest is the role that selection plays in shaping the viral population over the course of a single infection. We here describe a method to measure selection acting upon the influenza virus within an individual host, based upon time-resolved genome sequence data from an infection. Analysing sequence data from a transmission study conducted in pigs, describing part of the haemagglutinin gene (HA1 of an influenza virus, we find signatures of non-neutrality in six of a total of sixteen infections. We find evidence for both positive and negative selection acting upon specific alleles, while in three cases, the data suggest the presence of time-dependent selection. In one infection we observe what is potentially a specific immune response against the virus; a non-synonymous mutation in an epitope region of the virus is found to be under initially positive, then strongly negative selection. Crucially, given the lack of homologous recombination in influenza, our method accounts for linkage disequilibrium between nucleotides at different positions in the haemagglutinin gene, allowing for the analysis of populations in which multiple mutations are present at any given time. Our approach offers a new insight into the dynamics of influenza infection, providing a detailed characterisation of the forces that underlie viral evolution.

  17. RANDOM FORESTS-BASED FEATURE SELECTION FOR LAND-USE CLASSIFICATION USING LIDAR DATA AND ORTHOIMAGERY

    Directory of Open Access Journals (Sweden)

    H. Guan

    2012-07-01

    Full Text Available The development of lidar system, especially incorporated with high-resolution camera components, has shown great potential for urban classification. However, how to automatically select the best features for land-use classification is challenging. Random Forests, a newly developed machine learning algorithm, is receiving considerable attention in the field of image classification and pattern recognition. Especially, it can provide the measure of variable importance. Thus, in this study the performance of the Random Forests-based feature selection for urban areas was explored. First, we extract features from lidar data, including height-based, intensity-based GLCM measures; other spectral features can be obtained from imagery, such as Red, Blue and Green three bands, and GLCM-based measures. Finally, Random Forests is used to automatically select the optimal and uncorrelated features for landuse classification. 0.5-meter resolution lidar data and aerial imagery are used to assess the feature selection performance of Random Forests in the study area located in Mannheim, Germany. The results clearly demonstrate that the use of Random Forests-based feature selection can improve the classification performance by the selected features.

  18. Effective automated feature construction and selection for classification of biological sequences.

    Directory of Open Access Journals (Sweden)

    Uday Kamath

    Full Text Available Many open problems in bioinformatics involve elucidating underlying functional signals in biological sequences. DNA sequences, in particular, are characterized by rich architectures in which functional signals are increasingly found to combine local and distal interactions at the nucleotide level. Problems of interest include detection of regulatory regions, splice sites, exons, hypersensitive sites, and more. These problems naturally lend themselves to formulation as classification problems in machine learning. When classification is based on features extracted from the sequences under investigation, success is critically dependent on the chosen set of features.We present an algorithmic framework (EFFECT for automated detection of functional signals in biological sequences. We focus here on classification problems involving DNA sequences which state-of-the-art work in machine learning shows to be challenging and involve complex combinations of local and distal features. EFFECT uses a two-stage process to first construct a set of candidate sequence-based features and then select a most effective subset for the classification task at hand. Both stages make heavy use of evolutionary algorithms to efficiently guide the search towards informative features capable of discriminating between sequences that contain a particular functional signal and those that do not.To demonstrate its generality, EFFECT is applied to three separate problems of importance in DNA research: the recognition of hypersensitive sites, splice sites, and ALU sites. Comparisons with state-of-the-art algorithms show that the framework is both general and powerful. In addition, a detailed analysis of the constructed features shows that they contain valuable biological information about DNA architecture, allowing biologists and other researchers to directly inspect the features and potentially use the insights obtained to assist wet-laboratory studies on retainment or modification

  19. Sequence variation in human succinate dehydrogenase genes: evidence for long-term balancing selection on SDHA

    Directory of Open Access Journals (Sweden)

    Lawrence Elizabeth C

    2007-03-01

    Full Text Available Abstract Background Balancing selection operating for long evolutionary periods at a locus is characterized by the maintenance of distinct alleles because of a heterozygote or rare-allele advantage. The loci under balancing selection are distinguished by their unusually high polymorphism levels. In this report, we provide statistical and comparative genetic evidence suggesting that the SDHA gene is under long-term balancing selection. SDHA encodes the major catalytical subunit (flavoprotein, Fp of the succinate dehydrogenase enzyme complex (SDH; mitochondrial complex II. The inhibition of Fp by homozygous SDHA mutations or by 3-nitropropionic acid poisoning causes central nervous system pathologies. In contrast, heterozygous mutations in SDHB, SDHC, and SDHD, the other SDH subunit genes, cause hereditary paraganglioma (PGL tumors, which show constitutive activation of pathways induced by oxygen deprivation (hypoxia. Results We sequenced the four SDH subunit genes (10.8 kb in 24 African American and 24 European American samples. We also sequenced the SDHA gene (2.8 kb in 18 chimpanzees. Increased nucleotide diversity distinguished the human SDHA gene from its chimpanzee ortholog and from the PGL genes. Sequence analysis uncovered two common SDHA missense variants and refuted the previous suggestions that these variants originate from different genetic loci. Two highly dissimilar SDHA haplotype clusters were present in intermediate frequencies in both racial groups. The SDHA variation pattern showed statistically significant deviations from neutrality by the Tajima, Fu and Li, Hudson-Kreitman-Aguadé, and Depaulis haplotype number tests. Empirically, the elevated values of the nucleotide diversity (% π = 0.231 and the Tajima statistics (D = 1.954 in the SDHA gene were comparable with the most outstanding cases for balancing selection in the African American population. Conclusion The SDHA gene has a strong signature of balancing selection. The

  20. Differentiating Schistosoma haematobium from related animal schistosomes by PCR amplifying inter-repeat sequences flanking newly selected repeated sequences.

    Science.gov (United States)

    Abbasi, Ibrahim; Hamburger, Joseph; Kariuki, Curtis; Mungai, Peter L; Muchiri, Eric M; King, Charles H

    2012-12-01

    In schistosomiasis elimination programs, successful discrimination of Schistosoma haematobium from the related animal Schistosoma parasites will be essential for accurate detection of human parasite transmission. Polymerase chain reaction assays employing primers from two newly selected repeated sequences, named Sh73 and Sh77, did not discriminate S. haematobium when amplifying Sh73-77 intra- or inter-repeats. However, amplification between Sh73 and the previously described DraI repeat exhibited discriminative banding patterns for S. haematobium and Schistosoma bovis (sensitivity 1 pg and 10 pg, respectively). It also enabled banding pattern discrimination of Schistosoma curassoni and Schistosoma intercalatum, but Schistosoma mattheei and Schistosoma margrebowiei did not yield amplicons. Similar inter-repeat amplification between Sh77 and DraI yielded amplicons with discriminative banding for S. haematobium, and S. bovis; however, S. mattheei was detected only at low sensitivity (1 ng). The Sh73/DraI assay detected snails infected with S. haematobium, S. bovis, or both, and should prove useful for screening snails where discrimination of S. haematobium from related schistosomes is required.

  1. SHAPE Selection (SHAPES) enrich for RNA structure signal in SHAPE sequencing-based probing data

    Science.gov (United States)

    Poulsen, Line Dahl; Kielpinski, Lukasz Jan; Salama, Sofie R.; Krogh, Anders; Vinther, Jeppe

    2015-01-01

    Selective 2′ Hydroxyl Acylation analyzed by Primer Extension (SHAPE) is an accurate method for probing of RNA secondary structure. In existing SHAPE methods, the SHAPE probing signal is normalized to a no-reagent control to correct for the background caused by premature termination of the reverse transcriptase. Here, we introduce a SHAPE Selection (SHAPES) reagent, N-propanone isatoic anhydride (NPIA), which retains the ability of SHAPE reagents to accurately probe RNA structure, but also allows covalent coupling between the SHAPES reagent and a biotin molecule. We demonstrate that SHAPES-based selection of cDNA–RNA hybrids on streptavidin beads effectively removes the large majority of background signal present in SHAPE probing data and that sequencing-based SHAPES data contain the same amount of RNA structure data as regular sequencing-based SHAPE data obtained through normalization to a no-reagent control. Moreover, the selection efficiently enriches for probed RNAs, suggesting that the SHAPES strategy will be useful for applications with high-background and low-probing signal such as in vivo RNA structure probing. PMID:25805860

  2. Identifying Genetic Signatures of Natural Selection Using Pooled Population Sequencing in Picea abies.

    Science.gov (United States)

    Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin

    2016-07-07

    The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. Copyright © 2016 Chen et al.

  3. Fully Automatic Myocardial Segmentation of Contrast Echocardiography Sequence Using Random Forests Guided by Shape Model.

    Science.gov (United States)

    Li, Yuanwei; Ho, Chin Pang; Toulemonde, Matthieu; Chahal, Navtej; Senior, Roxy; Tang, Meng-Xing

    2017-09-26

    Myocardial contrast echocardiography (MCE) is an imaging technique that assesses left ventricle function and myocardial perfusion for the detection of coronary artery diseases. Automatic MCE perfusion quantification is challenging and requires accurate segmentation of the myocardium from noisy and time-varying images. Random forests (RF) have been successfully applied to many medical image segmentation tasks. However, the pixel-wise RF classifier ignores contextual relationships between label outputs of individual pixels. RF which only utilizes local appearance features is also susceptible to data suffering from large intensity variations. In this paper, we demonstrate how to overcome the above limitations of classic RF by presenting a fully automatic segmentation pipeline for myocardial segmentation in full-cycle 2D MCE data. Specifically, a statistical shape model is used to provide shape prior information that guide the RF segmentation in two ways. First, a novel shape model (SM) feature is incorporated into the RF framework to generate a more accurate RF probability map. Second, the shape model is fitted to the RF probability map to refine and constrain the final segmentation to plausible myocardial shapes. We further improve the performance by introducing a bounding box detection algorithm as a preprocessing step in the segmentation pipeline. Our approach on 2D image is further extended to 2D+t sequences which ensures temporal consistency in the final sequence segmentations. When evaluated on clinical MCE datasets, our proposed method achieves notable improvement in segmentation accuracy and outperforms other state-of-the-art methods including the classic RF and its variants, active shape model and image registration.

  4. Transcriptome Sequencing of Lima Bean (Phaseolus lunatus) to Identify Putative Positive Selection in Phaseolus and Legumes.

    Science.gov (United States)

    Li, Fengqi; Cao, Depan; Liu, Yang; Yang, Ting; Wang, Guirong

    2015-07-03

    The identification of genes under positive selection is a central goal of evolutionary biology. Many legume species, including Phaseolus vulgaris (common bean) and Phaseolus lunatus (lima bean), have important ecological and economic value. In this study, we sequenced and assembled the transcriptome of one Phaseolus species, lima bean. A comparison with the genomes of six other legume species, including the common bean, Medicago, lotus, soybean, chickpea, and pigeonpea, revealed 15 and 4 orthologous groups with signatures of positive selection among the two Phaseolus species and among the seven legume species, respectively. Characterization of these positively selected genes using Non redundant (nr) annotation, gene ontology (GO) classification, GO term enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses revealed that these genes are mostly involved in thylakoids, photosynthesis and metabolism. This study identified genes that may be related to the divergence of the Phaseolus and legume species. These detected genes are particularly good candidates for subsequent functional studies.

  5. Transcriptome Sequencing of Lima Bean (Phaseolus lunatus to Identify Putative Positive Selection in Phaseolus and Legumes

    Directory of Open Access Journals (Sweden)

    Fengqi Li

    2015-07-01

    Full Text Available The identification of genes under positive selection is a central goal of evolutionary biology. Many legume species, including Phaseolus vulgaris (common bean and Phaseolus lunatus (lima bean, have important ecological and economic value. In this study, we sequenced and assembled the transcriptome of one Phaseolus species, lima bean. A comparison with the genomes of six other legume species, including the common bean, Medicago, lotus, soybean, chickpea, and pigeonpea, revealed 15 and 4 orthologous groups with signatures of positive selection among the two Phaseolus species and among the seven legume species, respectively. Characterization of these positively selected genes using Non redundant (nr annotation, gene ontology (GO classification, GO term enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG pathway analyses revealed that these genes are mostly involved in thylakoids, photosynthesis and metabolism. This study identified genes that may be related to the divergence of the Phaseolus and legume species. These detected genes are particularly good candidates for subsequent functional studies.

  6. Parametric and non-parametric masking of randomness in sequence alignments can be improved and leads to better resolved trees

    Directory of Open Access Journals (Sweden)

    von Reumont Björn M

    2010-03-01

    Full Text Available Abstract Background Methods of alignment masking, which refers to the technique of excluding alignment blocks prior to tree reconstructions, have been successful in improving the signal-to-noise ratio in sequence alignments. However, the lack of formally well defined methods to identify randomness in sequence alignments has prevented a routine application of alignment masking. In this study, we compared the effects on tree reconstructions of the most commonly used profiling method (GBLOCKS which uses a predefined set of rules in combination with alignment masking, with a new profiling approach (ALISCORE based on Monte Carlo resampling within a sliding window, using different data sets and alignment methods. While the GBLOCKS approach excludes variable sections above a certain threshold which choice is left arbitrary, the ALISCORE algorithm is free of a priori rating of parameter space and therefore more objective. Results ALISCORE was successfully extended to amino acids using a proportional model and empirical substitution matrices to score randomness in multiple sequence alignments. A complex bootstrap resampling leads to an even distribution of scores of randomly similar sequences to assess randomness of the observed sequence similarity. Testing performance on real data, both masking methods, GBLOCKS and ALISCORE, helped to improve tree resolution. The sliding window approach was less sensitive to different alignments of identical data sets and performed equally well on all data sets. Concurrently, ALISCORE is capable of dealing with different substitution patterns and heterogeneous base composition. ALISCORE and the most relaxed GBLOCKS gap parameter setting performed best on all data sets. Correspondingly, Neighbor-Net analyses showed the most decrease in conflict. Conclusions Alignment masking improves signal-to-noise ratio in multiple sequence alignments prior to phylogenetic reconstruction. Given the robust performance of alignment

  7. Least squares deconvolution for leak detection with a pseudo random binary sequence excitation

    Science.gov (United States)

    Nguyen, Si Tran Nguyen; Gong, Jinzhe; Lambert, Martin F.; Zecchin, Aaron C.; Simpson, Angus R.

    2018-01-01

    Leak detection and localisation is critical for water distribution system pipelines. This paper examines the use of the time-domain impulse response function (IRF) for leak detection and localisation in a pressurised water pipeline with a pseudo random binary sequence (PRBS) signal excitation. Compared to the conventional step wave generated using a single fast operation of a valve closure, a PRBS signal offers advantageous correlation properties, in that the signal has very low autocorrelation for lags different from zero and low cross correlation with other signals including noise and other interference. These properties result in a significant improvement in the IRF signal to noise ratio (SNR), leading to more accurate leak localisation. In this paper, the estimation of the system IRF is formulated as an optimisation problem in which the l2 norm of the IRF is minimised to suppress the impact of noise and interference sources. Both numerical and experimental data are used to verify the proposed technique. The resultant estimated IRF provides not only accurate leak location estimation, but also good sensitivity to small leak sizes due to the improved SNR.

  8. Controlled random sequences: methods of convex analysis and problems with functional constraints

    Science.gov (United States)

    Piunovskii, A. B.

    1998-12-01

    ContentsIntroduction § 1. Controlled random sequences: main definitions and traditional approaches § 1.1. Description of the mathematical model § 1.2. Models with integral functionals § 1.3. Homogeneous Markov decision processes with average cost criteria § 2. Application of methods of convex analysis § 2.1. Properties of the space \\mathcal D § 2.2. Existence of optimal policies § 2.3. Sufficiency of selectors § 2.4. Preliminary results. The notion of an occupation measure § 2.5. Markov decision processes with total cost criteria and occupation measures § 2.6. Discounted costs and the corresponding occupation measures § 2.7. Average costs and ergodic occupation measures § 3. Problems with functional constraints § 3.1. General results § 3.2. Preliminary conclusions § 3.3. Markov decision processes with total cost criteria § 3.4. Homogeneous Markov decision processes with discounting § 3.5. Homogeneous Markov decision processes with average cost criteria § 3.6. Other constrained problems, related topics, and future prospectsConclusionAppendix. Elements of convex analysis and measure theory Bibliography

  9. SNP selection and classification of genome-wide SNP data using stratified sampling random forests.

    Science.gov (United States)

    Wu, Qingyao; Ye, Yunming; Liu, Yang; Ng, Michael K

    2012-09-01

    For high dimensional genome-wide association (GWA) case-control data of complex disease, there are usually a large portion of single-nucleotide polymorphisms (SNPs) that are irrelevant with the disease. A simple random sampling method in random forest using default mtry parameter to choose feature subspace, will select too many subspaces without informative SNPs. Exhaustive searching an optimal mtry is often required in order to include useful and relevant SNPs and get rid of vast of non-informative SNPs. However, it is too time-consuming and not favorable in GWA for high-dimensional data. The main aim of this paper is to propose a stratified sampling method for feature subspace selection to generate decision trees in a random forest for GWA high-dimensional data. Our idea is to design an equal-width discretization scheme for informativeness to divide SNPs into multiple groups. In feature subspace selection, we randomly select the same number of SNPs from each group and combine them to form a subspace to generate a decision tree. The advantage of this stratified sampling procedure can make sure each subspace contains enough useful SNPs, but can avoid a very high computational cost of exhaustive search of an optimal mtry, and maintain the randomness of a random forest. We employ two genome-wide SNP data sets (Parkinson case-control data comprised of 408 803 SNPs and Alzheimer case-control data comprised of 380 157 SNPs) to demonstrate that the proposed stratified sampling method is effective, and it can generate better random forest with higher accuracy and lower error bound than those by Breiman's random forest generation method. For Parkinson data, we also show some interesting genes identified by the method, which may be associated with neurological disorders for further biological investigations.

  10. An efficient method of wavelength interval selection based on random frog for multivariate spectral calibration

    Science.gov (United States)

    Yun, Yong-Huan; Li, Hong-Dong; Wood, Leslie R. E.; Fan, Wei; Wang, Jia-Jun; Cao, Dong-Sheng; Xu, Qing-Song; Liang, Yi-Zeng

    2013-07-01

    Wavelength selection is a critical step for producing better prediction performance when applied to spectral data. Considering the fact that the vibrational and rotational spectra have continuous features of spectral bands, we propose a novel method of wavelength interval selection based on random frog, called interval random frog (iRF). To obtain all the possible continuous intervals, spectra are first divided into intervals by moving window of a fix width over the whole spectra. These overlapping intervals are ranked applying random frog coupled with PLS and the optimal ones are chosen. This method has been applied to two near-infrared spectral datasets displaying higher efficiency in wavelength interval selection than others. The source code of iRF can be freely downloaded for academy research at the website: http://code.google.com/p/multivariate-calibration/downloads/list.

  11. Sequence-selective binding of C8-conjugated pyrrolobenzodiazepines (PBDs) to DNA.

    Science.gov (United States)

    Basher, Mohammad A; Rahman, Khondaker Miraz; Jackson, Paul J M; Thurston, David E; Fox, Keith R

    2017-11-01

    DNA footprinting and melting experiments have been used to examine the sequence-specific binding of C8-conjugates of pyrrolobenzodiazepines (PBDs) and benzofused rings including benzothiophene and benzofuran, which are attached using pyrrole- or imidazole-containing linkers. The conjugates modulate the covalent attachment points of the PBDs, so that they bind best to guanines flanked by A/T-rich sequences on either the 5'- or 3'-side. The linker affects the binding, and pyrrole produces larger changes than imidazole. Melting studies with 14-mer oligonucleotide duplexes confirm covalent attachment of the conjugates, which show a different selectivity to anthramycin and reveal that more than one ligand molecule can bind to each duplex. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Pairwise selection assembly for sequence-independent construction of long-length DNA.

    Science.gov (United States)

    Blake, William J; Chapman, Brad A; Zindal, Anuradha; Lee, Michael E; Lippow, Shaun M; Baynes, Brian M

    2010-05-01

    The engineering of biological components has been facilitated by de novo synthesis of gene-length DNA. Biological engineering at the level of pathways and genomes, however, requires a scalable and cost-effective assembly of DNA molecules that are longer than approximately 10 kb, and this remains a challenge. Here we present the development of pairwise selection assembly (PSA), a process that involves hierarchical construction of long-length DNA through the use of a standard set of components and operations. In PSA, activation tags at the termini of assembly sub-fragments are reused throughout the assembly process to activate vector-encoded selectable markers. Marker activation enables stringent selection for a correctly assembled product in vivo, often obviating the need for clonal isolation. Importantly, construction via PSA is sequence-independent, and does not require primary sequence modification (e.g. the addition or removal of restriction sites). The utility of PSA is demonstrated in the construction of a completely synthetic 91-kb chromosome arm from Saccharomyces cerevisiae.

  13. A population genetics-phylogenetics approach to inferring natural selection in coding sequences.

    Directory of Open Access Journals (Sweden)

    Daniel J Wilson

    2011-12-01

    Full Text Available Through an analysis of polymorphism within and divergence between species, we can hope to learn about the distribution of selective effects of mutations in the genome, changes in the fitness landscape that occur over time, and the location of sites involved in key adaptations that distinguish modern-day species. We introduce a novel method for the analysis of variation in selection pressures within and between species, spatially along the genome and temporally between lineages. We model codon evolution explicitly using a joint population genetics-phylogenetics approach that we developed for the construction of multiallelic models with mutation, selection, and drift. Our approach has the advantage of performing direct inference on coding sequences, inferring ancestral states probabilistically, utilizing allele frequency information, and generalizing to multiple species. We use a Bayesian sliding window model for intragenic variation in selection coefficients that efficiently combines information across sites and captures spatial clustering within the genome. To demonstrate the utility of the method, we infer selective pressures acting in Drosophila melanogaster and D. simulans from polymorphism and divergence data for 100 X-linked coding regions.

  14. Two-Level Adaptive Algebraic Multigrid for a Sequence of Problems with Slowly Varying Random Coefficients [Adaptive Algebraic Multigrid for Sequence of Problems with Slowly Varying Random Coefficients

    Energy Technology Data Exchange (ETDEWEB)

    Kalchev, D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Ketelsen, C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Vassilevski, P. S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2013-11-07

    Our paper proposes an adaptive strategy for reusing a previously constructed coarse space by algebraic multigrid to construct a two-level solver for a problem with nearby characteristics. Furthermore, a main target application is the solution of the linear problems that appear throughout a sequence of Markov chain Monte Carlo simulations of subsurface flow with uncertain permeability field. We demonstrate the efficacy of the method with extensive set of numerical experiments.

  15. EVIDENCE FOR THE UNIVERSALITY OF PROPERTIES OF RED-SEQUENCE GALAXIES IN X-RAY- AND RED-SEQUENCE-SELECTED CLUSTERS AT z ∼ 1

    Energy Technology Data Exchange (ETDEWEB)

    Foltz, R.; Wilson, G.; DeGroot, A. [Department of Physics and Astronomy, University of California Riverside, 900 University Avenue, Riverside, CA 92521 (United States); Rettura, A. [Infrared Processing and Analysis Center, California Institute of Technology, KS 314-6, Pasadena, CA 91125 (United States); Van der Burg, R. F. J. [Laboratoire AIM, IRFU/Service d’Astrophysique—CEA/DSM—CNRS—Université Paris Diderot, Bât. 709, CEA-Saclay, F-91191 Gif-sur-Yvette Cedex (France); Muzzin, A. [Institute of Astronomy, University of Cambridge, Madingley Road, Cambridge, CB3 0HA (United Kingdom); Lidman, C. [Australian Astronomical Observatory, P.O. Box 915, North Ryde NSW 1670 (Australia); Demarco, R. [Department of Astronomy, Universidad de Concepcion, Barrio Universitario. Casilla 160-C, Concepcion (Chile); Nantais, Julie [Grupo Astronomi´a, Departamento de Ciencias Fi´sicas, Universidad Andrés Bello, República 220, Santiago (Chile); Yee, H., E-mail: ryan.foltz@email.ucr.edu, E-mail: gillian.wilson@ucr.edu, E-mail: adegr001@ucr.edu, E-mail: arettura@astro.caltech.edu, E-mail: remco.van-der-burg@cea.fr, E-mail: avmuzzin@ast.cam.ac.uk, E-mail: clidman@aao.gov.au, E-mail: rdemarco@astro-udec.cl, E-mail: julie.nantais@unab.cl, E-mail: hyee@astro.utoronto.ca [Dept of Astronomy and Astrophysics, University of Toronto, 50 Saint George Street, Toronto, ON M5S 3H4 (Canada)

    2015-10-20

    We study the slope, intercept, and scatter of the color–magnitude and color–mass relations for a sample of 10 infrared red-sequence-selected clusters at z ∼ 1. The quiescent galaxies in these clusters formed the bulk of their stars above z ≳ 3 with an age spread Δt ≳ 1 Gyr. We compare UVJ color–color and spectroscopic-based galaxy selection techniques, and find a 15% difference in the galaxy populations classified as quiescent by these methods. We compare the color–magnitude relations from our red-sequence selected sample with X-ray- and photometric-redshift-selected cluster samples of similar mass and redshift. Within uncertainties, we are unable to detect any difference in the ages and star formation histories of quiescent cluster members in clusters selected by different methods, suggesting that the dominant quenching mechanism is insensitive to cluster baryon partitioning at z ∼ 1.

  16. Two-year Randomized Clinical Trial Of Self-etching Adhesives And Selective Enamel Etching

    OpenAIRE

    Pena, MR; Rodrigues CE; JA; Ely; Giannini, C.; Reis, M; AF

    2016-01-01

    Objective: The aim of this randomized, controlled prospective clinical trial was to evaluate the clinical effectiveness of restoring noncarious cervical lesions with two self-etching adhesive systems applied with or without selective enamel etching. Methods: A one-step self-etching adhesive (Xeno V+) and a two-step self-etching system (Clearfil SE Bond) were used. The effectiveness of phosphoric acid selective etching of enamel margins was also evaluated. Fifty-six cavities were restored with...

  17. Selection of guide sequences that direct efficient cleavage of mRNA by human ribonuclease P.

    Science.gov (United States)

    Yuan, Y; Altman, S

    1994-03-04

    Any RNA, when in a complex with another oligoribonucleotide known as an external guide sequence (EGS), can become a substrate for ribonuclease P. Simulation of evolution in vitro was used to select EGSs that bind tightly to a target substrate messenger RNA and that increase the efficiency of cleavage of the target by human ribonuclease P to a level equal to that achieved with natural substrates. The most efficient EGSs form transfer RNA precursor-like structures with the target RNA, in which the analog of the anticodon stem has been disrupted, an indication that selection for the optimal substrate for ribonuclease P yields an RNA structure different from that of present-day transfer RNA precursors.

  18. Development and evaluation of a non-ribosomal random PCR and next-generation sequencing based assay for detection and sequencing of hand, foot and mouth disease pathogens.

    Science.gov (United States)

    Nguyen, Anh To; Tran, Thanh Tan; Hoang, Van Minh Tu; Nghiem, Ngoc My; Le, Nhu Nguyen Truc; Le, Thanh Thi My; Phan, Qui Tu; Truong, Khanh Huu; Le, Nhan Nguyen Thanh; Ho, Viet Lu; Do, Viet Chau; Ha, Tuan Manh; Nguyen, Hung Thanh; Nguyen, Chau Van Vinh; Thwaites, Guy; van Doorn, H Rogier; Le, Tan Van

    2016-07-07

    Hand, foot and mouth disease (HFMD) has become a major public health problem across the Asia-Pacific region, and is commonly caused by enterovirus A71 (EV-A71) and coxsackievirus A6 (CV-A6), CV-A10 and CV-A16. Generating pathogen whole-genome sequences is essential for understanding their evolutionary biology. The frequent replacements among EV serotypes and a limited numbers of available whole-genome sequences hinder the development of overlapping PCRs for whole-genome sequencing. We developed and evaluated a non-ribosomal random PCR (rPCR) and next-generation sequencing based assay for sequence-independent whole-genome amplification and sequencing of HFMD pathogens. A total of 16 EV-A71/CV-A6/CV-A10/CV-A16 PCR positive rectal/throat swabs (Cp values: 20.9-33.3) were used for assay evaluation. Our assay evidently outperformed the conventional rPCR in terms of the total number of EV-A71 reads and the percentage of EV-A71 reads: 2.6 % (1275/50,000 reads) vs. 0.1 % (31/50,000) and 6 % (3008/50,000) vs. 0.9 % (433/50,000) for two samples with Cp values of 30 and 26, respectively. Additionally the assay could generate genome sequences with the percentages of coverage of 94-100 % of 4 different enterovirus serotypes in 73 % of the tested samples, representing the first whole-genome sequences of CV-A6/10/16 from Vietnam, and could assign correctly serotyping results in 100 % of 24 tested specimens. In all but three the obtained consensuses of two replicates from the same sample were 100 % identical, suggesting that our assay is highly reproducible. In conclusion, we have successfully developed a non-ribosomal rPCR and next-generation sequencing based assay for sensitive detection and direct whole-genome sequencing of HFMD pathogens from clinical samples.

  19. Intragraft selection of the T cell receptor repertoire by class I MHC sequences in tolerant recipients.

    Directory of Open Access Journals (Sweden)

    Dahai Liu

    Full Text Available BACKGROUND: Allograft tolerance of ACI (RT1(a recipients to WF (RT1(u hearts can be induced by allochimeric class I MHC molecules containing donor-type (RT1A(u immunogenic epitopes displayed on recipient-type (RT1A(a sequences. Here, we sought the mechanisms by which allochimeric sequences may affect responding T cells through T cell receptor (TCA repertoire restriction. METHODOLOGY/PRINCIPAL FINDINGS: The soluble [alpha(1h (u]-RT1.A(a allochimeric molecule was delivered into ACI recipients of WF hearts in the presence of sub-therapeutic dose of cyclosporine (CsA. The TCR Vbeta spectrotyping of the splenocytes and cardiac allografts showed that the Vbeta gene families were differentially expressed within the TCR repertoire in allochimeric- or high-dose CsA-treated tolerant recipients at day +5 and +7 of post-transplantation. However, at day 30 of post-transplantation the allochimeric molecule-treated rats showed the restriction of TCR repertoire with altered dominant size peaks representing preferential clonal expansion of Vbeta7, Vbeta11, Vbeta13, Vbeta 14, and Vbeta15 genes. Moreover, we found a positive correlation between the alteration of Vbeta profile, restriction of TCR repertoire, and the establishment of allograft tolerance. CONCLUSIONS: Our findings indicate that presentation of allochimeric MHC class I sequences that partially mimic donor and recipient epitopes may induce unique tolerant state by selecting alloresponsive Vbeta genes.

  20. Hebbian Learning in a Random Network Captures Selectivity Properties of the Prefrontal Cortex.

    Science.gov (United States)

    Lindsay, Grace W; Rigotti, Mattia; Warden, Melissa R; Miller, Earl K; Fusi, Stefano

    2017-11-08

    Complex cognitive behaviors, such as context-switching and rule-following, are thought to be supported by the prefrontal cortex (PFC). Neural activity in the PFC must thus be specialized to specific tasks while retaining flexibility. Nonlinear "mixed" selectivity is an important neurophysiological trait for enabling complex and context-dependent behaviors. Here we investigate (1) the extent to which the PFC exhibits computationally relevant properties, such as mixed selectivity, and (2) how such properties could arise via circuit mechanisms. We show that PFC cells recorded from male and female rhesus macaques during a complex task show a moderate level of specialization and structure that is not replicated by a model wherein cells receive random feedforward inputs. While random connectivity can be effective at generating mixed selectivity, the data show significantly more mixed selectivity than predicted by a model with otherwise matched parameters. A simple Hebbian learning rule applied to the random connectivity, however, increases mixed selectivity and enables the model to match the data more accurately. To explain how learning achieves this, we provide analysis along with a clear geometric interpretation of the impact of learning on selectivity. After learning, the model also matches the data on measures of noise, response density, clustering, and the distribution of selectivities. Of two styles of Hebbian learning tested, the simpler and more biologically plausible option better matches the data. These modeling results provide clues about how neural properties important for cognition can arise in a circuit and make clear experimental predictions regarding how various measures of selectivity would evolve during animal training. SIGNIFICANCE STATEMENT The prefrontal cortex is a brain region believed to support the ability of animals to engage in complex behavior. How neurons in this area respond to stimuli-and in particular, to combinations of stimuli ("mixed

  1. Selecting Optimal Parameters of Random Linear Network Coding for Wireless Sensor Networks

    DEFF Research Database (Denmark)

    Heide, Janus; Zhang, Qi; Fitzek, Frank

    2013-01-01

    This work studies how to select optimal code parameters of Random Linear Network Coding (RLNC) in Wireless Sensor Networks (WSNs). With Rateless Deluge [1] the authors proposed to apply Network Coding (NC) for Over-the-Air Programming (OAP) in WSNs, and demonstrated that with NC a significant...

  2. PHASTpep: Analysis Software for Discovery of Cell-Selective Peptides via Phage Display and Next-Generation Sequencing.

    Directory of Open Access Journals (Sweden)

    Lindsey T Brinton

    Full Text Available Next-generation sequencing has enhanced the phage display process, allowing for the quantification of millions of sequences resulting from the biopanning process. In response, many valuable analysis programs focused on specificity and finding targeted motifs or consensus sequences were developed. For targeted drug delivery and molecular imaging, it is also necessary to find peptides that are selective-targeting only the cell type or tissue of interest. We present a new analysis strategy and accompanying software, PHage Analysis for Selective Targeted PEPtides (PHASTpep, which identifies highly specific and selective peptides. Using this process, we discovered and validated, both in vitro and in vivo in mice, two sequences (HTTIPKV and APPIMSV targeted to pancreatic cancer-associated fibroblasts that escaped identification using previously existing software. Our selectivity analysis makes it possible to discover peptides that target a specific cell type and avoid other cell types, enhancing clinical translatability by circumventing complications with systemic use.

  3. Filtering and ranking techniques for automated selection of high-quality 16S rRNA gene sequences.

    Science.gov (United States)

    De Smet, Wim; De Loof, Karel; De Vos, Paul; Dawyndt, Peter; De Baets, Bernard

    2013-12-01

    StrainInfo has augmented its type strain and species/subspecies passports with a recommendation for a high-quality 16S rRNA gene sequence available from the public sequence databases. These recommendations are generated by an automated pipeline that collects all candidate 16S rRNA gene sequences for a prokaryotic type strain, filters out low-quality sequences and retains a high-quality sequence from the remaining pool. Due to thorough automation, recommendations can be renewed daily using the latest updates of the public sequence databases and the latest species descriptions. We discuss the quality criteria constructed to filter and rank available 16S rRNA gene sequences, and show how a partially ordered set (poset) ranking algorithm can be applied to solve the multi-criteria ranking problem of selecting the best candidate sequence. The proof of concept of the recommender system is validated by comparing the results of automated selection with an expert selection made in the All-Species Living Tree Project. Based on these validation results, the pipeline may reliably be applied for non-type strains and developed further for the automated selection of housekeeping genes. Copyright © 2013 Elsevier GmbH. All rights reserved.

  4. Protein Ordered Sequences are Formed by Random Joining of Amino Acids in Protein 0th-Order Structure, Followed by Evolutionary Process

    Science.gov (United States)

    Ikehara, Kenji

    2014-12-01

    Only random processes should occur on the primitive Earth. In contrast, many ordered sequences are synthesized according to genetic information on the present Earth. In this communication, I have proposed an idea that protein 0th-order structures or specific amino acid compositions would mediate the transfer from random process to formation of ordered sequences, after formation of double-stranded genes.

  5. Impact of Selection Bias on Treatment Effect Size Estimates in Randomized Trials of Oral Health Interventions: A Meta-epidemiological Study.

    Science.gov (United States)

    Saltaji, H; Armijo-Olivo, S; Cummings, G G; Amin, M; da Costa, B R; Flores-Mir, C

    2018-01-01

    Emerging evidence suggests that design flaws of randomized controlled trials can result in over- or underestimation of the treatment effect size (ES). The objective of this study was to examine associations between treatment ES estimates and adequacy of sequence generation, allocation concealment, and baseline comparability among a sample of oral health randomized controlled trials. For our analysis, we selected all meta-analyses that included a minimum of 5 oral health randomized controlled trials and used continuous outcomes. We extracted data, in duplicate, related to items of selection bias (sequence generation, allocation concealment, and baseline comparability) in the Cochrane Risk of Bias tool. Using a 2-level meta-meta-analytic approach with a random effects model to allow for intra- and inter-meta-analysis heterogeneity, we quantified the impact of selection bias on the magnitude of ES estimates. We identified 64 meta-analyses, including 540 randomized controlled trials analyzing 137,957 patients. Sequence generation was judged to be adequate (at low risk of bias) in 32% ( n = 173) of trials, and baseline comparability was judged to be adequate in 77.8% of trials. Allocation concealment was unclear in the majority of trials ( n = 458, 84.8%). We identified significantly larger treatment ES estimates in trials that had inadequate/unknown sequence generation (difference in ES = 0.13; 95% CI: 0.01 to 0.25) and inadequate/unknown allocation concealment (difference in ES = 0.15; 95% CI: 0.02 to 0.27). In contrast, baseline imbalance (difference in ES = 0.01, 95% CI: -0.09 to 0.12) was not associated with inflated or underestimated ES. In conclusion, treatment ES estimates were 0.13 and 0.15 larger in trials with inadequate/unknown sequence generation and inadequate/unknown allocation concealment, respectively. Therefore, authors of systematic reviews using oral health randomized controlled trials should perform sensitivity analyses based on the adequacy of

  6. Genome-Based Selection and Characterization of Fusarium circinatum-Specific Sequences

    Directory of Open Access Journals (Sweden)

    Mkhululi N. Maphosa

    2016-03-01

    Full Text Available Fusarium circinatum is an important pathogen of pine trees and its management in the commercial forestry environment relies largely on early detection, particularly in seedling nurseries. The fact that the entire genome of this pathogen is available opens new avenues for the development of diagnostic tools for this fungus. In this study we identified open reading frames (ORFs unique to F. circinatum and determined that they were specific to the pathogen. The ORF identification process involved bioinformatics-based screening of all the putative F. circinatum ORFs against public databases. This was followed by functional characterization of ORFs found to be unique to F. circinatum. We used PCR- and hybridization-based approaches to confirm the presence of selected unique genes in different strains of F. circinatum and their absence from other Fusarium species for which genome sequence data are not yet available. These included species that are closely related to F. circinatum as well as those that are commonly encountered in the forestry environment. Thirty-six ORFs were identified as potentially unique to F. circinatum. Nineteen of these encode proteins with known domains while the other 17 encode proteins of unknown function. The results of our PCR analyses and hybridization assays showed that three of the selected genes were present in all of the strains of F. circinatum tested and absent from the other Fusarium species screened. These data thus indicate that the selected genes are common and unique to F. circinatum. These genes thus could be good candidates for use in rapid, in-the-field diagnostic assays specific to F. circinatum. Our study further demonstrates how genome sequence information can be mined for the identification of new diagnostic markers for the detection of plant pathogens.

  7. Tehran Air Pollutants Prediction Based on Random Forest Feature Selection Method

    Science.gov (United States)

    Shamsoddini, A.; Aboodi, M. R.; Karami, J.

    2017-09-01

    Air pollution as one of the most serious forms of environmental pollutions poses huge threat to human life. Air pollution leads to environmental instability, and has harmful and undesirable effects on the environment. Modern prediction methods of the pollutant concentration are able to improve decision making and provide appropriate solutions. This study examines the performance of the Random Forest feature selection in combination with multiple-linear regression and Multilayer Perceptron Artificial Neural Networks methods, in order to achieve an efficient model to estimate carbon monoxide and nitrogen dioxide, sulfur dioxide and PM2.5 contents in the air. The results indicated that Artificial Neural Networks fed by the attributes selected by Random Forest feature selection method performed more accurate than other models for the modeling of all pollutants. The estimation accuracy of sulfur dioxide emissions was lower than the other air contaminants whereas the nitrogen dioxide was predicted more accurate than the other pollutants.

  8. TEHRAN AIR POLLUTANTS PREDICTION BASED ON RANDOM FOREST FEATURE SELECTION METHOD

    Directory of Open Access Journals (Sweden)

    A. Shamsoddini

    2017-09-01

    Full Text Available Air pollution as one of the most serious forms of environmental pollutions poses huge threat to human life. Air pollution leads to environmental instability, and has harmful and undesirable effects on the environment. Modern prediction methods of the pollutant concentration are able to improve decision making and provide appropriate solutions. This study examines the performance of the Random Forest feature selection in combination with multiple-linear regression and Multilayer Perceptron Artificial Neural Networks methods, in order to achieve an efficient model to estimate carbon monoxide and nitrogen dioxide, sulfur dioxide and PM2.5 contents in the air. The results indicated that Artificial Neural Networks fed by the attributes selected by Random Forest feature selection method performed more accurate than other models for the modeling of all pollutants. The estimation accuracy of sulfur dioxide emissions was lower than the other air contaminants whereas the nitrogen dioxide was predicted more accurate than the other pollutants.

  9. Classification of epileptic EEG signals based on simple random sampling and sequential feature selection.

    Science.gov (United States)

    Ghayab, Hadi Ratham Al; Li, Yan; Abdulla, Shahab; Diykh, Mohammed; Wan, Xiangkui

    2016-06-01

    Electroencephalogram (EEG) signals are used broadly in the medical fields. The main applications of EEG signals are the diagnosis and treatment of diseases such as epilepsy, Alzheimer, sleep problems and so on. This paper presents a new method which extracts and selects features from multi-channel EEG signals. This research focuses on three main points. Firstly, simple random sampling (SRS) technique is used to extract features from the time domain of EEG signals. Secondly, the sequential feature selection (SFS) algorithm is applied to select the key features and to reduce the dimensionality of the data. Finally, the selected features are forwarded to a least square support vector machine (LS_SVM) classifier to classify the EEG signals. The LS_SVM classifier classified the features which are extracted and selected from the SRS and the SFS. The experimental results show that the method achieves 99.90, 99.80 and 100 % for classification accuracy, sensitivity and specificity, respectively.

  10. Quasi-Coherent Noise Jamming to LFM Radar Based on Pseudo-random Sequence Phase-modulation

    Directory of Open Access Journals (Sweden)

    N. Tai

    2015-12-01

    Full Text Available A novel quasi-coherent noise jamming method is proposed against linear frequency modulation (LFM signal and pulse compression radar. Based on the structure of digital radio frequency memory (DRFM, the jamming signal is acquired by the pseudo-random sequence phase-modulation of sampled radar signal. The characteristic of jamming signal in time domain and frequency domain is analyzed in detail. Results of ambiguity function indicate that the blanket jamming effect along the range direction will be formed when jamming signal passes through the matched filter. By flexible controlling the parameters of interrupted-sampling pulse and pseudo-random sequence, different covering distances and jamming effects will be achieved. When the jamming power is equivalent, this jamming obtains higher process gain compared with non-coherent jamming. The jamming signal enhances the detection threshold and the real target avoids being detected. Simulation results and circuit engineering implementation validate that the jamming signal covers real target effectively.

  11. Evaluation of artificial selection in Standard Poodles using whole-genome sequencing.

    Science.gov (United States)

    Friedenberg, Steven G; Meurs, Kathryn M; Mackay, Trudy F C

    2016-12-01

    Identifying regions of artificial selection within dog breeds may provide insights into genetic variation that underlies breed-specific traits or diseases-particularly if these traits or disease predispositions are fixed within a breed. In this study, we searched for runs of homozygosity (ROH) and calculated the d i statistic (which is based upon F ST) to identify regions of artificial selection in Standard Poodles using high-coverage, whole-genome sequencing data of 15 Standard Poodles and 49 dogs across seven other breeds. We identified consensus ROH regions ≥1 Mb in length and common to at least ten Standard Poodles covering 0.6 % of the genome, and d i regions that most distinguish Standard Poodles from other breeds covering 3.7 % of the genome. Within these regions, we identified enriched gene pathways related to olfaction, digestion, and taste, as well as pathways related to adrenal hormone biosynthesis, T cell function, and protein ubiquitination that could contribute to the pathogenesis of some Poodle-prevalent autoimmune diseases. We also validated variants related to hair coat and skull morphology that have previously been identified as being under selective pressure in Poodles, and flagged additional polymorphisms in genes such as ITGA2B, CBX4, and TNXB that may represent strong candidates for other common Poodle disorders.

  12. The role of upstream sequences in selecting the reading frame on tmRNA

    Directory of Open Access Journals (Sweden)

    Dewey Jonathan D

    2008-06-01

    Full Text Available Abstract Background tmRNA acts first as a tRNA and then as an mRNA to rescue stalled ribosomes in eubacteria. Two unanswered questions about tmRNA function remain: how does tmRNA, lacking an anticodon, bypass the decoding machinery and enter the ribosome? Secondly, how does the ribosome choose the proper codon to resume translation on tmRNA? According to the -1 triplet hypothesis, the answer to both questions lies in the unique properties of the three nucleotides upstream of the first tmRNA codon. These nucleotides assume an A-form conformation that mimics the codon-anticodon interaction, leading to recognition by the decoding center and choice of the reading frame. The -1 triplet hypothesis is important because it is the most credible model in which direct binding and recognition by the ribosome sets the reading frame on tmRNA. Results Conformational analysis predicts that 18 triplets cannot form the correct structure to function as the -1 triplet of tmRNA. We tested the tmRNA activity of all possible -1 triplet mutants using a genetic assay in Escherichia coli. While many mutants displayed reduced activity, our findings do not match the predictions of this model. Additional mutagenesis identified sequences further upstream that are required for tmRNA function. An immunoblot assay for translation of the tmRNA tag revealed that certain mutations in U85, A86, and the -1 triplet sequence result in improper selection of the first codon and translation in the wrong frame (-1 or +1 in vivo. Conclusion Our findings disprove the -1 triplet hypothesis. The -1 triplet is not required for accommodation of tmRNA into the ribosome, although it plays a minor role in frame selection. Our results strongly disfavor direct ribosomal recognition of the upstream sequence, instead supporting a model in which the binding of a separate ligand to A86 is primarily responsible for frame selection.

  13. Personal name in Igbo Culture: A dataset on randomly selected personal names and their statistical analysis.

    Science.gov (United States)

    Okagbue, Hilary I; Opanuga, Abiodun A; Adamu, Muminu O; Ugwoke, Paulinus O; Obasi, Emmanuela C M; Eze, Grace A

    2017-12-01

    This data article contains the statistical analysis of Igbo personal names and a sample of randomly selected of such names. This was presented as the following: 1). A simple random sampling of some Igbo personal names and their respective gender associated with each name. 2). The distribution of the vowels, consonants and letters of alphabets of the personal names. 3). The distribution of name length. 4). The distribution of initial and terminal letters of Igbo personal names. The significance of the data was discussed.

  14. Chirality- and sequence-selective successive self-sorting via specific homo- and complementary-duplex formations.

    Science.gov (United States)

    Makiguchi, Wataru; Tanabe, Junki; Yamada, Hidekazu; Iida, Hiroki; Taura, Daisuke; Ousaka, Naoki; Yashima, Eiji

    2015-06-08

    Self-recognition and self-discrimination within complex mixtures are of fundamental importance in biological systems, which entirely rely on the preprogrammed monomer sequences and homochirality of biological macromolecules. Here we report artificial chirality- and sequence-selective successive self-sorting of chiral dimeric strands bearing carboxylic acid or amidine groups joined by chiral amide linkers with different sequences through homo- and complementary-duplex formations. A mixture of carboxylic acid dimers linked by racemic-1,2-cyclohexane bis-amides with different amide sequences (NHCO or CONH) self-associate to form homoduplexes in a completely sequence-selective way, the structures of which are different from each other depending on the linker amide sequences. The further addition of an enantiopure amide-linked amidine dimer to a mixture of the racemic carboxylic acid dimers resulted in the formation of a single optically pure complementary duplex with a 100% diastereoselectivity and complete sequence specificity stabilized by the amidinium-carboxylate salt bridges, leading to the perfect chirality- and sequence-selective duplex formation.

  15. Cryptographic pseudo-random sequences from the chaotic Hénon ...

    Indian Academy of Sciences (India)

    dimensional discrete-time Hénon map is proposed. Properties of the proposed sequences pertaining to linear complexity, linear complexity profile, correlation and auto-correlation are investigated. All these properties of the sequences suggest a ...

  16. A novel whole genome amplification method using type IIS restriction enzymes to create overhangs with random sequences.

    Science.gov (United States)

    Pan, Xiaoming; Wan, Baihui; Li, Chunchuan; Liu, Yu; Wang, Jing; Mou, Haijin; Liang, Xingguo

    2014-08-20

    Ligation-mediated polymerase chain reaction (LM-PCR) is a whole genome amplification (WGA) method, for which genomic DNA is cleaved into numerous fragments and then all of the fragments are amplified by PCR after attaching a universal end sequence. However, the self-ligation of these fragments could happen and may cause biased amplification and restriction of its application. To decrease the self-ligation probability, here we use type IIS restriction enzymes to digest genomic DNA into fragments with 4-5nt long overhangs with random sequences. After ligation to an adapter with random end sequences to above fragments, PCR is carried out and almost all present DNA sequences are amplified. In this study, whole genome of Vibrio parahaemolyticus was amplified and the amplification efficiency was evaluated by quantitative PCR. The results suggested that our approach could provide sufficient genomic DNA with good quality to meet requirements of various genetic analyses. Copyright © 2014. Published by Elsevier B.V.

  17. SNP calling using genotype model selection on high-throughput sequencing data

    KAUST Repository

    You, Na

    2012-01-16

    Motivation: A review of the available single nucleotide polymorphism (SNP) calling procedures for Illumina high-throughput sequencing (HTS) platform data reveals that most rely mainly on base-calling and mapping qualities as sources of error when calling SNPs. Thus, errors not involved in base-calling or alignment, such as those in genomic sample preparation, are not accounted for.Results: A novel method of consensus and SNP calling, Genotype Model Selection (GeMS), is given which accounts for the errors that occur during the preparation of the genomic sample. Simulations and real data analyses indicate that GeMS has the best performance balance of sensitivity and positive predictive value among the tested SNP callers. © The Author 2012. Published by Oxford University Press. All rights reserved.

  18. In vitro selection of optimal DNA substrates for T4 RNA ligase

    Science.gov (United States)

    Harada, Kazuo; Orgel, Leslie E.

    1993-01-01

    We have used in vitro selection techniques to characterize DNA sequences that are ligated efficiently by T4 RNA ligase. We find that the ensemble of selected sequences ligated about 10 times as efficiently as the random mixture of sequences used as the input for selection. Surprisingly, the majority of the selected sequences approximated a well-defined consensus sequence.

  19. Inference of gorilla demographic and selective history from whole-genome sequence data.

    Science.gov (United States)

    McManus, Kimberly F; Kelley, Joanna L; Song, Shiya; Veeramah, Krishna R; Woerner, August E; Stevison, Laurie S; Ryder, Oliver A; Ape Genome Project, Great; Kidd, Jeffrey M; Wall, Jeffrey D; Bustamante, Carlos D; Hammer, Michael F

    2015-03-01

    Although population-level genomic sequence data have been gathered extensively for humans, similar data from our closest living relatives are just beginning to emerge. Examination of genomic variation within great apes offers many opportunities to increase our understanding of the forces that have differentially shaped the evolutionary history of hominid taxa. Here, we expand upon the work of the Great Ape Genome Project by analyzing medium to high coverage whole-genome sequences from 14 western lowland gorillas (Gorilla gorilla gorilla), 2 eastern lowland gorillas (G. beringei graueri), and a single Cross River individual (G. gorilla diehli). We infer that the ancestors of western and eastern lowland gorillas diverged from a common ancestor approximately 261 ka, and that the ancestors of the Cross River population diverged from the western lowland gorilla lineage approximately 68 ka. Using a diffusion approximation approach to model the genome-wide site frequency spectrum, we infer a history of western lowland gorillas that includes an ancestral population expansion of 1.4-fold around 970 ka and a recent 5.6-fold contraction in population size 23 ka. The latter may correspond to a major reduction in African equatorial forests around the Last Glacial Maximum. We also analyze patterns of variation among western lowland gorillas to identify several genomic regions with strong signatures of recent selective sweeps. We find that processes related to taste, pancreatic and saliva secretion, sodium ion transmembrane transport, and cardiac muscle function are overrepresented in genomic regions predicted to have experienced recent positive selection. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. Identification of peptide sequences that selectively bind to pentaerythritol trinitrate hemisuccinate-a surrogate of PETN, via phage display technology.

    Science.gov (United States)

    Kubas, George; Rees, William; Caguiat, Jonathan; Asch, David; Fagan, Diana; Cortes, Pedro

    2017-03-01

    The present research investigates the identification of amino acid sequences that selectively bind to a pentaerythritol tetranitrate (PETN) explosive surrogate. Through the use of a phage display technique and enzyme-linked immunosorbent assays (ELISA), a peptide library was tested against pentaerythritol trinitrate hemisuccinate (PETNH), a surrogate of PETN, to screen for those with amino acids having affinity toward the explosive. The results suggest that the library contains peptides selective to PETNH. Following three rounds of panning, clones were picked and tested for specificity toward PETNH. ELISA results from these samples show that each phage clone has some level of selectivity for binding to PETNH. The peptides from these clones have been sequenced and shown to contain certain common amino acid segments among them. This work represents a technological platform for identifying amino-acid sequences selective toward any bio-chem analyte of interest. © 2016 Wiley Periodicals, Inc.

  1. K-Ras(G12D)-selective inhibitory peptides generated by random peptide T7 phage display technology.

    Science.gov (United States)

    Sakamoto, Kotaro; Kamada, Yusuke; Sameshima, Tomoya; Yaguchi, Masahiro; Niida, Ayumu; Sasaki, Shigekazu; Miwa, Masanori; Ohkubo, Shoichi; Sakamoto, Jun-Ichi; Kamaura, Masahiro; Cho, Nobuo; Tani, Akiyoshi

    2017-03-11

    Amino-acid mutations of Gly 12 (e.g. G12D, G12V, G12C) of V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (K-Ras), the most promising drug target in cancer therapy, are major growth drivers in various cancers. Although over 30 years have passed since the discovery of these mutations in most cancer patients, effective mutated K-Ras inhibitors have not been marketed. Here, we report novel and selective inhibitory peptides to K-Ras(G12D). We screened random peptide libraries displayed on T7 phage against purified recombinant K-Ras(G12D), with thorough subtraction of phages bound to wild-type K-Ras, and obtained KRpep-2 (Ac-RRCPLYISYDPVCRR-NH 2 ) as a consensus sequence. KRpep-2 showed more than 10-fold binding- and inhibition-selectivity to K-Ras(G12D), both in SPR analysis and GDP/GTP exchange enzyme assay. K D and IC 50 values were 51 and 8.9 nM, respectively. After subsequent sequence optimization, we successfully generated KRpep-2d (Ac-RRRRCPLYISYDPVCRRRR-NH 2 ) that inhibited enzyme activity of K-Ras(G12D) with IC 50  = 1.6 nM and significantly suppressed ERK-phosphorylation, downstream of K-Ras(G12D), along with A427 cancer cell proliferation at 30 μM peptide concentration. To our knowledge, this is the first report of a K-Ras(G12D)-selective inhibitor, contributing to the development and study of K-Ras(G12D)-targeting drugs. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis.

    Science.gov (United States)

    Fernandes, Andrew D; Reid, Jennifer Ns; Macklaim, Jean M; McMurrough, Thomas A; Edgell, David R; Gloor, Gregory B

    2014-01-01

    Experimental designs that take advantage of high-throughput sequencing to generate datasets include RNA sequencing (RNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), sequencing of 16S rRNA gene fragments, metagenomic analysis and selective growth experiments. In each case the underlying data are similar and are composed of counts of sequencing reads mapped to a large number of features in each sample. Despite this underlying similarity, the data analysis methods used for these experimental designs are all different, and do not translate across experiments. Alternative methods have been developed in the physical and geological sciences that treat similar data as compositions. Compositional data analysis methods transform the data to relative abundances with the result that the analyses are more robust and reproducible. Data from an in vitro selective growth experiment, an RNA-seq experiment and the Human Microbiome Project 16S rRNA gene abundance dataset were examined by ALDEx2, a compositional data analysis tool that uses Bayesian methods to infer technical and statistical error. The ALDEx2 approach is shown to be suitable for all three types of data: it correctly identifies both the direction and differential abundance of features in the differential growth experiment, it identifies a substantially similar set of differentially expressed genes in the RNA-seq dataset as the leading tools and it identifies as differential the taxa that distinguish the tongue dorsum and buccal mucosa in the Human Microbiome Project dataset. The design of ALDEx2 reduces the number of false positive identifications that result from datasets composed of many features in few samples. Statistical analysis of high-throughput sequencing datasets composed of per feature counts showed that the ALDEx2 R package is a simple and robust tool, which can be applied to RNA-seq, 16S rRNA gene sequencing and differential growth datasets, and by extension to other techniques that use a

  3. Simulated Performance Evaluation of a Selective Tracker Through Random Scenario Generation

    DEFF Research Database (Denmark)

    Hussain, Dil Muhammad Akbar

    2006-01-01

      The paper presents a simulation study on the performance of a target tracker using selective track splitting filter algorithm through a random scenario implemented on a digital signal processor.  In a typical track splitting filter all the observation which fall inside a likelihood ellipse...... are used for update, however, in our proposed selective track splitting filter less number of observations are used for track update.  Much of the previous performance work [1] has been done on specific (deterministic) scenarios. One of the reasons for considering the specific scenarios, which were...

  4. Genome sequencing reveals loci under artificial selection that underlie disease phenotypes in the laboratory rat.

    Science.gov (United States)

    Atanur, Santosh S; Diaz, Ana Garcia; Maratou, Klio; Sarkis, Allison; Rotival, Maxime; Game, Laurence; Tschannen, Michael R; Kaisaki, Pamela J; Otto, Georg W; Ma, Man Chun John; Keane, Thomas M; Hummel, Oliver; Saar, Kathrin; Chen, Wei; Guryev, Victor; Gopalakrishnan, Kathirvel; Garrett, Michael R; Joe, Bina; Citterio, Lorena; Bianchi, Giuseppe; McBride, Martin; Dominiczak, Anna; Adams, David J; Serikawa, Tadao; Flicek, Paul; Cuppen, Edwin; Hubner, Norbert; Petretto, Enrico; Gauguier, Dominique; Kwitek, Anne; Jacob, Howard; Aitman, Timothy J

    2013-08-01

    Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and insulin resistance, along with their respective control strains. Altogether, we identified more than 13 million single-nucleotide variants, indels, and structural variants across these rat strains. Analysis of strain-specific selective sweeps and gene clusters implicated genes and pathways involved in cation transport, angiotensin production, and regulators of oxidative stress in the development of cardiovascular disease phenotypes in rats. Many of the rat loci that we identified overlap with previously mapped loci for related traits in humans, indicating the presence of shared pathways underlying these phenotypes in rats and humans. These data represent a step change in resources available for evolutionary analysis of complex traits in disease models. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  5. Classification of epileptic EEG signals based on simple random sampling and sequential feature selection

    OpenAIRE

    Ghayab, Hadi Ratham Al; Li, Yan; Abdulla, Shahab; Diykh, Mohammed; Wan, Xiangkui

    2016-01-01

    Electroencephalogram (EEG) signals are used broadly in the medical fields. The main applications of EEG signals are the diagnosis and treatment of diseases such as epilepsy, Alzheimer, sleep problems and so on. This paper presents a new method which extracts and selects features from multi-channel EEG signals. This research focuses on three main points. Firstly, simple random sampling (SRS) technique is used to extract features from the time domain of EEG signals. Secondly, the sequential fea...

  6. A Method for Large-Scale Screening of Random Sequence Libraries to Determine the Function of Unstructured Regions from Essential Proteins.

    Science.gov (United States)

    Millau, Jean-François; Guillemette, Benoit; Gaudreau, Luc

    2017-01-01

    In this chapter we present a method allowing the screening of random sequences to discover essential aspects of unstructured protein regions in yeast. The approach can be applied to any protein with unstructured peptide sequences for which functions are difficult to decipher, for example the N-terminal tails of histones. The protocol first describes the building and preparation of a large library of random peptides in fusion with a protein of interest. Recent technical advances in oligonucleotide synthesis allow the construction of long random sequences up to 35 residues long. The protocol details the screening of the library in yeast for sequences that can functionally replace an unstructured domain in an essential protein in vivo. Our method typically identifies sequences that, while being totally different from the wild type, retain essential features allowing yeast to live. This collection of proteins with functional synthetic sequences can subsequently be used in phenotypic tests or genetic screens in order to discover genetic interaction.

  7. Random Tagging Genotyping by Sequencing (rtGBS, an Unbiased Approach to Locate Restriction Enzyme Sites across the Target Genome.

    Directory of Open Access Journals (Sweden)

    Elena Hilario

    Full Text Available Genotyping by sequencing (GBS is a restriction enzyme based targeted approach developed to reduce the genome complexity and discover genetic markers when a priori sequence information is unavailable. Sufficient coverage at each locus is essential to distinguish heterozygous from homozygous sites accurately. The number of GBS samples able to be pooled in one sequencing lane is limited by the number of restriction sites present in the genome and the read depth required at each site per sample for accurate calling of single-nucleotide polymorphisms. Loci bias was observed using a slight modification of the Elshire et al.some restriction enzyme sites were represented in higher proportions while others were poorly represented or absent. This bias could be due to the quality of genomic DNA, the endonuclease and ligase reaction efficiency, the distance between restriction sites, the preferential amplification of small library restriction fragments, or bias towards cluster formation of small amplicons during the sequencing process. To overcome these issues, we have developed a GBS method based on randomly tagging genomic DNA (rtGBS. By randomly landing on the genome, we can, with less bias, find restriction sites that are far apart, and undetected by the standard GBS (stdGBS method. The study comprises two types of biological replicates: six different kiwifruit plants and two independent DNA extractions per plant; and three types of technical replicates: four samples of each DNA extraction, stdGBS vs. rtGBS methods, and two independent library amplifications, each sequenced in separate lanes. A statistically significant unbiased distribution of restriction fragment size by rtGBS showed that this method targeted 49% (39,145 of BamH I sites shared with the reference genome, compared to only 14% (11,513 by stdGBS.

  8. Selection bias and subject refusal in a cluster-randomized controlled trial

    Directory of Open Access Journals (Sweden)

    Rochelle Yang

    2017-07-01

    Full Text Available Abstract Background Selection bias and non-participation bias are major methodological concerns which impact external validity. Cluster-randomized controlled trials are especially prone to selection bias as it is impractical to blind clusters to their allocation into intervention or control. This study assessed the impact of selection bias in a large cluster-randomized controlled trial. Methods The Improved Cardiovascular Risk Reduction to Enhance Rural Primary Care (ICARE study examined the impact of a remote pharmacist-led intervention in twelve medical offices. To assess eligibility, a standardized form containing patient demographics and medical information was completed for each screened patient. Eligible patients were approached by the study coordinator for recruitment. Both the study coordinator and the patient were aware of the site’s allocation prior to consent. Patients who consented or declined to participate were compared across control and intervention arms for differing characteristics. Statistical significance was determined using a two-tailed, equal variance t-test and a chi-square test with adjusted Bonferroni p-values. Results were adjusted for random cluster variation. Results There were 2749 completed screening forms returned to research staff with 461 subjects who had either consented or declined participation. Patients with poorly controlled diabetes were found to be significantly more likely to decline participation in intervention sites compared to those in control sites. A higher mean diastolic blood pressure was seen in patients with uncontrolled hypertension who declined in the control sites compared to those who declined in the intervention sites. However, these findings were no longer significant after adjustment for random variation among the sites. After this adjustment, females were now found to be significantly more likely to consent than males (odds ratio = 1.41; 95% confidence interval = 1.03, 1

  9. Simple sequence repeat markers in genetic divergence and marker-assisted selection of rice cultivars: a review.

    Science.gov (United States)

    Kaur, Shubhneet; Panesar, Parmjit S; Bera, Manab B; Kaur, Varinder

    2015-01-01

    Sequencing of rice genome has facilitated the understanding of rice evolution and has been utilized extensively for mining of DNA markers to facilitate marker-assisted breeding. Simple sequence repeat (SSR) markers that are tandemly repeated nucleotide sequence motifs flanked by unique sequences are presently the maker of choice in rice improvement due to their abundance, co-dominant inheritance, high levels of allelic diversity, and simple reproducible assay. The current level of genome coverage by SSR markers in rice is sufficient to employ them for genotype identification and marker-assisted selection in breeding for mapping of genes and quantitative trait loci analysis. This review provides comprehensive information on the mapping and applications of SSR markers in investigation of rice cultivars to study their genetic divergence and marker-assisted selection of important agronomic traits.

  10. In vitro selection of external guide sequences for directing RNase P-mediated inhibition of viral gene expression.

    Science.gov (United States)

    Zhou, Tianhong; Kim, Joseph; Kilani, Ahmed F; Kim, Kihoon; Dunn, Walter; Jo, Solomon; Nepomuceno, Edward; Liu, Fenyong

    2002-08-16

    External guide sequences (EGSs) are small RNA molecules that bind to a target mRNA, form a complex resembling the structure of a tRNA, and render the mRNA susceptible to hydrolysis by RNase P, a tRNA processing enzyme. An in vitro selection procedure was used to select EGSs that direct human RNase P to cleave the mRNA encoding thymidine kinase (TK) of herpes simplex virus 1. One of the selected EGSs, TK17, was at least 35 times more active in directing RNase P in cleaving TK mRNA in vitro than the EGS derived from a natural tRNA sequence. TK17, when in complex with the TK mRNA sequence, resembles a portion of tRNA structure and exhibits an enhanced binding affinity to the target mRNA. Moreover, a reduction of 95 and 50% in the TK expression was found in herpes simplex virus 1-infected cells that expressed the selected EGS and the EGS derived from the natural tRNA sequence, respectively. Our study provides direct evidence that EGS molecules isolated by the selection procedure are effective in tissue culture. These results also demonstrate the potential for using the selection procedure as a general approach for the generation of highly effective EGSs for gene-targeting application.

  11. Selective enrichment and sequencing of whole mitochondrial genomes in the presence of nuclear encoded mitochondrial pseudogenes (numts.

    Directory of Open Access Journals (Sweden)

    Jonci N Wolff

    Full Text Available Numts are an integral component of many eukaryote genomes offering a snapshot of the evolutionary process that led from the incorporation of an α-proteobacterium into a larger eukaryotic cell some 1.8 billion years ago. Although numt sequence can be harnessed as molecular marker, these sequences often remain unidentified and are mistaken for genuine mtDNA leading to erroneous interpretation of mtDNA data sets. It is therefore indispensable that during the process of amplifying and sequencing mitochondrial genes, preventive measures are taken to ensure the exclusion of numts to guarantee the recovery of genuine mtDNA. This applies to mtDNA analyses in general but especially to studies where mtDNAs are sequenced de novo as the launch pad for subsequent mtDNA-based research. By using a combination of dilution series and nested rolling circle amplification (RCA, we present a novel strategy to selectively amplify mtDNA and exclude the amplification of numt sequence. We have successfully applied this strategy to de novo sequence the mtDNA of the Black Field Cricket Teleogryllus commodus, a species known to contain numts. Aligning our assembled sequence to the reference genome of Teleogryllus emma (GenBank EU557269.1 led to the identification of a numt sequence in the reference sequence. This unexpected result further highlights the need of a reliable and accessible strategy to eliminate this source of error.

  12. Effect of non-random mating on genomic and BLUP selection schemes

    Directory of Open Access Journals (Sweden)

    Nirea Kahsay G

    2012-04-01

    Full Text Available Abstract Background The risk of long-term unequal contribution of mating pairs to the gene pool is that deleterious recessive genes can be expressed. Such consequences could be alleviated by appropriately designing and optimizing breeding schemes i.e. by improving selection and mating procedures. Methods We studied the effect of mating designs, random, minimum coancestry and minimum covariance of ancestral contributions on rate of inbreeding and genetic gain for schemes with different information sources, i.e. sib test or own performance records, different genetic evaluation methods, i.e. BLUP or genomic selection, and different family structures, i.e. factorial or pair-wise. Results Results showed that substantial differences in rates of inbreeding due to mating design were present under schemes with a pair-wise family structure, for which minimum coancestry turned out to be more effective to generate lower rates of inbreeding. Specifically, substantial reductions in rates of inbreeding were observed in schemes using sib test records and BLUP evaluation. However, with a factorial family structure, differences in rates of inbreeding due mating designs were minor. Moreover, non-random mating had only a small effect in breeding schemes that used genomic evaluation, regardless of the information source. Conclusions It was concluded that minimum coancestry remains an efficient mating design when BLUP is used for genetic evaluation or when the size of the population is small, whereas the effect of non-random mating is smaller in schemes using genomic evaluation.

  13. Feature Selection and the Class Imbalance Problem in Predicting Protein Function from Sequence

    NARCIS (Netherlands)

    Al-Shahib, A.; Breitling, R.; Gilbert, D.

    2005-01-01

    Abstract: When the standard approach to predict protein function by sequence homology fails, other alternative methods can be used that require only the amino acid sequence for predicting function. One such approach uses machine learning to predict protein function directly from amino acid sequence

  14. KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily

    DEFF Research Database (Denmark)

    Pons, Tirso; Vazquez, Miguel; Matey-Hernandez, María Luisa

    2016-01-01

    remains challenging: cells tolerate most genomic alterations and only a minor fraction disrupt molecular function sufficiently and drive disease. Results: KinMutRF is a novel random-forest method to automatically identify pathogenic variants in human kinases. Twenty six decision trees implemented......Background: The association between aberrant signal processing by protein kinases and human diseases such as cancer was established long time ago. However, understanding the link between sequence variants in the protein kinase superfamily and the mechanistic complex traits at the molecular level...... as a random forest ponder a battery of features that characterize the variants: a) at the gene level, including membership to a Kinbase group and Gene Ontology terms; b) at the PFAM domain level; and c) at the residue level, the types of amino acids involved, changes in biochemical properties, functional...

  15. Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy.

    Science.gov (United States)

    Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing

    Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew's Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc.

  16. Prediction of antimicrobial peptides based on sequence alignment and feature selection methods.

    Directory of Open Access Journals (Sweden)

    Ping Wang

    Full Text Available Antimicrobial peptides (AMPs represent a class of natural peptides that form a part of the innate immune system, and this kind of 'nature's antibiotics' is quite promising for solving the problem of increasing antibiotic resistance. In view of this, it is highly desired to develop an effective computational method for accurately predicting novel AMPs because it can provide us with more candidates and useful insights for drug design. In this study, a new method for predicting AMPs was implemented by integrating the sequence alignment method and the feature selection method. It was observed that, the overall jackknife success rate by the new predictor on a newly constructed benchmark dataset was over 80.23%, and the Mathews correlation coefficient is 0.73, indicating a good prediction. Moreover, it is indicated by an in-depth feature analysis that the results are quite consistent with the previously known knowledge that some amino acids are preferential in AMPs and that these amino acids do play an important role for the antimicrobial activity. For the convenience of most experimental scientists who want to use the prediction method without the interest to follow the mathematical details, a user-friendly web-server is provided at http://amp.biosino.org/.

  17. Non-contrast-enhanced pulmonary vein MRI with a spatially selective slab inversion preparation sequence.

    Science.gov (United States)

    Hu, Peng; Chuang, Michael L; Kissinger, Kraig V; Goddu, Beth; Goepfert, Lois A; Rofsky, Neil M; Manning, Warren J; Nezafat, Reza

    2010-02-01

    We propose a non-contrast-enhanced, three-dimensional, free-breathing, electrocardiogram-gated, gradient recalled echo sequence with a slab-selective inversion for pulmonary vein (PV) MRI. A sagittal inversion slab was applied prior to data acquisition to suppress structures adjacent to the left atrium (LA) and PVs, thereby improving the conspicuity of the PV and LA. Compared with other MR angiography methods using an inversion pulse, the proposed method does not require signal subtraction and the inversion slab is not parallel to the imaging slab. The feasibility of the proposed method was demonstrated in healthy subjects. The inversion slab thickness and inversion time were optimized to be 60 mm and 500 ms, respectively. Compared to conventional gradient recalled echo imaging without inversion, the proposed technique significantly increased the contrast-to-noise ratios between the LA and the right atrium by 20-fold (P slab (P > 0.3). The proposed technique greatly enhances the conspicuity of the PVs and LA without significant loss of signal-to-noise ratio.

  18. A specific brushing sequence and plaque removal efficacy : a randomized split-mouth design

    NARCIS (Netherlands)

    van der Sluijs, E.; Slot, D.E.; Hennequin-Hoenderdos, N.L.; van der Weijden, G.A.

    2018-01-01

    Aim: It has been propagated by the dental care professionals to start toothbrushing the lingual aspect of teeth first. In general, it is assumed that these surfaces of teeth are more difficult to clean. The evidence to support this recommendation is sparse. Method: In this randomized controlled

  19. Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data

    Directory of Open Access Journals (Sweden)

    Regad Leslie

    2010-01-01

    Full Text Available Abstract Background In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.. Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. Results The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Conclusions Our algorithms prove to be effective and able to handle real data sets with

  20. Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data

    Science.gov (United States)

    2010-01-01

    Background In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.). Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. Results The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Conclusions Our algorithms prove to be effective and able to handle real data sets with multiple sequences, as well

  1. DRUMS: Disk Repository with Update Management and Select option for high throughput sequencing data.

    Science.gov (United States)

    Nettling, Martin; Thieme, Nils; Both, Andreas; Grosse, Ivo

    2014-02-04

    New technologies for analyzing biological samples, like next generation sequencing, are producing a growing amount of data together with quality scores. Moreover, software tools (e.g., for mapping sequence reads), calculating transcription factor binding probabilities, estimating epigenetic modification enriched regions or determining single nucleotide polymorphism increase this amount of position-specific DNA-related data even further. Hence, requesting data becomes challenging and expensive and is often implemented using specialised hardware. In addition, picking specific data as fast as possible becomes increasingly important in many fields of science. The general problem of handling big data sets was addressed by developing specialized databases like HBase, HyperTable or Cassandra. However, these database solutions require also specialized or distributed hardware leading to expensive investments. To the best of our knowledge, there is no database capable of (i) storing billions of position-specific DNA-related records, (ii) performing fast and resource saving requests, and (iii) running on a single standard computer hardware. Here, we present DRUMS (Disk Repository with Update Management and Select option), satisfying demands (i)-(iii). It tackles the weaknesses of traditional databases while handling position-specific DNA-related data in an efficient manner. DRUMS is capable of storing up to billions of records. Moreover, it focuses on optimizing relating single lookups as range request, which are needed permanently for computations in bioinformatics. To validate the power of DRUMS, we compare it to the widely used MySQL database. The test setting considers two biological data sets. We use standard desktop hardware as test environment. DRUMS outperforms MySQL in writing and reading records by a factor of two up to a factor of 10000. Furthermore, it can work with significantly larger data sets. Our work focuses on mid-sized data sets up to several billion

  2. Assessing the amount of quadruplex structures present within G₂-tract synthetic random-sequence DNA libraries.

    Directory of Open Access Journals (Sweden)

    Simon A McManus

    Full Text Available The process of in vitro selection has led to the discovery of many aptamers with potential to be developed into inhibitors and biosensors, but problems in isolating aptamers against certain targets with desired affinity and specificity still remain. One possible improvement is to use libraries enhanced for motifs repeatedly isolated in aptamer molecules. One such frequently observed motif is the two-tiered guanine quadruplex. In this study we investigated whether DNA libraries could be designed to contain a large fraction of molecules capable of folding into two-tiered guanine quadruplexes. Using comprehensive circular dichroism analysis, we found that DNA libraries could be designed to contain a large proportion of sequences that adopt guanine quadruplex structures. Analysis of individual sequences from a small library revealed a mixture of quadruplexes of different topologies providing the diversity desired for an in vitro selection. We also found that primer-binding sites are detrimental to quadruplex formation and devised a method for post-selection amplification of primer-less quadruplex libraries. With the development of guanine quadruplex enriched DNA libraries, it should be possible to improve the chances of isolating aptamers that utilize a quadruplex scaffold and enhance the success of in vitro selection experiments.

  3. Automated Gel Size Selection to Improve the Quality of Next-generation Sequencing Libraries Prepared from Environmental Water Samples.

    Science.gov (United States)

    Uyaguari-Diaz, Miguel I; Slobodan, Jared R; Nesbitt, Matthew J; Croxen, Matthew A; Isaac-Renton, Judith; Prystajecky, Natalie A; Tang, Patrick

    2015-04-17

    Next-generation sequencing of environmental samples can be challenging because of the variable DNA quantity and quality in these samples. High quality DNA libraries are needed for optimal results from next-generation sequencing. Environmental samples such as water may have low quality and quantities of DNA as well as contaminants that co-precipitate with DNA. The mechanical and enzymatic processes involved in extraction and library preparation may further damage the DNA. Gel size selection enables purification and recovery of DNA fragments of a defined size for sequencing applications. Nevertheless, this task is one of the most time-consuming steps in the DNA library preparation workflow. The protocol described here enables complete automation of agarose gel loading, electrophoretic analysis, and recovery of targeted DNA fragments. In this study, we describe a high-throughput approach to prepare high quality DNA libraries from freshwater samples that can be applied also to other environmental samples. We used an indirect approach to concentrate bacterial cells from environmental freshwater samples; DNA was extracted using a commercially available DNA extraction kit, and DNA libraries were prepared using a commercial transposon-based protocol. DNA fragments of 500 to 800 bp were gel size selected using Ranger Technology, an automated electrophoresis workstation. Sequencing of the size-selected DNA libraries demonstrated significant improvements to read length and quality of the sequencing reads.

  4. Assessing the accuracy and stability of variable selection methods for random forest modeling in ecology.

    Science.gov (United States)

    Fox, Eric W; Hill, Ryan A; Leibowitz, Scott G; Olsen, Anthony R; Thornbrugh, Darren J; Weber, Marc H

    2017-07-01

    Random forest (RF) modeling has emerged as an important statistical learning method in ecology due to its exceptional predictive performance. However, for large and complex ecological data sets, there is limited guidance on variable selection methods for RF modeling. Typically, either a preselected set of predictor variables are used or stepwise procedures are employed which iteratively remove variables according to their importance measures. This paper investigates the application of variable selection methods to RF models for predicting probable biological stream condition. Our motivating data set consists of the good/poor condition of n = 1365 stream survey sites from the 2008/2009 National Rivers and Stream Assessment, and a large set (p = 212) of landscape features from the StreamCat data set as potential predictors. We compare two types of RF models: a full variable set model with all 212 predictors and a reduced variable set model selected using a backward elimination approach. We assess model accuracy using RF's internal out-of-bag estimate, and a cross-validation procedure with validation folds external to the variable selection process. We also assess the stability of the spatial predictions generated by the RF models to changes in the number of predictors and argue that model selection needs to consider both accuracy and stability. The results suggest that RF modeling is robust to the inclusion of many variables of moderate to low importance. We found no substantial improvement in cross-validated accuracy as a result of variable reduction. Moreover, the backward elimination procedure tended to select too few variables and exhibited numerous issues such as upwardly biased out-of-bag accuracy estimates and instabilities in the spatial predictions. We use simulations to further support and generalize results from the analysis of real data. A main purpose of this work is to elucidate issues of model selection bias and instability to ecologists interested in

  5. PReFerSim: fast simulation of demography and selection under the Poisson Random Field model.

    Science.gov (United States)

    Ortega-Del Vecchyo, Diego; Marsden, Clare D; Lohmueller, Kirk E

    2016-11-15

    The Poisson Random Field (PRF) model has become an important tool in population genetics to study weakly deleterious genetic variation under complicated demographic scenarios. Currently, there are no freely available software applications that allow simulation of genetic variation data under this model. Here we present PReFerSim, an ANSI C program that performs forward simulations under the PRF model. PReFerSim models changes in population size, arbitrary amounts of inbreeding, dominance and distributions of selective effects. Users can track summaries of genetic variation over time and output trajectories of selected alleles. PReFerSim is freely available at: https://github.com/LohmuellerLab/PReFerSim CONTACT: klohmueller@ucla.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. Filovirus RefSeq Entries: Evaluation and Selection of Filovirus Type Variants, Type Sequences, and Names

    Directory of Open Access Journals (Sweden)

    Jens H. Kuhn

    2014-09-01

    Full Text Available Sequence determination of complete or coding-complete genomes of viruses is becoming common practice for supporting the work of epidemiologists, ecologists, virologists, and taxonomists. Sequencing duration and costs are rapidly decreasing, sequencing hardware is under modification for use by non-experts, and software is constantly being improved to simplify sequence data management and analysis. Thus, analysis of virus disease outbreaks on the molecular level is now feasible, including characterization of the evolution of individual virus populations in single patients over time. The increasing accumulation of sequencing data creates a management problem for the curators of commonly used sequence databases and an entry retrieval problem for end users. Therefore, utilizing the data to their fullest potential will require setting nomenclature and annotation standards for virus isolates and associated genomic sequences. The National Center for Biotechnology Information’s (NCBI’s RefSeq is a non-redundant, curated database for reference (or type nucleotide sequence records that supplies source data to numerous other databases. Building on recently proposed templates for filovirus variant naming [ (////-], we report consensus decisions from a majority of past and currently active filovirus experts on the eight filovirus type variants and isolates to be represented in RefSeq, their final designations, and their associated sequences.

  7. Query-seeded iterative sequence similarity searching improves selectivity 5-20-fold.

    Science.gov (United States)

    Pearson, William R; Li, Weizhong; Lopez, Rodrigo

    2017-04-20

    Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination. psiblast PSSMs are built from the query-based multiple sequence alignment (MSA) implied by the pairwise alignments between the query model (PSSM, HMM) and the subject sequences in the library. When the original query sequence residues are inserted into gapped positions in the aligned subject sequence, the resulting PSSM rarely produces alignment over-extensions or alignments to unrelated sequences. This simple step, which tends to anchor the PSSM to the original query sequence and slightly increase target percent identity, can reduce the frequency of false-positive alignments more than 20-fold compared with psiblast and jackhmmer, with little loss in search sensitivity. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Query-seeded iterative sequence similarity searching improves selectivity 5–20-fold

    Science.gov (United States)

    Li, Weizhong; Lopez, Rodrigo

    2017-01-01

    Abstract Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination. psiblast PSSMs are built from the query-based multiple sequence alignment (MSA) implied by the pairwise alignments between the query model (PSSM, HMM) and the subject sequences in the library. When the original query sequence residues are inserted into gapped positions in the aligned subject sequence, the resulting PSSM rarely produces alignment over-extensions or alignments to unrelated sequences. This simple step, which tends to anchor the PSSM to the original query sequence and slightly increase target percent identity, can reduce the frequency of false-positive alignments more than 20-fold compared with psiblast and jackhmmer, with little loss in search sensitivity. PMID:27923999

  9. Filovirus RefSeq Entries: Evaluation and Selection of Filovirus Type Variants, Type Sequences, and Names

    Science.gov (United States)

    Kuhn, Jens H.; Andersen, Kristian G.; Bào, Yīmíng; Bavari, Sina; Becker, Stephan; Bennett, Richard S.; Bergman, Nicholas H.; Blinkova, Olga; Bradfute, Steven; Brister, J. Rodney; Bukreyev, Alexander; Chandran, Kartik; Chepurnov, Alexander A.; Davey, Robert A.; Dietzgen, Ralf G.; Doggett, Norman A.; Dolnik, Olga; Dye, John M.; Enterlein, Sven; Fenimore, Paul W.; Formenty, Pierre; Freiberg, Alexander N.; Garry, Robert F.; Garza, Nicole L.; Gire, Stephen K.; Gonzalez, Jean-Paul; Griffiths, Anthony; Happi, Christian T.; Hensley, Lisa E.; Herbert, Andrew S.; Hevey, Michael C.; Hoenen, Thomas; Honko, Anna N.; Ignatyev, Georgy M.; Jahrling, Peter B.; Johnson, Joshua C.; Johnson, Karl M.; Kindrachuk, Jason; Klenk, Hans-Dieter; Kobinger, Gary; Kochel, Tadeusz J.; Lackemeyer, Matthew G.; Lackner, Daniel F.; Leroy, Eric M.; Lever, Mark S.; Mühlberger, Elke; Netesov, Sergey V.; Olinger, Gene G.; Omilabu, Sunday A.; Palacios, Gustavo; Panchal, Rekha G.; Park, Daniel J.; Patterson, Jean L.; Paweska, Janusz T.; Peters, Clarence J.; Pettitt, James; Pitt, Louise; Radoshitzky, Sheli R.; Ryabchikova, Elena I.; Saphire, Erica Ollmann; Sabeti, Pardis C.; Sealfon, Rachel; Shestopalov, Aleksandr M.; Smither, Sophie J.; Sullivan, Nancy J.; Swanepoel, Robert; Takada, Ayato; Towner, Jonathan S.; van der Groen, Guido; Volchkov, Viktor E.; Volchkova, Valentina A.; Wahl-Jensen, Victoria; Warren, Travis K.; Warfield, Kelly L.; Weidmann, Manfred; Nichol, Stuart T.

    2014-01-01

    Sequence determination of complete or coding-complete genomes of viruses is becoming common practice for supporting the work of epidemiologists, ecologists, virologists, and taxonomists. Sequencing duration and costs are rapidly decreasing, sequencing hardware is under modification for use by non-experts, and software is constantly being improved to simplify sequence data management and analysis. Thus, analysis of virus disease outbreaks on the molecular level is now feasible, including characterization of the evolution of individual virus populations in single patients over time. The increasing accumulation of sequencing data creates a management problem for the curators of commonly used sequence databases and an entry retrieval problem for end users. Therefore, utilizing the data to their fullest potential will require setting nomenclature and annotation standards for virus isolates and associated genomic sequences. The National Center for Biotechnology Information’s (NCBI’s) RefSeq is a non-redundant, curated database for reference (or type) nucleotide sequence records that supplies source data to numerous other databases. Building on recently proposed templates for filovirus variant naming [virus name> ()////-], we report consensus decisions from a majority of past and currently active filovirus experts on the eight filovirus type variants and isolates to be represented in RefSeq, their final designations, and their associated sequences. PMID:25256396

  10. Genetic alterations of hepatocellular carcinoma by random amplified polymorphic DNA analysis and cloning sequencing of tumor differential DNA fragment

    Science.gov (United States)

    Xian, Zhi-Hong; Cong, Wen-Ming; Zhang, Shu-Hui; Wu, Meng-Chao

    2005-01-01

    AIM: To study the genetic alterations and their association with clinicopathological characteristics of hepatocellular carcinoma (HCC), and to find the tumor related DNA fragments. METHODS: DNA isolated from tumors and corresponding noncancerous liver tissues of 56 HCC patients was amplified by random amplified polymorphic DNA (RAPD) with 10 random 10-mer arbitrary primers. The RAPD bands showing obvious differences in tumor tissue DNA corresponding to that of normal tissue were separated, purified, cloned and sequenced. DNA sequences were analyzed and compared with GenBank data. RESULTS: A total of 56 cases of HCC were demonstrated to have genetic alterations, which were detected by at least one primer. The detestability of genetic alterations ranged from 20% to 70% in each case, and 17.9% to 50% in each primer. Serum HBV infection, tumor size, histological grade, tumor capsule, as well as tumor intrahepatic metastasis, might be correlated with genetic alterations on certain primers. A band with a higher intensity of 480 bp or so amplified fragments in tumor DNA relative to normal DNA could be seen in 27 of 56 tumor samples using primer 4. Sequence analysis of these fragments showed 91% homology with Homo sapiens double homeobox protein DUX10 gene. CONCLUSION: Genetic alterations are a frequent event in HCC, and tumor related DNA fragments have been found in this study, which may be associated with hepatocarcin-ogenesis. RAPD is an effective method for the identification and analysis of genetic alterations in HCC, and may provide new information for further evaluating the molecular mechanism of hepatocarcinogenesis. PMID:15996039

  11. Selective oropharyngeal decontamination versus selective digestive decontamination in critically ill patients: a meta-analysis of randomized controlled trials

    Directory of Open Access Journals (Sweden)

    Zhao D

    2015-07-01

    Full Text Available Di Zhao,1,* Jian Song,2,* Xuan Gao,3 Fei Gao,4 Yupeng Wu,2 Yingying Lu,5 Kai Hou1 1Department of Neurosurgery, The First Hospital of Hebei Medical University, 2Department of Neurosurgery, 3Department of Neurology, The Second Hospital of Hebei Medical University, 4Hebei Provincial Procurement Centers for Medical Drugs and Devices, 5Department of Neurosurgery, The Second Hospital of Hebei Medical University, Shijiazhuang People’s Republic of China *These authors contributed equally to this work Background: Selective digestive decontamination (SDD and selective oropharyngeal decontamination (SOD are associated with reduced mortality and infection rates among patients in intensive care units (ICUs; however, whether SOD has a superior effect than SDD remains uncertain. Hence, we conducted a meta-analysis of randomized controlled trials (RCTs to compare SOD with SDD in terms of clinical outcomes and antimicrobial resistance rates in patients who were critically ill. Methods: RCTs published in PubMed, Embase, and Web of Science were systematically reviewed to compare the effects of SOD and SDD in patients who were critically ill. Outcomes included day-28 mortality, length of ICU stay, length of hospital stay, duration of mechanical ventilation, ICU-acquired bacteremia, and prevalence of antibiotic-resistant Gram-negative bacteria. Results were expressed as risk ratio (RR with 95% confidence intervals (CIs, and weighted mean differences (WMDs with 95% CIs. Pooled estimates were performed using a fixed-effects model or random-effects model, depending on the heterogeneity among studies. Results: A total of four RCTs involving 23,822 patients met the inclusion criteria and were included in this meta-analysis. Among patients whose admitting specialty was surgery, cardiothoracic surgery (57.3% and neurosurgery (29.7% were the two main types of surgery being performed. Pooled results showed that SOD had similar effects as SDD in day-28 mortality (RR =1

  12. Ethnopharmacological versus random plant selection methods for the evaluation of the antimycobacterial activity

    Directory of Open Access Journals (Sweden)

    Danilo R. Oliveira

    2011-05-01

    Full Text Available The municipality of Oriximiná, Brazil, has 33 quilombola communities in remote areas, endowed with wide experience in the use of medicinal plants. An ethnobotanical survey was carried out in five of these communities. A free-listing method directed for the survey of species locally indicated against Tuberculosis and lung problems was also applied. Data were analyzed by quantitative techniques: saliency index and major use agreement. Thirty four informants related 254 ethnospecies. Among these, 43 were surveyed for possible antimycobacterial activity. As a result of those informations, ten species obtained from the ethnodirected approach (ETHNO and eighteen species obtained from the random approach (RANDOM were assayed against Mycobacterium tuberculosis by the microdilution method, using resazurin as an indicator of cell viability. The best results for antimycobacterial activity were obtained of some plants selected by the ethnopharmacological approach (50% ETHNO x 16,7% RANDOM. These results can be even more significant if we consider that the therapeutic success obtained among the quilombola practice is complex, being the use of some plants acting as fortifying agents, depurative, vomitory, purgative and bitter remedy, especially to infectious diseases, of great importance to the communities in the curing or recovering of health as a whole.

  13. Random forest variable selection in spatial malaria transmission modelling in Mpumalanga Province, South Africa.

    Science.gov (United States)

    Kapwata, Thandi; Gebreslasie, Michael T

    2016-11-16

    Malaria is an environmentally driven disease. In order to quantify the spatial variability of malaria transmission, it is imperative to understand the interactions between environmental variables and malaria epidemiology at a micro-geographic level using a novel statistical approach. The random forest (RF) statistical learning method, a relatively new variable-importance ranking method, measures the variable importance of potentially influential parameters through the percent increase of the mean squared error. As this value increases, so does the relative importance of the associated variable. The principal aim of this study was to create predictive malaria maps generated using the selected variables based on the RF algorithm in the Ehlanzeni District of Mpumalanga Province, South Africa. From the seven environmental variables used [temperature, lag temperature, rainfall, lag rainfall, humidity, altitude, and the normalized difference vegetation index (NDVI)], altitude was identified as the most influential predictor variable due its high selection frequency. It was selected as the top predictor for 4 out of 12 months of the year, followed by NDVI, temperature and lag rainfall, which were each selected twice. The combination of climatic variables that produced the highest prediction accuracy was altitude, NDVI, and temperature. This suggests that these three variables have high predictive capabilities in relation to malaria transmission. Furthermore, it is anticipated that the predictive maps generated from predictions made by the RF algorithm could be used to monitor the progression of malaria and assist in intervention and prevention efforts with respect to malaria.

  14. Random forest variable selection in spatial malaria transmission modelling in Mpumalanga Province, South Africa

    Directory of Open Access Journals (Sweden)

    Thandi Kapwata

    2016-11-01

    Full Text Available Malaria is an environmentally driven disease. In order to quantify the spatial variability of malaria transmission, it is imperative to understand the interactions between environmental variables and malaria epidemiology at a micro-geographic level using a novel statistical approach. The random forest (RF statistical learning method, a relatively new variable-importance ranking method, measures the variable importance of potentially influential parameters through the percent increase of the mean squared error. As this value increases, so does the relative importance of the associated variable. The principal aim of this study was to create predictive malaria maps generated using the selected variables based on the RF algorithm in the Ehlanzeni District of Mpumalanga Province, South Africa. From the seven environmental variables used [temperature, lag temperature, rainfall, lag rainfall, humidity, altitude, and the normalized difference vegetation index (NDVI], altitude was identified as the most influential predictor variable due its high selection frequency. It was selected as the top predictor for 4 out of 12 months of the year, followed by NDVI, temperature and lag rainfall, which were each selected twice. The combination of climatic variables that produced the highest prediction accuracy was altitude, NDVI, and temperature. This suggests that these three variables have high predictive capabilities in relation to malaria transmission. Furthermore, it is anticipated that the predictive maps generated from predictions made by the RF algorithm could be used to monitor the progression of malaria and assist in intervention and prevention efforts with respect to malaria.

  15. Selecting the appropriate pacing mode for patients with sick sinus syndrome: evidence from randomized clinical trials.

    Science.gov (United States)

    Albertsen, A E; Nielsen, J C

    2003-12-01

    Several observational studies have indicated that selection of pacing mode may be important for the clinical outcome in patients with symptomatic bradycardia, affecting the development of atrial fibrillation (AF), thromboembolism, congestive heart failure, mortality and quality of life. In this paper we present and discuss the most recent data from six randomized trials on mode selection in patients with sick sinus syndrome (SSS). In pacing mode selection, VVI(R) pacing is the least attractive solution, increasing the incidence of AF and-as compared with AAI(R) pacing, also the incidence of heart failure, thromboembolism and death. VVI(R) pacing should not be used as the primary pacing mode in patients with SSS, who haven't chronic AF. AAIR pacing is superior to DDDR pacing, reducing AF and preserving left ventricular function. Single site right ventricular pacing-VVI(R) or DDD(R) mode-causes an abnormal ventricular activation and contraction (called ventricular desynchronization), which results in a reduced left ventricular function. Despite the risk of AV block, we consider AAIR pacing to be the optimal pacing mode for isolated SSS today and an algorithm to select patients for AAIR pacing is suggested. Trials on new pacemaker algorithms minimizing right ventricular pacing as well as trials testing alternative pacing sites and multisite pacing to reduce ventricular desynchronization can be expected within the next years.

  16. Geography and genography: prediction of continental origin using randomly selected single nucleotide polymorphisms

    Directory of Open Access Journals (Sweden)

    Ramoni Marco F

    2007-03-01

    Full Text Available Abstract Background Recent studies have shown that when individuals are grouped on the basis of genetic similarity, group membership corresponds closely to continental origin. There has been considerable debate about the implications of these findings in the context of larger debates about race and the extent of genetic variation between groups. Some have argued that clustering according to continental origin demonstrates the existence of significant genetic differences between groups and that these differences may have important implications for differences in health and disease. Others argue that clustering according to continental origin requires the use of large amounts of genetic data or specifically chosen markers and is indicative only of very subtle genetic differences that are unlikely to have biomedical significance. Results We used small numbers of randomly selected single nucleotide polymorphisms (SNPs from the International HapMap Project to train naïve Bayes classifiers for prediction of ancestral continent of origin. Predictive accuracy was tested on two independent data sets. Genetically similar groups should be difficult to distinguish, especially if only a small number of genetic markers are used. The genetic differences between continentally defined groups are sufficiently large that one can accurately predict ancestral continent of origin using only a minute, randomly selected fraction of the genetic variation present in the human genome. Genotype data from only 50 random SNPs was sufficient to predict ancestral continent of origin in our primary test data set with an average accuracy of 95%. Genetic variations informative about ancestry were common and widely distributed throughout the genome. Conclusion Accurate characterization of ancestry is possible using small numbers of randomly selected SNPs. The results presented here show how investigators conducting genetic association studies can use small numbers of arbitrarily

  17. Joint random beam and spectrum selection for spectrum sharing systems with partial channel state information

    KAUST Repository

    Abdallah, Mohamed M.

    2013-11-01

    In this work, we develop joint interference-aware random beam and spectrum selection scheme that provide enhanced performance for the secondary network under the condition that the interference observed at the primary receiver is below a predetermined acceptable value. We consider a secondary link composed of a transmitter equipped with multiple antennas and a single-antenna receiver sharing the same spectrum with a set of primary links composed of a single-antenna transmitter and a single-antenna receiver. The proposed schemes jointly select a beam, among a set of power-optimized random beams, as well as the primary spectrum that maximizes the signal-to-interference-plus-noise ratio (SINR) of the secondary link while satisfying the primary interference constraint. In particular, we consider the case where the interference level is described by a q-bit description of its magnitude, whereby we propose a technique to find the optimal quantizer thresholds in a mean square error (MSE) sense. © 2013 IEEE.

  18. Interference-aware random beam selection schemes for spectrum sharing systems

    KAUST Repository

    Abdallah, Mohamed

    2012-10-19

    Spectrum sharing systems have been recently introduced to alleviate the problem of spectrum scarcity by allowing secondary unlicensed networks to share the spectrum with primary licensed networks under acceptable interference levels to the primary users. In this work, we develop interference-aware random beam selection schemes that provide enhanced performance for the secondary network under the condition that the interference observed by the receivers of the primary network is below a predetermined/acceptable value. We consider a secondary link composed of a transmitter equipped with multiple antennas and a single-antenna receiver sharing the same spectrum with a primary link composed of a single-antenna transmitter and a single-antenna receiver. The proposed schemes select a beam, among a set of power-optimized random beams, that maximizes the signal-to-interference-plus-noise ratio (SINR) of the secondary link while satisfying the primary interference constraint for different levels of feedback information describing the interference level at the primary receiver. For the proposed schemes, we develop a statistical analysis for the SINR statistics as well as the capacity and bit error rate (BER) of the secondary link.

  19. Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classifier.

    Science.gov (United States)

    Paul, Desbordes; Su, Ruan; Romain, Modzelewski; Sébastien, Vauclin; Pierre, Vera; Isabelle, Gardin

    2017-09-01

    The outcome prediction of patients can greatly help to personalize cancer treatment. A large amount of quantitative features (clinical exams, imaging, …) are potentially useful to assess the patient outcome. The challenge is to choose the most predictive subset of features. In this paper, we propose a new feature selection strategy called GARF (genetic algorithm based on random forest) extracted from positron emission tomography (PET) images and clinical data. The most relevant features, predictive of the therapeutic response or which are prognoses of the patient survival 3 years after the end of treatment, were selected using GARF on a cohort of 65 patients with a local advanced oesophageal cancer eligible for chemo-radiation therapy. The most relevant predictive results were obtained with a subset of 9 features leading to a random forest misclassification rate of 18±4% and an areas under the of receiver operating characteristic (ROC) curves (AUC) of 0.823±0.032. The most relevant prognostic results were obtained with 8 features leading to an error rate of 20±7% and an AUC of 0.750±0.108. Both predictive and prognostic results show better performances using GARF than using 4 other studied methods. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Reducing animal sequencing redundancy by preferentially selecting animals with low-frequency haplotypes

    Science.gov (United States)

    Many studies leverage targeted whole genome sequencing (WGS) experiments in order to identify rare and causal variants within populations. As a natural consequence of experimental design, many of these surveys tend to sequence redundant haplotype segments due to high frequency in the base population...

  1. Optimal pseudorandom sequence selection for online c-VEP based BCI control applications

    DEFF Research Database (Denmark)

    Isaksen, Jonas L.; Mohebbi, Ali; Puthusserypady, Sadasivan

    2017-01-01

    Background: In a c-VEP BCI setting, test subjects can have highly varying performances when different pseudorandom sequences are applied as stimulus, and ideally, multiple codes should be supported. On the other hand, repeating the experiment with many different pseudorandom sequences is a labori......Background: In a c-VEP BCI setting, test subjects can have highly varying performances when different pseudorandom sequences are applied as stimulus, and ideally, multiple codes should be supported. On the other hand, repeating the experiment with many different pseudorandom sequences...... predictor. Conclusions: The simple and fast method presented in this study as the Accuracy Score, allows c-VEP based BCI systems to support multiple pseudorandom sequences without increase in trial length. This allows for more personalized BCI systems with better performance to be tested without increased...

  2. Natural Selection and Functional Potentials of Human Noncoding Elements Revealed by Analysis of Next Generation Sequencing Data.

    Science.gov (United States)

    Jha, Pankaj; Lu, Dongsheng; Xu, Shuhua

    2015-01-01

    Noncoding DNA sequences (NCS) have attracted much attention recently due to their functional potentials. Here we attempted to reveal the functional roles of noncoding sequences from the point of view of natural selection that typically indicates the functional potentials of certain genomic elements. We analyzed nearly 37 million single nucleotide polymorphisms (SNPs) of Phase I data of the 1000 Genomes Project. We estimated a series of key parameters of population genetics and molecular evolution to characterize sequence variations of the noncoding genome within and between populations, and identified the natural selection footprints in NCS in worldwide human populations. Our results showed that purifying selection is prevalent and there is substantial constraint of variations in NCS, while positive selectionis more likely to be specific to some particular genomic regions and regional populations. Intriguingly, we observed larger fraction of non-conserved NCS variants with lower derived allele frequency in the genome, indicating possible functional gain of non-conserved NCS. Notably, NCS elements are enriched for potentially functional markers such as eQTLs, TF motif, and DNase I footprints in the genome. More interestingly, some NCS variants associated with diseases such as Alzheimer's disease, Type 1 diabetes, and immune-related bowel disorder (IBD) showed signatures of positive selection, although the majority of NCS variants, reported as risk alleles by genome-wide association studies, showed signatures of negative selection. Our analyses provided compelling evidence of natural selection forces on noncoding sequences in the human genome and advanced our understanding of their functional potentials that play important roles in disease etiology and human evolution.

  3. Genotypic detection of rifampicin and isoniazid resistant Mycobacterium tuberculosis strains by DNA sequencing: a randomized trial

    Directory of Open Access Journals (Sweden)

    El mashad Noha

    2009-01-01

    Full Text Available Abstract Background Tuberculosis is a growing international health concern. It is the biggest killer among the infectious diseases in the world today. Early detection of drug resistance allows starting of an appropriate treatment. Resistance to drugs is due to particular genomic mutations in specific genes of Mycobacterium tuberculosis(MTB. The aim of this study was to identify the presence of Isoniazid (INH and Rifampicin(RIF drug resistance in new and previously treated tuberculosis (TB cases using DNA sequencing. Methods This study was carried out on 153 tuberculous patients with positive Bactec 460 culture for acid fast bacilli. Results Of the 153 patients, 105 (68.6% were new cases and 48 (31.4% were previously treated cases. Drug susceptibility testing on Bactec revealed 50 resistant cases for one or more of the first line antituberculous. Genotypic analysis was done only for rifampicin resistant specimens (23 cases and INH resistant specimens (26 cases to detect mutations responsible for drug resistance by PCR amplification of rpoB gene for rifampicin resistant cases and KatG gene for isoniazid resistant cases. Finally, DNA sequencing was done for detection of mutation within rpoB and KatG genes. Genotypic analysis of RIF resistant cases revealed that 20/23 cases (86.9% of RIF resistance were having rpoB gene mutation versus 3 cases (13.1% having no mutation with a high statistical significant difference between them (P Conclusion We can conclude that rifampicin resistance could be used as a useful surrogate marker for estimation of multidrug resistance. In addition, Genotypic method was superior to that of the traditional phenotypic method which is time-consuming taking several weeks or longer.

  4. Partial Sequencing of 16S rRNA Gene of Selected Staphylococcus aureus Isolates and its Antibiotic Resistance

    Directory of Open Access Journals (Sweden)

    Harsi Dewantari Kusumaningrum

    2016-08-01

    Full Text Available The choice of primer used in 16S rRNA sequencing for identification of Staphylococcus species found in food is important. This study aimed to characterize Staphylococcus aureus isolates by partial sequencing based on 16S rRNA gene employing primers 16sF, 63F or 1387R. The isolates were isolated from milk, egg dishes and chicken dishes and selected based on the presence of sea gene that responsible for formation of enterotoxin-A. Antibiotic susceptibility of the isolates towards six antibiotics was also tested. The use of 16sF resulted generally in higher identity percentage and query coverage compared to the sequencing by 63F or 1387R. BLAST results of all isolates, sequenced by 16sF, showed 99% homology to complete genome of four S. aureus strains, with different characteristics on enterotoxin production and antibiotic resistance. Considering that all isolates were carrying sea gene, indicated by the occurence of 120 bp amplicon after PCR amplification using primer SEA1/SEA2,  the isolates were most in agreeing to S. aureus subsp. aureus ST288. This study indicated that 4 out of 8 selected isolates were resistant towards streptomycin. The 16S rRNA gene sequencing using 16sF is useful for identification of S. aureus. However, additional analysis such as PCR employing specific gene target, should give a valuable supplementary information, when specific characteristic is expected.

  5. Folded Proteins Occur Frequently in Libraries of Random Amino Acid Sequences

    Science.gov (United States)

    Davidson, Alan R.; Sauer, Robert T.

    1994-03-01

    A library of synthetic genes encoding 80- to 100-residue proteins composed mainly of random combinations of glutamine (Q), leucine (L), and arginine (R) has been expressed in Escherichia coli. These genes also encode an epitope tag and six carboxyl-terminal histidines. Screening of this library by immunoblotting showed that 5% of these QLR proteins are expressed at readily detectable levels. Three well-expressed QLR proteins were purified and characterized. Each of these proteins has significant α-helical content, is largely resistant to degradation by Pronase, and has a distinct oligomeric structure. In addition, one protein unfolds in a highly cooperative manner. These properties of the QLR proteins demonstrate that they possess folded structures with some native-like properties. The QLR proteins differ from most natural proteins, however, in being remarkably resistant to denaturant-induced and thermal-induced unfolding and in being relatively insoluble in the absence of denaturants.

  6. Fire detection system using random forest classification for image sequences of complex background

    Science.gov (United States)

    Kim, Onecue; Kang, Dong-Joong

    2013-06-01

    We present a fire alarm system based on image processing that detects fire accidents in various environments. To reduce false alarms that frequently appeared in earlier systems, we combined image features including color, motion, and blinking information. We specifically define the color conditions of fires in hue, saturation and value, and RGB color space. Fire features are represented as intensity variation, color mean and variance, motion, and image differences. Moreover, blinking fire features are modeled by using crossing patches. We propose an algorithm that classifies patches into fire or nonfire areas by using random forest supervised learning. We design an embedded surveillance device made with acrylonitrile butadiene styrene housing for stable fire detection in outdoor environments. The experimental results show that our algorithm works robustly in complex environments and is able to detect fires in real time.

  7. The sequencing of adverbial clauses of time in academic English: Random forest modelling

    Directory of Open Access Journals (Sweden)

    Abbas Ali Rezaee

    2016-12-01

    Full Text Available Adverbial clauses of time are positioned either before or after their associated main clauses. This study aims to assess the importance of discourse-pragmatics and processing-related constraints on the positioning of adverbial clauses of time in research articles of applied linguistics written by authors for whom English is considered a native language. Previous research has revealed that the ordering is co-determined by various factors from the domains of semantics and discourse-pragmatics (bridging, iconicity, and subordinator and language processing (deranking, length, and complexity. This research conducts a multifactorial analysis on the motivators of the positioning of adverbial clauses of time in 100 research articles of applied linguistics. The study will use a random forest of conditional inference trees as the statistical technique to measure the weights of the aforementioned variables. It was found that iconicity and bridging, which are factors associated with discourse and semantics, are the two most salient predictors of clause ordering.

  8. Comparative genomics based on massive parallel transcriptome sequencing reveals patterns of substitution and selection across 10 bird species.

    Science.gov (United States)

    Künstner, Axel; Wolf, Jochen B W; Backström, Niclas; Whitney, Osceola; Balakrishnan, Christopher N; Day, Lainy; Edwards, Scott V; Janes, Daniel E; Schlinger, Barney A; Wilson, Richard K; Jarvis, Erich D; Warren, Wesley C; Ellegren, Hans

    2010-03-01

    Next-generation sequencing technology provides an attractive means to obtain large-scale sequence data necessary for comparative genomic analysis. To analyse the patterns of mutation rate variation and selection intensity across the avian genome, we performed brain transcriptome sequencing using Roche 454 technology of 10 different non-model avian species. Contigs from de novo assemblies were aligned to the two available avian reference genomes, chicken and zebra finch. In total, we identified 6499 different genes across all 10 species, with approximately 1000 genes found in each full run per species. We found evidence for a higher mutation rate of the Z chromosome than of autosomes (male-biased mutation) and a negative correlation between the neutral substitution rate (d(S)) and chromosome size. Analyses of the mean d(N)/d(S) ratio (omega) of genes across chromosomes supported the Hill-Robertson effect (the effect of selection at linked loci) and point at stochastic problems with omega as an independent measure of selection. Overall, this study demonstrates the usefulness of next-generation sequencing for obtaining genomic resources for comparative genomic analysis of non-model organisms.

  9. Specific and selective probes for Staphylococcus aureus from phage-displayed random peptide libraries.

    Science.gov (United States)

    De Plano, Laura M; Carnazza, Santina; Messina, Grazia M L; Rizzo, Maria Giovanna; Marletta, Giovanni; Guglielmino, Salvatore P P

    2017-09-01

    Staphylococcus aureus is a major human pathogen causing health care-associated and community-associated infections. Early diagnosis is essential to prevent disease progression and to reduce complications that can be serious. In this study, we selected, from a 9-mer phage peptide library, a phage clone displaying peptide capable of specific binding to S. aureus cell surface, namely St.au9IVS5 (sequence peptide RVRSAPSSS).The ability of the isolated phage clone to interact specifically with S. aureus and the efficacy of its bacteria-binding properties were established by using enzyme linked immune-sorbent assay (ELISA). We also demonstrated by Western blot analysis that the most reactive and selective phage peptide binds a 78KDa protein on the bacterial cell surface. Furthermore, we observed selectivity of phage-bacteria-binding allowing to identify clinical isolates of S. aureus in comparison with a panel of other bacterial species. In order to explore the possibility of realizing a selective bacteria biosensor device, based on immobilization of affinity-selected phage, we have studied the physisorbed phage deposition onto a mica surface. Atomic Force Microscopy (AFM) was used to determine the organization of phage on mica surface and then the binding performance of mica-physisorbed phage to bacterial target was evaluated during the time by fluorescent microscopy. The system is able to bind specifically about 50% of S. aureus cells after 15' and 90% after one hour. Due to specificity and rapidness, this biosensing strategy paves the way to the further development of new cheap biosensors to be used in developing countries, as lab-on-chip (LOC) to detect bacterial agents in clinical diagnostics applications. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Probability distribution of intersymbol distances in random symbolic sequences: Applications to improving detection of keywords in texts and of amino acid clustering in proteins.

    Science.gov (United States)

    Carpena, Pedro; Bernaola-Galván, Pedro A; Carretero-Campos, Concepción; Coronado, Ana V

    2016-11-01

    Symbolic sequences have been extensively investigated in the past few years within the framework of statistical physics. Paradigmatic examples of such sequences are written texts, and deoxyribonucleic acid (DNA) and protein sequences. In these examples, the spatial distribution of a given symbol (a word, a DNA motif, an amino acid) is a key property usually related to the symbol importance in the sequence: The more uneven and far from random the symbol distribution, the higher the relevance of the symbol to the sequence. Thus, many techniques of analysis measure in some way the deviation of the symbol spatial distribution with respect to the random expectation. The problem is then to know the spatial distribution corresponding to randomness, which is typically considered to be either the geometric or the exponential distribution. However, these distributions are only valid for very large symbolic sequences and for many occurrences of the analyzed symbol. Here, we obtain analytically the exact, randomly expected spatial distribution valid for any sequence length and any symbol frequency, and we study its main properties. The knowledge of the distribution allows us to define a measure able to properly quantify the deviation from randomness of the symbol distribution, especially for short sequences and low symbol frequency. We apply the measure to the problem of keyword detection in written texts and to study amino acid clustering in protein sequences. In texts, we show how the results improve with respect to previous methods when short texts are analyzed. In proteins, which are typically short, we show how the measure quantifies unambiguously the amino acid clustering and characterize its spatial distribution.

  11. MR pulse sequences for selective relaxation time measurements: a phantom study

    DEFF Research Database (Denmark)

    Thomsen, C; Jensen, K E; Jensen, M

    1990-01-01

    The accuracy of relaxation time measurements of spectroscopic inversion recovery and CPMG multi-echo pulse sequences together with ISIS and stimulated echo-pulse methods have been tested on a reference phantom (test object no. 5, of the EEC Concerted Research Project). For the measurements...... a Siemens Magnetom wholebody magnetic resonance scanner operating at 1.5 Tesla was used. For comparison six imaging pulse sequences for relaxation time measurements were tested on the same phantom. The spectroscopic pulse sequences all had an accuracy better than 10% of the reference values....

  12. Improvements of the DANTE-Z sequence for band-selective excitation: application to multidimensional NMR spectroscopy

    Energy Technology Data Exchange (ETDEWEB)

    Roumestand, C.; Toma, F. (CEA Centre d' Etudes de Saclay, 91 - Gif-sur-Yvette (France). Dept. d' Ingenierie et d' Etudes des Proteines)

    1994-06-01

    New developments of the DANTE-Z sequence (1) are presented. Particularly, improvements of the shape of the excitation profile are described. The easy implementation of DANTE-Z in the classic multidimensional homo- or heteronuclear experiments, and its numerous advantages (clean excitation profile, absence of phase gradient and of amplitude distortions, no need of instrumental adjustments...) make this sequence the easier way to perform band-selective excitation in NMR spectroscopy. Most of these modifications have been checked on a sample of protein (toxin [gamma] from Naja nigricollis (2)) dissolved in water. (authors). 13 refs., 4 figs.

  13. Selection of a Novel Aptamer Against Vitronectin Using Capillary Electrophoresis and Next Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Christopher H Stuart

    2016-01-01

    Full Text Available Breast cancer (BC results in ≃40,000 deaths each year in the United States and even among survivors treatment of the disease may have devastating consequences, including increased risk for heart disease and cognitive impairment resulting from the toxic effects of chemotherapy. Aptamer-mediated drug delivery can contribute to improved treatment outcomes through the selective delivery of chemotherapy to BC cells, provided suitable cancer-specific antigens can be identified. We report here the use of capillary electrophoresis in conjunction with next generation sequencing to develop the first vitronectin (VN binding aptamer (VBA-01; Kd 405 nmol/l, the first aptamer to vitronectin (VN; Kd = 405 nmol/l, a protein that plays an important role in wound healing and that is present at elevated levels in BC tissue and in the blood of BC patients relative to the corresponding nonmalignant tissues. We used VBA-01 to develop DVBA-01, a dimeric aptamer complex, and conjugated doxorubicin (Dox to DVBA-01 (7:1 ratio using pH-sensitive, covalent linkages. Dox conjugation enhanced the thermal stability of the complex (60.2 versus 46.5°C and did not decrease affinity for the VN target. The resulting DVBA-01-Dox complex displayed increased cytotoxicity to MDA-MB-231 BC cells that were cultured on plasticware coated with VN (1.8 × 10−6mol/l relative to uncoated plates (2.4 × 10−6 mol/l, or plates coated with the related protein fibronectin (2.1 × 10−6 mol/l. The VBA-01 aptamer was evaluated for binding to human BC tissue using immunohistochemistry and displayed tissue specific binding and apparent association with BC cells. In contrast, a monoclonal antibody that preferentially binds to multimeric VN primarily stained extracellular matrix and vessel walls of BC tissue. Our results indicate a strong potential for using VN-targeting aptamers to improve drug delivery to treat BC.

  14. Multiplexed Spliced-Leader Sequencing: A high-throughput, selective method for RNA-seq in Trypanosomatids.

    Science.gov (United States)

    Cuypers, Bart; Domagalska, Malgorzata A; Meysman, Pieter; Muylder, Géraldine de; Vanaerschot, Manu; Imamura, Hideo; Dumetz, Franck; Verdonckt, Thomas Wolf; Myler, Peter J; Ramasamy, Gowthaman; Laukens, Kris; Dujardin, Jean-Claude

    2017-06-16

    High throughput sequencing techniques are poorly adapted for in vivo studies of parasites, which require prior in vitro culturing and purification. Trypanosomatids, a group of kinetoplastid protozoans, possess a distinctive feature in their transcriptional mechanism whereby a specific Spliced Leader (SL) sequence is added to the 5'end of each mRNA by trans-splicing. This allows to discriminate Trypansomatid RNA from mammalian RNA and forms the basis of our new multiplexed protocol for high-throughput, selective RNA-sequencing called SL-seq. We provided a proof-of-concept of SL-seq in Leishmania donovani, the main causative agent of visceral leishmaniasis in humans, and successfully applied the method to sequence Leishmania mRNA directly from infected macrophages and from highly diluted mixes with human RNA. mRNA profiles obtained with SL-seq corresponded largely to those obtained from conventional poly-A tail purification methods, indicating both enumerate the same mRNA pool. However, SL-seq offers additional advantages, including lower sequencing depth requirements, fast and simple library prep and high resolution splice site detection. SL-seq is therefore ideal for fast and massive parallel sequencing of parasite transcriptomes directly from host tissues. Since SLs are also present in Nematodes, Cnidaria and primitive chordates, this method could also have high potential for transcriptomics studies in other organisms.

  15. MELCOR 1.8.2 calculations of selected sequences for the ABWR

    Energy Technology Data Exchange (ETDEWEB)

    Kmetyk, L.N.

    1994-07-01

    This report summarizes the results from MELCOR calculations of severe accident sequences in the ABWR and presents comparisons with MAAP calculations for the same sequences. MELCOR was run for two low-pressure and three high-pressure sequences to identify the materials which enter containment and are available for release to the environment (source terms), to study the potential effects of core-concrete interaction, and to obtain event timings during each sequence; the source terms include fission products and other materials such as those generated by core-concrete interactions. Sensitivity studies were done on the impact of assuming limestone rather than basaltic concrete and on the effect of quenching core debris in the cavity compared to having hot, unquenched debris present.

  16. Severe accident source term characteristics for selected Peach Bottom sequences predicted by the MELCOR Code

    Energy Technology Data Exchange (ETDEWEB)

    Carbajo, J.J. [Oak Ridge National Lab., TN (United States)

    1993-09-01

    The purpose of this report is to compare in-containment source terms developed for NUREG-1159, which used the Source Term Code Package (STCP), with those generated by MELCOR to identify significant differences. For this comparison, two short-term depressurized station blackout sequences (with a dry cavity and with a flooded cavity) and a Loss-of-Coolant Accident (LOCA) concurrent with complete loss of the Emergency Core Cooling System (ECCS) were analyzed for the Peach Bottom Atomic Power Station (a BWR-4 with a Mark I containment). The results indicate that for the sequences analyzed, the two codes predict similar total in-containment release fractions for each of the element groups. However, the MELCOR/CORBH Package predicts significantly longer times for vessel failure and reduced energy of the released material for the station blackout sequences (when compared to the STCP results). MELCOR also calculated smaller releases into the environment than STCP for the station blackout sequences.

  17. Bayesian Selection of Markov Models for Symbol Sequences: Application to Microsaccadic Eye Movements

    Science.gov (United States)

    Bettenbühl, Mario; Rusconi, Marco; Engbert, Ralf; Holschneider, Matthias

    2012-01-01

    Complex biological dynamics often generate sequences of discrete events which can be described as a Markov process. The order of the underlying Markovian stochastic process is fundamental for characterizing statistical dependencies within sequences. As an example for this class of biological systems, we investigate the Markov order of sequences of microsaccadic eye movements from human observers. We calculate the integrated likelihood of a given sequence for various orders of the Markov process and use this in a Bayesian framework for statistical inference on the Markov order. Our analysis shows that data from most participants are best explained by a first-order Markov process. This is compatible with recent findings of a statistical coupling of subsequent microsaccade orientations. Our method might prove to be useful for a broad class of biological systems. PMID:22970124

  18. Selection pressure from neutralizing antibodies drives sequence evolution during acute infection with hepatitis C virus.

    Science.gov (United States)

    Dowd, Kimberly A; Netski, Dale M; Wang, Xiao-Hong; Cox, Andrea L; Ray, Stuart C

    2009-06-01

    Despite recent characterization of hepatitis C virus-specific neutralizing antibodies, it is not clear to what extent immune pressure from neutralizing antibodies drives viral sequence evolution in vivo. This lack of understanding is particularly evident in acute infection, the phase when elimination or persistence of viral replication is determined and during which the importance of the humoral immune response has been largely discounted. We analyzed envelope glycoprotein sequence evolution and neutralization of sequential autologous hepatitis C virus pseudoparticles in 8 individuals throughout acute infection. Amino acid substitutions occurred throughout the envelope genes, primarily within the hypervariable region 1 of E2. When individualized pseudoparticles expressing sequential envelope sequences were used to measure neutralization by autologous sera, antibodies neutralizing earlier sequence variants were detected at earlier time points than antibodies neutralizing later variants, indicating clearance and evolution of viral variants in response to pressure from neutralizing antibodies. To demonstrate the effects of amino acid substitution on neutralization, site-directed mutagenesis of a pseudoparticle envelope sequence revealed amino acid substitutions in hypervariable region 1 that were responsible for a dramatic decrease in neutralization sensitivity over time. In addition, high-titer neutralizing antibodies peaked at the time of viral clearance in all spontaneous resolvers, whereas chronically evolving subjects displayed low-titer or absent neutralizing antibodies throughout early acute infection. These findings indicate that, during acute hepatitis C virus infection in vivo, virus-specific neutralizing antibodies drive sequence evolution and, in some individuals, play a role in determining the outcome of infection.

  19. Does the Use of a Decision Aid Improve Decision Making in Prosthetic Heart Valve Selection? A Multicenter Randomized Trial

    NARCIS (Netherlands)

    Korteland, Nelleke M.; Ahmed, Yunus; Koolbergen, David R.; Brouwer, Marjan; de Heer, Frederiek; Kluin, Jolanda; Bruggemans, Eline F.; Klautz, Robert J. M.; Stiggelbout, Anne M.; Bucx, Jeroen J. J.; Roos-Hesselink, Jolien W.; Polak, Peter; Markou, Thanasie; van den Broek, Inge; Ligthart, Rene; Bogers, Ad J. J. C.; Takkenberg, Johanna J. M.

    2017-01-01

    A Dutch online patient decision aid to support prosthetic heart valve selection was recently developed. A multicenter randomized controlled trial was conducted to assess whether use of the patient decision aid results in optimization of shared decision making in prosthetic heart valve selection. In

  20. Molecular Analysis of Date Palm Genetic Diversity Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeats (ISSRs).

    Science.gov (United States)

    El Sharabasy, Sherif F; Soliman, Khaled A

    2017-01-01

    The date palm is an ancient domesticated plant with great diversity and has been cultivated in the Middle East and North Africa for at last 5000 years. Date palm cultivars are classified based on the fruit moisture content, as dry, semidry, and soft dates. There are a number of biochemical and molecular techniques available for characterization of the date palm variation. This chapter focuses on the DNA-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeats (ISSR) techniques, in addition to biochemical markers based on isozyme analysis. These techniques coupled with appropriate statistical tools proved useful for determining phylogenetic relationships among date palm cultivars and provide information resources for date palm gene banks.

  1. H-DROP: an SVM based helical domain linker predictor trained with features optimized by combining random forest and stepwise selection.

    Science.gov (United States)

    Ebina, Teppei; Suzuki, Ryosuke; Tsuji, Ryotaro; Kuroda, Yutaka

    2014-08-01

    Domain linker prediction is attracting much interest as it can help identifying novel domains suitable for high throughput proteomics analysis. Here, we report H-DROP, an SVM-based Helical Domain linker pRediction using OPtimal features. H-DROP is, to the best of our knowledge, the first predictor for specifically and effectively identifying helical linkers. This was made possible first because a large training dataset became available from IS-Dom, and second because we selected a small number of optimal features from a huge number of potential ones. The training helical linker dataset, which included 261 helical linkers, was constructed by detecting helical residues at the boundary regions of two independent structural domains listed in our previously reported IS-Dom dataset. 45 optimal feature candidates were selected from 3,000 features by random forest, which were further reduced to 26 optimal features by stepwise selection. The prediction sensitivity and precision of H-DROP were 35.2 and 38.8%, respectively. These values were over 10.7% higher than those of control methods including our previously developed DROP, which is a coil linker predictor, and PPRODO, which is trained with un-differentiated domain boundary sequences. Overall, these results indicated that helical linkers can be predicted from sequence information alone by using a strictly curated training data set for helical linkers and carefully selected set of optimal features. H-DROP is available at http://domserv.lab.tuat.ac.jp.

  2. A large-scale study of the random variability of a coding sequence: a study on the CFTR gene.

    Science.gov (United States)

    Modiano, Guido; Bombieri, Cristina; Ciminelli, Bianca Maria; Belpinati, Francesca; Giorgi, Silvia; Georges, Marie des; Scotet, Virginie; Pompei, Fiorenza; Ciccacci, Cinzia; Guittard, Caroline; Audrézet, Marie Pierre; Begnini, Angela; Toepfer, Michael; Macek, Milan; Ferec, Claude; Claustres, Mireille; Pignatti, Pier Franco

    2005-02-01

    Coding single nucleotide substitutions (cSNSs) have been studied on hundreds of genes using small samples (n(g) approximately 100-150 genes). In the present investigation, a large random European population sample (average n(g) approximately 1500) was studied for a single gene, the CFTR (Cystic Fibrosis Transmembrane conductance Regulator). The nonsynonymous (NS) substitutions exhibited, in accordance with previous reports, a mean probability of being polymorphic (q > 0.005), much lower than that of the synonymous (S) substitutions, but they showed a similar rate of subpolymorphic (q < 0.005) variability. This indicates that, in autosomal genes that may have harmful recessive alleles (nonduplicated genes with important functions), genetic drift overwhelms selection in the subpolymorphic range of variability, making disadvantageous alleles behave as neutral. These results imply that the majority of the subpolymorphic nonsynonymous alleles of these genes are selectively negative or even pathogenic.

  3. Cooperation of deterministic dynamics and random noise in production of complex syntactical avian song sequences: a neural network model

    Directory of Open Access Journals (Sweden)

    Yuichi eYamashita

    2011-04-01

    Full Text Available How the brain learns and generates temporal sequences is a fundamental issue in neuroscience. The production of birdsongs, a process which involves complex learned sequences, provides researchers with an excellent biological model for this topic. The Bengalese finch in particular learns a highly complex song with syntactical structure. The nucleus HVC (HVC, a premotor nucleus within the avian song system, plays a key role in generating the temporal structures of their songs. From lesion studies, the nucleus interfacialis (NIf projecting to the HVC is considered one of the essential regions that contribute to the complexity of their songs. However, the types of interaction between the HVC and the NIf that can produce complex syntactical songs remain unclear. In order to investigate the function of interactions between the HVC and NIf, we have proposed a neural network model based on previous biological evidence. The HVC is modeled by a recurrent neural network (RNN that learns to generate temporal patterns of songs. The NIf is modeled as a mechanism that provides auditory feedback to the HVC and generates random noise that feeds into the HVC. The model showed that complex syntactical songs can be replicated by simple interactions between deterministic dynamics of the RNN and random noise. In the current study, the plausibility of the model is tested by the comparison between the changes in the songs of actual birds induced by pharmacological inhibition of the NIf and the changes in the songs produced by the model resulting from modification of parameters representing NIf functions. The efficacy of the model demonstrates that the changes of songs induced by pharmacological inhibition of the NIf can be interpreted as a trade-off between the effects of noise and the effects of feedback on the dynamics of the RNN of the HVC. These facts suggest that the current model provides a convincing hypothesis for the functional role of NIf-HVC interaction.

  4. Selective outcome reporting and sponsorship in randomized controlled trials in IVF and ICSI.

    Science.gov (United States)

    Braakhekke, M; Scholten, I; Mol, F; Limpens, J; Mol, B W; van der Veen, F

    2017-10-01

    Are randomized controlled trials (RCTs) on IVF and ICSI subject to selective outcome reporting and is this related to sponsorship? There are inconsistencies, independent from sponsorship, in the reporting of primary outcome measures in the majority of IVF and ICSI trials, indicating selective outcome reporting. RCTs are subject to bias at various levels. Of these biases, selective outcome reporting is particularly relevant to IVF and ICSI trials since there is a wide variety of outcome measures to choose from. An established cause of reporting bias is sponsorship. It is, at present, unknown whether RCTs in IVF/ICSI are subject to selective outcome reporting and whether this is related with sponsorship. We systematically searched RCTs on IVF and ICSI published between January 2009 and March 2016 in MEDLINE, EMBASE, the Cochrane Central Register of Controlled Trials and the publisher subset of PubMed. We analysed 415 RCTs. Per included RCT, we extracted data on impact factor of the journal, sample size, power calculation, and trial registry and thereafter data on primary outcome measure, the direction of trial results and sponsorship. Of the 415 identified RCTs, 235 were excluded for our primary analysis, because the sponsorship was not reported. Of the 180 RCTs included in our analysis, 7 trials did not report on any primary outcome measure and 107 of the remaining 173 trials (62%) reported on surrogate primary outcome measures. Of the 114 registered trials, 21 trials (18%) provided primary outcomes in their manuscript that were different from those in the trial registry. This indicates selective outcome reporting. We found no association between selective outcome reporting and sponsorship. We ran additional analyses to include the trials that had not reported sponsorship and found no outcomes that differed from our primary analysis. Since the majority of the trials did not report on sponsorship, there is a risk on sampling bias. IVF and ICSI trials are subject, to

  5. Characterization of the complete mitochondrial genome sequence of Homalogaster paloniae (Gastrodiscidae, Trematoda) and comparative analyses with selected digeneans.

    Science.gov (United States)

    Yang, Xin; Wang, Lixia; Feng, Hanli; Qi, Mingwei; Zhang, Zongze; Gao, Chong; Wang, Chunqun; Hu, Min; Fang, Rui; Li, Chengye

    2016-10-01

    Gastrodiscidae species are neglected but significant paramphistomes in small ruminants, which can lead to considerable economic losses to the breeding industry of livestock. However, knowledge about molecular ecology, population genetics, and phylogenetic analysis is still limited. In the present study, we firstly sequenced and analyzed the full mitochondrial (mt) genome of Homalogaster paloniae (14,490 bp). The gene contents and organization of the H. paloniae mt genome is the same as that of other digeneans, such as Fasciola hepatica and Paramphistomum cervi. It is interesting that unlike other paramphistomes, H. paloniae is flat in shape which is similar with Fasciola, such as F. hepatica. Phylogenetic analysis of H. paloniae and other 17 selected digeneans using concatenated amino acid sequences of the 12 protein-coding genes showed that Gastrodiscidae is closely related to Paramphistomidae and Gastrothylacidae. The availability of the mt genome sequence of H. paloniae should provide an important foundation for further molecular study of Gastrodiscidae and other digeneans.

  6. Active classifier selection for RGB-D object categorization using a Markov random field ensemble method

    Science.gov (United States)

    Durner, Maximilian; Márton, Zoltán.; Hillenbrand, Ulrich; Ali, Haider; Kleinsteuber, Martin

    2017-03-01

    In this work, a new ensemble method for the task of category recognition in different environments is presented. The focus is on service robotic perception in an open environment, where the robot's task is to recognize previously unseen objects of predefined categories, based on training on a public dataset. We propose an ensemble learning approach to be able to flexibly combine complementary sources of information (different state-of-the-art descriptors computed on color and depth images), based on a Markov Random Field (MRF). By exploiting its specific characteristics, the MRF ensemble method can also be executed as a Dynamic Classifier Selection (DCS) system. In the experiments, the committee- and topology-dependent performance boost of our ensemble is shown. Despite reduced computational costs and using less information, our strategy performs on the same level as common ensemble approaches. Finally, the impact of large differences between datasets is analyzed.

  7. Clinical outcome of intracytoplasmic injection of spermatozoa morphologically selected under high magnification: a prospective randomized study.

    Science.gov (United States)

    Balaban, Basak; Yakin, Kayhan; Alatas, Cengiz; Oktem, Ozgur; Isiklar, Aycan; Urman, Bulent

    2011-05-01

    Recent evidence shows that the selection of spermatozoa based on the analysis of morphology under high magnification (×6000) may have a positive impact on embryo development in cases with severe male factor infertility and/or previous implantation failures. The objective of this prospective randomized study was to compare the clinical outcome of 87 intracytoplasmic morphologically selected sperm injection (IMSI) cycles with 81 conventional intracytoplasmic sperm injection (ICSI) cycles in an unselected infertile population. IMSI did not provide a significant improvement in the clinical outcome compared with ICSI although there were trends for higher implantation (28.9% versus 19.5%), clinical pregnancy (54.0% versus 44.4%) and live birth rates (43.7% versus 38.3%) in the IMSI group. However, severe male factor patients benefited from the IMSI procedure as shown by significantly higher implantation rates compared with their counterparts in the ICSI group (29.6% versus 15.2%, P=0.01). These results suggest that IMSI may improve IVF success rates in a selected group of patients with male factor infertility. New technological developments enable the real time examination of motile spermatozoa with an inverted light microscope equipped with high-power differential interference contrast optics, enhanced by digital imaging. High magnification (over ×6000) provides the identification of spermatozoa with a normal nucleus and nuclear content. Intracytoplasmic injection of spermatozoa selected according to fine nuclear morphology under high magnification may improve the clinical outcome in cases with severe male factor infertility. Copyright © 2010 Reproductive Healthcare Ltd. Published by Elsevier Ltd. All rights reserved.

  8. A Comparison of Dietary Habits between Recreational Runners and a Randomly Selected Adult Population in Slovenia.

    Science.gov (United States)

    Škof, Branko; Rotovnik Kozjek, Nada

    2015-09-01

    The aim of the study was to compare the dietary habits of recreational runners with those of a random sample of the general population. We also wanted to determine the influence of gender, age and sports performance of recreational runners on their basic diet and compliance with recommendations in sports nutrition. The study population consisted of 1,212 adult Slovenian recreational runners and 774 randomly selected residents of Slovenia between the ages of 18 and 65 years. The data on the dietary habits of our subjects was gathered by means of two questionnaires. The following parameters were evaluated: the type of diet, a food pattern, and the frequency of consumption of individual food groups, the use of dietary supplements, fluid intake, and alcohol consumption. Recreational runners had better compliance with recommendations for healthy nutrition than the general population. This pattern increased with the runner's age and performance level. Compared to male runners, female runners ate more regularly and had a more frequent consumption of food groups associated with a healthy diet (fruit, vegetables, whole grain foods, and low-fat dairy products). The consumption of simple sugars and use of nutritional supplements by well-trained runners was inadequate with values recommended for physically active individuals. Recreational runners are an exemplary population group that actively seeks to adopt a healthier lifestyle.

  9. Radiographic methods used before removal of mandibular third molars among randomly selected general dental clinics.

    Science.gov (United States)

    Matzen, Louise H; Petersen, Lars B; Wenzel, Ann

    2016-01-01

    To assess radiographic methods and diagnostically sufficient images used before removal of mandibular third molars among randomly selected general dental clinics. Furthermore, to assess factors predisposing for an additional radiographic examination. 2 observers visited 18 randomly selected clinics in Denmark and studied patient files, including radiographs of patients who had their mandibular third molar(s) removed. The radiographic unit and type of receptor were registered. A diagnostically sufficient image was defined as the whole tooth and mandibular canal were displayed in the radiograph (yes/no). Overprojection between the tooth and mandibular canal (yes/no) and patient-reported inferior alveolar nerve sensory disturbances (yes/no) were recorded. Regression analyses tested if overprojection between the third molar and the mandibular canal and an insufficient intraoral image predisposed for additional radiographic examination(s). 1500 mandibular third molars had been removed; 1090 had intraoral, 468 had panoramic and 67 had CBCT examination. 1000 teeth were removed after an intraoral examination alone, 433 after panoramic examination and 67 after CBCT examination. 90 teeth had an additional examination after intraoral. Overprojection between the tooth and mandibular canal was a significant factor (p < 0.001, odds ratio = 3.56) for an additional examination. 63.7% of the intraoral images were sufficient and 36.3% were insufficient, with no significant difference between images performed with phosphor plates and solid-state sensors (p = 0.6). An insufficient image predisposed for an additional examination (p = 0.008, odds ratio = 1.8) but was only performed in 11% of the cases. Most mandibular third molars were removed based on an intraoral examination although 36.3% were insufficient.

  10. In vitro selection of external guide sequences for directing human RNase P to cleave a target mRNA.

    Science.gov (United States)

    Raj, Stephen; Liu, Fenyong

    2004-01-01

    External guide sequences (EGSs) are oligonucleotides that consist of a sequence that is complementary to a target mRNA and recruit intracellular RNase P for specific degradation of the target RNA. Recent studies indicate that increasing the targeting activity of EGSs in directing human RNase P to cleave an mRNA in vitro can lead to better efficacies of the EGSs in inducing RNase P-mediated inhibition of the expression of the target mRNA in cultured cells. This chapter will describe the procedure for the generation of highly functional EGSs by in vitro selection. We also describe protocols for in vitro evaluation of the activity of the EGSs. These methods should provide general guidelines for using in vitro selection for generating highly active EGSs for gene-targeting applications.

  11. Quantitative profiling of selective Sox/POU pairing on hundreds of sequences in parallel by Coop-seq.

    Science.gov (United States)

    Chang, Yiming K; Srivastava, Yogesh; Hu, Caizhen; Joyce, Adam; Yang, Xiaoxiao; Zuo, Zheng; Havranek, James J; Stormo, Gary D; Jauch, Ralf

    2017-01-25

    Cooperative binding of transcription factors is known to be important in the regulation of gene expression programs conferring cellular identities. However, current methods to measure cooperativity parameters have been laborious and therefore limited to studying only a few sequence variants at a time. We developed Coop-seq (cooperativity by sequencing) that is capable of efficiently and accurately determining the cooperativity parameters for hundreds of different DNA sequences in a single experiment. We apply Coop-seq to 12 dimer pairs from the Sox and POU families of transcription factors using 324 unique sequences with changed half-site orientation, altered spacing and discrete randomization within the binding elements. The study reveals specific dimerization profiles of different Sox factors with Oct4. By contrast, Oct4 and the three neural class III POU factors Brn2, Brn4 and Oct6 assemble with Sox2 in a surprisingly indistinguishable manner. Two novel half-site configurations can support functional Sox/Oct dimerization in addition to known composite motifs. Moreover, Coop-seq uncovers a nucleotide switch within the POU half-site when spacing is altered, which is mirrored in genomic loci bound by Sox2/Oct4 complexes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. A Developmental and Sequenced One-to-One Educational Intervention (DS1-EI for autism spectrum disorder: a randomized single-blind controlled trial.

    Directory of Open Access Journals (Sweden)

    TANET Antoine

    2016-09-01

    Full Text Available Introduction: Individuals with Autism Spectrum Disorder (ASD who also exhibit severe to moderate ranges of intellectual disability (ID still face many challenges (i.e. less evidence-based trials, less inclusion in school with peers.Methods: We implemented a novel model called the Developmental and Sequenced One-to-One Educational Intervention (DS1-EI in 5-9-year-old children with co-occurring ASD and ID. The treatment protocol was adapted for school implementation by designing it using an educational agenda. The intervention was based on intensity, regular assessments, updating objectives, encouraging spontaneous communication, promoting skills through play with peers, supporting positive behaviours, providing supervision, capitalizing on teachers’ unique skills, and providing developmental and sequenced learning. Developmental learning implies that the focus of training is what is close to the developmental expectations given a child’s development in a specific domain. Sequenced learning means that the teacher changes the learning activities every 10-15 minutes to maintain the child’s attention in the context of an anticipated time agenda.We selected 11 French institutions in which we implemented the model in small classrooms. Each institution recruited participants per dyads matched by age, sex and developmental quotient. Patients from each dyad were then randomized to a DS1-EI group or a Treatment as usual (TAU group for 36 months. The primary variables – the Childhood Autism Rating scale (CARS and the psychoeducational profile (PEP-3 – will be blindly assessed by independent raters at the 18-month and 36-month follow-up.Discussion and baseline description: We enrolled 75 participants: 38 were randomized to the DS1-EI and 37 to the TAU groups. At enrolment, we found no significant differences in participants’ characteristics between groups. As expected, exposure to school was the only significant difference (9.4 (±4.1 h/week in

  13. The Primary Sequence of Acetylcholinesterase and Selective Antibodies for the Detection of Organophosphate Toxicity

    Science.gov (United States)

    1989-11-30

    Gene Family. Sequence identities come from published sequences (refs. 12-18) and data for bovine AChE are from B.P. Doctor and reflect -85% of the...base, and incubated 2 h under N2 at 50’C with a 2-fold M excess of dithiothreitol over estimated total cysteine residues. To label cysteines, [14C] iodo ...californica acetylcholinesterase 8 by coupling to bovine serum albumin or encapsulation into liposomes containing lipid A as an adjuvant prior to immunization

  14. Private Selective Sweeps Identified from Next-Generation Pool-Sequencing Reveal Convergent Pathways under Selection in Two Inbred Schistosoma mansoni Strains

    Science.gov (United States)

    Clément, Julie A. J.; Toulza, Eve; Gautier, Mathieu; Parrinello, Hugues; Roquis, David; Boissier, Jérôme; Rognon, Anne; Moné, Hélène; Mouahid, Gabriel; Buard, Jérôme; Mitta, Guillaume; Grunau, Christoph

    2013-01-01

    Background The trematode flatworms of the genus Schistosoma, the causative agents of schistosomiasis, are among the most prevalent parasites in humans, affecting more than 200 million people worldwide. In this study, we focused on two well-characterized strains of S. mansoni, to explore signatures of selection. Both strains are highly inbred and exhibit differences in life history traits, in particular in their compatibility with the intermediate host Biomphalaria glabrata. Methodology/Principal Findings We performed high throughput sequencing of DNA from pools of individuals of each strain using Illumina technology and identified single nucleotide polymorphisms (SNP) and copy number variations (CNV). In total, 708,898 SNPs were identified and roughly 2,000 CNVs. The SNPs revealed low nucleotide diversity (π = 2×10−4) within each strain and a high differentiation level (Fst = 0.73) between them. Based on a recently developed in-silico approach, we further detected 12 and 19 private (i.e. specific non-overlapping) selective sweeps among the 121 and 151 sweeps found in total for each strain. Conclusions/Significance Functional annotation of transcripts lying in the private selective sweeps revealed specific selection for functions related to parasitic interaction (e.g. cell-cell adhesion or redox reactions). Despite high differentiation between strains, we identified evolutionary convergence of genes related to proteolysis, known as a key virulence factor and a potential target of drug and vaccine development. Our data show that pool-sequencing can be used for the detection of selective sweeps in parasite populations and enables one to identify biological functions under selection. PMID:24349597

  15. Automatic sequences

    CERN Document Server

    Haeseler, Friedrich

    2003-01-01

    Automatic sequences are sequences which are produced by a finite automaton. Although they are not random they may look as being random. They are complicated, in the sense of not being not ultimately periodic, they may look rather complicated, in the sense that it may not be easy to name the rule by which the sequence is generated, however there exists a rule which generates the sequence. The concept automatic sequences has special applications in algebra, number theory, finite automata and formal languages, combinatorics on words. The text deals with different aspects of automatic sequences, in particular:· a general introduction to automatic sequences· the basic (combinatorial) properties of automatic sequences· the algebraic approach to automatic sequences· geometric objects related to automatic sequences.

  16. Cost-effective HRMA pre-sequence typing of clone libraries; application to phage display selection

    National Research Council Canada - National Science Library

    Pepers, Barry A; Schut, Menno H; Vossen, Rolf Ham; van Ommen, Gert-Jan B; den Dunnen, Johan T; van Roon-Mom, Willeke Mc

    2009-01-01

    Methodologies like phage display selection, in vitro mutagenesis and the determination of allelic expression differences include steps where large numbers of clones need to be compared and characterised...

  17. Clinical Application of Targeted Deep Sequencing in Solid-Cancer Patients and Utility for Biomarker-Selected Clinical Trials.

    Science.gov (United States)

    Kim, Seung Tae; Kim, Kyoung-Mee; Kim, Nayoung K D; Park, Joon Oh; Ahn, Soomin; Yun, Jae-Won; Kim, Kyu-Tae; Park, Se Hoon; Park, Peter J; Kim, Hee Cheol; Sohn, Tae Sung; Choi, Dong Il; Cho, Jong Ho; Heo, Jin Seok; Kwon, Wooil; Lee, Hyuk; Min, Byung-Hoon; Hong, Sung No; Park, Young Suk; Lim, Ho Yeong; Kang, Won Ki; Park, Woong-Yang; Lee, Jeeyun

    2017-10-01

    Molecular profiling of actionable mutations in refractory cancer patients has the potential to enable "precision medicine," wherein individualized therapies are guided based on genomic profiling. The molecular-screening program was intended to route participants to different candidate drugs in trials based on clinical-sequencing reports. In this screening program, we used a custom target-enrichment panel consisting of cancer-related genes to interrogate single-nucleotide variants, insertions and deletions, copy number variants, and a subset of gene fusions. From August 2014 through April 2015, 654 patients consented to participate in the program at Samsung Medical Center. Of these patients, 588 passed the quality control process for the 381-gene cancer-panel test, and 418 patients were included in the final analysis as being eligible for any anticancer treatment (127 gastric cancer, 122 colorectal cancer, 62 pancreatic/biliary tract cancer, 67 sarcoma/other cancer, and 40 genitourinary cancer patients). Of the 418 patients, 55 (12%) harbored a biomarker that guided them to a biomarker-selected clinical trial, and 184 (44%) patients harbored at least one genomic alteration that was potentially targetable. This study demonstrated that the panel-based sequencing program resulted in an increased rate of trial enrollment of metastatic cancer patients into biomarker-selected clinical trials. Given the expanding list of biomarker-selected trials, the guidance percentage to matched trials is anticipated to increase. This study demonstrated that the panel-based sequencing program resulted in an increased rate of trial enrollment of metastatic cancer patients into biomarker-selected clinical trials. Given the expanding list of biomarker-selected trials, the guidance percentage to matched trials is anticipated to increase. © AlphaMed Press 2017.

  18. Detecting selection in the blue crab, Callinectes sapidus, using DNA sequence data from multiple nuclear protein-coding genes.

    Science.gov (United States)

    Yednock, Bree K; Neigel, Joseph E

    2014-01-01

    The identification of genes involved in the adaptive evolution of non-model organisms with uncharacterized genomes constitutes a major challenge. This study employed a rigorous and targeted candidate gene approach to test for positive selection on protein-coding genes of the blue crab, Callinectes sapidus. Four genes with putative roles in physiological adaptation to environmental stress were chosen as candidates. A fifth gene not expected to play a role in environmental adaptation was used as a control. Large samples (n>800) of DNA sequences from C. sapidus were used in tests of selective neutrality based on sequence polymorphisms. In combination with these, sequences from the congener C. similis were used in neutrality tests based on interspecific divergence. In multiple tests, significant departures from neutral expectations and indicative of positive selection were found for the candidate gene trehalose 6-phosphate synthase (tps). These departures could not be explained by any of the historical population expansion or bottleneck scenarios that were evaluated in coalescent simulations. Evidence was also found for balancing selection at ATP-synthase subunit 9 (atps) using a maximum likelihood version of the Hudson, Kreitmen, and Aguadé test, and positive selection favoring amino acid replacements within ATP/ADP translocase (ant) was detected using the McDonald-Kreitman test. In contrast, test statistics for the control gene, ribosomal protein L12 (rpl), which presumably has experienced the same demographic effects as the candidate loci, were not significantly different from neutral expectations and could readily be explained by demographic effects. Together, these findings demonstrate the utility of the candidate gene approach for investigating adaptation at the molecular level in a marine invertebrate for which extensive genomic resources are not available.

  19. Detecting selection in the blue crab, Callinectes sapidus, using DNA sequence data from multiple nuclear protein-coding genes.

    Directory of Open Access Journals (Sweden)

    Bree K Yednock

    Full Text Available The identification of genes involved in the adaptive evolution of non-model organisms with uncharacterized genomes constitutes a major challenge. This study employed a rigorous and targeted candidate gene approach to test for positive selection on protein-coding genes of the blue crab, Callinectes sapidus. Four genes with putative roles in physiological adaptation to environmental stress were chosen as candidates. A fifth gene not expected to play a role in environmental adaptation was used as a control. Large samples (n>800 of DNA sequences from C. sapidus were used in tests of selective neutrality based on sequence polymorphisms. In combination with these, sequences from the congener C. similis were used in neutrality tests based on interspecific divergence. In multiple tests, significant departures from neutral expectations and indicative of positive selection were found for the candidate gene trehalose 6-phosphate synthase (tps. These departures could not be explained by any of the historical population expansion or bottleneck scenarios that were evaluated in coalescent simulations. Evidence was also found for balancing selection at ATP-synthase subunit 9 (atps using a maximum likelihood version of the Hudson, Kreitmen, and Aguadé test, and positive selection favoring amino acid replacements within ATP/ADP translocase (ant was detected using the McDonald-Kreitman test. In contrast, test statistics for the control gene, ribosomal protein L12 (rpl, which presumably has experienced the same demographic effects as the candidate loci, were not significantly different from neutral expectations and could readily be explained by demographic effects. Together, these findings demonstrate the utility of the candidate gene approach for investigating adaptation at the molecular level in a marine invertebrate for which extensive genomic resources are not available.

  20. Characterizing embryonic gene expression patterns in the mouse using nonredundant sequence-based selection

    DEFF Research Database (Denmark)

    Sousa-Nunes, Rita; Rana, Amer Ahmed; Kettleborough, Ross

    2003-01-01

    steps of axial specification and tissue patterning in the mouse. To avoid examining the same gene more than once, and to exclude potentially ubiquitously expressed housekeeping genes, cDNA sequence was derived from 1978 clones of the Endoderm library. These yielded 1440 distinct cDNAs, of which 123...

  1. Control group selection in critical care randomized controlled trials evaluating interventional strategies: An ethical assessment.

    Science.gov (United States)

    Silverman, Henry J; Miller, Franklin G

    2004-03-01

    Ethical concern has been raised with critical care randomized controlled trials in which the standard of care reflects a broad range of clinical practices. Commentators have argued that trials without an unrestricted control group, in which standard practices are implemented at the discretion of the attending physician, lack the ability to redefine the standard of care and might expose subjects to excessive harms due to an inability to stop early. To develop a framework for analyzing control group selection for critical care trials. Ethical analysis. A key ethical variable in trial design is the extent with which the control group adequately reflects standard care practices. Such a control group might incorporate either the "unrestricted" practices of physicians or a protocol that specifies and restricts the parameters of standard practices. Control group selection should be determined with respect to the following ethical objectives of trial design: 1) clinical value, 2) scientific validity, 3) efficiency and feasibility, and 4) protection of human subjects. Because these objectives may conflict, control group selection will involve trade-offs and compromises. Trials using a protocolized rather than an unrestricted standard care control group will likely have enhanced validity. However, if the protocolized control group lacks representativeness to standard care practices, then trials that use such groups will offer less clinical value and could provide less assurance of protecting subjects compared with trials that use unrestricted control groups. For trials evaluating contrasting strategies that do not adequately represent standard practices, use of a third group that is more representative of standard practices will enhance clinical value and increase the ability to stop early if needed to protect subjects. These advantages might come at the expense of efficiency and feasibility. Weighing and balancing the competing ethical objectives of trial design should be

  2. Sequence diversity and natural selection at domain I of the apical membrane antigen 1 among Indian Plasmodium falciparum populations

    Directory of Open Access Journals (Sweden)

    Kumar Ashwani

    2007-11-01

    Full Text Available Abstract Background The Plasmodium falciparum apical membrane antigen 1 (AMA1 is a leading malaria vaccine candidate antigen. The complete AMA1 protein is comprised of three domains where domain I exhibits high sequence polymorphism and is thus named as the hyper-variable region (HVR. The present study describes the extent of genetic polymorphism and natural selection at domain I of the ama1 gene among Indian P. falciparum isolates. Methods The part of the ama1 gene covering domain I was PCR amplified and sequenced from 157 P. falciparum isolates collected from five different geographical regions of India. Statistical and phylogenetic analyses of the sequences were done using DnaSP ver. 4. 10. 9 and MEGA version 3.0 packages. Results A total of 57 AMA1 haplotypes were observed among 157 isolates sequenced. Forty-six of these 57 haplotypes are being reported here for the first time. The parasites collected from the high malaria transmission areas (Assam, Orissa, and Andaman and Nicobar Islands showed more haplotypes (H and nucleotide diversity π as compared to low malaria transmission areas (Uttar Pradesh and Goa. The comparison of all five Indian P. falciparum subpopulations indicated moderate level of genetic differentiation and limited gene flow (Fixation index ranging from 0.048 to 0.13 between populations. The difference between rates of non-synonymous and synonymous mutations, Tajima's D and McDonald-Kreitman test statistics suggested that the diversity at domain I of the AMA1 antigen is due to positive natural selection. The minimum recombination events were also high indicating the possible role of recombination in generating AMA1 allelic diversity. Conclusion The level of genetic diversity and diversifying selection were higher in Assam, Orissa, and Andaman and Nicobar Islands populations as compared to Uttar Pradesh and Goa. The amounts of gene flow among these populations were moderate. The data reported here will be valuable for the

  3. The adverse effect of selective cyclooxygenase-2 inhibitor on random skin flap survival in rats.

    Directory of Open Access Journals (Sweden)

    Haiyong Ren

    Full Text Available BACKGROUND: Cyclooxygenase-2(COX-2 inhibitors provide desired analgesic effects after injury or surgery, but evidences suggested they also attenuate wound healing. The study is to investigate the effect of COX-2 inhibitor on random skin flap survival. METHODS: The McFarlane flap model was established in 40 rats and evaluated within two groups, each group gave the same volume of Parecoxib and saline injection for 7 days. The necrotic area of the flap was measured, the specimens of the flap were stained with haematoxylin-eosin(HE for histologic analysis. Immunohistochemical staining was performed to analyse the level of VEGF and COX-2 . RESULTS: 7 days after operation, the flap necrotic area ratio in study group (66.65 ± 2.81% was significantly enlarged than that of the control group(48.81 ± 2.33%(P <0.01. Histological analysis demonstrated angiogenesis with mean vessel density per mm(2 being lower in study group (15.4 ± 4.4 than in control group (27.2 ± 4.1 (P <0.05. To evaluate the expression of COX-2 and VEGF protein in the intermediate area II in the two groups by immunohistochemistry test .The expression of COX-2 in study group was (1022.45 ± 153.1, and in control group was (2638.05 ± 132.2 (P <0.01. The expression of VEGF in the study and control groups were (2779.45 ± 472.0 vs (4938.05 ± 123.6(P <0.01.In the COX-2 inhibitor group, the expressions of COX-2 and VEGF protein were remarkably down-regulated as compared with the control group. CONCLUSION: Selective COX-2 inhibitor had adverse effect on random skin flap survival. Suppression of neovascularization induced by low level of VEGF was supposed to be the biological mechanism.

  4. A computational method for selecting short peptide sequences for inorganic material binding.

    Science.gov (United States)

    Nayebi, Niloofar; Cetinel, Sibel; Omar, Sara Ibrahim; Tuszynski, Jack A; Montemagno, Carlo

    2017-11-01

    Discovering or designing biofunctionalized materials with improved quality highly depends on the ability to manipulate and control the peptide-inorganic interaction. Various peptides can be used as assemblers, synthesizers, and linkers in the material syntheses. In another context, specific and selective material-binding peptides can be used as recognition blocks in mining applications. In this study, we propose a new in silico method to select short 4-mer peptides with high affinity and selectivity for a given target material. This method is illustrated with the calcite (104) surface as an example, which has been experimentally validated. A calcite binding peptide can play an important role in our understanding of biomineralization. A practical aspect of calcite is a need for it to be selectively depressed in mining sites. © 2017 Wiley Periodicals, Inc.

  5. Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

    Science.gov (United States)

    M.N. lslam-Faridi; C.D. Nelson; S.P. DiFazio; L.E. Gunter; G.A. Tuskan

    2009-01-01

    The 185-285 rDNA and 55 rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 185-285 rDNA sites and one 55 rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones...

  6. Sequences of epicuticular wax structures along stems in four selected tree species

    Directory of Open Access Journals (Sweden)

    Tomaszewski Dominik

    2014-09-01

    Full Text Available Wax layer formation accompanies the processes of epidermis and cuticle formation. To examine these changes, observationsalong current-year long shoots of four woody species (Acer negundo, A. rufinerve, Gymnocladus dioica, and Gingko biloba were made. Long shoots are suitable objects for such observations, because from the same stem, several samples can be obtained that represent a well-defined sequence of fragments of different ages.

  7. Theoretical study of polymeric mixtures with different sequence statistics. II. Brazovskii class: Linear random copolymers with diblock copolymers

    Energy Technology Data Exchange (ETDEWEB)

    Qi, Shuyan [Department of Chemical Engineering, Department of Chemistry, and Materials Science Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, California 94720 (United States); Chakraborty, Arup K. [Department of Chemical Engineering, Department of Chemistry, and Materials Science Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, California 94720 (United States)

    2000-01-15

    We use a Landau theory to study the instability of the homogeneous state of a mixture of linear random copolymers and diblock copolymers. Interesting features of the calculated structure factors for different components of the mixture are found, which can be directly compared with scattering experiments with selectively deuterated samples. We also investigate the least stable concentration fluctuations and find four different types of segregation modes at the spinodal depending upon the characteristics of the mixture (e.g., average compositions, statistical correlation lengths and volume fractions). The different segregation modes are also indicative of the kinetic pathways leading to the formation of ordered microstructures. Experiments probing these pathways are suggested. (c) 2000 American Institute of Physics.

  8. Sequence characterized amplified region marker as a tool for selection of high-artemisinin containing species of Artemisia.

    Science.gov (United States)

    Asghari, Matin; Naghavi, Mohammad Reza; Hosseinzadeh, Abdol Hadi; Ranjbar, Mojtaba; Poorebrahim, Mansour

    2015-01-01

    Malaria is currently one of the most important causes of mortality in developing countries. High resistance to available antimalarial drugs has been reported frequently, thus it is crucial to focus on the discovery of new antimalarial drugs. Artemisinin, an effective antimalarial medication, is isolated from various Artemisia species. To identify the Artemisia species producing high quantity of artemisinin, eight species of Artemisia were screened with the genetic sequence characterized amplified region (SCAR) marker for higher quantity of artemisinin. The DNA band corresponding to SCAR marker was cloned into pGEM®-T Easy vector and sequenced. The content of artemisinin in tested species was also measured using high-performance liquid chromatography (HPLC) assay. The primers designed for high-artemisinin SCAR marker could amplify a specific band of approximately 1000 bp which was present in two Artemisia annua and Artemisia absinthium species. These SCAR marker sequences for two selected species were submitted into the GenBank databases under KC337116 and KC465952 accession numbers. HPLC analysis indicated that two selected Artemisia species, genetically recognized as high-artemisinin yielding plants, had higher artemisinin content in comparison to other examined species. Therefore, in this study, we propose developed SCAR marker as a complementary tool for confidently detection of high-artemisinin content in Artemisia species.

  9. Translational database selection and multiplexed sequence capture for up front filtering of reliable breast cancer biomarker candidates.

    Directory of Open Access Journals (Sweden)

    Patrik L Ståhl

    Full Text Available Biomarker identification is of utmost importance for the development of novel diagnostics and therapeutics. Here we make use of a translational database selection strategy, utilizing data from the Human Protein Atlas (HPA on differentially expressed protein patterns in healthy and breast cancer tissues as a means to filter out potential biomarkers for underlying genetic causatives of the disease. DNA was isolated from ten breast cancer biopsies, and the protein coding and flanking non-coding genomic regions corresponding to the selected proteins were extracted in a multiplexed format from the samples using a single DNA sequence capture array. Deep sequencing revealed an even enrichment of the multiplexed samples and a great variation of genetic alterations in the tumors of the sampled individuals. Benefiting from the upstream filtering method, the final set of biomarker candidates could be completely verified through bidirectional Sanger sequencing, revealing a 40 percent false positive rate despite high read coverage. Of the variants encountered in translated regions, nine novel non-synonymous variations were identified and verified, two of which were present in more than one of the ten tumor samples.

  10. Genotyping-by-sequencing of pear (Pyrus spp.) accessions unravels novel patterns of genetic diversity and selection footprints.

    Science.gov (United States)

    Kumar, Satish; Kirk, Chris; Deng, Cecilia; Wiedow, Claudia; Knaebel, Mareike; Brewer, Lester

    2017-01-01

    Understanding of genetic diversity and marker-trait relationships in pears (Pyrus spp.) forms an important part of gene conservation and cultivar breeding. Accessions of Asian and European pear species, and interspecific hybrids were planted in a common garden experiment. Genotyping-by-sequencing (GBS) was used to genotype 214 accessions, which were also phenotyped for fruit quality traits. A combination of selection scans and association analyses were used to identify signatures of selection. Patterns of genetic diversity, population structure and introgression were also investigated. About 15 000 high-quality SNP markers were identified from the GBS data, of which 25% and 11% harboured private alleles for European and Asian species, respectively. Bayesian clustering analysis suggested negligible gene flow, resulting in highly significant population differentiation (Fst=0.45) between Asian and European pears. Interspecific hybrids displayed an average of 55% and 45% introgression from their Asian and European ancestors, respectively. Phenotypic (firmness, acidity, shape and so on) variation between accessions was significantly associated with genetic differentiation. Allele frequencies at large-effect SNP loci were significantly different between genetic groups, suggesting footprints of directional selection. Selection scan analyses identified over 20 outlier SNP loci with substantial statistical support, likely to be subject to directional selection or closely linked to loci under selection.

  11. Random genetic drift, natural selection, and noise in human cranial evolution.

    Science.gov (United States)

    Roseman, Charles C

    2016-08-01

    This study assesses the extent to which relationships among groups complicate comparative studies of adaptation in recent human cranial variation and the extent to which departures from neutral additive models of evolution hinder the reconstruction of population relationships among groups using cranial morphology. Using a maximum likelihood evolutionary model fitting approach and a mixed population genomic and cranial data set, I evaluate the relative fits of several widely used models of human cranial evolution. Moreover, I compare the goodness of fit of models of cranial evolution constrained by genomic variation to test hypotheses about population specific departures from neutrality. Models from population genomics are much better fits to cranial variation than are traditional models from comparative human biology. There is not enough evolutionary information in the cranium to reconstruct much of recent human evolution but the influence of population history on cranial variation is strong enough to cause comparative studies of adaptation serious difficulties. Deviations from a model of random genetic drift along a tree-like population history show the importance of environmental effects, gene flow, and/or natural selection on human cranial variation. Moreover, there is a strong signal of the effect of natural selection or an environmental factor on a group of humans from Siberia. The evolution of the human cranium is complex and no one evolutionary process has prevailed at the expense of all others. A holistic unification of phenome, genome, and environmental context, gives us a strong point of purchase on these problems, which is unavailable to any one traditional approach alone. Am J Phys Anthropol 160:582-592, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  12. Variants of sequence family B Thermococcus kodakaraensis DNA polymerase with increased mismatch extension selectivity.

    Directory of Open Access Journals (Sweden)

    Claudia Huber

    Full Text Available Fidelity and selectivity of DNA polymerases are critical determinants for the biology of life, as well as important tools for biotechnological applications. DNA polymerases catalyze the formation of DNA strands by adding deoxynucleotides to a primer, which is complementarily bound to a template. To ensure the integrity of the genome, DNA polymerases select the correct nucleotide and further extend the nascent DNA strand. Thus, DNA polymerase fidelity is pivotal for ensuring that cells can replicate their genome with minimal error. DNA polymerases are, however, further optimized for more specific biotechnological or diagnostic applications. Here we report on the semi-rational design of mutant libraries derived by saturation mutagenesis at single sites of a 3'-5'-exonuclease deficient variant of Thermococcus kodakaraensis DNA polymerase (KOD pol and the discovery for variants with enhanced mismatch extension selectivity by screening. Sites of potential interest for saturation mutagenesis were selected by their proximity to primer or template strands. The resulting libraries were screened via quantitative real-time PCR. We identified three variants with single amino acid exchanges-R501C, R606Q, and R606W-which exhibited increased mismatch extension selectivity. These variants were further characterized towards their potential in mismatch discrimination. Additionally, the identified enzymes were also able to differentiate between cytosine and 5-methylcytosine. Our results demonstrate the potential in characterizing and developing DNA polymerases for specific PCR based applications in DNA biotechnology and diagnostics.

  13. Polyamide Curvature and DNA Sequence Selective Recognition: Use of 4-Aminobenzamide to Adjust Curvature

    Science.gov (United States)

    Lajiness, Jamie; Sielaff, Alan; Mackay, Hilary; Brown, Toni; Kluza, Jerome; Nguyen, Binh; Wilson, W. David; Lee, Moses; Hartley, John A.

    2014-01-01

    Imidazole and pyrrole-containing polyamides belong to an important class of compounds that can be designed to target specific DNA sequences, and they are potentially useful in applications of controlling gene expression. The extent of polyamide curvature is an important consideration when studying the ability of such compounds to bind in the minor groove of DNA. The current study investigates the importance of curvature using polyamides of the form f-Im-Phenyl-Im, in which the imidazole heterocycles are placed in ortho-, meta-, and para-configurations of the phenyl moiety. The synthesis and biophysical evaluation of each compound binding to its cognate DNA sequence (5′-ACGCGT-3′) and a negative control sequence (5′-AAATTT-3′) is reported, along with their comparison to the parent binder, f-Im-Py-Im (3). ACGCGT is a medicinally significant sequence present in the MluI cell-cycle box (MCB) transcriptional element found in the promoter of a gene associated with cell division. The results demonstrated that the para-derivative has the greatest affinity for its cognate sequence, as indicated via thermal denaturation, CD, ITC, SPR analyses, and DNase I footprinting. ITC studies showed that binding of the para-isomer (2c) to ACGCGT was significantly more exothermic than binding to AAATTT. In contrast, no heat change was observed for binding of the meta- (2b) and ortho- (2a) isomers to both DNAs, due to low binding affinities. This is consistent with results from SPR studies, which indicate that the para-derivative binds in a 2:1 fashion to ACGCGT and binds weakly to ACCGGT (K = 1.8 × 106 and 4.0 × 104 M−1, respectively). Interestingly, it binds in a 1:1 fashion to AAATTT (K = 5.4 × 105 M−1). The meta-compound does not bind to any sequence. The para-derivative also was the only compound to show an induced peak via CD at 330 nm, indicative of minor groove binding, and produced a ΔTm value of 5.8 ºC. Molecular modeling experiments have been performed to

  14. Deep-sequencing revealing mutation dynamics in the miltefosine transporter gene in Leishmania infantum selected for miltefosine resistance.

    Science.gov (United States)

    Laffitte, Marie-Claude N; Leprohon, Philippe; Légaré, Danielle; Ouellette, Marc

    2016-10-01

    Miltefosine is the first oral drug used in chemotherapy against leishmaniasis. In vitro studies found that resistance to miltefosine in Leishmania is often associated with the acquisition of point mutations in the miltefosine transporter, leading to a decrease in drug uptake. In this study, the dynamics of mutations upon miltefosine selection was studied by deep-sequencing of the miltefosine transporter gene. Deep-sequencing data revealed that no mutation was detected in the miltefosine transporter at sub-inhibitory concentrations of miltefosine. We show that the prevalence of mutated alleles was increasing when the drug pressure heightened, that more mutations were observed in highly resistant mutants, and that most mutations remained when parasites were cultured for a few passages in the absence of miltefosine.

  15. CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests.

    Science.gov (United States)

    Ma, Li; Fan, Suohai

    2017-03-14

    The random forests algorithm is a type of classifier with prominent universality, a wide application range, and robustness for avoiding overfitting. But there are still some drawbacks to random forests. Therefore, to improve the performance of random forests, this paper seeks to improve imbalanced data processing, feature selection and parameter optimization. We propose the CURE-SMOTE algorithm for the imbalanced data classification problem. Experiments on imbalanced UCI data reveal that the combination of Clustering Using Representatives (CURE) enhances the original synthetic minority oversampling technique (SMOTE) algorithms effectively compared with the classification results on the original data using random sampling, Borderline-SMOTE1, safe-level SMOTE, C-SMOTE, and k-means-SMOTE. Additionally, the hybrid RF (random forests) algorithm has been proposed for feature selection and parameter optimization, which uses the minimum out of bag (OOB) data error as its objective function. Simulation results on binary and higher-dimensional data indicate that the proposed hybrid RF algorithms, hybrid genetic-random forests algorithm, hybrid particle swarm-random forests algorithm and hybrid fish swarm-random forests algorithm can achieve the minimum OOB error and show the best generalization ability. The training set produced from the proposed CURE-SMOTE algorithm is closer to the original data distribution because it contains minimal noise. Thus, better classification results are produced from this feasible and effective algorithm. Moreover, the hybrid algorithm's F-value, G-mean, AUC and OOB scores demonstrate that they surpass the performance of the original RF algorithm. Hence, this hybrid algorithm provides a new way to perform feature selection and parameter optimization.

  16. Inference of purifying and positive selection in three subspecies of chimpanzees (Pan troglodytes) from exome sequencing

    DEFF Research Database (Denmark)

    Bataillon, Thomas; Duan, Jinjie; Hvilsom, Christina

    2015-01-01

    of recent gene flow from Western into Eastern chimpanzees. The striking contrast in X-linked vs. autosomal polymorphism and divergence previously reported in Central chimpanzees is also found in Eastern and Western chimpanzees. We show that the direction of selection (DoS) statistic exhibits a strong non......We study genome-wide nucleotide diversity in three subspecies of extant chimpanzees using exome capture. After strict filtering, SNVs and indels were called and genotyped for >50% of exons at a mean coverage of 35x per individual. Central chimpanzees (P. t. troglodytes) are the most polymorphic......-monotonic relationship with the strength of purifying selection S, making it inappropriate for estimating S. We instead use counts in synonymous vs. non-synonymous frequency classes to infer the distribution of S coefficients acting on non-synonymous mutations in each subspecies. The strength of purifying selection we...

  17. Performance of selected coded direct-sequence receiver structures in pulsed interference

    Science.gov (United States)

    Matis, K. R.; Modestino, J. W.

    The performance of short constraint-length convolutional codes are considered in conjunction with coherent BPSK direct-sequence spread-spectrum modulation in a variety of pulsed interference scenarios. The use of several types of imperfect interference estimation mechanisms, and the resulting impact on Viterbi decoder bit error probability performance, are considered. The digital receiver structures eliminate the need for a fast-acting analog AGC circuit, which is often susceptible to spoofing. For the short constraint-length codes under consideration, upper bounds are provided on bit error probability performance under idealized channel modeling assumptions. Departures from these idealized assumptions are analyzed through extensive Monte-Carlo computer simulation.

  18. Selective confinement of vibrations in composite systems with alternate quasi-regular sequences

    Energy Technology Data Exchange (ETDEWEB)

    Montalban, A. [Departamento de Ciencia y Tecnologia de Materiales, Division de Optica, Universidad Miguel Hernandez, 03202 Elche (Spain); Velasco, V.R. [Instituto de Ciencia de Materiales de Madrid, CSIC, Sor Juana Ines de la Cruz 3, 28049 Madrid (Spain)]. E-mail: vrvr@icmm.csic.es; Tutor, J. [Departamento de Fisica Aplicada, Universidad Autonoma de Madrid, Cantoblanco, 28049 Madrid (Spain); Fernandez-Velicia, F.J. [Departamento de Fisica de los Materiales, Facultad de Ciencias, Universidad Nacional de Educacion a Distancia, Senda del Rey 9, 28080 Madrid (Spain)

    2007-01-01

    We have studied the atom displacements and the vibrational frequencies of 1D systems formed by combinations of Fibonacci, Thue-Morse and Rudin-Shapiro quasi-regular stacks and their alternate ones. The materials are described by nearest-neighbor force constants and the corresponding atom masses, particularized to the Al, Ag systems. These structures exhibit differences in the frequency spectrum as compared to the original simple quasi-regular generations but the most important feature is the presence of separate confinement of the atom displacements in one of the sequences forming the total composite structure for different frequency ranges.

  19. Noise-induced hearing loss in randomly selected New York dairy farmers.

    Science.gov (United States)

    May, J J; Marvel, M; Regan, M; Marvel, L H; Pratt, D S

    1990-01-01

    To understand better the effects of noise levels associated with dairy farming, we randomly selected 49 full-time dairy farmers from an established cohort. Medical and occupational histories were taken and standard audiometric testing was done. Forty-six males (94%) and three females (6%) with a mean age of 43.5 (+/- 13) years and an average of 29.4 (+/- 14) years in farming were tested. Pure Tone Average thresholds (PTA4) at 0.5, 1.0, 2.0, and 3.0 kHz plus High Frequency Average thresholds (HFA3) at 3.0, 4.0, and 6.0 kHz were calculated. Subjects with a loss of greater than or equal to 20 db in either ear were considered abnormal. Eighteen subjects (37%) had abnormal PTA4S and 32 (65%) abnormal HFA3S. The left ear was more severely affected in both groups (p less than or equal to .05, t-test). Significant associations were found between hearing loss and years worked (odds ratio 4.1, r = .53) and age (odds ratio 4.1, r = .59). No association could be found between hearing loss and measles; mumps; previous ear infections; or use of power tools, guns, motorcycles, snowmobiles, or stereo headphones. Our data suggest that among farmers, substantial hearing loss occurs especially in the high-frequency ranges. Presbycusis is an important confounding variable.

  20. Modeling Slotted Aloha as a Stochastic Game with Random Discrete Power Selection Algorithms

    Directory of Open Access Journals (Sweden)

    Rachid El-Azouzi

    2009-01-01

    Full Text Available We consider the uplink case of a cellular system where bufferless mobiles transmit over a common channel to a base station, using the slotted aloha medium access protocol. We study the performance of this system under several power differentiation schemes. Indeed, we consider a random set of selectable transmission powers and further study the impact of priorities given either to new arrival packets or to the backlogged ones. Later, we address a general capture model where a mobile transmits successfully a packet if its instantaneous SINR (signal to interferences plus noise ratio is lager than some fixed threshold. Under this capture model, we analyze both the cooperative team in which a common goal is jointly optimized as well as the noncooperative game problem where mobiles reach to optimize their own objectives. Furthermore, we derive the throughput and the expected delay and use them as the objectives to optimize and provide a stability analysis as alternative study. Exhaustive performance evaluations were carried out, we show that schemes with power differentiation improve significantly the individual as well as global performances, and could eliminate in some cases the bi-stable nature of slotted aloha.

  1. Footprint of Positive Selection in Treponema pallidum subsp. pallidum Genome Sequences Suggests Adaptive Microevolution of the Syphilis Pathogen

    Science.gov (United States)

    Centurion-Lara, Arturo; Jeffrey, Brendan M.; Le, Hoavan T.; Molini, Barbara J.; Lukehart, Sheila A.; Sokurenko, Evgeni V.; Rockey, Daniel D.

    2012-01-01

    In the rabbit model of syphilis, infection phenotypes associated with the Nichols and Chicago strains of Treponema pallidum (T. pallidum), though similar, are not identical. Between these strains, significant differences are found in expression of, and antibody responses to some candidate virulence factors, suggesting the existence of functional genetic differences between isolates. The Chicago strain genome was therefore sequenced and compared to the Nichols genome, available since 1998. Initial comparative analysis suggested the presence of 44 single nucleotide polymorphisms (SNPs), 103 small (≤3 nucleotides) indels, and 1 large (1204 bp) insertion in the Chicago genome with respect to the Nichols genome. To confirm the above findings, Sanger sequencing was performed on most loci carrying differences using DNA from Chicago and the Nichols strain used in the original T. pallidum genome project. A majority of the previously identified differences were found to be due to errors in the published Nichols genome, while the accuracy of the Chicago genome was confirmed. However, 20 SNPs were confirmed between the two genomes, and 16 (80.0%) were found in coding regions, with all being of non-synonymous nature, strongly indicating action of positive selection. Sequencing of 16 genomic loci harboring SNPs in 12 additional T. pallidum strains, (SS14, Bal 3, Bal 7, Bal 9, Sea 81-3, Sea 81-8, Sea 86-1, Sea 87-1, Mexico A, UW231B, UW236B, and UW249C), was used to identify “Chicago-“ or “Nichols -specific” differences. All but one of the 16 SNPs were “Nichols-specific”, with Chicago having identical sequences at these positions to almost all of the additional strains examined. These mutations could reflect differential adaptation of the Nichols strain to the rabbit host or pathoadaptive mutations acquired during human infection. Our findings indicate that SNPs among T. pallidum strains emerge under positive selection and, therefore, are likely to be functional in

  2. Selected mapping based orthogonal frequency division multiplexing system (OFDM) for the reduction of peak to average power ratio (PAPR) using higher number of novel phase sequences under 32-QAM

    Science.gov (United States)

    Gupta, Prabal; Singh, Balpreet; Arora, Krishan

    2017-07-01

    The very high peak to average power ratio (PAPR) is the biggest problem faced by OFDM system which ultimately causes distortion in the transmitted data. In the literatures various techniques have been proposed for the reduction of PAPR. One of the important technique which is known as Selected Mapping (SLM) or distortion-less technique proposed by several literature for the reduction of PAPR. But SLM technique generally uses several number of randomly designed phase sequence in frequency domain so that after inverse fast Fourier transform (IFFT) when data is converted into corresponding time domain sequence it can be optimized accordingly. Henceforth, in this paper we are proposing a higher number of novel phase sequence based SLM with 32-Quadrature amplitude modulation (QAM) under various sub carriers like 32, 64, 128, 256 and 512. Probabilistic analysis with the help of complementary cumulative distribution function (CCDF) clearly depicts the remarkable performance of our proposed algorithm in comparison with conventional OFDM system.

  3. Difference in muscle activation patterns during high-speed versus standard-speed yoga: A randomized sequence crossover study.

    Science.gov (United States)

    Potiaumpai, Melanie; Martins, Maria Carolina Massoni; Wong, Claudia; Desai, Trusha; Rodriguez, Roberto; Mooney, Kiersten; Signorile, Joseph F

    2017-02-01

    To compare the difference in muscle activation between high-speed yoga and standard-speed yoga and to compare muscle activation of the transitions between poses and the held phases of a yoga pose. Randomized sequence crossover trial SETTING: A laboratory of neuromuscular research and active aging Interventions: Eight minutes of continuous Sun Salutation B was performed, at a high speed versus a standard-speed, separately. Electromyography was used to quantify normalized muscle activation patterns of eight upper and lower body muscles (pectoralis major, medial deltoids, lateral head of the triceps, middle fibers of the trapezius, vastus medialis, medial gastrocnemius, thoracic extensor spinae, and external obliques) during the high-speed and standard-speed yoga protocols. Difference in normalized muscle activation between high-speed yoga and standard-speed yoga. Normalized muscle activity signals were significantly higher in all eight muscles during the transition phases of poses compared to the held phases (pyoga across the entire session. Our results show that transitions from one held phase of a pose to another produces higher normalized muscle activity than the held phases of the poses and that overall activity is greater during highspeed yoga than standard-speed yoga. Therefore, the transition speed and associated number of poses should be considered when targeting specific improvements in performance. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Analysis of heart rate and oxygen uptake kinetics studied by two different pseudo-random binary sequence work rate amplitudes.

    Science.gov (United States)

    Drescher, U; Koschate, J; Schiffer, T; Schneider, S; Hoffmann, U

    2017-06-01

    The aim of the study was to compare the kinetics responses of heart rate (HR), pulmonary (V˙O2pulm) and predicted muscular (V˙O2musc) oxygen uptake between two different pseudo-random binary sequence (PRBS) work rate (WR) amplitudes both below anaerobic threshold. Eight healthy individuals performed two PRBS WR protocols implying changes between 30W and 80W and between 30W and 110W. HR and V˙O2pulm were measured beat-to-beat and breath-by-breath, respectively. V˙O2musc was estimated applying the approach of Hoffmann et al. (Eur J Appl Physiol 113: 1745-1754, 2013) considering a circulatory model for venous return and cross-correlation functions (CCF) for the kinetics analysis. HR and V˙O2musc kinetics seem to be independent of WR intensity (p>0.05). V˙O2pulm kinetics show prominent differences in the lag of the CCF maximum (39±9s; 31±4s; pkinetics remain unchanged. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Impact of Sequencing Radiation Therapy and Chemotherapy on Long-Term Local Toxicity for Early Breast Cancer: Results of a Randomized Study at 15-Year Follow-Up

    Energy Technology Data Exchange (ETDEWEB)

    Pinnarò, Paola; Giordano, Carolina; Farneti, Alessia [Department of Radiation Oncology, Regina Elena National Cancer Institute, Rome (Italy); Strigari, Lidia; Landoni, Valeria [Department of Physics, Regina Elena National Cancer Institute, Rome (Italy); Marucci, Laura; Petrongari, Maria Grazia [Department of Radiation Oncology, Regina Elena National Cancer Institute, Rome (Italy); Sanguineti, Giuseppe, E-mail: sanguineti@ifo.it [Department of Radiation Oncology, Regina Elena National Cancer Institute, Rome (Italy)

    2016-07-15

    Purpose: To compare long-term late local toxicity after either concomitant or sequential chemoradiation therapy after breast-conserving surgery. Methods and Materials: From 1997 to 2002, women aged 18 to 75 years who underwent breast-conserving surgery and axillary dissection for early breast cancer and in whom CMF (cyclophosphamide, methotrexate, and 5-fluorouracil) chemotherapy was planned were randomized between concomitant and sequential radiation therapy. Radiation therapy was delivered to the whole breast through tangential fields to 50 Gy in 20 fractions over a period of 4 weeks, followed by an electron boost. Surviving patients were tentatively contacted and examined between March and September 2014. Patients in whom progressive disease had developed or who had undergone further breast surgery were excluded. Local toxicity (fibrosis, telangiectasia, and breast atrophy or retraction) was scored blindly to the treatment received. A logistic regression was run to investigate the effect of treatment sequence after correction for several patient-, treatment-, and tumor-related covariates on selected endpoints. The median time to cross-sectional analysis was 15.7 years (range, 12.0-17.8 years). Results: Of 206 patients randomized, 154 (74.8%) were potentially eligible. Of these, 43 (27.9%) refused participation and 4 (2.6%) had been lost to follow-up, and for 5 (3.2%), we could not restore planning data; thus, the final number of analyzed patients was 102. No grade 4 toxicity had been observed, whereas the number of grade 3 toxicity events was low (<8%) for each item, allowing pooling of grade 2 and 3 events for further analysis. Treatment sequence (concomitant vs sequential) was an independent predictor of grade 2 or 3 fibrosis according to both the National Cancer Institute Common Terminology Criteria for Adverse Events (odds ratio [OR], 4.05; 95% confidence interval [CI], 1.34-12.2; P=.013) and the SOMA (Subjective, Objective, Management and Analytic

  6. SHAPE selection (SHAPES) enrich for RNA structure signal in SHAPE sequencing-based probing data

    DEFF Research Database (Denmark)

    Poulsen, Line Dahl; Kielpinski, Lukasz Jan; Salama, Sofie R

    2015-01-01

    transcriptase. Here, we introduce a SHAPE Selection (SHAPES) reagent, N-propanone isatoic anhydride (NPIA), which retains the ability of SHAPE reagents to accurately probe RNA structure, but also allows covalent coupling between the SHAPES reagent and a biotin molecule. We demonstrate that SHAPES...

  7. Inference of purifying and positive selection in three subspecies of chimpanzees (Pan troglodytes) from exome sequencing.

    Science.gov (United States)

    Bataillon, Thomas; Duan, Jinjie; Hvilsom, Christina; Jin, Xin; Li, Yingrui; Skov, Laurits; Glemin, Sylvain; Munch, Kasper; Jiang, Tao; Qian, Yu; Hobolth, Asger; Wang, Jun; Mailund, Thomas; Siegismund, Hans R; Schierup, Mikkel H

    2015-03-30

    We study genome-wide nucleotide diversity in three subspecies of extant chimpanzees using exome capture. After strict filtering, Single Nucleotide Polymorphisms and indels were called and genotyped for greater than 50% of exons at a mean coverage of 35× per individual. Central chimpanzees (Pan troglodytes troglodytes) are the most polymorphic (nucleotide diversity, θw = 0.0023 per site) followed by Eastern (P. t. schweinfurthii) chimpanzees (θw = 0.0016) and Western (P. t. verus) chimpanzees (θw = 0.0008). A demographic scenario of divergence without gene flow fits the patterns of autosomal synonymous nucleotide diversity well except for a signal of recent gene flow from Western into Eastern chimpanzees. The striking contrast in X-linked versus autosomal polymorphism and divergence previously reported in Central chimpanzees is also found in Eastern and Western chimpanzees. We show that the direction of selection statistic exhibits a strong nonmonotonic relationship with the strength of purifying selection S, making it inappropriate for estimating S. We instead use counts in synonymous versus nonsynonymous frequency classes to infer the distribution of S coefficients acting on nonsynonymous mutations in each subspecies. The strength of purifying selection we infer is congruent with the differences in effective sizes of each subspecies: Central chimpanzees are undergoing the strongest purifying selection followed by Eastern and Western chimpanzees. Coding indels show stronger selection against indels changing the reading frame than observed in human populations. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  8. Engineering of restriction endonucleases: using methylation activity of the bifunctional endonuclease Eco57I to select the mutant with a novel sequence specificity.

    Science.gov (United States)

    Rimseliene, Renata; Maneliene, Zita; Lubys, Arvydas; Janulaitis, Arvydas

    2003-03-21

    Type II restriction endonucleases (REs) are widely used tools in molecular biology, biotechnology and diagnostics. Efforts to generate new specificities by structure-guided design and random mutagenesis have been unsuccessful so far. We have developed a new procedure called the methylation activity-based selection (MABS) for generating REs with a new specificity. MABS uses a unique property of bifunctional type II REs to methylate DNA targets they recognize. The procedure includes three steps: (1) conversion of a bifunctional RE into a monofunctional DNA-modifying enzyme by cleavage center disruption; (2) mutagenesis and selection of mutants with altered DNA modification specificity based on their ability to protect predetermined DNA targets; (3) reconstitution of the cleavage center's wild-type structure. The efficiency of the MABS technique was demonstrated by altering the sequence specificity of the bifunctional RE Eco57I from 5'-CTGAAG to 5'-CTGRAG, and thus generating the mutant restriction endonuclease (and DNA methyltransferase) of a specificity not known before. This study provides evidence that MABS is a promising technique for generation of REs with new specificities.

  9. The prevalence of symptoms associated with pulmonary tuberculosis in randomly selected children from a high burden community

    OpenAIRE

    Marais, B.; Obihara, C; Gie, R.; Schaaf, H; Hesseling, A.; Lombard, C.; Enarson, D; Bateman, E; Beyers, N

    2005-01-01

    Background: Diagnosis of childhood tuberculosis is problematic and symptom based diagnostic approaches are often promoted in high burden settings. This study aimed (i) to document the prevalence of symptoms associated with tuberculosis among randomly selected children living in a high burden community, and (ii) to compare the prevalence of these symptoms in children without tuberculosis to those in children with newly diagnosed tuberculosis.

  10. Selecting soybean resistant to the cyst nematode Heterodera glycines using simple sequence repeat (microssatellite) markers.

    Science.gov (United States)

    Espindola, S M C G; Hamawaki, O T; Oliveira, A P; Hamawaki, C D L; Hamawaki, R L; Takahashi, L M

    2016-03-11

    The soybean cyst nematode (SCN) is a major cause of soybean yield reduction. The objective of this study was to evaluate the efficiency of marker-assisted selection to identify genotypes resistant to SCN race 3 infection, using Sat_168 and Sat-141 resistance quantitative trait loci. The experiment was carried out under greenhouse conditions, using soybean populations originated from crosses between susceptible and resistant parent stock: CD-201 (susceptible) and Foster IAC (resistant), Conquista (susceptible) and S83-30 (resistant), La-Suprema (susceptible) and S57-11 (resistant), and Parecis (susceptible) and S65-50 (resistant). Plants were inoculated with SCN and evaluated according to the female index (FI), those with FI < 10% were classified as resistant to nematode infection. Plants were genotyped for SCN resistance using microsatellite markers Sat-141 and Sat_168. Marker selection efficiency was analyzed by a contingency table, taking into account genotypic versus phenotypic evaluations for each line. These markers were shown to be useful tool for selection of SCN race 3.

  11. Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests.

    Science.gov (United States)

    Le, Trang T; Simmons, W Kyle; Misaki, Masaya; Bodurka, Jerzy; White, Bill C; Savitz, Jonathan; McKinney, Brett A

    2017-09-15

    Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting. We introduce private Evaporative Cooling, a stochastic privacy-preserving machine learning algorithm that uses Relief-F for feature selection and random forest for privacy preserving classification that also prevents overfitting. We relate the privacy-preserving threshold mechanism to a thermodynamic Maxwell-Boltzmann distribution, where the temperature represents the privacy threshold. We use the thermal statistical physics concept of Evaporative Cooling of atomic gases to perform backward stepwise privacy-preserving feature selection. On simulated data with main effects and statistical interactions, we compare accuracies on holdout and validation sets for three privacy-preserving methods: the reusable holdout, reusable holdout with random forest, and private Evaporative Cooling, which uses Relief-F feature selection and random forest classification. In simulations where interactions exist between attributes, private Evaporative Cooling provides higher classification accuracy without overfitting based on an independent validation set. In simulations without interactions, thresholdout with random forest and private Evaporative Cooling give comparable accuracies. We also apply these privacy methods to human brain resting-state fMRI data from a study of major depressive disorder. Code

  12. A Sensitive and Selective Label-Free Electrochemical DNA Biosensor for the Detection of Specific Dengue Virus Serotype 3 Sequences

    Directory of Open Access Journals (Sweden)

    Natália Oliveira

    2015-07-01

    Full Text Available Dengue fever is the most prevalent vector-borne disease in the world, with nearly 100 million people infected every year. Early diagnosis and identification of the pathogen are crucial steps for the treatment and for prevention of the disease, mainly in areas where the co-circulation of different serotypes is common, increasing the outcome of dengue hemorrhagic fever (DHF and dengue shock syndrome (DSS. Due to the lack of fast and inexpensive methods available for the identification of dengue serotypes, herein we report the development of an electrochemical DNA biosensor for the detection of sequences of dengue virus serotype 3 (DENV-3. DENV-3 probe was designed using bioinformatics software and differential pulse voltammetry (DPV was used for electrochemical analysis. The results showed that a 22-m sequence was the best DNA probe for the identification of DENV-3. The optimum concentration of the DNA probe immobilized onto the electrode surface is 500 nM and a low detection limit of the system (3.09 nM. Moreover, this system allows selective detection of DENV-3 sequences in buffer and human serum solutions. Therefore, the application of DNA biosensors for diagnostics at the molecular level may contribute to future advances in the implementation of specific, effective and rapid detection methods for the diagnosis dengue viruses.

  13. Bayesian dose selection design for a binary outcome using restricted response adaptive randomization.

    Science.gov (United States)

    Meinzer, Caitlyn; Martin, Renee; Suarez, Jose I

    2017-09-08

    In phase II trials, the most efficacious dose is usually not known. Moreover, given limited resources, it is difficult to robustly identify a dose while also testing for a signal of efficacy that would support a phase III trial. Recent designs have sought to be more efficient by exploring multiple doses through the use of adaptive strategies. However, the added flexibility may potentially increase the risk of making incorrect assumptions and reduce the total amount of information available across the dose range as a function of imbalanced sample size. To balance these challenges, a novel placebo-controlled design is presented in which a restricted Bayesian response adaptive randomization (RAR) is used to allocate a majority of subjects to the optimal dose of active drug, defined as the dose with the lowest probability of poor outcome. However, the allocation between subjects who receive active drug or placebo is held constant to retain the maximum possible power for a hypothesis test of overall efficacy comparing the optimal dose to placebo. The design properties and optimization of the design are presented in the context of a phase II trial for subarachnoid hemorrhage. For a fixed total sample size, a trade-off exists between the ability to select the optimal dose and the probability of rejecting the null hypothesis. This relationship is modified by the allocation ratio between active and control subjects, the choice of RAR algorithm, and the number of subjects allocated to an initial fixed allocation period. While a responsive RAR algorithm improves the ability to select the correct dose, there is an increased risk of assigning more subjects to a worse arm as a function of ephemeral trends in the data. A subarachnoid treatment trial is used to illustrate how this design can be customized for specific objectives and available data. Bayesian adaptive designs are a flexible approach to addressing multiple questions surrounding the optimal dose for treatment efficacy

  14. KinMutRF: a random forest classifier of sequence variants in the human protein kinase superfamily.

    Science.gov (United States)

    Pons, Tirso; Vazquez, Miguel; Matey-Hernandez, María Luisa; Brunak, Søren; Valencia, Alfonso; Izarzugaza, Jose Mg

    2016-06-23

    The association between aberrant signal processing by protein kinases and human diseases such as cancer was established long time ago. However, understanding the link between sequence variants in the protein kinase superfamily and the mechanistic complex traits at the molecular level remains challenging: cells tolerate most genomic alterations and only a minor fraction disrupt molecular function sufficiently and drive disease. KinMutRF is a novel random-forest method to automatically identify pathogenic variants in human kinases. Twenty six decision trees implemented as a random forest ponder a battery of features that characterize the variants: a) at the gene level, including membership to a Kinbase group and Gene Ontology terms; b) at the PFAM domain level; and c) at the residue level, the types of amino acids involved, changes in biochemical properties, functional annotations from UniProt, Phospho.ELM and FireDB. KinMutRF identifies disease-associated variants satisfactorily (Acc: 0.88, Prec:0.82, Rec:0.75, F-score:0.78, MCC:0.68) when trained and cross-validated with the 3689 human kinase variants from UniProt that have been annotated as neutral or pathogenic. All unclassified variants were excluded from the training set. Furthermore, KinMutRF is discussed with respect to two independent kinase-specific sets of mutations no included in the training and testing, Kin-Driver (643 variants) and Pon-BTK (1495 variants). Moreover, we provide predictions for the 848 protein kinase variants in UniProt that remained unclassified. A public implementation of KinMutRF, including documentation and examples, is available online ( http://kinmut2.bioinfo.cnio.es ). The source code for local installation is released under a GPL version 3 license, and can be downloaded from https://github.com/Rbbt-Workflows/KinMut2 . KinMutRF is capable of classifying kinase variation with good performance. Predictions by KinMutRF compare favorably in a benchmark with other state

  15. Using ArcMap, Google Earth, and Global Positioning Systems to select and locate random households in rural Haiti

    Directory of Open Access Journals (Sweden)

    Wampler Peter J

    2013-01-01

    Full Text Available Abstract Background A remote sensing technique was developed which combines a Geographic Information System (GIS; Google Earth, and Microsoft Excel to identify home locations for a random sample of households in rural Haiti. The method was used to select homes for ethnographic and water quality research in a region of rural Haiti located within 9 km of a local hospital and source of health education in Deschapelles, Haiti. The technique does not require access to governmental records or ground based surveys to collect household location data and can be performed in a rapid, cost-effective manner. Methods The random selection of households and the location of these households during field surveys were accomplished using GIS, Google Earth, Microsoft Excel, and handheld Garmin GPSmap 76CSx GPS units. Homes were identified and mapped in Google Earth, exported to ArcMap 10.0, and a random list of homes was generated using Microsoft Excel which was then loaded onto handheld GPS units for field location. The development and use of a remote sensing method was essential to the selection and location of random households. Results A total of 537 homes initially were mapped and a randomized subset of 96 was identified as potential survey locations. Over 96% of the homes mapped using Google Earth imagery were correctly identified as occupied dwellings. Only 3.6% of the occupants of mapped homes visited declined to be interviewed. 16.4% of the homes visited were not occupied at the time of the visit due to work away from the home or market days. A total of 55 households were located using this method during the 10 days of fieldwork in May and June of 2012. Conclusions The method used to generate and field locate random homes for surveys and water sampling was an effective means of selecting random households in a rural environment lacking geolocation infrastructure. The success rate for locating households using a handheld GPS was excellent and only

  16. Using ArcMap, Google Earth, and Global Positioning Systems to select and locate random households in rural Haiti.

    Science.gov (United States)

    Wampler, Peter J; Rediske, Richard R; Molla, Azizur R

    2013-01-18

    A remote sensing technique was developed which combines a Geographic Information System (GIS); Google Earth, and Microsoft Excel to identify home locations for a random sample of households in rural Haiti. The method was used to select homes for ethnographic and water quality research in a region of rural Haiti located within 9 km of a local hospital and source of health education in Deschapelles, Haiti. The technique does not require access to governmental records or ground based surveys to collect household location data and can be performed in a rapid, cost-effective manner. The random selection of households and the location of these households during field surveys were accomplished using GIS, Google Earth, Microsoft Excel, and handheld Garmin GPSmap 76CSx GPS units. Homes were identified and mapped in Google Earth, exported to ArcMap 10.0, and a random list of homes was generated using Microsoft Excel which was then loaded onto handheld GPS units for field location. The development and use of a remote sensing method was essential to the selection and location of random households. A total of 537 homes initially were mapped and a randomized subset of 96 was identified as potential survey locations. Over 96% of the homes mapped using Google Earth imagery were correctly identified as occupied dwellings. Only 3.6% of the occupants of mapped homes visited declined to be interviewed. 16.4% of the homes visited were not occupied at the time of the visit due to work away from the home or market days. A total of 55 households were located using this method during the 10 days of fieldwork in May and June of 2012. The method used to generate and field locate random homes for surveys and water sampling was an effective means of selecting random households in a rural environment lacking geolocation infrastructure. The success rate for locating households using a handheld GPS was excellent and only rarely was local knowledge required to identify and locate households. This

  17. HIV-1 Nef sequence and functional compartmentalization in the gut is not due to differential cytotoxic T lymphocyte selective pressure.

    Directory of Open Access Journals (Sweden)

    Martha J Lewis

    Full Text Available The gut is the largest lymphoid organ in the body and a site of active HIV-1 replication and immune surveillance. The gut is a reservoir of persistent infection in some individuals with fully suppressed plasma viremia on combination antiretroviral therapy (cART although the cause of this persistence is unknown. The HIV-1 accessory protein Nef contributes to persistence through multiple functions including immune evasion and increasing infectivity. Previous studies showed that Nef's function is shaped by cytotoxic T lymphocyte (CTL responses and that there are distinct populations of Nef within tissue compartments. We asked whether Nef's sequence and/or function are compartmentalized in the gut and how compartmentalization relates to local CTL immune responses. Primary nef quasispecies from paired plasma and sigmoid colon biopsies from chronically infected subjects not on therapy were sequenced and cloned into Env(- Vpu(- pseudotyped reporter viruses. CTL responses were mapped by IFN-γ ELISpot using expanded CD8+ cells from blood and gut with pools of overlapping peptides covering the entire HIV proteome. CD4 and MHC Class I Nef-mediated downregulation was measured by flow cytometry. Multiple tests indicated compartmentalization of nef sequences in 5 of 8 subjects. There was also compartmentalization of function with MHC Class I downregulation relatively well preserved, but significant loss of CD4 downregulation specifically by gut quasispecies in 5 of 7 subjects. There was no compartmentalization of CTL responses in 6 of 8 subjects, and the selective pressure on quasispecies correlated with the magnitude CTL response regardless of location. These results demonstrate that Nef adapts via diverse pathways to local selective pressures within gut mucosa, which may be predominated by factors other than CTL responses such as target cell availability. The finding of a functionally distinct population within gut mucosa offers some insight into how HIV-1

  18. An expressed sequence tag (EST library for Drosophila serrata, a model system for sexual selection and climatic adaptation studies

    Directory of Open Access Journals (Sweden)

    McGraw Elizabeth A

    2009-01-01

    Full Text Available Abstract Background The native Australian fly Drosophila serrata belongs to the highly speciose montium subgroup of the melanogaster species group. It has recently emerged as an excellent model system with which to address a number of important questions, including the evolution of traits under sexual selection and traits involved in climatic adaptation along latitudinal gradients. Understanding the molecular genetic basis of such traits has been limited by a lack of genomic resources for this species. Here, we present the first expressed sequence tag (EST collection for D. serrata that will enable the identification of genes underlying sexually-selected phenotypes and physiological responses to environmental change and may help resolve controversial phylogenetic relationships within the montium subgroup. Results A normalized cDNA library was constructed from whole fly bodies at several developmental stages, including larvae and adults. Assembly of 11,616 clones sequenced from the 3' end allowed us to identify 6,607 unique contigs, of which at least 90% encoded peptides. Partial transcripts were discovered from a variety of genes of evolutionary interest by BLASTing contigs against the 12 Drosophila genomes currently sequenced. By incorporating into the cDNA library multiple individuals from populations spanning a large portion of the geographical range of D. serrata, we were able to identify 11,057 putative single nucleotide polymorphisms (SNPs, with 278 different contigs having at least one "double hit" SNP that is highly likely to be a real polymorphism. At least 394 EST-associated microsatellite markers, representing 355 different contigs, were also found, providing an additional set of genetic markers. The assembled EST library is available online at http://www.chenowethlab.org/serrata/index.cgi. Conclusion We have provided the first gene collection and largest set of polymorphic genetic markers, to date, for the fly D. serrata. The EST

  19. Precision medicine ethics: selected issues and developments in next-generation sequencing, clinical oncology, and ethics.

    Science.gov (United States)

    Fiore, Robin N; Goodman, Kenneth W

    2016-01-01

    In early 2015 the National Institutes of Health launched a new, national Precision Medicine Initiative with the primary goal of rapidly improving the prevention, diagnosis, and treatment of cancers. The first-stage emphasis on oncology presents unique opportunities for clinical oncology to influence how the ethical challenges of precision medicine are to be articulated and addressed. Thus, a review of recent developments in connection with the Initiative, in particular on core ethics issues in clinical genomics, is a useful starting point. Unique ethical issues arise in precision medicine because of the enormous amounts of data generated by clinical whole-genome or whole-exome sequencing and the extent of current uncertainties with respect to data interpretations and disease associations. Among the most ethically challenging issues for clinicians are complicated informed consent processes, returning results - particularly secondary and incidental findings-and privacy and confidentiality. The first tests of precision medicine ethics in practice will be in clinical oncology, providing a rare opportunity to shape the agenda and integrate practical ethics considerations. These efforts can benefit from pre-existing research ethics analyses and recommendations from clinical and translational genetics research.

  20. Stereochemical Sequence Ion Selectivity: Proline versus Pipecolic-acid-containing Protonated Peptides

    Science.gov (United States)

    Abutokaikah, Maha T.; Guan, Shanshan; Bythell, Benjamin J.

    2017-01-01

    Substitution of proline by pipecolic acid, the six-membered ring congener of proline, results in vastly different tandem mass spectra. The well-known proline effect is eliminated and amide bond cleavage C-terminal to pipecolic acid dominates instead. Why do these two ostensibly similar residues produce dramatically differing spectra? Recent evidence indicates that the proton affinities of these residues are similar, so are unlikely to explain the result [Raulfs et al., J. Am. Soc. Mass Spectrom. 25, 1705-1715 (2014)]. An additional hypothesis based on increased flexibility was also advocated. Here, we provide a computational investigation of the "pipecolic acid effect," to test this and other hypotheses to determine if theory can shed additional light on this fascinating result. Our calculations provide evidence for both the increased flexibility of pipecolic-acid-containing peptides, and structural changes in the transition structures necessary to produce the sequence ions. The most striking computational finding is inversion of the stereochemistry of the transition structures leading to "proline effect"-type amide bond fragmentation between the proline/pipecolic acid-congeners: R (proline) to S (pipecolic acid). Additionally, our calculations predict substantial stabilization of the amide bond cleavage barriers for the pipecolic acid congeners by reduction in deleterious steric interactions and provide evidence for the importance of experimental energy regime in rationalizing the spectra.

  1. Simple Sequence Repeat Analysis of Selected NSIC-registered Coffee Varieties in the Philippines

    Directory of Open Access Journals (Sweden)

    Daisy May C. Santos

    2016-06-01

    Full Text Available Coffee (Coffea sp. is an important commercial crop worldwide. Three species of coffee are used as beverage, namely Coffea arabica, C. canephora, and C. liberica. Coffea arabica L. is the most cultivated among the three coffee species due to its taste quality, rich aroma, and low caffeine content. Despite its inferior taste and aroma, C. canephora Pierre ex A. Froehner, which has the highest caffeine content, is the second most widely cultivated because of its resistance to coffee diseases. On the other hand, C. liberica W.Bull ex Hierncomes is characterized by its very strong taste and flavor. The Philippines used to be a leading exporter of coffee until coffee rust destroyed the farms in Batangas, home of the famous Kapeng Barako. The country has been attempting to revive the coffee industry by focusing on the production of specialty coffee with registered varieties on the National Seed Industry Council (NSIC. Correct identification and isolation of pure coffee beans are the main factors that determine coffee’s market value. Local farms usually misidentify and mix coffee beans of different varieties, leading to the depreciation of their value. This study used simple sequence repeat (SSR markers to evaluate and distinguish Philippine NSIC-registered coffee species and varieties. The neighbor-joining tree generated using PAUP showed high bootstrap support, separating C. arabica, C. canephora, and C. liberica from each other. Among the twenty primer pairs used, seven were able to distinguish C. arabica, nine for C. liberica, and one for C. canephora.

  2. Web Platform vs In-Person Genetic Counselor for Return of Carrier Results From Exome Sequencing: A Randomized Clinical Trial.

    Science.gov (United States)

    Biesecker, Barbara B; Lewis, Katie L; Umstead, Kendall L; Johnston, Jennifer J; Turbitt, Erin; Fishler, Kristen P; Patton, John H; Miller, Ilana M; Heidlebaugh, Alexis R; Biesecker, Leslie G

    2018-01-22

    A critical bottleneck in clinical genomics is the mismatch between large volumes of results and the availability of knowledgeable professionals to return them. To test whether a web-based platform is noninferior to a genetic counselor for educating patients about their carrier results from exome sequencing. A randomized noninferiority trial conducted in a longitudinal sequencing cohort at the National Institutes of Health from February 5, 2014, to December 16, 2016, was used to compare the web-based platform with a genetic counselor. Among the 571 eligible participants, 1 to 7 heterozygous variants were identified in genes that cause a phenotype that is recessively inherited. Surveys were administered after cohort enrollment, immediately following trial education, and 1 month and 6 months later to primarily healthy postreproductive participants who expressed interest in learning their carrier results. Both intention-to-treat and per-protocol analyses were applied. A web-based platform that integrated education on carrier results with personal test results was designed to directly parallel disclosure education by a genetic counselor. The sessions took a mean (SD) time of 21 (10.6), and 27 (9.3) minutes, respectively. The primary outcomes and noninferiority margins (δNI) were knowledge (0 to 8, δNI = -1), test-specific distress (0 to 30, δNI = +1), and decisional conflict (15 to 75, δNI = +6). After 462 participants (80.9%) provided consent and were randomized, all but 3 participants (n = 459) completed surveys following education and counseling; 398 (86.1%) completed 1-month surveys and 392 (84.8%) completed 6-month surveys. Participants were predominantly well-educated, non-Hispanic white, married parents; mean (SD) age was 63 (63.1) years and 246 (53.6%) were men. The web platform was noninferior to the genetic counselor on outcomes assessed at 1 and 6 months: knowledge (mean group difference, -0.18; lower limit of 97.5% CI, -0.63;

  3. Domestication and the storage starch biosynthesis pathway: signatures of selection from a whole sorghum genome sequencing strategy.

    Science.gov (United States)

    Campbell, Bradley C; Gilding, Edward K; Mace, Emma S; Tai, Shuaishuai; Tao, Yongfu; Prentis, Peter J; Thomelin, Pauline; Jordan, David R; Godwin, Ian D

    2016-12-01

    Next-generation sequencing of complete genomes has given researchers unprecedented levels of information to study the multifaceted evolutionary changes that have shaped elite plant germplasm. In conjunction with population genetic analytical techniques and detailed online databases, we can more accurately capture the effects of domestication on entire biological pathways of agronomic importance. In this study, we explore the genetic diversity and signatures of selection in all predicted gene models of the storage starch synthesis pathway of Sorghum bicolor, utilizing a diversity panel containing lines categorized as either 'Landraces' or 'Wild and Weedy' genotypes. Amongst a total of 114 genes involved in starch synthesis, 71 had at least a single signal of purifying selection and 62 a signal of balancing selection and others a mix of both. This included key genes such as STARCH PHOSPHORYLASE 2 (SbPHO2, under balancing selection), PULLULANASE (SbPUL, under balancing selection) and ADP-glucose pyrophosphorylases (SHRUNKEN2, SbSH2 under purifying selection). Effectively, many genes within the primary starch synthesis pathway had a clear reduction in nucleotide diversity between the Landraces and wild and weedy lines indicating that the ancestral effects of domestication are still clearly identifiable. There was evidence of the positional rate variation within the well-characterized primary starch synthesis pathway of sorghum, particularly in the Landraces, whereby low evolutionary rates upstream and high rates downstream in the metabolic pathway were expected. This observation did not extend to the wild and weedy lines or the minor starch synthesis pathways. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  4. Cations Form Sequence Selective Motifs within DNA Grooves via a Combination of Cation-Pi and Ion-Dipole/Hydrogen Bond Interactions: e71420

    National Research Council Canada - National Science Library

    Mikaela Stewart; Tori Dunlap; Elizabeth Dourlain; Bryce Grant; Lori McFail-Isom

    2013-01-01

    ... variation including helical flexibility and conformation. Cation-pi interactions between solvent cations or their first hydration shell waters and the faces of DNA bases form sequence selectively and contribute to DNA structural heterogeneity...

  5. The optimal sequence for bronchial brushing and forceps biopsy in lung cancer diagnosis: a random control study.

    Science.gov (United States)

    Hou, Gang; Miao, Yuan; Hu, Xue-Jun; Wang, Wei; Wang, Qiu-Yue; Wu, Guang-Ping; Wang, En-Hua; Kang, Jian

    2016-03-01

    Optimizing basic techniques in diagnostic bronchoscopy is important for improving medical services in developing countries. In this study, the optimal sequence of bronchial brushing relative to bronchial biopsy for lung cancer diagnosis was evaluated. A total of 420 patients with visible endobronchial tumors were prospectively and randomly enrolled in two groups: a pre-biopsy brushing group, receiving two brushings before biopsy; two brushings which performed afterwards; were set as self-control and compared with the pre-biopsy brushings as the intra-group comparison; and a post-biopsy brushing group, only receiving two brushings after biopsy, which were compared with the pre-biopsy brushings as the inter-group comparison. Diagnostic yield of brushing was compared before and after biopsy, and as well as for different tumor pathologies and bronchoscopic morphologies. The occurrence of treated bleeding which defined as bleeding needed further intervention with argon plasma coagulation and/or anti-coagulation drugs in two groups was also compared. Only patients with a definitive cytological or histological diagnosis of lung cancer based on bronchoscopy or other confirmatory techniques were included. Patients were excluded if they had submucosal lesions, extrinsic compressions, pulmonary metastasis of extrapulmonary malignancies or uncommon non-small cell lung carcinoma (NSCLC). A total of 362 patients who met the inclusion criteria were analyzed. Diagnostic yield for pre-biopsy brushing (49.2%, 88/179) was significantly higher than for post-biopsy brushing within the same pre-biopsy brushing group (31.8%, 57/179) (P=0.007) as the intra-group comparison, and significantly higher than for post-biopsy brushing in the post group (30.6%, 56/183) (Pcancer. In cases of endobronchial exophytic tumors, pre-biopsy brushing appears to be superior to post-biopsy brushing.

  6. Comparison of grouper infection with two different iridoviruses using transcriptome sequencing and multiple reference species selection.

    Science.gov (United States)

    Liu, Chun-Cheng; Ho, Li-Ping; Yang, Cin-Hang; Kao, Tsung-Yu; Chou, Hsin-Yiu; Pai, Tun-Wen

    2017-12-01

    Due to high-density aquafarming in Taiwan, groupers are commonly infected with two different iridoviruses: Megalocytivirus (grouper iridovirus of Taiwan, TGIV) and Ranavirus (grouper iridovirus, GIV). Iridoviral diseases cause mass mortality, and surviving fish retain these pathogens, which can then be horizontally transferred. These viruses have therefore become a major challenge for grouper aquaculture. In this study, comparisons of the biological responses of groupers to infection with these two different iridoviruses were performed. A novel approach for transcriptomic analysis was proposed to enhance the discovery of differentially expressed genes and associated biological pathways. In this method, suitable and available reference species are selected from the NCBI taxonomy tree and the Ensembl and KEGG databases instead of either choosing only one model species or adopting the NCBI non-redundant dataset as references. Our results show that selection of multiple appropriate model species as references increases the efficiency and performance of analyses compared to those of traditional approaches. Using this method, 17 shared pathways and 5 specific pathways were found to be significantly differentially expressed following infection with the two iridoviruses, among which 11 pathways were additionally identified based on the proposed method of multiple reference species selection. Among the pathways responsive to infection with a specific iridovirus, the spliceosomal pathway (ko03040; p-value = 0.0011) was exclusively associated with TGIV infection, while the glycolysis/gluconeogenesis pathway (ko00010; p-value = 0.0032) was associated with GIV infection. These findings and designed corresponding biological experiments may facilitate a deeper understanding of the mechanisms by which both TGIV and GIV cause fatal infections, as well as the ways in which they induce different pathologies and symptoms. We believe that the proposed novel mechanism for de novo

  7. Templated synthesis of peptide nucleic acids via sequence-selective base-filling reactions.

    Science.gov (United States)

    Heemstra, Jennifer M; Liu, David R

    2009-08-19

    The templated synthesis of nucleic acids has previously been achieved through the backbone ligation of preformed nucleotide monomers or oligomers. In contrast, here we demonstrate templated nucleic acid synthesis using a base-filling approach in which individual bases are added to abasic sites of a peptide nucleic acid (PNA). Because nucleobase substrates in this approach are not self-reactive, a base-filling approach may reduce the formation of nontemplated reaction products. Using either reductive amination or amine acylation chemistries, we observed efficient and selective addition of each of the four nucleobases to an abasic site in the middle of the PNA strand. We also describe the addition of single nucleobases to the end of a PNA strand through base filling, as well as the tandem addition of two bases to the middle of the PNA strand. These findings represent an experimental foundation for nonenzymatic information transfer through base filling.

  8. Exploring evidence of positive selection reveals genetic basis of meat quality traits in Berkshire pigs through whole genome sequencing.

    Science.gov (United States)

    Jeong, Hyeonsoo; Song, Ki-Duk; Seo, Minseok; Caetano-Anollés, Kelsey; Kim, Jaemin; Kwak, Woori; Oh, Jae-Don; Kim, EuiSoo; Jeong, Dong Kee; Cho, Seoae; Kim, Heebal; Lee, Hak-Kyo

    2015-08-20

    Natural and artificial selection following domestication has led to the existence of more than a hundred pig breeds, as well as incredible variation in phenotypic traits. Berkshire pigs are regarded as having superior meat quality compared to other breeds. As the meat production industry seeks selective breeding approaches to improve profitable traits such as meat quality, information about genetic determinants of these traits is in high demand. However, most of the studies have been performed using trained sensory panel analysis without investigating the underlying genetic factors. Here we investigate the relationship between genomic composition and this phenotypic trait by scanning for signatures of positive selection in whole-genome sequencing data. We generated genomes of 10 Berkshire pigs at a total of 100.6 coverage depth, using the Illumina Hiseq2000 platform. Along with the genomes of 11 Landrace and 13 Yorkshire pigs, we identified genomic variants of 18.9 million SNVs and 3.4 million Indels in the mapped regions. We identified several associated genes related to lipid metabolism, intramuscular fatty acid deposition, and muscle fiber type which attribute to pork quality (TG, FABP1, AKIRIN2, GLP2R, TGFBR3, JPH3, ICAM2, and ERN1) by applying between population statistical tests (XP-EHH and XP-CLR). A statistical enrichment test was also conducted to detect breed specific genetic variation. In addition, de novo short sequence read assembly strategy identified several candidate genes (SLC25A14, IGF1, PI4KA, CACNA1A) as also contributing to lipid metabolism. Results revealed several candidate genes involved in Berkshire meat quality; most of these genes are involved in lipid metabolism and intramuscular fat deposition. These results can provide a basis for future research on the genomic characteristics of Berkshire pigs.

  9. sDFIRE: Sequence-specific statistical energy function for protein structure prediction by decoy selections.

    Science.gov (United States)

    Hoque, Md Tamjidul; Yang, Yuedong; Mishra, Avdesh; Zhou, Yaoqi

    2016-05-05

    An important unsolved problem in molecular and structural biology is the protein folding and structure prediction problem. One major bottleneck for solving this is the lack of an accurate energy to discriminate near-native conformations against other possible conformations. Here we have developed sDFIRE energy function, which is an optimized linear combination of DFIRE (the Distance-scaled Finite Ideal gas Reference state based Energy), the orientation dependent (polar-polar and polar-nonpolar) statistical potentials, and the matching scores between predicted and model structural properties including predicted main-chain torsion angles and solvent accessible surface area. The weights for these scoring terms are optimized by three widely used decoy sets consisting of a total of 134 proteins. Independent tests on CASP8 and CASP9 decoy sets indicate that sDFIRE outperforms other state-of-the-art energy functions in selecting near native structures and in the Pearson's correlation coefficient between the energy score and structural accuracy of the model (measured by TM-score). © 2016 Wiley Periodicals, Inc.

  10. Selection of DNA Aptamers for Ovarian Cancer Biomarker CA125 Using One-Pot SELEX and High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Delia J. Scoville

    2017-01-01

    Full Text Available CA125 is a mucin glycoprotein whose concentration in serum correlates with a woman’s risk of developing ovarian cancer and also indicates response to therapy in diagnosed patients. Accurate detection of this large, complex protein in patient samples is of great clinical relevance. We suggest that powerful new diagnostic tools may be enabled by the development of nucleic acid aptamers with affinity for CA125. Here, we report on our use of One-Pot SELEX to isolate single-stranded DNA aptamers with affinity for CA125, followed by high-throughput sequencing of the selected oligonucleotides. This data-rich approach, combined with bioinformatics tools, enabled the entire selection process to be characterized. Using fluorescence anisotropy and affinity probe capillary electrophoresis, the binding affinities of four aptamer candidates were evaluated. Two aptamers, CA125_1 and CA125_12, both without primers, were found to bind to clinically relevant concentrations of the protein target. Binding was differently influenced by the presence of Mg2+ ions, being required for binding of CA125_1 and abrogating binding of CA125_12. In conclusion, One-Pot SELEX was found to be a promising selection method that yielded DNA aptamers to a clinically important protein target.

  11. Characterization of the transcriptome, nucleotide sequence polymorphism, and natural selection in the desert adapted mouse Peromyscus eremicus

    Directory of Open Access Journals (Sweden)

    Matthew D. MacManes

    2014-10-01

    Full Text Available As a direct result of intense heat and aridity, deserts are thought to be among the most harsh of environments, particularly for their mammalian inhabitants. Given that osmoregulation can be challenging for these animals, with failure resulting in death, strong selection should be observed on genes related to the maintenance of water and solute balance. One such animal, Peromyscus eremicus, is native to the desert regions of the southwest United States and may live its entire life without oral fluid intake. As a first step toward understanding the genetics that underlie this phenotype, we present a characterization of the P. eremicus transcriptome. We assay four tissues (kidney, liver, brain, testes from a single individual and supplement this with population level renal transcriptome sequencing from 15 additional animals. We identified a set of transcripts undergoing both purifying and balancing selection based on estimates of Tajima’s D. In addition, we used the branch-site test to identify a transcript—Slc2a9, likely related to desert osmoregulation—undergoing enhanced selection in P. eremicus relative to a set of related non-desert rodents.

  12. Characterization of clinically-attenuated Burkholderia mallei by whole genome sequencing: candidate strain for exclusion from Select Agent lists.

    Directory of Open Access Journals (Sweden)

    Steven E Schutzer

    Full Text Available BACKGROUND: Burkholderia mallei is an understudied biothreat agent responsible for glanders which can be lethal in humans and animals. Research with this pathogen has been hampered in part by constraints of Select Agent regulations for safety reasons. Whole genomic sequencing (WGS is an apt approach to characterize newly discovered or poorly understood microbial pathogens. METHODOLOGY/PRINCIPAL FINDINGS: We performed WGS on a strain of B. mallei, SAVP1, previously pathogenic, that was experimentally infected in 6 equids (4 ponies, 1 mule, 1 donkey, natural hosts, for purposes of producing antibodies. Multiple high inocula were used in some cases. Unexpectedly SAVP1 appeared to be avirulent in the ponies and mule, and attenuated in the donkey, but induced antibodies. We determined the genome sequence of SAVP1 and compared it to a strain that was virulent in horses and a human. In comparison, this phenotypic avirulent SAVP1 strain was missing multiple genes including all the animal type III secretory system (T3SS complex of genes demonstrated to be essential for virulence in mice and hamster models. The loss of these genes in the SAVP1 strain appears to be the consequence of a multiple gene deletion across insertion sequence (IS elements in the B. mallei genome. Therefore, the strain by itself is unlikely to revert naturally to its virulent phenotype. There were other genes present in one strain and not the other and vice-versa. CONCLUSION/SIGNIFICANCE: The discovery that this strain of B. mallei was both avirulent in the natural host ponies, and did not possess T3SS associated genes may be fortuitous to advance biodefense research. The deleted virulence-essential T3SS is not likely to be re-acquired naturally. These findings may provide a basis for exclusion of SAVP1 from the Select Agent regulation or at least discussion of what else would be required for exclusion. This exclusion could accelerate research by investigators not possessing BSL-3

  13. Anomalous stress diffusion, Omori's law and Continuous Time Random Walk in the 2010 Efpalion aftershock sequence (Corinth rift, Greece)

    Science.gov (United States)

    Michas, Georgios; Vallianatos, Filippos; Karakostas, Vassilios; Papadimitriou, Eleftheria; Sammonds, Peter

    2014-05-01

    Efpalion aftershock sequence occurred in January 2010, when an M=5.5 earthquake was followed four days later by another strong event (M=5.4) and numerous aftershocks (Karakostas et al., 2012). This activity interrupted a 15 years period of low to moderate earthquake occurrence in Corinth rift, where the last major event was the 1995 Aigion earthquake (M=6.2). Coulomb stress analysis performed in previous studies (Karakostas et al., 2012; Sokos et al., 2012; Ganas et al., 2013) indicated that the second major event and most of the aftershocks were triggered due to stress transfer. The aftershocks production rate decays as a power-law with time according to the modified Omori law (Utsu et al., 1995) with an exponent larger than one for the first four days, while after the occurrence of the second strong event the exponent turns to unity. We consider the earthquake sequence as a point process in time and space and study its spatiotemporal evolution considering a Continuous Time Random Walk (CTRW) model with a joint probability density function of inter-event times and jumps between the successive earthquakes (Metzler and Klafter, 2000). Jump length distribution exhibits finite variance, whereas inter-event times scale as a q-generalized gamma distribution (Michas et al., 2013) with a long power-law tail. These properties are indicative of a subdiffusive process in terms of CTRW. Additionally, the mean square displacement of aftershocks is constant with time after the occurrence of the first event, while it changes to a power-law with exponent close to 0.15 after the second major event, illustrating a slow diffusive process. During the first four days aftershocks cluster around the epicentral area of the second major event, while after that and taking as a reference the second event, the aftershock zone is migrating slowly with time to the west near the epicentral area of the first event. This process is much slower from what would be expected from normal diffusion, a

  14. Widespread sequence variations in VAMP1 across vertebrates suggest a potential selective pressure from botulinum neurotoxins.

    Directory of Open Access Journals (Sweden)

    Lisheng Peng

    2014-07-01

    Full Text Available Botulinum neurotoxins (BoNT/A-G, the most potent toxins known, act by cleaving three SNARE proteins required for synaptic vesicle exocytosis. Previous studies on BoNTs have generally utilized the major SNARE homologues expressed in brain (VAMP2, syntaxin 1, and SNAP-25. However, BoNTs target peripheral motor neurons and cause death by paralyzing respiratory muscles such as the diaphragm. Here we report that VAMP1, but not VAMP2, is the SNARE homologue predominantly expressed in adult rodent diaphragm motor nerve terminals and in differentiated human motor neurons. In contrast to the highly conserved VAMP2, BoNT-resistant variations in VAMP1 are widespread across vertebrates. In particular, we identified a polymorphism at position 48 of VAMP1 in rats, which renders VAMP1 either resistant (I48 or sensitive (M48 to BoNT/D. Taking advantage of this finding, we showed that rat diaphragms with I48 in VAMP1 are insensitive to BoNT/D compared to rat diaphragms with M48 in VAMP1. This unique intra-species comparison establishes VAMP1 as a physiological toxin target in diaphragm motor nerve terminals, and demonstrates that the resistance of VAMP1 to BoNTs can underlie the insensitivity of a species to members of BoNTs. Consistently, human VAMP1 contains I48, which may explain why humans are insensitive to BoNT/D. Finally, we report that residue 48 of VAMP1 varies frequently between M and I across seventeen closely related primate species, suggesting a potential selective pressure from members of BoNTs for resistance in vertebrates.

  15. The basic science and mathematics of random mutation and natural selection.

    Science.gov (United States)

    Kleinman, Alan

    2014-12-20

    The mutation and natural selection phenomenon can and often does cause the failure of antimicrobial, herbicidal, pesticide and cancer treatments selection pressures. This phenomenon operates in a mathematically predictable behavior, which when understood leads to approaches to reduce and prevent the failure of the use of these selection pressures. The mathematical behavior of mutation and selection is derived using the principles given by probability theory. The derivation of the equations describing the mutation and selection phenomenon is carried out in the context of an empirical example. Copyright © 2014 John Wiley & Sons, Ltd.

  16. Selectivity by host plants affects the distribution of arbuscular mycorrhizal fungi: evidence from ITS rDNA sequence metadata.

    Science.gov (United States)

    Yang, Haishui; Zang, Yanyan; Yuan, Yongge; Tang, Jianjun; Chen, Xin

    2012-04-12

    Arbuscular mycorrhizal fungi (AMF) can form obligate symbioses with the vast majority of land plants, and AMF distribution patterns have received increasing attention from researchers. At the local scale, the distribution of AMF is well documented. Studies at large scales, however, are limited because intensive sampling is difficult. Here, we used ITS rDNA sequence metadata obtained from public databases to study the distribution of AMF at continental and global scales. We also used these sequence metadata to investigate whether host plant is the main factor that affects the distribution of AMF at large scales. We defined 305 ITS virtual taxa (ITS-VTs) among all sequences of the Glomeromycota by using a comprehensive maximum likelihood phylogenetic analysis. Each host taxonomic order averaged about 53% specific ITS-VTs, and approximately 60% of the ITS-VTs were host specific. Those ITS-VTs with wide host range showed wide geographic distribution. Most ITS-VTs occurred in only one type of host functional group. The distributions of most ITS-VTs were limited across ecosystem, across continent, across biogeographical realm, and across climatic zone. Non-metric multidimensional scaling analysis (NMDS) showed that AMF community composition differed among functional groups of hosts, and among ecosystem, continent, biogeographical realm, and climatic zone. The Mantel test showed that AMF community composition was significantly correlated with plant community composition among ecosystem, among continent, among biogeographical realm, and among climatic zone. The structural equation modeling (SEM) showed that the effects of ecosystem, continent, biogeographical realm, and climatic zone were mainly indirect on AMF distribution, but plant had strongly direct effects on AMF. The distribution of AMF as indicated by ITS rDNA sequences showed a pattern of high endemism at large scales. This pattern indicates high specificity of AMF for host at different scales (plant taxonomic

  17. Characterization and Profiling of Liver microRNAs by RNA-sequencing in Cattle Divergently Selected for Residual Feed Intake

    Directory of Open Access Journals (Sweden)

    Wijdan Al-Husseini

    2016-10-01

    Full Text Available MicroRNAs (miRNAs are short non-coding RNAs that post-transcriptionally regulate expression of mRNAs in many biological pathways. Liver plays an important role in the feed efficiency of animals and high and low efficient cattle demonstrated different gene expression profiles by microarray. Here we report comprehensive miRNAs profiles by next-gen deep sequencing in Angus cattle divergently selected for residual feed intake (RFI and identify miRNAs related to feed efficiency in beef cattle. Two microRNA libraries were constructed from pooled RNA extracted from livers of low and high RFI cattle, and sequenced by Illumina genome analyser. In total, 23,628,103 high quality short sequence reads were obtained and more than half of these reads were matched to the bovine genome (UMD 3.1. We identified 305 known bovine miRNAs. Bta-miR-143, bta-miR-30, bta-miR-122, bta-miR-378, and bta-let-7 were the top five most abundant miRNAs families expressed in liver, representing more than 63% of expressed miRNAs. We also identified 52 homologous miRNAs and 10 novel putative bovine-specific miRNAs, based on precursor sequence and the secondary structure and utilizing the miRBase (v. 21. We compared the miRNAs profile between high and low RFI animals and ranked the most differentially expressed bovine known miRNAs. Bovine miR-143 was the most abundant miRNA in the bovine liver and comprised 20% of total expressed mapped miRNAs. The most highly expressed miRNA in liver of mice and humans, miR-122, was the third most abundant in our cattle liver samples. We also identified 10 putative novel bovine-specific miRNA candidates. Differentially expressed miRNAs between high and low RFI cattle were identified with 18 miRNAs being up-regulated and 7 other miRNAs down-regulated in low RFI cattle. Our study has identified comprehensive miRNAs expressed in bovine liver. Some of the expressed miRNAs are novel in cattle. The differentially expressed miRNAs between high and low RFI

  18. Use of molecular diversity of Mycoplasma gallisepticum by gene-targeted sequencing (GTS) and random amplified polymorphic DNA (RAPD) analysis for epidemiological studies.

    Science.gov (United States)

    Ferguson, Naola M; Hepp, Diego; Sun, Shulei; Ikuta, Nilo; Levisohn, Sharon; Kleven, Stanley H; García, Maricarmen

    2005-06-01

    A total of 67 Mycoplasma gallisepticum field isolates from the USA, Israel and Australia, and 10 reference strains, were characterized by gene-targeted sequencing (GTS) analysis of portions of the putative cytadhesin pvpA gene, the cytadhesin gapA gene, the cytadhesin mgc2 gene, and an uncharacterized hypothetical surface lipoprotein-encoding gene designated genome coding DNA sequence (CDS) MGA_0319. The regions of the surface-protein-encoding genes targeted in this analysis were found to be stable within a strain, after sequencing different in vitro passages of M. gallisepticum reference strains. Gene sequences were first analysed on the basis of gene size polymorphism. The pvpA and mgc2 genes are characterized by the presence of different nucleotide insertions/deletions. However, differentiation of isolates based solely on pvpA/mgc2 PCR size polymorphism was not found to be a reliable method to differentiate among M. gallisepticum isolates. On the other hand, GTS analysis based on the nucleotide sequence identities of individual and multiple genes correlated with epidemiologically linked isolates and with random amplified polymorphic DNA (RAPD) analysis. GTS analysis of individual genes, gapA, MGA_0319, mgc2 and pvpA, identified 17, 16, 20 and 22 sequence types, respectively. GTS analysis using multiple gene sequences mgc2/pvpa and gapA/MGA_0319/mgc2/pvpA identified 38 and 40 sequence types, respectively. GTS of multiple surface-protein-encoding genes showed better discriminatory power than RAPD analysis, which identified 36 pattern types from the same panel of M. gallisepticum strains. These results are believed to provide the first evidence that typing of M. gallisepticum isolates by GTS analysis of surface-protein genes is a sensitive and reproducible typing method and will allow rapid global comparisons between laboratories.

  19. Nucleotide sequence analyses of the MRP1 gene in four populations suggest negative selection on its coding region

    Directory of Open Access Journals (Sweden)

    Ryan Stephen

    2006-05-01

    Full Text Available Abstract Background The MRP1 gene encodes the 190 kDa multidrug resistance-associated protein 1 (MRP1/ABCC1 and effluxes diverse drugs and xenobiotics. Sequence variations within this gene might account for differences in drug response in different individuals. To facilitate association studies of this gene with diseases and/or drug response, exons and flanking introns of MRP1 were screened for polymorphisms in 142 DNA samples from four different populations. Results Seventy-one polymorphisms, including 60 biallelic single nucleotide polymorphisms (SNPs, ten insertions/deletions (indel and one short tandem repeat (STR were identified. Thirty-four of these polymorphisms have not been previously reported. Interestingly, the STR polymorphism at the 5' untranslated region (5'UTR occurs at high but different frequencies in the different populations. Frequencies of common polymorphisms in our populations were comparable to those of similar populations in HAPMAP or Perlegen. Nucleotide diversity indices indicated that the coding region of MRP1 may have undergone negative selection or recent population expansion. SNPs E10/1299 G>T (R433S and E16/2012 G>T (G671V which occur at low frequency in only one or two of four populations examined were predicted to be functionally deleterious and hence are likely to be under negative selection. Conclusion Through in silico approaches, we identified two rare SNPs that are potentially negatively selected. These SNPs may be useful for studies associating this gene with rare events including adverse drug reactions.

  20. Natural selection in a population of Drosophila melanogaster explained by changes in gene expression caused by sequence variation in core promoter regions.

    Science.gov (United States)

    Sato, Mitsuhiko P; Makino, Takashi; Kawata, Masakado

    2016-02-09

    Understanding the evolutionary forces that influence variation in gene regulatory regions in natural populations is an important challenge for evolutionary biology because natural selection for such variations could promote adaptive phenotypic evolution. Recently, whole-genome sequence analyses have identified regulatory regions subject to natural selection. However, these studies could not identify the relationship between sequence variation in the detected regions and change in gene expression levels. We analyzed sequence variations in core promoter regions, which are critical regions for gene regulation in higher eukaryotes, in a natural population of Drosophila melanogaster, and identified core promoter sequence variations associated with differences in gene expression levels subjected to natural selection. Among the core promoter regions whose sequence variation could change transcription factor binding sites and explain differences in expression levels, three core promoter regions were detected as candidates associated with purifying selection or selective sweep and seven as candidates associated with balancing selection, excluding the possibility of linkage between these regions and core promoter regions. CHKov1, which confers resistance to the sigma virus and related insecticides, was identified as core promoter regions that has been subject to selective sweep, although it could not be denied that selection for variation in core promoter regions was due to linked single nucleotide polymorphisms in the regulatory region outside core promoter regions. Nucleotide changes in core promoter regions of CHKov1 caused the loss of two basal transcription factor binding sites and acquisition of one transcription factor binding site, resulting in decreased gene expression levels. Of nine core promoter regions regions associated with balancing selection, brat, and CG9044 are associated with neuromuscular junction development, and Nmda1 are associated with learning

  1. Pseudo cluster randomization: a treatment allocation method to minimize contamination and selection bias.

    NARCIS (Netherlands)

    Borm, G.F.; Melis, R.J.F.; Teerenstra, S.; Peer, P.G.M.

    2005-01-01

    In some clinical trials, treatment allocation on a patient level is not feasible, and whole groups or clusters of patients are allocated to the same treatment. If, for example, a clinical trial is investigating the efficacy of various patient coaching methods and randomization is done on a patient

  2. DS/LPI autocorrelation detection in noise plus random-tone interference. [Direct Sequence Low-Probabilty of Intercept

    Science.gov (United States)

    Hinedi, S.; Polydoros, A.

    1988-01-01

    The authors present and analyze a frequency-noncoherent two-lag autocorrelation statistic for the wideband detection of random BPSK signals in noise-plus-random-multitone interference. It is shown that this detector is quite robust to the presence or absence of interference and its specific parameter values, contrary to the case of an energy detector. The rule assumes knowledge of the data rate and the active scenario under H0. It is concluded that the real-time autocorrelation domain and its samples (lags) are a viable approach for detecting random signals in dense environments.

  3. Sequence correction of random coil chemical shifts: correlation between neighbor correction factors and changes in the Ramachandran distribution

    DEFF Research Database (Denmark)

    Kjærgaard, Magnus; Poulsen, Flemming Martin

    2011-01-01

    . The contributions from the neighboring residues are typically removed by using neighbor correction factors determined based on each residue's effect on glycine chemical shifts. Due to its unusual conformational freedom, glycine may be particularly unrepresentative for the remaining residue types. In this study, we......Random coil chemical shifts are necessary for secondary chemical shift analysis, which is the main NMR method for identification of secondary structure in proteins. One of the largest challenges in the determination of random coil chemical shifts is accounting for the effect of neighboring residues...... use random coil peptides containing glutamine instead of glycine to determine the random coil chemical shifts and the neighbor correction factors. The resulting correction factors correlate to changes in the populations of the major wells in the Ramachandran plot, which demonstrates that changes...

  4. Presence of psychoactive substances in oral fluid from randomly selected drivers in Denmark

    DEFF Research Database (Denmark)

    Simonsen, K. Wiese; Steentoft, A.; Hels, Tove

    2012-01-01

    This roadside study is the Danish part of the EU-project DRUID (Driving under the Influence of Drugs, Alcohol, and Medicines) and included three representative regions in Denmark. Oral fluid samples (n = 3002) were collected randomly from drivers using a sampling scheme stratified by time, season...... of narcotic drugs. It can be concluded that driving under the influence of drugs is as serious a road safety problem as drunk driving.......This roadside study is the Danish part of the EU-project DRUID (Driving under the Influence of Drugs, Alcohol, and Medicines) and included three representative regions in Denmark. Oral fluid samples (n = 3002) were collected randomly from drivers using a sampling scheme stratified by time, season...

  5. Presence of psychoactive substances in oral fluid from randomly selected drivers in Denmark

    DEFF Research Database (Denmark)

    Simonsen, Kirsten Wiese; Steentoft, Anni; Hels, Tove

    2012-01-01

    This roadside study is the Danish part of the EU-project DRUID (Driving under the Influence of Drugs, Alcohol, and Medicines) and included three representative regions in Denmark. Oral fluid samples (n = 3002) were collected randomly from drivers using a sampling scheme stratified by time, season....... It can be concluded that driving under the influence of drugs is as serious a road safety problem as drunk driving.......This roadside study is the Danish part of the EU-project DRUID (Driving under the Influence of Drugs, Alcohol, and Medicines) and included three representative regions in Denmark. Oral fluid samples (n = 3002) were collected randomly from drivers using a sampling scheme stratified by time, season...

  6. Feature selection and classification of mechanical fault of an induction motor using random forest classifier

    Directory of Open Access Journals (Sweden)

    Raj Kumar Patel

    2016-09-01

    Full Text Available Fault detection and diagnosis is the most important technology in condition-based maintenance (CBM system for rotating machinery. This paper experimentally explores the development of a random forest (RF classifier, a recently emerged machine learning technique, for multi-class mechanical fault diagnosis in bearing of an induction motor. Firstly, the vibration signals are collected from the bearing using accelerometer sensor. Parameters from the vibration signal are extracted in the form of statistical features and used as input feature for the classification problem. These features are classified through RF classifiers for four class problems. The prime objective of this paper is to evaluate effectiveness of random forest classifier on bearing fault diagnosis. The obtained results compared with the existing artificial intelligence techniques, neural network. The analysis of results shows the better performance and higher accuracy than the well existing techniques.

  7. Random-effects linear modeling and sample size tables for two special crossover designs of average bioequivalence studies: the four-period, two-sequence, two-formulation and six-period, three-sequence, three-formulation designs.

    Science.gov (United States)

    Diaz, Francisco J; Berg, Michel J; Krebill, Ron; Welty, Timothy; Gidal, Barry E; Alloway, Rita; Privitera, Michael

    2013-12-01

    Due to concern and debate in the epilepsy medical community and to the current interest of the US Food and Drug Administration (FDA) in revising approaches to the approval of generic drugs, the FDA is currently supporting ongoing bioequivalence studies of antiepileptic drugs, the EQUIGEN studies. During the design of these crossover studies, the researchers could not find commercial or non-commercial statistical software that quickly allowed computation of sample sizes for their designs, particularly software implementing the FDA requirement of using random-effects linear models for the analyses of bioequivalence studies. This article presents tables for sample-size evaluations of average bioequivalence studies based on the two crossover designs used in the EQUIGEN studies: the four-period, two-sequence, two-formulation design, and the six-period, three-sequence, three-formulation design. Sample-size computations assume that random-effects linear models are used in bioequivalence analyses with crossover designs. Random-effects linear models have been traditionally viewed by many pharmacologists and clinical researchers as just mathematical devices to analyze repeated-measures data. In contrast, a modern view of these models attributes an important mathematical role in theoretical formulations in personalized medicine to them, because these models not only have parameters that represent average patients, but also have parameters that represent individual patients. Moreover, the notation and language of random-effects linear models have evolved over the years. Thus, another goal of this article is to provide a presentation of the statistical modeling of data from bioequivalence studies that highlights the modern view of these models, with special emphasis on power analyses and sample-size computations.

  8. Selective nerve root blocks vs. caudal epidural injection for single level prolapsed lumbar intervertebral disc - A prospective randomized study.

    Science.gov (United States)

    Singh, Sudhir; Kumar, Sanjiv; Chahal, Gaurav; Verma, Reetu

    2017-01-01

    Chronic lumbar radiculopathy has a lifetime prevalence of 5.3% in men and 3.7% in women. It usually resolves spontaneously, but up to 30% cases will have pronounced symptoms even after one year. A prospective randomized single-blind study was conducted to compare the efficacy of caudal epidural steroid injection and selective nerve root block in management of pain and disability in cases of lumbar disc herniation. Eighty patients with confirmed single-level lumbar disc herniation were equally divided in two groups: (a) caudal epidural and (b) selective nerve root block group, by a computer-generated random allocation method. The caudal group received three injections of steroid mixed with local anesthetics while selective nerve root block group received single injection of steroid mixed with local anesthetic agent. Patients were assessed for pain relief and reduction in disability. In SNRB group, pain reduced by more than 50% up till 6 months, while in caudal group more than 50% reduction of pain was maintained till 1 year. The reduction in ODI in SNRB group was 52.8% till 3 months, 48.6% till 6 months, and 46.7% at 1 year, while in caudal group the improvement was 59.6%, 64.6%, 65.1%, and 65.4% at corresponding follow-up periods, respectively. Caudal epidural block is an easy and safe method with better pain relief and improvement in functional disability than selective nerve root block. Selective nerve root block injection is technically more demanding and has to be given by a skilled anesthetist.

  9. Major histocompatibility complex class II DOA sequences from three Antarctic seal species verify stabilizing selection on the DO locus.

    Science.gov (United States)

    Decker, D J; Stewart, B S; Lehman, N

    2002-12-01

    To provide additional support for the sequence conservation and hence the regulatory role of the MHC class II DOA locus, we obtained the nucleotide sequences of exon 2 and exon 3, along with the intervening intron, of the Ross seal, and sequences from the exon 2 region from the Weddell and leopard seals. These are the first reports of the sequences of this locus from a carnivore species. The results demonstrate strong conservation among mammals for the exon sequence and produce a gene genealogy that is consistent in topology with a species tree.

  10. Mapping sequences by parts

    Directory of Open Access Journals (Sweden)

    Guziolowski Carito

    2007-09-01

    Full Text Available Abstract Background: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. Results: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N using O (|s| × |t| × N memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. Practical Application: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events.

  11. Development and comparison of algorithms for generating a scan sequence for a random access scanner. [ZAP (and flow charts for ZIP and SCAN), in FORTRAN for DEC-10

    Energy Technology Data Exchange (ETDEWEB)

    Eason, R. O.

    1980-09-01

    Many data acquisition systems incorporate high-speed scanners to convert analog signals into digital format for further processing. Some systems multiplex many channels into a single scanner. A random access scanner whose scan sequence is specified by a table in random access memory will permit different scan rates on different channels. Generation of this scan table can be a tedious manual task when there are many channels (e.g. 50), when there are more than a few scan rates (e.g. 5), and/or when the ratio of the highest scan rate to the lowest scan rate becomes large (e.g. 100:1). An algorithm is developed which will generate these scan sequences for the random access scanner and implements the algorithm on a digital computer. Application of number theory to the mathematical statement of the problem led to development of several algorithms which were implemented in FORTRAN. The most efficient of these algorithms operates by partitioning the problem into a set of subproblems. Through recursion they solve each subproblem by partitioning it repeatedly into even smaller parts, continuing until a set of simple problems is created. From this process, a pictorial representation or wheel diagram of the problem can be constructed. From the wheel diagram and a description of the original problem, a scan table can be constructed. In addition, the wheel diagram can be used as a method of storing the scan sequence in a smaller amount of memory. The most efficient partitioning algorithm solved most scan table problems in less than a second of CPU time. Some types of problems, however, required as much as a few minutes of CPU time. 26 figures, 2 tables.

  12. Design of Long Period Pseudo-Random Sequences from the Addition of m -Sequences over 𝔽 p

    OpenAIRE

    Ren Jian

    2004-01-01

    Pseudo-random sequence with good correlation property and large linear span is widely used in code division multiple access (CDMA) communication systems and cryptology for reliable and secure information transmission. In this paper, sequences with long period, large complexity, balance statistics, and low cross-correlation property are constructed from the addition of m -sequences with pairwise-prime linear spans (AMPLS). Using m -sequences as building blocks, the proposed method proved to...

  13. Determining Phylogenetic Relationships Among Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeat (ISSR) Markers.

    Science.gov (United States)

    Haider, Nadia

    2017-01-01

    Investigation of genetic variation and phylogenetic relationships among date palm (Phoenix dactylifera L.) cultivars is useful for their conservation and genetic improvement. Various molecular markers such as restriction fragment length polymorphisms (RFLPs), simple sequence repeat (SSR), representational difference analysis (RDA), and amplified fragment length polymorphism (AFLP) have been developed to molecularly characterize date palm cultivars. PCR-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) are powerful tools to determine the relatedness of date palm cultivars that are difficult to distinguish morphologically. In this chapter, the principles, materials, and methods of RAPD and ISSR techniques are presented. Analysis of data generated from these two techniques and the use of these data to reveal phylogenetic relationships among date palm cultivars are also discussed.

  14. Selection of locations of knots for linear splines in random regression test-day models.

    Science.gov (United States)

    Jamrozik, J; Bohmanova, J; Schaeffer, L R

    2010-04-01

    Using spline functions (segmented polynomials) in regression models requires the knowledge of the location of the knots. Knots are the points at which independent linear segments are connected. Optimal positions of knots for linear splines of different orders were determined in this study for different scenarios, using existing estimates of covariance functions and an optimization algorithm. The traits considered were test-day milk, fat and protein yields, and somatic cell score (SCS) in the first three lactations of Canadian Holsteins. Two ranges of days in milk (from 5 to 305 and from 5 to 365) were taken into account. In addition, four different populations of Holstein cows, from Australia, Canada, Italy and New Zealand, were examined with respect to first lactation (305 days) milk only. The estimates of genetic and permanent environmental covariance functions were based on single- and multiple-trait test-day models, with Legendre polynomials of order 4 as random regressions. A differential evolution algorithm was applied to find the best location of knots for splines of orders 4 to 7 and the criterion for optimization was the goodness-of-fit of the spline covariance function. Results indicated that the optimal position of knots for linear splines differed between genetic and permanent environmental effects, as well as between traits and lactations. Different populations also exhibited different patterns of optimal knot locations. With linear splines, different positions of knots should therefore be used for different effects and traits in random regression test-day models when analysing milk production traits.

  15. A Randomized Controlled Trial of Cognitive Debiasing Improves Assessment and Treatment Selection for Pediatric Bipolar Disorder

    Science.gov (United States)

    Jenkins, Melissa M.; Youngstrom, Eric A.

    2015-01-01

    Objective This study examined the efficacy of a new cognitive debiasing intervention in reducing decision-making errors in the assessment of pediatric bipolar disorder (PBD). Method The study was a randomized controlled trial using case vignette methodology. Participants were 137 mental health professionals working in different regions of the US (M=8.6±7.5 years of experience). Participants were randomly assigned to a (1) brief overview of PBD (control condition), or (2) the same brief overview plus a cognitive debiasing intervention (treatment condition) that educated participants about common cognitive pitfalls (e.g., base-rate neglect; search satisficing) and taught corrective strategies (e.g., mnemonics, Bayesian tools). Both groups evaluated four identical case vignettes. Primary outcome measures were clinicians’ diagnoses and treatment decisions. The vignette characters’ race/ethnicity was experimentally manipulated. Results Participants in the treatment group showed better overall judgment accuracy, p < .001, and committed significantly fewer decision-making errors, p < .001. Inaccurate and somewhat accurate diagnostic decisions were significantly associated with different treatment and clinical recommendations, particularly in cases where participants missed comorbid conditions, failed to detect the possibility of hypomania or mania in depressed youths, and misdiagnosed classic manic symptoms. In contrast, effects of patient race were negligible. Conclusions The cognitive debiasing intervention outperformed the control condition. Examining specific heuristics in cases of PBD may identify especially problematic mismatches between typical habits of thought and characteristics of the disorder. The debiasing intervention was brief and delivered via the Web; it has the potential to generalize and extend to other diagnoses as well as to various practice and training settings. PMID:26727411

  16. Genome-wide association data classification and SNPs selection using two-stage quality-based Random Forests.

    Science.gov (United States)

    Nguyen, Thanh-Tung; Huang, Joshua; Wu, Qingyao; Nguyen, Thuy; Li, Mark

    2015-01-01

    Single-nucleotide polymorphisms (SNPs) selection and identification are the most important tasks in Genome-wide association data analysis. The problem is difficult because genome-wide association data is very high dimensional and a large portion of SNPs in the data is irrelevant to the disease. Advanced machine learning methods have been successfully used in Genome-wide association studies (GWAS) for identification of genetic variants that have relatively big effects in some common, complex diseases. Among them, the most successful one is Random Forests (RF). Despite of performing well in terms of prediction accuracy in some data sets with moderate size, RF still suffers from working in GWAS for selecting informative SNPs and building accurate prediction models. In this paper, we propose to use a new two-stage quality-based sampling method in random forests, named ts-RF, for SNP subspace selection for GWAS. The method first applies p-value assessment to find a cut-off point that separates informative and irrelevant SNPs in two groups. The informative SNPs group is further divided into two sub-groups: highly informative and weak informative SNPs. When sampling the SNP subspace for building trees for the forest, only those SNPs from the two sub-groups are taken into account. The feature subspaces always contain highly informative SNPs when used to split a node at a tree. This approach enables one to generate more accurate trees with a lower prediction error, meanwhile possibly avoiding overfitting. It allows one to detect interactions of multiple SNPs with the diseases, and to reduce the dimensionality and the amount of Genome-wide association data needed for learning the RF model. Extensive experiments on two genome-wide SNP data sets (Parkinson case-control data comprised of 408,803 SNPs and Alzheimer case-control data comprised of 380,157 SNPs) and 10 gene data sets have demonstrated that the proposed model significantly reduced prediction errors and outperformed

  17. Role of selective V2-receptor-antagonism in septic shock: a randomized, controlled, experimental study

    OpenAIRE

    Rehberg, Sebastian; Ertmer, Christian; Lange, Matthias; Morelli, Andrea; Whorton, Elbert; Strohhäcker, Anne-Katrin; Dünser, Martin Wolfgang; Lipke, Erik; Kampmeier, Tim G; Aken, Hugo; Traber, Daniel L; Westphal, Martin

    2010-01-01

    ABSTRACT : INTRODUCTION : V2-receptor (V2R) stimulation potentially aggravates sepsis-induced vasodilation, fluid accumulation and microvascular thrombosis. Therefore, the present study was performed to determine the effects of a first-line therapy with the selective V2R-antagonist (Propionyl1-D-Tyr(Et)2-Val4-Abu6-Arg8,9)-Vasopressin on cardiopulmonary hemodynamics and organ function vs. the mixed V1aR/V2R-agonist arginine vasopressin (AVP) or placebo in an established ovine model of septic s...

  18. Conversion of the random amplified polymorphic DNA (RAPD ...

    African Journals Online (AJOL)

    Conversion of the random amplified polymorphic DNA (RAPD) marker UBC#116 linked to Fusarium crown and root rot resistance gene (Frl) into a co-dominant sequence characterized amplified region (SCAR) marker for marker-assisted selection of tomato.

  19. Host cell selection of Murray Valley encephalitis virus variants altered at an RGD sequence in the envelope protein and in mouse virulence.

    Science.gov (United States)

    Lobigs, M; Usha, R; Nestorowicz, A; Marshall, I D; Weir, R C; Dalgarno, L

    1990-06-01

    We have passaged the prototype strain of Murray Valley encephalitis virus in SW13 (human) cells, sequenced the E and M genes, and examined the virulence of the passaged virus for 21-day-old mice following intracranial and intraperitoneal inoculation. Six independent passage series were carried out: four in the presence of mouse hyperimmune ascitic fluid and two without antibody. Changes were observed in the E protein deduced amino acid sequence for each of the six 10th passage stocks sequenced. Eleven changes were observed in total for the six stocks sequenced; these were at residues 117, 118, 390, 423, and 460. Nine of the changes were nonconservative. Five of the six passaged variants were altered at Asp 390 which is part of an Arg-Gly-Asp (RGD) sequence. This change resulted from adaptation to SW13 cells rather than from antibody selection. The RGD sequence (and residue 423) falls within a region which is highly conserved between flaviviruses and is strongly hydrophilic. All five variants which were altered at Asp 390 were attenuated in 21-day-old mice following i.p. inoculation. We propose that the domain of E encompassing the RGD sequence is an important determinant of flavivirus pathogenicity.

  20. Conflicts of Interest, Selective Inertia, and Research Malpractice in Randomized Clinical Trials: An Unholy Trinity.

    Science.gov (United States)

    Berger, Vance W

    2015-08-01

    Recently a great deal of attention has been paid to conflicts of interest in medical research, and the Institute of Medicine has called for more research into this important area. One research question that has not received sufficient attention concerns the mechanisms of action by which conflicts of interest can result in biased and/or flawed research. What discretion do conflicted researchers have to sway the results one way or the other? We address this issue from the perspective of selective inertia, or an unnatural selection of research methods based on which are most likely to establish the preferred conclusions, rather than on which are most valid. In many cases it is abundantly clear that a method that is not being used in practice is superior to the one that is being used in practice, at least from the perspective of validity, and that it is only inertia, as opposed to any serious suggestion that the incumbent method is superior (or even comparable), that keeps the inferior procedure in use, to the exclusion of the superior one. By focusing on these flawed research methods we can go beyond statements of potential harm from real conflicts of interest, and can more directly assess actual (not potential) harm.

  1. Participant-selected music and physical activity in older adults following cardiac rehabilitation: a randomized controlled trial.

    Science.gov (United States)

    Clark, Imogen N; Baker, Felicity A; Peiris, Casey L; Shoebridge, Georgie; Taylor, Nicholas F

    2017-03-01

    To evaluate effects of participant-selected music on older adults' achievement of activity levels recommended in the physical activity guidelines following cardiac rehabilitation. A parallel group randomized controlled trial with measurements at Weeks 0, 6 and 26. A multisite outpatient rehabilitation programme of a publicly funded metropolitan health service. Adults aged 60 years and older who had completed a cardiac rehabilitation programme. Experimental participants selected music to support walking with guidance from a music therapist. Control participants received usual care only. The primary outcome was the proportion of participants achieving activity levels recommended in physical activity guidelines. Secondary outcomes compared amounts of physical activity, exercise capacity, cardiac risk factors, and exercise self-efficacy. A total of 56 participants, mean age 68.2 years (SD = 6.5), were randomized to the experimental ( n = 28) and control groups ( n = 28). There were no differences between groups in proportions of participants achieving activity recommended in physical activity guidelines at Week 6 or 26. Secondary outcomes demonstrated between-group differences in male waist circumference at both measurements (Week 6 difference -2.0 cm, 95% CI -4.0 to 0; Week 26 difference -2.8 cm, 95% CI -5.4 to -0.1), and observed effect sizes favoured the experimental group for amounts of physical activity (d = 0.30), exercise capacity (d = 0.48), and blood pressure (d = -0.32). Participant-selected music did not increase the proportion of participants achieving recommended amounts of physical activity, but may have contributed to exercise-related benefits.

  2. Content analysis of a stratified random selection of JVME articles: 1974-2004.

    Science.gov (United States)

    Olson, Lynne E

    2011-01-01

    A content analysis was performed on a random sample (N = 168) of 25% of the articles published in the Journal of Veterinary Medical Education (JVME) per year from 1974 through 2004. Over time, there were increased numbers of authors per paper, more cross-institutional collaborations, greater prevalence of references or endnotes, and lengthier articles, which could indicate a trend toward publications describing more complex or complete work. The number of first authors that could be identified as female was greatest for the most recent time period studied (2000-2004). Two different categorization schemes were created to assess the content of the publications. The first categorization scheme identified the most frequently published topics as admissions, descriptions of courses, the effect of changing teaching methods, issues facing the profession, and examples of uses of technology. The second categorization scheme identified the subset of articles that described medical education research on the basis of the purpose of the research, which represented only 14% of the sample articles (24 of 168). Of that group, only three of 24, or 12%, represented studies based on a firm conceptual framework that could be confirmed or refuted by the study's results. The results indicate that JVME is meeting its broadly based mission and that publications in the veterinary medical education literature have features common to publications in medicine and medical education.

  3. Capturing the Flatness of a peer-to-peer lending network through random and selected perturbations

    Science.gov (United States)

    Karampourniotis, Panagiotis D.; Singh, Pramesh; Uparna, Jayaram; Horvat, Emoke-Agnes; Szymanski, Boleslaw K.; Korniss, Gyorgy; Bakdash, Jonathan Z.; Uzzi, Brian

    Null models are established tools that have been used in network analysis to uncover various structural patterns. They quantify the deviance of an observed network measure to that given by the null model. We construct a null model for weighted, directed networks to identify biased links (carrying significantly different weights than expected according to the null model) and thus quantify the flatness of the system. Using this model, we study the flatness of Kiva, a large international crownfinancing network of borrowers and lenders, aggregated to the country level. The dataset spans the years from 2006 to 2013. Our longitudinal analysis shows that flatness of the system is reducing over time, meaning the proportion of biased inter-country links is growing. We extend our analysis by testing the robustness of the flatness of the network in perturbations on the links' weights or the nodes themselves. Examples of such perturbations are event shocks (e.g. erecting walls) or regulatory shocks (e.g. Brexit). We find that flatness is unaffected by random shocks, but changes after shocks target links with a large weight or bias. The methods we use to capture the flatness are based on analytics, simulations, and numerical computations using Shannon's maximum entropy. Supported by ARL NS-CTA.

  4. Benefits of Selected Physical Exercise Programs in Detention: A Randomized Controlled Study

    Directory of Open Access Journals (Sweden)

    Claudia Battaglia

    2013-10-01

    Full Text Available The aim of the study was to determine which kind of physical activity could be useful to inmate populations to improve their health status and fitness levels. A repeated measure design was used to evaluate the effects of two different training protocols on subjects in a state of detention, tested pre- and post-experimental protocol.Seventy-five male subjects were enrolled in the studyand randomly allocated to three groups: the cardiovascular plus resistance training protocol group (CRT (n = 25; mean age 30.9 ± 8.9 years,the high-intensity strength training protocol group (HIST (n = 25; mean age 33.9 ± 6.8 years, and a control group (C (n = 25; mean age 32.9 ± 8.9 years receiving no treatment. All subjects underwent a clinical assessmentandfitness tests. MANOVA revealed significant multivariate effects on group (p < 0.01 and group-training interaction (p < 0.05. CRT protocol resulted the most effective protocol to reach the best outcome in fitness tests. Both CRT and HIST protocols produced significant gains in the functional capacity (cardio-respiratory capacity and cardiovascular disease risk decrease of incarcerated males. The significant gains obtained in functional capacity reflect the great potential of supervised exercise interventions for improving the health status of incarcerated people.

  5. Evolution towards ergodic behavior of stationary fractal random processes with memory: application to the study of long-range correlations of nucleotide sequences in DNA

    Science.gov (United States)

    Vlad, Marcel Ovidiu; Schönfisch, Birgitt; Mackey, Michael C.

    1996-02-01

    The possible occurrence of ergodic behavior for large times is investigated in the case of stationary random processes with memory. It is shown that for finite times the time average of a state function is generally a random variable and thus two types of cumulants can be introduced: for the time average and for the statistical ensemble, respectively. In the limit of infinite time a transition from the random to the deterministic behavior of the time average may occur, resulting in an ergodic behavior. The conditions of occurrence of this transition are investigated by analyzing the scaling behavior of the cumulants of the time average. A general approach for the computation of these cumulants is developed; explicit computations are presented both for short and long memory in the particular case of separable stationary processes for which the cumulants of a statistical ensemble can be factorized into products of functions depending on binary time differences. In both cases the ergodic behavior emerges for large times provided that the cumulants of a statistical ensemble decrease to zero as the time differences increase to infinity. The analysis leads to the surprising conclusion that the scaling behavior of the cumulants of the time average is relatively insensitive to the type of memory considered: both for short and long memory the cumulants of the time average obey inverse different from zero for large time differences, then the time averaage is random even as the length of the total time interval tends to infinity and the ergodic behavior no longer holds. The theory is applied to the study of long range correlations of nucleotide sequences in DNA; in this case the length t of a sequence of nucleotides plays the role of the time variable. A proportionality relationship is established between the cumulants of the pyrimidine excess in a sequence of length t and the cumulants of the time (length) average of the probability of occurrence of a pyrimidine. It is shown

  6. Reduced plasma aldosterone concentrations in randomly selected patients with insulin-dependent diabetes mellitus.

    LENUS (Irish Health Repository)

    Cronin, C C

    2012-02-03

    Abnormalities of the renin-angiotensin system have been reported in patients with diabetes mellitus and with diabetic complications. In this study, plasma concentrations of prorenin, renin, and aldosterone were measured in a stratified random sample of 110 insulin-dependent (Type 1) diabetic patients attending our outpatient clinic. Fifty-four age- and sex-matched control subjects were also examined. Plasma prorenin concentration was higher in patients without complications than in control subjects when upright (geometric mean (95% confidence intervals (CI): 75.9 (55.0-105.6) vs 45.1 (31.6-64.3) mU I-1, p < 0.05). There was no difference in plasma prorenin concentration between patients without and with microalbuminuria and between patients without and with background retinopathy. Plasma renin concentration, both when supine and upright, was similar in control subjects, in patients without complications, and in patients with varying degrees of diabetic microangiopathy. Plasma aldosterone was suppressed in patients without complications in comparison to control subjects (74 (58-95) vs 167 (140-199) ng I-1, p < 0.001) and was also suppressed in patients with microvascular disease. Plasma potassium was significantly higher in patients than in control subjects (mean +\\/- standard deviation: 4.10 +\\/- 0.36 vs 3.89 +\\/- 0.26 mmol I-1; p < 0.001) and plasma sodium was significantly lower (138 +\\/- 4 vs 140 +\\/- 2 mmol I-1; p < 0.001). We conclude that plasma prorenin is not a useful early marker for diabetic microvascular disease. Despite apparently normal plasma renin concentrations, plasma aldosterone is suppressed in insulin-dependent diabetic patients.

  7. A Permutation Importance-Based Feature Selection Method for Short-Term Electricity Load Forecasting Using Random Forest

    Directory of Open Access Journals (Sweden)

    Nantian Huang

    2016-09-01

    Full Text Available The prediction accuracy of short-term load forecast (STLF depends on prediction model choice and feature selection result. In this paper, a novel random forest (RF-based feature selection method for STLF is proposed. First, 243 related features were extracted from historical load data and the time information of prediction points to form the original feature set. Subsequently, the original feature set was used to train an RF as the original model. After the training process, the prediction error of the original model on the test set was recorded and the permutation importance (PI value of each feature was obtained. Then, an improved sequential backward search method was used to select the optimal forecasting feature subset based on the PI value of each feature. Finally, the optimal forecasting feature subset was used to train a new RF model as the final prediction model. Experiments showed that the prediction accuracy of RF trained by the optimal forecasting feature subset was higher than that of the original model and comparative models based on support vector regression and artificial neural network.

  8. Effectiveness of a selective, personality-targeted prevention program for adolescent alcohol use and misuse: a cluster randomized controlled trial.

    Science.gov (United States)

    Conrod, Patricia J; O'Leary-Barrett, Maeve; Newton, Nicola; Topper, Lauren; Castellanos-Ryan, Natalie; Mackie, Clare; Girard, Alain

    2013-03-01

    Selective school-based alcohol prevention programs targeting youth with personality risk factors for addiction and mental health problems have been found to reduce substance use and misuse in those with elevated personality profiles. To report 24-month outcomes of the Teacher-Delivered Personality-Targeted Interventions for Substance Misuse Trial (Adventure trial) in which school staff were trained to provide interventions to students with 1 of 4 high-risk (HR) profiles: anxiety sensitivity, hopelessness, impulsivity, and sensation seeking and to examine the indirect herd effects of this program on the broader low-risk (LR) population of students who were not selected for intervention. Cluster randomized controlled trial. Secondary schools in London, United Kingdom. A total of 1210 HR and 1433 LR students in the ninth grade (mean [SD] age, 13.7 [0.33] years). Schools were randomized to provide brief personality-targeted interventions to HR youth or treatment as usual (statutory drug education in class). Participants were assessed for drinking, binge drinking, and problem drinking before randomization and at 6-monthly intervals for 2 years. Two-part latent growth models indicated long-term effects of the intervention on drinking rates (β = -0.320, SE = 0.145, P = .03) and binge drinking rates (β = -0.400, SE = 0.179, P = .03) and growth in binge drinking (β = -0.716, SE = 0.274, P = .009) and problem drinking (β = -0.452, SE = 0.193, P = .02) for HR youth. The HR youth were also found to benefit from the interventions during the 24-month follow-up on drinking quantity (β = -0.098, SE = 0.047, P = .04), growth in drinking quantity (β = -0.176, SE = 0.073, P = .02), and growth in binge drinking frequency (β = -0.183, SE = 0.092, P = .047). Some herd effects in LR youth were observed, specifically on drinking rates (β = -0.259, SE = 0.132, P = .049) and growth of binge drinking (β = -0.244, SE = 0.073, P = .001), during the 24-month follow-up. Findings further

  9. Sensitive and selective amplification of methylated DNA sequences using helper-dependent chain reaction in combination with a methylation-dependent restriction enzymes.

    Science.gov (United States)

    Rand, Keith N; Young, Graeme P; Ho, Thu; Molloy, Peter L

    2013-01-07

    We have developed a novel technique for specific amplification of rare methylated DNA fragments in a high background of unmethylated sequences that avoids the need of bisulphite conversion. The methylation-dependent restriction enzyme GlaI is used to selectively cut methylated DNA. Then targeted fragments are tagged using specially designed 'helper' oligonucleotides that are also used to maintain selection in subsequent amplification cycles in a process called 'helper-dependent chain reaction'. The process uses disabled primers called 'drivers' that can only prime on each cycle if the helpers recognize specific sequences within the target amplicon. In this way, selection for the sequence of interest is maintained throughout the amplification, preventing amplification of unwanted sequences. Here we show how the method can be applied to methylated Septin 9, a promising biomarker for early diagnosis of colorectal cancer. The GlaI digestion and subsequent amplification can all be done in a single tube. A detection sensitivity of 0.1% methylated DNA in a background of unmethylated DNA was achieved, which was similar to the well-established Heavy Methyl method that requires bisulphite-treated DNA.

  10. Preference option randomized design (PORD) for comparative effectiveness research: Statistical power for testing comparative effect, preference effect, selection effect, intent-to-treat effect, and overall effect.

    Science.gov (United States)

    Heo, Moonseong; Meissner, Paul; Litwin, Alain H; Arnsten, Julia H; McKee, M Diane; Karasz, Alison; McKinley, Paula; Rehm, Colin D; Chambers, Earle C; Yeh, Ming-Chin; Wylie-Rosett, Judith

    2017-01-01

    Comparative effectiveness research trials in real-world settings may require participants to choose between preferred intervention options. A randomized clinical trial with parallel experimental and control arms is straightforward and regarded as a gold standard design, but by design it forces and anticipates the participants to comply with a randomly assigned intervention regardless of their preference. Therefore, the randomized clinical trial may impose impractical limitations when planning comparative effectiveness research trials. To accommodate participants' preference if they are expressed, and to maintain randomization, we propose an alternative design that allows participants' preference after randomization, which we call a "preference option randomized design (PORD)". In contrast to other preference designs, which ask whether or not participants consent to the assigned intervention after randomization, the crucial feature of preference option randomized design is its unique informed consent process before randomization. Specifically, the preference option randomized design consent process informs participants that they can opt out and switch to the other intervention only if after randomization they actively express the desire to do so. Participants who do not independently express explicit alternate preference or assent to the randomly assigned intervention are considered to not have an alternate preference. In sum, preference option randomized design intends to maximize retention, minimize possibility of forced assignment for any participants, and to maintain randomization by allowing participants with no or equal preference to represent random assignments. This design scheme enables to define five effects that are interconnected with each other through common design parameters-comparative, preference, selection, intent-to-treat, and overall/as-treated-to collectively guide decision making between interventions. Statistical power functions for testing

  11. Comparative Evolutionary Histories of the Fungal Chitinase Gene Family Reveal Non-Random Size Expansions and Contractions due to Adaptive Natural Selection

    Directory of Open Access Journals (Sweden)

    Jan Stenlid

    2008-01-01

    Full Text Available Gene duplication and loss play an important role in the evolution of novel functions and for shaping an organism’s gene content. Recently, it was suggested that stress-related genes frequently are exposed to duplications and losses, while growth-related genes show selection against change in copy number. The fungal chitinase gene family constitutes an interesting case study of gene duplication and loss, as their biological roles include growth and development as well as more stress-responsive functions. We used genome sequence data to analyze the size of the chitinase gene family in different fungal taxa, which range from 1 in Batrachochytrium dendrobatidis and Schizosaccharomyces pombe to 20 in Hypocrea jecorina and Emericella nidulans, and to infer their phylogenetic relationships. Novel chitinase subgroups are identified and their phylogenetic relationships with previously known chitinases are discussed. We also employ a stochastic birth and death model to show that the fungal chitinase gene family indeed evolves non-randomly, and we identify six fungal lineages where larger-than-expected expansions (Pezizomycotina, H. jecorina, Gibberella zeae, Uncinocarpus reesii, E. nidulans and Rhizopus oryzae, and two contractions (Coccidioides immitis and S. pombe potentially indicate the action of adaptive natural selection. The results indicate that antagonistic fungal-fungal interactions are an important process for soil borne ascomycetes, but not for fungal species that are pathogenic in humans. Unicellular growth is correlated with a reduction of chitinase gene copy numbers which emphasizes the requirement of the combined action of several chitinases for filamentous growth.

  12. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    Energy Technology Data Exchange (ETDEWEB)

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by

  13. Sequence-selective interaction of the minor-groove interstrand cross-linking agent SJG-136 with naked and cellular DNA: footprinting and enzyme inhibition studies.

    Science.gov (United States)

    Martin, Chris; Ellis, Tom; McGurk, Claire J; Jenkins, Terence C; Hartley, John A; Waring, Michael J; Thurston, David E

    2005-03-22

    SJG-136 (3) is a novel pyrrolobenzodiazepine (PBD) dimer that is predicted from molecular models to bind in the minor groove of DNA and to form sequence-selective interstrand cross-links at 5'-Pu-GATC-Py-3' (Pu = purine; Py = pyrimidine) sites through covalent bonding between each PBD unit and guanines on opposing strands. Footprinting studies have confirmed that high-affinity adducts do form at 5'-G-GATC-C-3' sequences and that these can inhibit RNA polymerase in a sequence-selective manner. At higher concentrations of SJG-136, bands that migrate more slowly than one of the 5'-G-GATC-C-3' footprint sites show significantly reduced intensity, concomitant with the appearance of higher molecular weight material near the gel origin. This phenomenon is attributed to interstrand cross-linking at the 5'-G-GATC-C-3' site and is the first report of DNA footprinting being used to detect interstrand cross-linked adducts. The control dimer GD113 (4), of similar structure to SJG-136 but unable to cross-link DNA due to its C7/C7'-linkage rather than C8/C8'-linkage, neither produces footprints with the same DNA sequence nor blocks transcription at comparable concentrations. In addition to the two high-affinity 5'-G-GATC-C-3' footprints on the MS2 DNA sequence, other SJG-136 adducts of lower affinity are observed that can still block transcription but with lower efficiency. All these sites contain the 5'-GXXC-3' motif (where XX includes AG, TA, GC, CT, TT, GG, and TC) and represent less-favored cross-link sites. In time-course experiments, SJG-136 blocks transcription if incubated with a double-stranded DNA template before the transcription components are added; addition after transcription is initiated fails to elicit blockage. Single-strand ligation PCR studies on a sequence from the c-jun gene show that SJG-136 binds to 5'-GAAC-3'/5'-GTTC-3' (preferred) or 5'-GAGC-3'/5'-GCTC-3' sequences. Significantly, adducts are obtained at the same sequences following extraction of DNA

  14. Single-chain lipopeptide vaccines for the induction of virus-specific cytotoxic T cell responses in randomly selected populations.

    Science.gov (United States)

    Gras-Masse, H

    2001-12-01

    Effective vaccine development is now taking advantage of the rapidly accumulating information concerning the molecular basis of a protective immune response. Analysts and medicinal chemists have joined forces with immunologists and taken up the clear challenge of identifying immunologically active structural elements and synthesizing them in pure, reproducible forms. Current literature reveals the growing interest for extremely reductionist approaches aiming at producing totally synthetic vaccines that would be fully defined at the molecular level and particularly safe. The sequential information contained in these formulations tends to be minimized to those epitopes which elicit neutralizing antibodies, or cell-mediated responses. In the following review, we describe some of our results in developing fully synthetic, clinically acceptable lipopeptide vaccines for inducing cytotoxic T lymphocytes (CTL) responses in randomly selected populations.

  15. Selecting Optimal Random Forest Predictive Models: A Case Study on Predicting the Spatial Distribution of Seabed Hardness

    Science.gov (United States)

    Li, Jin; Tran, Maggie; Siwabessy, Justy

    2016-01-01

    Spatially continuous predictions of seabed hardness are important baseline environmental information for sustainable management of Australia’s marine jurisdiction. Seabed hardness is often inferred from multibeam backscatter data with unknown accuracy and can be inferred from underwater video footage at limited locations. In this study, we classified the seabed into four classes based on two new seabed hardness classification schemes (i.e., hard90 and hard70). We developed optimal predictive models to predict seabed hardness using random forest (RF) based on the point data of hardness classes and spatially continuous multibeam data. Five feature selection (FS) methods that are variable importance (VI), averaged variable importance (AVI), knowledge informed AVI (KIAVI), Boruta and regularized RF (RRF) were tested based on predictive accuracy. Effects of highly correlated, important and unimportant predictors on the accuracy of RF predictive models were examined. Finally, spatial predictions generated using the most accurate models were visually examined and analysed. This study confirmed that: 1) hard90 and hard70 are effective seabed hardness classification schemes; 2) seabed hardness of four classes can be predicted with a high degree of accuracy; 3) the typical approach used to pre-select predictive variables by excluding highly correlated variables needs to be re-examined; 4) the identification of the important and unimportant predictors provides useful guidelines for further improving predictive models; 5) FS methods select the most accurate predictive model(s) instead of the most parsimonious ones, and AVI and Boruta are recommended for future studies; and 6) RF is an effective modelling method with high predictive accuracy for multi-level categorical data and can be applied to ‘small p and large n’ problems in environmental sciences. Additionally, automated computational programs for AVI need to be developed to increase its computational efficiency and

  16. Selecting Optimal Random Forest Predictive Models: A Case Study on Predicting the Spatial Distribution of Seabed Hardness.

    Directory of Open Access Journals (Sweden)

    Jin Li

    Full Text Available Spatially continuous predictions of seabed hardness are important baseline environmental information for sustainable management of Australia's marine jurisdiction. Seabed hardness is often inferred from multibeam backscatter data with unknown accuracy and can be inferred from underwater video footage at limited locations. In this study, we classified the seabed into four classes based on two new seabed hardness classification schemes (i.e., hard90 and hard70. We developed optimal predictive models to predict seabed hardness using random forest (RF based on the point data of hardness classes and spatially continuous multibeam data. Five feature selection (FS methods that are variable importance (VI, averaged variable importance (AVI, knowledge informed AVI (KIAVI, Boruta and regularized RF (RRF were tested based on predictive accuracy. Effects of highly correlated, important and unimportant predictors on the accuracy of RF predictive models were examined. Finally, spatial predictions generated using the most accurate models were visually examined and analysed. This study confirmed that: 1 hard90 and hard70 are effective seabed hardness classification schemes; 2 seabed hardness of four classes can be predicted with a high degree of accuracy; 3 the typical approach used to pre-select predictive variables by excluding highly correlated variables needs to be re-examined; 4 the identification of the important and unimportant predictors provides useful guidelines for further improving predictive models; 5 FS methods select the most accurate predictive model(s instead of the most parsimonious ones, and AVI and Boruta are recommended for future studies; and 6 RF is an effective modelling method with high predictive accuracy for multi-level categorical data and can be applied to 'small p and large n' problems in environmental sciences. Additionally, automated computational programs for AVI need to be developed to increase its computational efficiency and

  17. Data for amino acid alignment of Japanese stingray melanocortin receptors with other gnathostome melanocortin receptor sequences, and the ligand selectivity of Japanese stingray melanocortin receptors

    Directory of Open Access Journals (Sweden)

    Akiyoshi Takahashi

    2016-06-01

    Full Text Available This article contains structure and pharmacological characteristics of melanocortin receptors (MCRs related to research published in “Characterization of melanocortin receptors from stingray Dasyatis akajei, a cartilaginous fish” (Takahashi et al., 2016 [1]. The amino acid sequences of the stingray, D. akajei, MC1R, MC2R, MC3R, MC4R, and MC5R were aligned with the corresponding melanocortin receptor sequences from the elephant shark, Callorhinchus milii, the dogfish, Squalus acanthias, the goldfish, Carassius auratus, and the mouse, Mus musculus. These alignments provide the basis for phylogenetic analysis of these gnathostome melanocortin receptor sequences. In addition, the Japanese stingray melanocortin receptors were separately expressed in Chinese Hamster Ovary cells, and stimulated with stingray ACTH, α-MSH, β-MSH, γ-MSH, δ-MSH, and β-endorphin. The dose response curves reveal the order of ligand selectivity for each stingray MCR.

  18. EcmPred: Prediction of extracellular matrix proteins based on random forest with maximum relevance minimum redundancy feature selection

    KAUST Repository

    Kandaswamy, Krishna Kumar Umar

    2013-01-01

    The extracellular matrix (ECM) is a major component of tissues of multicellular organisms. It consists of secreted macromolecules, mainly polysaccharides and glycoproteins. Malfunctions of ECM proteins lead to severe disorders such as marfan syndrome, osteogenesis imperfecta, numerous chondrodysplasias, and skin diseases. In this work, we report a random forest approach, EcmPred, for the prediction of ECM proteins from protein sequences. EcmPred was trained on a dataset containing 300 ECM and 300 non-ECM and tested on a dataset containing 145 ECM and 4187 non-ECM proteins. EcmPred achieved 83% accuracy on the training and 77% on the test dataset. EcmPred predicted 15 out of 20 experimentally verified ECM proteins. By scanning the entire human proteome, we predicted novel ECM proteins validated with gene ontology and InterPro. The dataset and standalone version of the EcmPred software is available at http://www.inb.uni-luebeck.de/tools-demos/Extracellular_matrix_proteins/EcmPred. © 2012 Elsevier Ltd.

  19. Randomized trial of switching from prescribed non-selective non-steroidal anti-inflammatory drugs to prescribed celecoxib

    DEFF Research Database (Denmark)

    Macdonald, Thomas M; Hawkey, Chris J; Ford, Ian

    2017-01-01

    BACKGROUND: Selective cyclooxygenase-2 inhibitors and conventional non-selective non-steroidal anti-inflammatory drugs (nsNSAIDs) have been associated with adverse cardiovascular (CV) effects. We compared the CV safety of switching to celecoxib vs. continuing nsNSAID therapy in a European setting....... METHOD: Patients aged 60 years and over with osteoarthritis or rheumatoid arthritis, free from established CV disease and taking chronic prescribed nsNSAIDs, were randomized to switch to celecoxib or to continue their previous nsNSAID. The primary endpoint was hospitalization for non-fatal myocardial...... expected developed an on-treatment (OT) primary CV event and the rate was similar for celecoxib, 0.95 per 100 patient-years, and nsNSAIDs, 0.86 per 100 patient-years (HR = 1.12, 95% confidence interval, 0.81-1.55; P = 0.50). Comparable intention-to-treat (ITT) rates were 1.14 per 100 patient...

  20. Recovery and characterization of a Citrus clementina Hort. ex Tan. 'Clemenules' haploid plant selected to establish the reference whole Citrus genome sequence.

    Science.gov (United States)

    Aleza, Pablo; Juárez, José; Hernández, María; Pina, José A; Ollitrault, Patrick; Navarro, Luis

    2009-08-22

    In recent years, the development of structural genomics has generated a growing interest in obtaining haploid plants. The use of homozygous lines presents a significant advantage for the accomplishment of sequencing projects. Commercial citrus species are characterized by high heterozygosity, making it difficult to assemble large genome sequences. Thus, the International Citrus Genomic Consortium (ICGC) decided to establish a reference whole citrus genome sequence from a homozygous plant. Due to the existence of important molecular resources and previous success in obtaining haploid clementine plants, haploid clementine was selected as the target for the implementation of the reference whole genome citrus sequence. To obtain haploid clementine lines we used the technique of in situ gynogenesis induced by irradiated pollen. Flow cytometry, chromosome counts and SSR marker (Simple Sequence Repeats) analysis facilitated the identification of six different haploid lines (2n = x = 9), one aneuploid line (2n = 2x+4 = 22) and one doubled haploid plant (2n = 2x = 18) of 'Clemenules' clementine. One of the haploids, obtained directly from an original haploid embryo, grew vigorously and produced flowers after four years. This is the first haploid plant of clementine that has bloomed and we have, for the first time, characterized the histology of haploid and diploid flowers of clementine. Additionally a double haploid plant was obtained spontaneously from this haploid line. The first haploid plant of 'Clemenules' clementine produced directly by germination of a haploid embryo, which grew vigorously and produced flowers, has been obtained in this work. This haploid line has been selected and it is being used by the ICGC to establish the reference sequence of the nuclear genome of citrus.

  1. Recovery and characterization of a Citrus clementina Hort. ex Tan. 'Clemenules' haploid plant selected to establish the reference whole Citrus genome sequence

    Directory of Open Access Journals (Sweden)

    Navarro Luis

    2009-08-01

    Full Text Available Abstract Background In recent years, the development of structural genomics has generated a growing interest in obtaining haploid plants. The use of homozygous lines presents a significant advantage for the accomplishment of sequencing projects. Commercial citrus species are characterized by high heterozygosity, making it difficult to assemble large genome sequences. Thus, the International Citrus Genomic Consortium (ICGC decided to establish a reference whole citrus genome sequence from a homozygous plant. Due to the existence of important molecular resources and previous success in obtaining haploid clementine plants, haploid clementine was selected as the target for the implementation of the reference whole genome citrus sequence. Results To obtain haploid clementine lines we used the technique of in situ gynogenesis induced by irradiated pollen. Flow cytometry, chromosome counts and SSR marker (Simple Sequence Repeats analysis facilitated the identification of six different haploid lines (2n = x = 9, one aneuploid line (2n = 2x+4 = 22 and one doubled haploid plant (2n = 2x = 18 of 'Clemenules' clementine. One of the haploids, obtained directly from an original haploid embryo, grew vigorously and produced flowers after four years. This is the first haploid plant of clementine that has bloomed and we have, for the first time, characterized the histology of haploid and diploid flowers of clementine. Additionally a double haploid plant was obtained spontaneously from this haploid line. Conclusion The first haploid plant of 'Clemenules' clementine produced directly by germination of a haploid embryo, which grew vigorously and produced flowers, has been obtained in this work. This haploid line has been selected and it is being used by the ICGC to establish the reference sequence of the nuclear genome of citrus.

  2. Analysis of genetic diversity of Sclerotinia sclerotiorum from eggplant by mycelial compatibility, random amplification of polymorphic DNA (RAPD and simple sequence repeat (SSR analyses

    Directory of Open Access Journals (Sweden)

    Fatih Mehmet Tok

    2016-09-01

    Full Text Available The genetic diversity and pathogenicity/virulence among 60 eggplant Sclerotinia sclerotiorum isolates collected from six different geographic regions of Turkey were analysed using mycelial compatibility groupings (MCGs, random amplified polymorphic DNA (RAPD and simple sequence repeat (SSR polymorphism. By MCG tests, the isolates were classified into 22 groups. Out of 22 MCGs, 36% were represented each by a single isolate. The isolates showed great variability for virulence regardless of MCG and geographic origin. Based on the results of RAPD and SSR analyses, 60 S. sclerotiorum isolates representing 22 MCGs were grouped in 2 and 3 distinct clusters, respectively. Analyses using RAPD and SSR markers illustrated that cluster groupings or genetic distance of S. sclerotiorum populations from eggplant were not distinctly relative to the MCG, geographical origin and virulence diversity. The patterns obtained revealed a high heterogeneity of genetic composition and suggested the occurrence of clonal and sexual reproduction of S. sclerotiorum on eggplant in the areas surveyed.

  3. Two-stage clustering (TSC: a pipeline for selecting operational taxonomic units for the high-throughput sequencing of PCR amplicons.

    Directory of Open Access Journals (Sweden)

    Xiao-Tao Jiang

    Full Text Available Clustering 16S/18S rRNA amplicon sequences into operational taxonomic units (OTUs is a critical step for the bioinformatic analysis of microbial diversity. Here, we report a pipeline for selecting OTUs with a relatively low computational demand and a high degree of accuracy. This pipeline is referred to as two-stage clustering (TSC because it divides tags into two groups according to their abundance and clusters them sequentially. The more abundant group is clustered using a hierarchical algorithm similar to that in ESPRIT, which has a high degree of accuracy but is computationally costly for large datasets. The rarer group, which includes the majority of tags, is then heuristically clustered to improve efficiency. To further improve the computational efficiency and accuracy, two preclustering steps are implemented. To maintain clustering accuracy, all tags are grouped into an OTU depending on their pairwise Needleman-Wunsch distance. This method not only improved the computational efficiency but also mitigated the spurious OTU estimation from 'noise' sequences. In addition, OTUs clustered using TSC showed comparable or improved performance in beta-diversity comparisons compared to existing OTU selection methods. This study suggests that the distribution of sequencing datasets is a useful property for improving the computational efficiency and increasing the clustering accuracy of the high-throughput sequencing of PCR amplicons. The software and user guide are freely available at http://hwzhoulab.smu.edu.cn/paperdata/.

  4. Fast and cost-effective single nucleotide polymorphism (SNP) detection in the absence of a reference genome using semideep next-generation Random Amplicon Sequencing (RAMseq).

    Science.gov (United States)

    Bayerl, Helmut; Kraus, Robert H S; Nowak, Carsten; Foerster, Daniel W; Fickel, Joerns; Kuehn, Ralph

    2017-09-15

    Biodiversity has suffered a dramatic global decline during the past decades, and monitoring tools are urgently needed providing data for the development and evaluation of conservation efforts both on a species and on a genetic level. However, in wild species, the assessment of genetic diversity is often hampered by the lack of suitable genetic markers. In this article, we present Random Amplicon Sequencing (RAMseq), a novel approach for fast and cost-effective detection of single nucleotide polymorphisms (SNPs) in nonmodel species by semideep sequencing of random amplicons. By applying RAMseq to the Eurasian otter (Lutra lutra), we identified 238 putative SNPs after quality filtering of all candidate loci and were able to validate 32 of 77 loci tested. In a second step, we evaluated the genotyping performance of these SNP loci in noninvasive samples, one of the most challenging genotyping applications, by comparing it with genotyping results of the same faecal samples at microsatellite markers. We compared (i) polymerase chain reaction (PCR) success rate, (ii) genotyping errors and (iii) Mendelian inheritance (population parameters). SNPs produced a significantly higher PCR success rate (75.5% vs. 65.1%) and lower mean allelic error rate (8.8% vs. 13.3%) than microsatellites, but showed a higher allelic dropout rate (29.7% vs. 19.8%). Genotyping results showed no deviations from Mendelian inheritance in any of the SNP loci. Hence, RAMseq appears to be a valuable tool for the detection of genetic markers in nonmodel species, which is a common challenge in conservation genetic studies. © 2017 John Wiley & Sons Ltd.

  5. The faint end of the red sequence galaxy luminosity function: unveiling surface brightness selection effects with the CLASH clusters

    Science.gov (United States)

    Martinet, Nicolas; Durret, Florence; Adami, Christophe; Rudnick, Gregory

    2017-08-01

    Characterizing the evolution of the faint end of the cluster red sequence (RS) galaxy luminosity function (GLF) with redshift is a milestone in understanding galaxy evolution. However, the community is still divided in that respect, hesitating between an enrichment of the RS due to efficient quenching of blue galaxies from z 1 to present-day or a scenario in which the RS is built at a higher redshift and does not evolve afterwards. Recently, it has been proposed that surface brightness (SB) selection effects could possibly solve the literature disagreement, accounting for the diminishing RS faint population in ground-based observations. We investigate this hypothesis by comparing the RS GLFs of 16 CLASH clusters computed independently from ground-based Subaru/Suprime-Cam V and Ip or Ic images and space-based HST/ACS F606W and F814W images in the redshift range 0.187 ≤ z ≤ 0.686. We stack individual cluster GLFs in two redshift bins (0.187 ≤ z ≤ 0.399 and 0.400 ≤ z ≤ 0.686) and two mass (6 × 1014M⊙ ≤ M200space- and ground-based data, with a difference of 0.2σ in the faint end parameter α when stacking all clusters together and a maximum difference of 0.9σ in the case of the high-redshift stack, demonstrating a weak dependence on the type of observation in the probed range of redshift and mass. When considering the full sample, we estimate α = - 0.76 ± 0.07 and α = - 0.78 ± 0.06 with HST and Subaru, respectively. We note a mild variation of the faint end between the high- and low-redshift subsamples at a 1.7σ and 2.6σ significance. We investigate the effect of SB dimming by simulating our low-redshift galaxies at high redshift. We measure an evolution in the faint end slope of less than 1σ in this case, implying that the observed signature is larger than one would expect from SB dimming alone, and indicating a true evolution in the faint end slope. Finally, we find no variation with mass or radius in the probed range of these two parameters

  6. The κB transcriptional enhancer motif and signal sequences of V(DJ recombination are targets for the zinc finger protein HIVEP3/KRC: a site selection amplification binding study

    Directory of Open Access Journals (Sweden)

    Wu Lai-Chu

    2002-08-01

    Full Text Available Abstract Background The ZAS family is composed of proteins that regulate transcription via specific gene regulatory elements. The amino-DNA binding domain (ZAS-N and the carboxyl-DNA binding domain (ZAS-C of a representative family member, named κB DNA binding and recognition component (KRC, were expressed as fusion proteins and their target DNA sequences were elucidated by site selection amplification binding assays, followed by cloning and DNA sequencing. The fusion proteins-selected DNA sequences were analyzed by the MEME and MAST computer programs to obtain consensus motifs and DNA elements bound by the ZAS domains. Results Both fusion proteins selected sequences that were similar to the κB motif or the canonical elements of the V(DJ recombination signal sequences (RSS from a pool of degenerate oligonucleotides. Specifically, the ZAS-N domain selected sequences similar to the canonical RSS nonamer, while ZAS-C domain selected sequences similar to the canonical RSS heptamer. In addition, both KRC fusion proteins selected oligonucleoties with sequences identical to heptamer and nonamer sequences within endogenous RSS. Conclusions The RSS are cis-acting DNA motifs which are essential for V(DJ recombination of antigen receptor genes. Due to its specific binding affinity for RSS and κB-like transcription enhancer motifs, we hypothesize that KRC may be involved in the regulation of V(DJ recombination.

  7. Multi-species sequence comparison reveals dynamic evolution of the elastin gene that has involved purifying selection and lineage-specific insertions/deletions

    Directory of Open Access Journals (Sweden)

    Green Eric D

    2004-05-01

    Full Text Available Abstract Background The elastin gene (ELN is implicated as a factor in both supravalvular aortic stenosis (SVAS and Williams Beuren Syndrome (WBS, two diseases involving pronounced complications in mental or physical development. Although the complete spectrum of functional roles of the processed gene product remains to be established, these roles are inferred to be analogous in human and mouse. This view is supported by genomic sequence comparison, in which there are no large-scale differences in the ~1.8 Mb sequence block encompassing the common region deleted in WBS, with the exception of an overall reversed physical orientation between human and mouse. Results Conserved synteny around ELN does not translate to a high level of conservation in the gene itself. In fact, ELN orthologs in mammals show more sequence divergence than expected for a gene with a critical role in development. The pattern of divergence is non-conventional due to an unusually high ratio of gaps to substitutions. Specifically, multi-sequence alignments of eight mammalian sequences reveal numerous non-aligning regions caused by species-specific insertions and deletions, in spite of the fact that the vast majority of aligning sites appear to be conserved and undergoing purifying selection. Conclusions The pattern of lineage-specific, in-frame insertions/deletions in the coding exons of ELN orthologous genes is unusual and has led to unique features of the gene in each lineage. These differences may indicate that the gene has a slightly different functional mechanism in mammalian lineages, or that the corresponding regions are functionally inert. Identified regions that undergo purifying selection reflect a functional importance associated with evolutionary pressure to retain those features.

  8. Selectivity of Chemoresistive Sensors Made of Chemically Functionalized Carbon Nanotube Random Networks for Volatile Organic Compounds (VOC

    Directory of Open Access Journals (Sweden)

    Jean-François Feller

    2014-01-01

    Full Text Available Different grades of chemically functionalized carbon nanotubes (CNT have been processed by spraying layer-by-layer (sLbL to obtain an array of chemoresistive transducers for volatile organic compound (VOC detection. The sLbL process led to random networks of CNT less conductive, but more sensitive to vapors than filtration under vacuum (bucky papers. Shorter CNT were also found to be more sensitive due to the less entangled and more easily disconnectable conducting networks they are making. Chemical functionalization of the CNT’ surface is changing their selectivity towards VOC, which makes it possible to easily discriminate methanol, chloroform and tetrahydrofuran (THF from toluene vapors after the assembly of CNT transducers into an array to make an e-nose. Interestingly, the amplitude of the CNT transducers’ responses can be enhanced by a factor of five (methanol to 100 (chloroform by dispersing them into a polymer matrix, such as poly(styrene (PS, poly(carbonate (PC or poly(methyl methacrylate (PMMA. COOH functionalization of CNT was found to penalize their dispersion in polymers and to decrease the sensors’ sensitivity. The resulting conductive polymer nanocomposites (CPCs not only allow for a more easy tuning of the sensors’ selectivity by changing the chemical nature of the matrix, but they also allow them to adjust their sensitivity by changing the average gap between CNT (acting on quantum tunneling in the CNT network. Quantum resistive sensors (QRSs appear promising for environmental monitoring and anticipated disease diagnostics that are both based on VOC analysis.

  9. Selection of G-quadruplex folding topology with LNA-modified human telomeric sequences in K+ solution

    DEFF Research Database (Denmark)

    Pradhan, Devranjan; Hansen, Lykke H; Vester, Birte

    2011-01-01

    this problem by examining the impact of LNA (locked nucleic acid) modifications on the folding topology of the dimeric model system of the human telomere sequence. In solution, this DNA G-quadruplex forms a mixture of G-quadruplexes with antiparallel and parallel topologies. Using CD and NMR spectroscopies, we......G-rich nucleic acid oligomers can form G-quadruplexes built by G-tetrads stacked upon each other. Depending on the nucleotide sequence, G-quadruplexes fold mainly with two topologies: parallel, in which all G-tracts are oriented parallel to each other, or antiparallel, in which one or more G......-tracts are oriented antiparallel to the other G-tracts. In the former topology, all glycosidic bond angles conform to anti conformations, while in the latter topology they adopt both syn and anti conformations. It is of interest to understand the molecular forces that govern G-quadruplex folding. Here, we approach...

  10. Generic-reference and generic-generic bioequivalence of forty-two, randomly-selected, on-market generic products of fourteen immediate-release oral drugs.

    Science.gov (United States)

    Hammami, Muhammad M; De Padua, Sophia J S; Hussein, Rajaa; Al Gaai, Eman; Khodr, Nesrine A; Al-Swayeh, Reem; Alvi, Syed N; Binhashim, Nada

    2017-12-08

    The extents of generic-reference and generic-generic average bioequivalence and intra-subject variation of on-market drug products have not been prospectively studied on a large scale. We assessed bioequivalence of 42 generic products of 14 immediate-release oral drugs with the highest number of generic products on the Saudi market. We conducted 14 four-sequence, randomized, crossover studies on the reference and three randomly-selected generic products of amlodipine, amoxicillin, atenolol, cephalexin, ciprofloxacin, clarithromycin, diclofenac, ibuprofen, fluconazole, metformin, metronidazole, paracetamol, omeprazole, and ranitidine. Geometric mean ratios of maximum concentration (Cmax) and area-under-the-concentration-time-curve, to last measured concentration (AUCT), extrapolated to infinity (AUCI), or truncated to Cmax time of reference product (AUCReftmax) were calculated using non-compartmental method and their 90% confidence intervals (CI) were compared to the 80.00%-125.00% bioequivalence range. Percentages of individual ratios falling outside the ±25% range were also determined. Mean (SD) age and body-mass-index of 700 healthy volunteers (28-80/study) were 32.2 (6.2) years and 24.4 (3.2) kg/m2, respectively. In 42 generic-reference comparisons, 100% of AUCT and AUCI CIs showed bioequivalence, 9.5% of Cmax CIs barely failed to show bioequivalence, and 66.7% of AUCReftmax CIs failed to show bioequivalence/showed bioinequivalence. Adjusting for 6 comparisons, 2.4% of AUCT and AUCI CIs and 21.4% of Cmax CIs failed to show bioequivalence. In 42 generic-generic comparisons, 2.4% of AUCT, AUCI, and Cmax CIs failed to show bioequivalence, and 66.7% of AUCReftmax CIs failed to show bioequivalence/showed bioinequivalence. Adjusting for 6 comparisons, 2.4% of AUCT and AUCI CIs and 14.3% of Cmax CIs failed to show bioequivalence. Average geometric mean ratio deviation from 100% was ≤3.2 and ≤5.4 percentage points for AUCI and Cmax, respectively, in both generic

  11. Optimal Subset Selection of Time-Series MODIS Images and Sample Data Transfer with Random Forests for Supervised Classification Modelling

    Directory of Open Access Journals (Sweden)

    Fuqun Zhou

    2016-10-01

    Full Text Available Nowadays, various time-series Earth Observation data with multiple bands are freely available, such as Moderate Resolution Imaging Spectroradiometer (MODIS datasets including 8-day composites from NASA, and 10-day composites from the Canada Centre for Remote Sensing (CCRS. It is challenging to efficiently use these time-series MODIS datasets for long-term environmental monitoring due to their vast volume and information redundancy. This challenge will be greater when Sentinel 2–3 data become available. Another challenge that researchers face is the lack of in-situ data for supervised modelling, especially for time-series data analysis. In this study, we attempt to tackle the two important issues with a case study of land cover mapping using CCRS 10-day MODIS composites with the help of Random Forests’ features: variable importance, outlier identification. The variable importance feature is used to analyze and select optimal subsets of time-series MODIS imagery for efficient land cover mapping, and the outlier identification feature is utilized for transferring sample data available from one year to an adjacent year for supervised classification modelling. The results of the case study of agricultural land cover classification at a regional scale show that using only about a half of the variables we can achieve land cover classification accuracy close to that generated using the full dataset. The proposed simple but effective solution of sample transferring could make supervised modelling possible for applications lacking sample data.

  12. Biased random key genetic algorithm with insertion and gender selection for capacitated vehicle routing problem with time windows

    Science.gov (United States)

    Rochman, Auliya Noor; Prasetyo, Hari; Nugroho, Munajat Tri

    2017-06-01

    Vehicle Routing Problem (VRP) often occurs when the manufacturers need to distribute their product to some customers/outlets. The distribution process is typically restricted by the capacity of the vehicle and the working hours at the distributor. This type of VRP is also known as Capacitated Vehicle Routing Problem with Time Windows (CVRPTW). A Biased Random Key Genetic Algorithm (BRKGA) was designed and coded in MATLAB to solve the CVRPTW case of soft drink distribution. The standard BRKGA was then modified by applying chromosome insertion into the initial population and defining chromosome gender for parent undergoing crossover operation. The performance of the established algorithms was then compared to a heuristic procedure for solving a soft drink distribution. Some findings are revealed (1) the total distribution cost of BRKGA with insertion (BRKGA-I) results in a cost saving of 39% compared to the total cost of heuristic method, (2) BRKGA with the gender selection (BRKGA-GS) could further improve the performance of the heuristic method. However, the BRKGA-GS tends to yield worse results compared to that obtained from the standard BRKGA.

  13. A preliminary investigation of the jack-bean urease inhibition by randomly selected traditionally used herbal medicine.

    Science.gov (United States)

    Biglar, Mahmood; Soltani, Khadijeh; Nabati, Farzaneh; Bazl, Roya; Mojab, Faraz; Amanlou, Massoud

    2012-01-01

    Helicobacter pylori (H. pylori) infection leads to different clinical and pathological outcomes in humans, including chronic gastritis, peptic ulcer disease and gastric neoplasia and even gastric cancer and its eradiation dependst upon multi-drug therapy. The most effective therapy is still unknown and prompts people to make great efforts to find better and more modern natural or synthetic anti-H. pylori agents. In this report 21 randomly selected herbal methanolic extracts were evaluated for their effect on inhibition of Jack-bean urease using the indophenol method as described by Weatherburn. The inhibition potency was measured by UV spectroscopy technique at 630 nm which attributes to released ammonium. Among these extracts, five showed potent inhibitory activities with IC50 ranges of 18-35 μg/mL. These plants are Matricaria disciforme (IC50:35 μg/mL), Nasturtium officinale (IC50:18 μg/mL), Punica granatum (IC50:30 μg/mL), Camelia sinensis (IC50:35 μg/mL), Citrus aurantifolia (IC50:28 μg/mL).

  14. A brief, web-based personalized feedback selective intervention for college student marijuana use: a randomized clinical trial.

    Science.gov (United States)

    Lee, Christine M; Neighbors, Clayton; Kilmer, Jason R; Larimer, Mary E

    2010-06-01

    Despite clear need, brief web-based interventions for marijuana-using college students have not been evaluated in the literature. The current study was designed to evaluate a brief, web-based personalized feedback intervention for at-risk marijuana users transitioning to college. All entering first-year students were invited to complete a brief questionnaire. Participants meeting criteria completed a baseline assessment (N = 341) and were randomly assigned to web-based personalized feedback or assessment-only control conditions. Participants completed 3-month (95.0%) and 6-month (94.4%) follow-up assessments. Results indicated that although there was no overall intervention effect, moderator analyses found promising effects for those with a family history of drug problems and, to a smaller extent, students who were higher in contemplation of changing marijuana use at baseline. Implications of these findings for selective intervention of college marijuana use and web-based interventions in general are discussed. (PsycINFO Database Record (c) 2010 APA, all rights reserved).

  15. Optimal Subset Selection of Time-Series MODIS Images and Sample Data Transfer with Random Forests for Supervised Classification Modelling.

    Science.gov (United States)

    Zhou, Fuqun; Zhang, Aining

    2016-10-25

    Nowadays, various time-series Earth Observation data with multiple bands are freely available, such as Moderate Resolution Imaging Spectroradiometer (MODIS) datasets including 8-day composites from NASA, and 10-day composites from the Canada Centre for Remote Sensing (CCRS). It is challenging to efficiently use these time-series MODIS datasets for long-term environmental monitoring due to their vast volume and information redundancy. This challenge will be greater when Sentinel 2-3 data become available. Another challenge that researchers face is the lack of in-situ data for supervised modelling, especially for time-series data analysis. In this study, we attempt to tackle the two important issues with a case study of land cover mapping using CCRS 10-day MODIS composites with the help of Random Forests' features: variable importance, outlier identification. The variable importance feature is used to analyze and select optimal subsets of time-series MODIS imagery for efficient land cover mapping, and the outlier identification feature is utilized for transferring sample data available from one year to an adjacent year for supervised classification modelling. The results of the case study of agricultural land cover classification at a regional scale show that using only about a half of the variables we can achieve land cover classification accuracy close to that generated using the full dataset. The proposed simple but effective solution of sample transferring could make supervised modelling possible for applications lacking sample data.

  16. Next generation sequencing gives an insight into the characteristics of highly selected breeds versus non-breed horses in the course of domestication.

    Science.gov (United States)

    Metzger, Julia; Tonda, Raul; Beltran, Sergi; Agueda, Lídia; Gut, Marta; Distl, Ottmar

    2014-07-04

    Domestication has shaped the horse and lead to a group of many different types. Some have been under strong human selection while others developed in close relationship with nature. The aim of our study was to perform next generation sequencing of breed and non-breed horses to provide an insight into genetic influences on selective forces. Whole genome sequencing of five horses of four different populations revealed 10,193,421 single nucleotide polymorphisms (SNPs) and 1,361,948 insertion/deletion polymorphisms (indels). In comparison to horse variant databases and previous reports, we were able to identify 3,394,883 novel SNPs and 868,525 novel indels. We analyzed the distribution of individual variants and found significant enrichment of private mutations in coding regions of genes involved in primary metabolic processes, anatomical structures, morphogenesis and cellular components in non-breed horses and in contrast to that private mutations in genes affecting cell communication, lipid metabolic process, neurological system process, muscle contraction, ion transport, developmental processes of the nervous system and ectoderm in breed horses. Our next generation sequencing data constitute an important first step for the characterization of non-breed in comparison to breed horses and provide a large number of novel variants for future analyses. Functional annotations suggest specific variants that could play a role for the characterization of breed or non-breed horses.

  17. Seeing the trees through the forest : sequence-based homo- and heteromeric protein-protein interaction sites prediction using random forest

    NARCIS (Netherlands)

    Hou, Qingzhen; De Geest, Paul F.G.; Vranken, Wim F.; Heringa, Jaap; Feenstra, K. Anton

    2017-01-01

    Motivation: Genome sequencing is producing an ever-increasing amount of associated protein sequences. Few of these sequences have experimentally validated annotations, however, and computational predictions are becoming increasingly successful in producing such annotations. One key challenge remains

  18. Long-range correlations and charge transport properties of DNA sequences

    Science.gov (United States)

    Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui

    2010-04-01

    By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5selected in our Letter are concerned, Ch22 sequence displays a transition from correlation behavior to anticorrelation behavior. The resonant peaks of the transmission coefficient in genomic sequences can survive in longer sequence length than in random sequences but in shorter sequence length than in quasiperiodic sequences. It is shown that the genomic sequences have long-range correlation properties to some extent but the correlations are not strong enough to maintain the scale invariance properties.

  19. Long-range correlations and charge transport properties of DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Liu Xiaoliang, E-mail: xlliucsu@yahoo.com.c [College of Physical Science and Technology and College of Metallurgical Science and Engineering, Central South University, Changsha 410083 (China); Ren, Yi [College of Physical Science and Technology and College of Metallurgical Science and Engineering, Central South University, Changsha 410083 (China); Xie, Qiong-tao [Key Laboratory of Low Dimensional Quantum Structures and Quantum Control of Ministry of Education (Hunan Normal University), Changsha 410081 (China); Deng, Chao-sheng; Xu, Hui [College of Physical Science and Technology and College of Metallurgical Science and Engineering, Central South University, Changsha 410083 (China)

    2010-04-26

    By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that lambda-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5selected in our Letter are concerned, Ch22 sequence displays a transition from correlation behavior to anticorrelation behavior. The resonant peaks of the transmission coefficient in genomic sequences can survive in longer sequence length than in random sequences but in shorter sequence length than in quasiperiodic sequences. It is shown that the genomic sequences have long-range correlation properties to some extent but the correlations are not strong enough to maintain the scale invariance properties.

  20. A Spatially-selective Implementation of the Adiabatic T2Prep Sequence for Magnetic Resonance Angiography of the Coronary Arteries

    Science.gov (United States)

    Soleimanifard, Sahar; Schär, Michael; Hays, Allison G.; Prince, Jerry L.; Weiss, Robert G.; Stuber, Matthias

    2012-01-01

    In coronary magnetic resonance angiography, a magnetization-preparation scheme for T2-weighting (T2Prep) is widely used to enhance contrast between the coronary blood-pool and the myocardium. This pre-pulse is commonly applied without spatial selection to minimize flow sensitivity, but the non-selective implementation results in a reduced magnetization of the in-flowing blood and a related penalty in signal-to-noise-ratio (SNR). It is hypothesized that a spatially-selective T2Prep would leave the magnetization of blood outside the T2Prep volume unaffected, and thereby lower the SNR penalty. To test this hypothesis, a spatially-selective T2Prep was implemented where the user could freely adjust angulation and position of the T2Prep slab to avoid covering the ventricular blood-pool and saturating the in-flowing spins. A time gap of 150ms was further added between the T2Prep and other pre-pulses to allow for in-flow of a larger volume of unsaturated spins. Consistent with numerical simulation, the spatially-selective T2Prep increased in vivo human coronary artery SNR (42.3±2.9 vs. 31.4±2.2, n=22, p<0.0001) and contrast-to-noise-ratio (18.6±1.5 vs. 13.9±1.2, p=0.009) as compared to those of the non-selective T2Prep. Additionally, a segmental analysis demonstrated that the spatially-selective T2Prep was most beneficial in proximal and mid segments where the in-flowing blood volume was largest compared to the distal segments. PMID:22915337

  1. Enumeration of Escherichia coli cells on chicken carcasses as a potential measure of microbial process control in a random selection of slaughter establishments in the United States

    Science.gov (United States)

    The purpose of this study was to evaluate whether the measurement of Escherichia coli levels at two points during the chicken slaughter process has utility as a measure of quality control. A one year long survey was conducted during 2004 and 2005 in 20 randomly selected United States chicken slaught...

  2. Impact of amoxicillin therapy on resistance selection in patients with community-acquired lower respiratory tract infections : A randomized, placebo-controlled study

    NARCIS (Netherlands)

    Malhotra-Kumar, Surbhi; Van Heirstraeten, Liesbet; Coenen, Samuel; Lammens, Christine; Adriaenssens, Niels; Kowalczyk, Anna; Godycki-Cwirko, Maciek; Bielicka, Zuzana; Hupkova, Helena; Lannering, Christina; Mölstad, Sigvard; Fernandez-Vandellos, Patricia; Torres, Antoni; Parizel, Maxim; Ieven, Margareta; Butler, Chris C.; Verheij, Theo; Little, Paul; Goossens, Hermanon; Frimodt-Møller, Niels; Bruno, Pascale; Hering, Iris; Lemiengre, Marieke; Loens, Katherine; Malmvall, Bo Eric; Muras, Magdalena; Romano, Nuria Sanchez; Prat, Matteu Serra; Svab, Igor; Swain, Jackie; Tarsia, Paolo; Leus, Frank; Veen, Robert; Worby, Tricia

    2016-01-01

    Objectives: To determine the effect of amoxicillin treatment on resistance selection in patients with community-acquired lower respiratory tract infections in a randomized, placebo-controlled trial. Methods: Patients were prescribed amoxicillin 1 g, three times daily (n = 52) or placebo (n = 50) for

  3. The Long-Term Effectiveness of a Selective, Personality-Targeted Prevention Program in Reducing Alcohol Use and Related Harms: A Cluster Randomized Controlled Trial

    Science.gov (United States)

    Newton, Nicola C.; Conrod, Patricia J.; Slade, Tim; Carragher, Natacha; Champion, Katrina E.; Barrett, Emma L.; Kelly, Erin V.; Nair, Natasha K.; Stapinski, Lexine; Teesson, Maree

    2016-01-01

    Background: This study investigated the long-term effectiveness of Preventure, a selective personality-targeted prevention program, in reducing the uptake of alcohol, harmful use of alcohol, and alcohol-related harms over a 3-year period. Methods: A cluster randomized controlled trial was conducted to assess the effectiveness of Preventure.…

  4. [Whole-genome sequencing in German clinical practice : Economic impacts of its use in selected areas of application].

    Science.gov (United States)

    Plöthner, Marika; Frank, Martin; Graf von der Schulenburg, J-Matthias

    2017-02-01

    The diagnostic use of whole-genome sequencing (WGS) is a growing issue in medical care. Due to limited resources in public health service, budget-impact analyses are necessary prior to implementation. A budget-impact analysis for WGS of all newborns and diagnostic investigation of tumor patients in different oncologic indications were evaluated. A cost analysis of WGS based on a quality-assured process chart for WGS at the German Cancer Research Center (DKFZ), Heidelberg, constitutes the basis for this evaluation. Data from the National Association of Statutory Health Insurance Funds and the Robert-Koch-Institute, Berlin, were used for calculations of specific clinical applications. WGS in newborn screening leads to costs of € 2.85 bn and to an increase of total expenditure by 1.41%. Sequencing of all tumor patients would cost approximately € 0.84 bn, which corresponds to 0.42% of total expenditures. In all scenarios, the sole consideration of procedure costs results in increasing costs. However, in cost discussions potential savings (reduction of disease-related follow-up-costs, improved cost-effectiveness of medical measures etc.) should be considered. Such considerations are the subject of economic indication-specific evaluations. WGS has the potential to generate a large number of deterministic findings for which treatment options are limited. Hence, it is necessary to limit indications, in which WGS has proven medical evidence.

  5. Selection of functional 2A sequences within foot-and-mouth disease virus; requirements for the NPGP motif with a distinct codon bias

    DEFF Research Database (Denmark)

    Kjær, Jonas; Belsham, Graham J.

    2018-01-01

    Foot-and-mouth disease virus (FMDV) has a positive-sense ssRNA genome including a single, large, open reading frame. Splitting of the encoded polyprotein at the 2A/2B junction is mediated by the 2A peptide (18 residues long) which induces a non-proteolytic, co-translational, "cleavage" at its own......, surprisingly, a clear codon preference for the wt nucleotide sequence encoding the NPGP motif within these viruses was observed. Indeed, the codons selected to code for P17 and P19 within this motif were distinct; thus the synonymous codons are not equivalent....

  6. Surveillance for cancer recurrence in long-term young breast cancer survivors randomly selected from a statewide cancer registry.

    Science.gov (United States)

    Jones, Tarsha; Duquette, Debra; Underhill, Meghan; Ming, Chang; Mendelsohn-Victor, Kari E; Anderson, Beth; Milliron, Kara J; Copeland, Glenn; Janz, Nancy K; Northouse, Laurel L; Duffy, Sonia M; Merajver, Sofia D; Katapodi, Maria C

    2018-01-20

    This study examined clinical breast exam (CBE) and mammography surveillance in long-term young breast cancer survivors (YBCS) and identified barriers and facilitators to cancer surveillance practices. Data collected with a self-administered survey from a statewide, randomly selected sample of YBCS diagnosed with invasive breast cancer or ductal carcinoma in situ younger than 45 years old, stratified by race (Black vs. White/Other). Multivariate logistic regression models identified predictors of annual CBEs and mammograms. Among 859 YBCS (n = 340 Black; n = 519 White/Other; mean age = 51.0 ± 5.9; diagnosed 11.0 ± 4.0 years ago), the majority (> 85%) reported an annual CBE and a mammogram. Black YBCS in the study were more likely to report lower rates of annual mammography and more barriers accessing care compared to White/Other YBCS. Having a routine source of care, confidence to use healthcare services, perceived expectations from family members and healthcare providers to engage in cancer surveillance, and motivation to comply with these expectations were significant predictors of having annual CBEs and annual mammograms. Cost-related lack of access to care was a significant barrier to annual mammograms. Routine source of post-treatment care facilitated breast cancer surveillance above national average rates. Persistent disparities regarding access to mammography surveillance were identified for Black YBCS, primarily due to lack of access to routine source of care and high out-of-pocket costs. Public health action targeting cancer surveillance in YBCS should ensure routine source of post-treatment care and address cost-related barriers. Clinical Trials Registration Number: NCT01612338.

  7. Evaluation of Randomly Selected Completed Medical Records Sheets in Teaching Hospitals of Jahrom University of Medical Sciences, 2009

    Directory of Open Access Journals (Sweden)

    Mohammad Parsa Mahjob

    2011-06-01

    Full Text Available Background and objective: Medical record documentation, often use to protect the patients legal rights, also providing information for medical researchers, general studies, education of health care staff and qualitative surveys is used. There is a need to control the amount of data entered in the medical record sheets of patients, considering the completion of these sheets is often carried out after completion of service delivery to the patients. Therefore, in this study the prevalence of completeness of medical history, operation reports, and physician order sheets by different documentaries in Jahrom teaching hospitals during year 2009 was analyzed. Methods and Materials: In this descriptive / retrospective study, the 400 medical record sheets of the patients from two teaching hospitals affiliated to Jahrom medical university was randomly selected. The tool of data collection was a checklist based on the content of medical history sheet, operation report and physician order sheets. The data were analyzed by SPSS (Version10 software and Microsoft Office Excel 2003. Results: Average of personal (Demography data entered in medical history, physician order and operation report sheets which is done by department's secretaries were 32.9, 35.8 and 40.18 percent. Average of clinical data entered by physician in medical history sheet is 38 percent. Surgical data entered by the surgeon in operation report sheet was 94.77 percent. Average of data entered by operation room's nurse in operation report sheet was 36.78 percent; Average of physician order data in physician order sheet entered by physician was 99.3 percent. Conclusion: According to this study, the rate of completed record papers reviewed by documentary in Jahrom teaching hospitals were not desirable and in some cases were very weak and incomplete. This deficiency was due to different reason such as medical record documentaries negligence, lack of adequate education for documentaries, High work

  8. Sexual selection has minimal impact on effective population sizes in species with high rates of random offspring mortality: An empirical demonstration using fitness distributions.

    Science.gov (United States)

    Pischedda, Alison; Friberg, Urban; Stewart, Andrew D; Miller, Paige M; Rice, William R

    2015-10-01

    The effective population size (N(e)) is a fundamental parameter in population genetics that influences the rate of loss of genetic diversity. Sexual selection has the potential to reduce N(e) by causing the sex-specific distributions of individuals that successfully reproduce to diverge. To empirically estimate the effect of sexual selection on N(e), we obtained fitness distributions for males and females from an outbred, laboratory-adapted population of Drosophila melanogaster. We observed strong sexual selection in this population (the variance in male reproductive success was ∼14 times higher than that for females), but found that sexual selection had only a modest effect on N(e), which was 75% of the census size. This occurs because the substantial random offspring mortality in this population diminishes the effects of sexual selection on N(e), a result that necessarily applies to other high fecundity species. The inclusion of this random offspring mortality creates a scaling effect that reduces the variance/mean ratios for male and female reproductive success and causes them to converge. Our results demonstrate that measuring reproductive success without considering offspring mortality can underestimate Ne and overestimate the genetic consequences of sexual selection. Similarly, comparing genetic diversity among different genomic components may fail to detect strong sexual selection. © 2015 The Author(s). Evolution © 2015 The Society for the Study of Evolution.

  9. Intrafamilial, Preferentially Mother-to-Child and Intraspousal, Helicobacter pylori Infection in Japan Determined by Mutilocus Sequence Typing and Random Amplified Polymorphic DNA Fingerprinting.

    Science.gov (United States)

    Yokota, Shin-ichi; Konno, Mutsuko; Fujiwara, Shin-ichi; Toita, Nariaki; Takahashi, Michiko; Yamamoto, Soh; Ogasawara, Noriko; Shiraishi, Tsukasa

    2015-10-01

    The infection route of Helicobacter pylori has been recognized to be mainly intrafamilial, preferentially mother-to-child, especially in developed countries. To determine the transmission route, we examined whether multilocus sequence typing (MLST) was useful for analysis of intrafamilial infection. The possibility of intraspousal infection was also evaluated. Clonal relationships between strains derived from 35 index Japanese pediatric patients, and their family members were analyzed by two genetic typing procedures, MLST and random amplified polymorphic DNA (RAPD) fingerprinting. Mostly coincident results were obtained by MLST and RAPD. By MLST, the allele of loci in the isolates mostly matched between the index child and both the father and mother for 9 (25.7%) of the 35 patients, between the index child and the mother for 25 (60.0%) of the 35 patients. MLST is useful for analyzing the infection route of H. pylori as a highly reproducible method. Intrafamilial, especially mother-to-children and sibling, infection is the dominant transmission route. Intraspousal infection is also thought to occur in about a quarter in the Japanese families. © 2015 John Wiley & Sons Ltd.

  10. Effects of tetracycline and zinc on selection of methicillin-resistant Staphylococcus aureus (MRSA) sequence type 398 in pigs

    DEFF Research Database (Denmark)

    Moodley, Arshnee; Nielsen, Søren Saxmose; Guardabassi, Luca

    2011-01-01

    An in vivo experiment was conducted to evaluate the effects of tetracycline and zinc on pig colonization and transmission of methicillin-resistant Staphylococcus aureus (MRSA) sequence type (ST) 398. Eight piglets naturally colonized with MRSA ST398 and 8 MRSA-negative piglets of the same age...... and breed were assigned to three groups treated with tetracycline and zinc (Group 1), zinc (Group 2) or tetracycline alone (Group 3) and one non-treated group (Group 4), each containing two MRSA-positive and two MRSA-negative animals. Two additional non-treated control groups composed of only MRSA......-positive (Group 5) and MRSA-negative (Group 6) animals were used to check for stability of MRSA carriage status. Nasal swabs and environmental wipes were collected on Days 0, 7, 14, and 21, and the occurrence of MRSA in each sample was quantified by bacteriological counts on Brilliance™ MRSA agar. Significantly...

  11. MHC class IIB gene sequences and expression in quails (Coturnix japonica) selected for high and low antibody responses.

    Science.gov (United States)

    Shimizu, Sayoko; Shiina, Takashi; Hosomichi, Kazuyoshi; Takahashi, Shinji; Koyama, Takumi; Onodera, Takashi; Kulski, Jerzy K; Inoko, Hidetoshi

    2004-07-01

    Two quail lines, H and L, which were developed for high (H) and low (L) antibody production against inactivated Newcastle disease virus antigen, were used to examine differences in the organization, structure and expression of the quail Mhc class IIB genes. Four Coja class IIB genes in the H line and ten Coja class IIB genes in the L line were identified by gene amplification using standard and long-range PCRs and sequencing of the amplified products. RFLP analysis, sequencing and gene mapping revealed that the H line was fixed for a single class IIB haplotype, which we have designated CojaII-02HL- CojaII-01HL. In contrast, evidence was found for two class IIB haplotypes segregating in the L line. Some individuals were found to be homozygous for haplotype CojaII-08L- CojaII-07L and others were found to be heterozygous CojaII-08L- CojaII-07L/ CojaII-02HL- CojaII-01HL. However, expression of CojaII-02HL- CojaII-01HL was not detected in the L line. SRBC immunization induced a measurable antibody response in the serum and a line-specific class IIB gene expression in the peripheral white blood cells. CojaII-01HL was expressed at the highest level in the H line and CojaII-07L in the L line. The expression of the class IIB mRNA reached the highest level at approximately 1 week after the primary antibody response and then declined exponentially. The antibody and class IIB gene expression data obtained in response to SRBC immunization provide further evidence that quails within the L line had reduced immunocompetence compared with those in the H line.

  12. Effects of strategy sequences and response-stimulus intervals on children's strategy selection and strategy execution: a study in computational estimation.

    Science.gov (United States)

    Lemaire, Patrick; Brun, Fleur

    2014-07-01

    The present study investigates how children's better strategy selection and strategy execution on a given problem are influenced by which strategy was used on the immediately preceding problem and by the duration between their answer to the previous problem and current problem display. These goals are pursued in the context of an arithmetic problem solving task. Third and fifth graders were asked to select the better strategy to find estimates to two-digit addition problems like 36 + 78. On each problem, children could choose rounding-down (i.e., rounding both operands down to the closest smaller decades, like doing 40 + 60 to solve 42 + 67) or rounding-up strategies (i.e., rounding both operands up to the closest larger decades, like doing 50 + 70 to solve 42 + 67). Children were tested under a short RSI condition (i.e., the next problem was displayed 900 ms after participants' answer) or under a long RSI condition (i.e., the next problem was displayed 1,900 ms after participants' answer). Results showed that both strategy selection (e.g., children selected the better strategy more often under long RSI condition and after selecting the poorer strategy on the immediately preceding problem) and strategy execution (e.g., children executed strategy more efficiently under long RSI condition and were slower when switching strategy over two consecutive problems) were influenced by RSI and which strategy was used on the immediately preceding problem. Moreover, data showed age-related changes in effects of RSI and strategy sequence on mean percent better strategy selection and on strategy performance. The present findings have important theoretical and empirical implications for our understanding of general and specific processes involved in strategy selection, strategy execution, and strategic development.

  13. Gene selection tool (GST): A R-based tool for genetic disorders based on the sliding-window proportion test using whole-exome sequencing data.

    Science.gov (United States)

    Lee, Sugi; Jung, Minah; Jung, Jaeeun; Park, Kunhyang; Ryu, Jea-Woon; Kim, Jeongkil; Kim, Dae-Soo

    2017-01-01

    Whole-exome sequencing (WES) can identify causative mutations in hereditary diseases. However, WES data might have a large candidate variant list, including false positives. Moreover, in families, it is more difficult to select disease-associated variants because many variants are shared among members. To reduce false positives and extract accurate candidates, we used a multilocus variant instead of a single-locus variant (SNV). We set up a specific window to analyze the multilocus variant and devised a sliding-window approach to observe all variants. We developed the gene selection tool (GST) based on proportion tests for linkage analysis using WES data. This tool is R program coded and has high sensitivity. We tested our code to find the gene for hereditary spastic paraplegia using SNVs from a specific family and identified the gene known to cause the disease in a significant gene list. The list identified other genes that might be associated with the disease.

  14. Whole-genome sequence analyses of Western Central African Pygmy hunter-gatherers reveal a complex demographic history and identify candidate genes under positive natural selection.

    Science.gov (United States)

    Hsieh, PingHsun; Veeramah, Krishna R; Lachance, Joseph; Tishkoff, Sarah A; Wall, Jeffrey D; Hammer, Michael F; Gutenkunst, Ryan N

    2016-03-01

    African Pygmies practicing a mobile hunter-gatherer lifestyle are phenotypically and genetically diverged from other anatomically modern humans, and they likely experienced strong selective pressures due to their unique lifestyle in the Central African rainforest. To identify genomic targets of adaptation, we sequenced the genomes of four Biaka Pygmies from the Central African Republic and jointly analyzed these data with the genome sequences of three Baka Pygmies from Cameroon and nine Yoruba famers. To account for the complex demographic history of these populations that includes both isolation and gene flow, we fit models using the joint allele frequency spectrum and validated them using independent approaches. Our two best-fit models both suggest ancient divergence between the ancestors of the farmers and Pygmies, 90,000 or 150,000 yr ago. We also find that bidirectional asymmetric gene flow is statistically better supported than a single pulse of unidirectional gene flow from farmers to Pygmies, as previously suggested. We then applied complementary statistics to scan the genome for evidence of selective sweeps and polygenic selection. We found that conventional statistical outlier approaches were biased toward identifying candidates in regions of high mutation or low recombination rate. To avoid this bias, we assigned P-values for candidates using whole-genome simulations incorporating demography and variation in both recombination and mutation rates. We found that genes and gene sets involved in muscle development, bone synthesis, immunity, reproduction, cell signaling and development, and energy metabolism are likely to be targets of positive natural selection in Western African Pygmies or their recent ancestors. © 2016 Hsieh et al.; Published by Cold Spring Harbor Laboratory Press.

  15. Tuning the processability, morphology and biodegradability of clay incorporated PLA/LLDPE blends via selective localization of nanoclay induced by melt mixing sequence

    Directory of Open Access Journals (Sweden)

    S. H. Jafari

    2013-01-01

    Full Text Available Polylactic acid (PLA/linear low density polyethylene (LLDPE blend nanocomposites based on two different commercial-grade nanoclays, Cloisite® 30B and Cloisite® 15A, were produced via different melt mixing procedures in a counter-rotating twin screw extruder. The effects of mixing sequence and clay type on morphological and rheological behaviors as well as degradation properties of the blends were investigated. The X-ray diffraction (XRD results showed that generally the level of exfoliation in 30B based nanocomposites was better than 15A based nanocomposites. In addition, due to difference in hydrophilicity and kind of modifiers in these two clays, the effect of 30B on refinement of dispersed phase and enhancement of biodegradability of PLA/LLDPE blend was much more remarkable than that of 15A nanoclay. Unlike the one step mixing process, preparation of nanocomposites via a two steps mixing process improved the morphology. Based on the XRD and TEM (transmission electron microscopic results, it is found that the mixing sequence has a remarkable influence on dispersion and localization of the major part of 30B nanoclay in the PLA matrix. Owing to the induced selective localization of nanoclays in PLA phase, the nanocomposites prepared through a two steps mixing sequence exhibited extraordinary biodegradability, refiner morphology and better melt elasticity.

  16. CRISPR-Cas9-Edited Site Sequencing (CRES-Seq): An Efficient and High-Throughput Method for the Selection of CRISPR-Cas9-Edited Clones.

    Science.gov (United States)

    Veeranagouda, Yaligara; Debono-Lagneaux, Delphine; Fournet, Hamida; Thill, Gilbert; Didier, Michel

    2018-01-16

    The emergence of clustered regularly interspaced short palindromic repeats-Cas9 (CRISPR-Cas9) gene editing systems has enabled the creation of specific mutants at low cost, in a short time and with high efficiency, in eukaryotic cells. Since a CRISPR-Cas9 system typically creates an array of mutations in targeted sites, a successful gene editing project requires careful selection of edited clones. This process can be very challenging, especially when working with multiallelic genes and/or polyploid cells (such as cancer and plants cells). Here we described a next-generation sequencing method called CRISPR-Cas9 Edited Site Sequencing (CRES-Seq) for the efficient and high-throughput screening of CRISPR-Cas9-edited clones. CRES-Seq facilitates the precise genotyping up to 96 CRISPR-Cas9-edited sites (CRES) in a single MiniSeq (Illumina) run with an approximate sequencing cost of $6/clone. CRES-Seq is particularly useful when multiple genes are simultaneously targeted by CRISPR-Cas9, and also for screening of clones generated from multiallelic genes/polyploid cells. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.

  17. Selective Whole-Genome Amplification Is a Robust Method That Enables Scalable Whole-Genome Sequencing of Plasmodium vivax from Unprocessed Clinical Samples.

    Science.gov (United States)

    Cowell, Annie N; Loy, Dorothy E; Sundararaman, Sesh A; Valdivia, Hugo; Fisch, Kathleen; Lescano, Andres G; Baldeviano, G Christian; Durand, Salomon; Gerbasi, Vince; Sutherland, Colin J; Nolder, Debbie; Vinetz, Joseph M; Hahn, Beatrice H; Winzeler, Elizabeth A

    2017-02-07

    Whole-genome sequencing (WGS) of microbial pathogens from clinical samples is a highly sensitive tool used to gain a deeper understanding of the biology, epidemiology, and drug resistance mechanisms of many infections. However, WGS of organisms which exhibit low densities in their hosts is challenging due to high levels of host genomic DNA (gDNA), which leads to very low coverage of the microbial genome. WGS of Plasmodium vivax, the most widely distributed form of malaria, is especially difficult because of low parasite densities and the lack of an ex vivo culture system. Current techniques used to enrich P. vivax DNA from clinical samples require significant resources or are not consistently effective. Here, we demonstrate that selective whole-genome amplification (SWGA) can enrich P. vivax gDNA from unprocessed human blood samples and dried blood spots for high-quality WGS, allowing genetic characterization of isolates that would otherwise have been prohibitively expensive or impossible to sequence. We achieved an average genome coverage of 24×, with up to 95% of the P. vivax core genome covered by ≥5 reads. The single-nucleotide polymorphism (SNP) characteristics and drug resistance mutations seen were consistent with those of other P. vivax sequences from a similar region in Peru, demonstrating that SWGA produces high-quality sequences for downstream analysis. SWGA is a robust tool that will enable efficient, cost-effective WGS of P. vivax isolates from clinical samples that can be applied to other neglected microbial pathogens. Malaria is a disease caused by Plasmodium parasites that caused 214 million symptomatic cases and 438,000 deaths in 2015. Plasmodium vivax is the most widely distributed species, causing the majority of malaria infections outside sub-Saharan Africa. Whole-genome sequencing (WGS) of Plasmodium parasites from clinical samples has revealed important insights into the epidemiology and mechanisms of drug resistance of malaria

  18. Effects of the sequence wildfire-harvesting-coppice sprout selection on nutrient export via streamfloe in a small E. globulus watershed in Galicia (NW Spain)

    Energy Technology Data Exchange (ETDEWEB)

    Fernandez, C.; Vega, J. A.; Bara, S.; Alonso, M.; Fonturbel, T.

    2011-07-01

    An experimental study was carried out between 1987 and 1999, to assess the effect of the sequence wildfire-clear felling-coppice sprout selection thinning, on stream flow nutrient export in a Eucalyptus globulus Labill. watershed in Galicia (NW Spain). The effects of such a sequence on nutrient export via stream flow had not been previously evaluated. A wildfire in 1989 caused a significant increase in nutrient exports in stream flow during the following two years. No significant effect was observed the third year after wildfire. After clear felling in 1992, inputs via precipitation compensated for nutrient exports in stream flow, except for K the first year following harvest and NO{sub 3}- during the three years after this operation. Coppice sprout selection thinning in 1995 had less effect on nutrient exports than wildfire or harvest. The results presented here could may help in evaluating the effects of current intensive forest management and perturbations that affect eucalypt stands in NW Spain. (Author) 39 refs.

  19. Exome sequencing of germline DNA from non-BRCA1/2 familial breast cancer cases selected on the basis of aCGH tumor profiling.

    Directory of Open Access Journals (Sweden)

    Florentine S Hilbers

    Full Text Available The bulk of familial breast cancer risk (∼70% cannot be explained by mutations in the known predisposition genes, primarily BRCA1 and BRCA2. Underlying genetic heterogeneity in these cases is the probable explanation for the failure of all attempts to identify further high-risk alleles. While exome sequencing of non-BRCA1/2 breast cancer cases is a promising strategy to detect new high-risk genes, rational approaches to the rigorous pre-selection of cases are needed to reduce heterogeneity. We selected six families in which the tumours of multiple cases showed a specific genomic profile on array comparative genomic hybridization (aCGH. Linkage analysis in these families revealed a region on chromosome 4 with a LOD score of 2.49 under homogeneity. We then analysed the germline DNA of two patients from each family using exome sequencing. Initially focusing on the linkage region, no potentially pathogenic variants could be identified in more than one family. Variants outside the linkage region were then analysed, and we detected multiple possibly pathogenic variants in genes that encode DNA integrity maintenance proteins. However, further analysis led to the rejection of all variants due to poor co-segregation or a relatively high allele frequency in a control population. We concluded that using CGH results to focus on a sub-set of families for sequencing analysis did not enable us to identify a common genetic change responsible for the aggregation of breast cancer in these families. Our data also support the emerging view that non-BRCA1/2 hereditary breast cancer families have a very heterogeneous genetic basis.

  20. Selective Whole-Genome Amplification Is a Robust Method That Enables Scalable Whole-Genome Sequencing of Plasmodium vivax from Unprocessed Clinical Samples

    Directory of Open Access Journals (Sweden)

    Annie N. Cowell

    2017-02-01

    Full Text Available Whole-genome sequencing (WGS of microbial pathogens from clinical samples is a highly sensitive tool used to gain a deeper understanding of the biology, epidemiology, and drug resistance mechanisms of many infections. However, WGS of organisms which exhibit low densities in their hosts is challenging due to high levels of host genomic DNA (gDNA, which leads to very low coverage of the microbial genome. WGS of Plasmodium vivax, the most widely distributed form of malaria, is especially difficult because of low parasite densities and the lack of an ex vivo culture system. Current techniques used to enrich P. vivax DNA from clinical samples require significant resources or are not consistently effective. Here, we demonstrate that selective whole-genome amplification (SWGA can enrich P. vivax gDNA from unprocessed human blood samples and dried blood spots for high-quality WGS, allowing genetic characterization of isolates that would otherwise have been prohibitively expensive or impossible to sequence. We achieved an average genome coverage of 24×, with up to 95% of the P. vivax core genome covered by ≥5 reads. The single-nucleotide polymorphism (SNP characteristics and drug resistance mutations seen were consistent with those of other P. vivax sequences from a similar region in Peru, demonstrating that SWGA produces high-quality sequences for downstream analysis. SWGA is a robust tool that will enable efficient, cost-effective WGS of P. vivax isolates from clinical samples that can be applied to other neglected microbial pathogens.

  1. Dynamical system modeling to simulate donor T cell response to whole exome sequencing-derived recipient peptides: Understanding randomness in alloreactivity incidence following stem cell transplantation.

    Directory of Open Access Journals (Sweden)

    Vishal Koparde

    Full Text Available Quantitative relationship between the magnitude of variation in minor histocompatibility antigens (mHA and graft versus host disease (GVHD pathophysiology in stem cell transplant (SCT donor-recipient pairs (DRP is not established. In order to elucidate this relationship, whole exome sequencing (WES was performed on 27 HLA matched related (MRD, & 50 unrelated donors (URD, to identify nonsynonymous single nucleotide polymorphisms (SNPs. An average 2,463 SNPs were identified in MRD, and 4,287 in URD DRP (p<0.01; resulting peptide antigens that may be presented on HLA class I molecules in each DRP were derived in silico (NetMHCpan ver2.0 and the tissue expression of proteins these were derived from determined (GTex. MRD DRP had an average 3,670 HLA-binding-alloreactive peptides, putative mHA (pmHA with an IC50 of <500 nM, and URD, had 5,386 (p<0.01. To simulate an alloreactive donor cytotoxic T cell response, the array of pmHA in each patient was considered as an operator matrix modifying a hypothetical cytotoxic T cell clonal vector matrix; each responding T cell clone's proliferation was determined by the logistic equation of growth, accounting for HLA binding affinity and tissue expression of each alloreactive peptide. The resulting simulated organ-specific alloreactive T cell clonal growth revealed marked variability, with the T cell count differences spanning orders of magnitude between different DRP. Despite an estimated, uniform set of constants used in the model for all DRP, and a heterogeneously treated group of patients, higher total and organ-specific T cell counts were associated with cumulative incidence of moderate to severe GVHD in recipients. In conclusion, exome wide sequence differences and the variable alloreactive peptide binding to HLA in each DRP yields a large range of possible alloreactive donor T cell responses. Our findings also help understand the apparent randomness observed in the development of alloimmune responses.

  2. The Impact of Whole-Genome Sequencing on the Primary Care and Outcomes of Healthy Adult Patients: A Pilot Randomized Trial.

    Science.gov (United States)

    Vassy, Jason L; Christensen, Kurt D; Schonman, Erica F; Blout, Carrie L; Robinson, Jill O; Krier, Joel B; Diamond, Pamela M; Lebo, Matthew; Machini, Kalotina; Azzariti, Danielle R; Dukhovny, Dmitry; Bates, David W; MacRae, Calum A; Murray, Michael F; Rehm, Heidi L; McGuire, Amy L; Green, Robert C

    2017-06-27

    Whole-genome sequencing (WGS) in asymptomatic adults might prevent disease but increase health care use without clinical value. To describe the effect on clinical care and outcomes of adding WGS to standardized family history assessment in primary care. Pilot randomized trial. (ClinicalTrials.gov: NCT01736566). Academic primary care practices. 9 primary care physicians (PCPs) and 100 generally healthy patients recruited at ages 40 to 65 years. Patients were randomly assigned to receive a family history report alone (FH group) or in combination with an interpreted WGS report (FH + WGS group), which included monogenic disease risk (MDR) results (associated with Mendelian disorders), carrier variants, pharmacogenomic associations, and polygenic risk estimates for cardiometabolic traits. Each patient met with his or her PCP to discuss the report. Clinical outcomes and health care use through 6 months were obtained from medical records and audio-recorded discussions between PCPs and patients. Patients' health behavior changes were surveyed 6 months after receiving results. A panel of clinician-geneticists rated the appropriateness of how PCPs managed MDR results. Mean age was 55 years; 58% of patients were female. Eleven FH + WGS patients (22% [95% CI, 12% to 36%]) had new MDR results. Only 2 (4% [CI, 0.01% to 15%]) had evidence of the phenotypes predicted by an MDR result (fundus albipunctatus due to RDH5 and variegate porphyria due to PPOX). Primary care physicians recommended new clinical actions for 16% (CI, 8% to 30%) of FH patients and 34% (CI, 22% to 49%) of FH + WGS patients. Thirty percent (CI, 17% to 45%) and 41% (CI, 27% to 56%) of FH and FH + WGS patients, respectively, reported making a health behavior change after 6 months. Geneticists rated PCP management of 8 MDR results (73% [CI, 39% to 99%]) as appropriate and 2 results (18% [CI, 3% to 52%]) as inappropriate. Limited sample size and ancestral and socioeconomic diversity. Adding WGS to primary care

  3. Antibiotic selection of Escherichia coli sequence type 131 in a mouse intestinal colonization model

    DEFF Research Database (Denmark)

    Hertz, Frederik Boetius; Løbner-Olesen, Anders; Frimodt-Møller, Niels

    2014-01-01

    day, antibiotic treatment was initiated and given subcutaneously once a day for three consecutive days. CFU of E. coli ST131, Bacteroides, and Gram-positive aerobic bacteria in fecal samples were studied, with intervals, until day 8. Bacteroides was used as an indicator organism for impact on the Gram-negative.......05). Of these, only clindamycin suppressed Bacteroides, while the remaining two antibiotics had no negative impact on Bacteroides or Gram-positive organisms. Only clindamycin treatment resulted in prolonged colonization. The remaining six antibiotics, including ciprofloxacin, did not promote overgrowth of E....... coli ST131 (P > 0.95), nor did they suppress Bacteroides or Gram-positive organisms. The results showed that antimicrobials both with and without an impact on Gram-negative anaerobes can select for ESBL-producing E. coli, indicating that not only Gram-negative anaerobes have a role in upholding...

  4. Profiling soil microbial communities with next-generation sequencing: the influence of DNA kit selection and technician technical expertise

    Directory of Open Access Journals (Sweden)

    Taha Soliman

    2017-12-01

    Full Text Available Structure and diversity of microbial communities are an important research topic in biology, since microbes play essential roles in the ecology of various environments. Different DNA isolation protocols can lead to data bias and can affect results of next-generation sequencing. To evaluate the impact of protocols for DNA isolation from soil samples and also the influence of individual handling of samples, we compared results obtained by two researchers (R and T using two different DNA extraction kits: (1 MO BIO PowerSoil® DNA Isolation kit (MO_R and MO_T and (2 NucleoSpin® Soil kit (MN_R and MN_T. Samples were collected from six different sites on Okinawa Island, Japan. For all sites, differences in the results of microbial composition analyses (bacteria, archaea, fungi, and other eukaryotes, obtained by the two researchers using the two kits, were analyzed. For both researchers, the MN kit gave significantly higher yields of genomic DNA at all sites compared to the MO kit (ANOVA; P < 0.006. In addition, operational taxonomic units for some phyla and classes were missed in some cases: Micrarchaea were detected only in the MN_T and MO_R analyses; the bacterial phylum Armatimonadetes was detected only in MO_R and MO_T; and WIM5 of the phylum Amoebozoa of eukaryotes was found only in the MO_T analysis. Our results suggest the possibility of handling bias; therefore, it is crucial that replicated DNA extraction be performed by at least two technicians for thorough microbial analyses and to obtain accurate estimates of microbial diversity.

  5. Elucidation of the sequence selective binding mode of the DNA minor groove binder adozelesin, by high-field {sup 1}H NMR and restrained molecular dynamics

    Energy Technology Data Exchange (ETDEWEB)

    Cameron, L

    1999-07-01

    Adozelesin (formerly U73-975, The Upjohn Co.) is a covalent, minor-groove binding analogue of the antitumour antibiotic (+)CC-1065. Adozelesin consists of a cyclopropapyrroloindole alkylating sub-unit identical to (+)CC-1065, plus indole and benzofuran sub-units which replace the more complex pyrroloindole B and C sub-units, respectively, of (+)CC-1065. Adozelesin is a clinically important drug candidate, since it does not contain the ethylene bridge moieties on the B and C sub-units which are thought to be responsible for the unusual delayed hepatotoxicity exhibited by (+)CC-1065. Sequencing techniques identified two consensus sequences for adozelesin binding as p(dA) and 5'(T/A)(T/A)T-A*(C/G)G. This suggests that adozelesinspans a total of five base-pairs and shows a preference for A=T base-pair rich sequences, thus avoiding steric crowding around the exocyclic NH{sub 2} of guanine and a wide minor groove. In this project, the covalent modification of two DNA sequences, i.e. 5'd(CGTAAGCGCTTA*CG){sub 2} and 5'-d(CGAAAAA*CGG){center_dot} 5'-d(CCGTTTTTCG), by adozelesin was examined by high-field NMR and restrained molecular mechanics and dynamics. Previous studies of minor groove binding drugs, using techniques as diverse as NMR, X-ray crystallography and molecular modelling, indicate that the incorporation of a guanine into the consensus sequence sterically hinders binding and, more importantly, produces a wider minor groove which is a 'slack' fit for the ligand. The aim of this investigation was to provide an insight into the sequence selective binding of adozelesin to 5'-AAAAA*CG and 5'-GCTTA*CG. The {sup 1}H NMR data revealed that, in both cases, {beta}-helical structure and Watson-Crick base-pairing was maintained on adduct formation. The 5'-GCTTA*CG adduct displayed significant distortion of the guanine base on the non-covalently modified strand. This distortion resulted from an amalgamation of two factors. Firstly

  6. Development of a new method for detection and identification of Oenococcus oeni bacteriophages based on endolysin gene sequence and randomly amplified polymorphic DNA.

    Science.gov (United States)

    Doria, Francesca; Napoli, Chiara; Costantini, Antonella; Berta, Graziella; Saiz, Juan-Carlos; Garcia-Moruno, Emilia

    2013-08-01

    Malolactic fermentation (MLF) is a biochemical transformation conducted by lactic acid bacteria (LAB) that occurs in wine at the end of alcoholic fermentation. Oenococcus oeni is the main species responsible for MLF in most wines. As in other fermented foods, where bacteriophages represent a potential risk for the fermentative process, O. oeni bacteriophages have been reported to be a possible cause of unsuccessful MLF in wine. Thus, preparation of commercial starters that take into account the different sensitivities of O. oeni strains to different phages would be advisable. However, currently, no methods have been described to identify phages infecting O. oeni. In this study, two factors are addressed: detection and typing of bacteriophages. First, a simple PCR method was devised targeting a conserved region of the endolysin (lys) gene to detect temperate O. oeni bacteriophages. For this purpose, 37 O. oeni strains isolated from Italian wines during different phases of the vinification process were analyzed by PCR for the presence of the lys gene, and 25 strains gave a band of the expected size (1,160 bp). This is the first method to be developed that allows identification of lysogenic O. oeni strains without the need for time-consuming phage bacterial-lysis induction methods. Moreover, a phylogenetic analysis was conducted to type bacteriophages. After the treatment of bacteria with UV light, lysis was obtained for 15 strains, and the 15 phage DNAs isolated were subjected to two randomly amplified polymorphic DNA (RAPD)-PCRs. By combining the RAPD profiles and lys sequences, 12 different O. oeni phages were clearly distinguished.

  7. Genotype by sequencing identifies natural selection as a driver of intraspecific divergence in Atlantic populations of the high dispersal marine invertebrate, Macoma petalum.

    Science.gov (United States)

    Metivier, Stacy L; Kim, Jin-Hong; Addison, Jason A

    2017-10-01

    Mitochondrial DNA analyses indicate that the Bay of Fundy population of the intertidal tellinid bivalve Macoma petalum is genetically divergent from coastal populations in the Gulf of Maine and Nova Scotia. To further examine the evolutionary forces driving this genetic break, we performed double digest genotype by sequencing (GBS) to survey the nuclear genome for evidence of both neutral and selective processes shaping this pattern. The resulting reads were mapped to a partial transcriptome of its sister species, M. balthica, to identify single nucleotide polymorphisms (SNPs) in protein-coding genes. Population assignment tests, principle components analyses, analysis of molecular variance, and outlier tests all support differentiation between the Bay of Fundy genotype and the genotypes of the Gulf of Maine, Gulf of St. Lawrence, and Nova Scotia. Although both neutral and non-neutral patterns of genetic subdivision were significant, genetic structure among the regions was nearly 20 times higher for loci putatively under selection, suggesting a strong role for natural selection as a driver of genetic diversity in this species. Genetic differences were the greatest between the Bay of Fundy and all other population samples, and some outlier proteins were involved in immunity-related processes. Our results suggest that in combination with limited gene flow across the mouth of the Bay of Fundy, local adaptation is an important driver of intraspecific genetic variation in this marine species with high dispersal potential.

  8. Deep Sequencing of Influenza A Virus from a Human Challenge Study Reveals a Selective Bottleneck and Only Limited Intrahost Genetic Diversification.

    Science.gov (United States)

    Sobel Leonard, Ashley; McClain, Micah T; Smith, Gavin J D; Wentworth, David E; Halpin, Rebecca A; Lin, Xudong; Ransier, Amy; Stockwell, Timothy B; Das, Suman R; Gilbert, Anthony S; Lambkin-Williams, Robert; Ginsburg, Geoffrey S; Woods, Christopher W; Koelle, Katia

    2016-12-15

    Knowledge of influenza virus evolution at the point of transmission and at the intrahost level remains limited, particularly for human hosts. Here, we analyze a unique viral data set of next-generation sequencing (NGS) samples generated from a human influenza challenge study wherein 17 healthy subjects were inoculated with cell- and egg-passaged virus. Nasal wash samples collected from 7 of these subjects were successfully deep sequenced. From these, we characterized changes in the subjects' viral populations during infection and identified differences between the virus in these samples and the viral stock used to inoculate the subjects. We first calculated pairwise genetic distances between the subjects' nasal wash samples, the viral stock, and the influenza virus A/Wisconsin/67/2005 (H3N2) reference strain used to generate the stock virus. These distances revealed that considerable viral evolution occurred at various points in the human challenge study. Further quantitative analyses indicated that (i) the viral stock contained genetic variants that originated and likely were selected for during the passaging process, (ii) direct intranasal inoculation with the viral stock resulted in a selective bottleneck that reduced nonsynonymous genetic diversity in the viral hemagglutinin and nucleoprotein, and (iii) intrahost viral evolution continued over the course of infection. These intrahost evolutionary dynamics were dominated by purifying selection. Our findings indicate that rapid viral evolution can occur during acute influenza infection in otherwise healthy human hosts when the founding population size of the virus is large, as is the case with direct intranasal inoculation. Influenza viruses circulating among humans are known to rapidly evolve over time. However, little is known about how influenza virus evolves across single transmission events and over the course of a single infection. To address these issues, we analyze influenza virus sequences from a human

  9. Molecular evolution of a viral non-coding sequence under the selective pressure of amiRNA-mediated silencing.

    Directory of Open Access Journals (Sweden)

    Shih-Shun Lin

    2009-02-01

    Full Text Available Plant microRNAs (miRNA guide cleavage of target mRNAs by DICER-like proteins, thereby reducing mRNA abundance. Native precursor miRNAs can be redesigned to target RNAs of interest, and one application of such artificial microRNA (amiRNA technology is to generate plants resistant to pathogenic viruses. Transgenic Arabidopsis plants expressing amiRNAs designed to target the genome of two unrelated viruses were resistant, in a highly specific manner, to the appropriate virus. Here, we pursued two different goals. First, we confirmed that the 21-nt target site of viral RNAs is both necessary and sufficient for resistance. Second, we studied the evolutionary stability of amiRNA-mediated resistance against a genetically plastic RNA virus, TuMV. To dissociate selective pressures acting upon protein function from those acting at the RNA level, we constructed a chimeric TuMV harboring a 21-nt, amiRNA target site in a non-essential region. In the first set of experiments designed to assess the likelihood of resistance breakdown, we explored the effect of single nucleotide mutation within the target 21-nt on the ability of mutant viruses to successfully infect amiRNA-expressing plants. We found non-equivalency of the target nucleotides, which can be divided into three categories depending on their impact in virus pathogenicity. In the second set of experiments, we investigated the evolution of the virus mutants in amiRNA-expressing plants. The most common outcome was the deletion of the target. However, when the 21-nt target was retained, viruses accumulated additional substitutions on it, further reducing the binding/cleavage ability of the amiRNA. The pattern of substitutions within the viral target was largely dominated by G to A and C to U transitions.

  10. Rev1 and Polzeta influence toxicity and mutagenicity of Me-lex, a sequence selective N3-adenine methylating agent.

    Science.gov (United States)

    Monti, Paola; Ciribilli, Yari; Russo, Debora; Bisio, Alessandra; Perfumo, Chiara; Andreotti, Virginia; Menichini, Paola; Inga, Alberto; Huang, Xiaofen; Gold, Barry; Fronza, Gilberto

    2008-03-01

    The relative toxicity and mutagenicity of Me-lex, which selectively generates 3-methyladenine (3-MeA), is dependent on the nature of the DNA repair background. Base excision repair (BER)-defective S. cerevisiae strains mag1 and apn1apn2 were both significantly more sensitive to Me-lex toxicity, but only the latter is significantly more prone to Me-lex-induced mutagenesis. To examine the contribution of translesion synthesis (TLS) DNA polymerases in the bypass of Me-lex-induced lesions, the REV3 and REV1 genes were independently deleted in the parental yeast strain and in different DNA repair-deficient derivatives: the nucleotide excision repair (NER)-deficient rad14, and the BER-deficient mag1 or apn1apn2 strains. The strains contained an integrated ADE2 reporter gene under control of the transcription factor p53. A centromeric yeast expression vector containing the wild-type p53 cDNA was treated in vitro with increasing concentrations of Me-lex and transformed into the different yeast strains. The toxicity of Me-lex-induced lesions was evaluated based on the plasmid transformation efficiency compared to the untreated vector, while Me-lex mutagenicity was assessed using the p53 reporter assay. In the present study, we demonstrate that disruption of Polzeta (through deletion of its catalytic subunit coded by REV3) or Rev1 (by REV1 deletion) increased Me-lex lethality and decreased Me-lex mutagenicity in both the NER-defective (rad14) and BER-defective (mag1; apn1apn2) strains. Therefore, Polzeta and Rev1 contribute to resistance of the lethal effects of Me-lex-induced lesions (3-MeA and derived AP sites) by bypassing lesions and fixing some mutations.

  11. Selection strategy and the design of hybrid oligonucleotide primers for RACE-PCR: cloning a family of toxin-like sequences from Agelena orientalis

    Directory of Open Access Journals (Sweden)

    Lipkin Alexey

    2007-05-01

    Full Text Available Abstract Background the use of specific but partially degenerate primers for nucleic acid hybridisations and PCRs amplification of known or unknown gene families was first reported well over a decade ago and the technique has been used widely since then. Results here we report a novel and successful selection strategy for the design of hybrid partially degenerate primers for use with RT-PCR and RACE-PCR for the identification of unknown gene families. The technique (named PaBaLiS has proven very effective as it allowed us to identify and clone a large group of mRNAs encoding neurotoxin-like polypeptide pools from the venom of Agelena orientalis species of spider. Our approach differs radically from the generally accepted CODEHOP principle first reported in 1998. Most importantly, our method has proven very efficient by performing better than an independently generated high throughput EST cloning programme. Our method yielded nearly 130 non-identical sequences from Agelena orientalis, whilst the EST cloning technique yielded only 48 non-identical sequences from 2100 clones obtained from the same Agelena material. In addition to the primer design approach reported here, which is almost universally applicable to any PCR cloning application, our results also indicate that venom of Agelena orientalis spider contains a much larger family of related toxin-like sequences than previously thought. Conclusion with upwards of 100,000 species of spider thought to exist, and a propensity for producing diverse peptide pools, many more peptides of pharmacological importance await discovery. We envisage that some of these peptides and their recombinant derivatives will provide a new range of tools for neuroscience research and could also facilitate the development of a new generation of analgesic drugs and insecticides.

  12. The evolutionary history of Xiphophorus fish and their sexually selected sword: a genome-wide approach using restriction site-associated DNA sequencing.

    Science.gov (United States)

    Jones, Julia C; Fan, Shaohua; Franchini, Paolo; Schartl, Manfred; Meyer, Axel

    2013-06-01

    Next-generation sequencing (NGS) techniques are now key tools in the detection of population genomic and gene expression differences in a large array of organisms. However, so far few studies have utilized such data for phylogenetic estimations. Here, we use NGS data obtained from genome-wide restriction site-associated DNA (RAD) (∼66000 SNPs) to estimate the phylogenetic relationships among all 26 species of swordtail and platyfish (genus Xiphophorus) from Central America. Past studies, both sequence and morphology-based, have differed in their inferences of the evolutionary relationships within this genus, particularly at the species-level and among monophyletic groupings. We show that using a large number of markers throughout the genome, we are able to infer the phylogenetic relationships with unparalleled resolution for this genus. The relationships among all three major clades and species within each of them are highly resolved and consistent under maximum likelihood, Bayesian inference and maximum parsimony. However, we also highlight the current cautions with this data type and analyses. This genus exhibits a particularly interesting evolutionary history where at least two species may have arisen through hybridization events. Here, we are able to infer the paternal lineages of these putative hybrid species. Using the RAD-marker-based tree we reconstruct the evolutionary history of the sexually selected sword trait and show that it may have been present in the common ancestor of the genus. Together our results highlight the outstanding capacity that RAD sequencing data has for resolving previously problematic phylogenetic relationships, particularly among relatively closely related species. © 2013 John Wiley & Sons Ltd.

  13. The Sequence of Acquisition of Personal Pronoun Case and Person Reference among 6 Year Old Children in Two Selected Malaysian Kindergartens

    Directory of Open Access Journals (Sweden)

    Arshad Abd Samad

    2017-03-01

    Full Text Available Pronoun case and person reference refer to the position of the pronoun in the sentence and the person the pronoun refers to respectively.  Examining the acquisition of pronoun case and person reference among young children can be insightful as, besides their obvious relevance to language development, both these constructs can have implications on other aspects of child development.  Attention given by children to these various constructs may indicate the importance children place on the concept of ego and self as well as on social relations.  The sequence of acquisition of personal pronouns among these children is therefore an important phenomenon to be examined as it can reflect linguistic and socio-cognitive development.  This largely descriptive study examines the sequence of acquisition of the English pronouns among forty 6 year old Malaysian children learning ESL in two kindergartens.  The children in the study were presented with 33 drawings to assess their familiarity with case and person reference expressed through English personal pronouns.  They were required to select the correct pronoun from three pronouns that were used to describe each drawing.  This paper reports on the accuracy rates for each pronoun and assumes that high accuracy rates indicate a more complete acquisition of the pronoun.  Error forms by the children were also be identified and examined.  Data obtained were compared to acquisition sequences in the literature and general implications related to the acquisition of personal pronouns among children in an ESL setting in Malaysia will be discussed.

  14. Preclinical evaluation of an mRNA HIV vaccine combining rationally selected antigenic sequences and adjuvant signals (HTI-TriMix).

    Science.gov (United States)

    Guardo, Alberto C; Joe, Patrick Tjok; Miralles, Laia; Bargalló, Manel E; Mothe, Beatriz; Krasniqi, Ahmet; Heirman, Carlo; García, Felipe; Thielemans, Kris; Brander, Christian; Aerts, Joeri L; Plana, Montserrat

    2017-01-28

    The development of a prophylactic vaccine against HIV-1 has so far not been successful. Therefore, attention has shifted more and more toward the development of novel therapeutic vaccines. Here, we evaluated a new mRNA-based therapeutic vaccine against HIV-1-encoding activation signals (TriMix: CD40L + CD70 + caTLR4) combined with rationally selected antigenic sequences [HIVACAT T-cell immunogen (HTI)] sequence: comprises 16 joined fragments from Gag, Pol, Vif, and Nef). For this purpose, peripheral blood mononuclear cells from HIV-1-infected individuals on cART, lymph node explants from noninfected humans, and splenocytes from immunized mice were collected and several immune functions were measured. Electroporation of immature monocyte-derived dendritic cells from HIV-infected patients with mRNA encoding HTI + TriMix potently activated dendritic cells which resulted in upregulation of maturation markers and cytokine production and T-cell stimulation, as evidenced by enhanced proliferation and cytokine secretion (IFN-γ). Responses were HIV specific and were predominantly targeted against the sequences included in HTI. These findings were confirmed in human lymph node explants exposed to HTI + TriMix mRNA. Intranodal immunizations with HTI mRNA in a mouse model increased antigen-specific cytotoxic T-lymphocyte responses. The addition of TriMix further enhanced cytotoxic responses. Our results suggest that uptake of mRNA, encoding strong activation signals and a potent HIV antigen, confers a T-cell stimulatory capacity to dendritic cells and enhances their ability to stimulate antigen-specific immunity. These findings may pave the way for therapeutic HIV vaccine strategies based on antigen-encoding RNA to specifically target antigen-presenting cells.

  15. Post hoc Analysis for Detecting Individual Rare Variant Risk Associations Using Probit Regression Bayesian Variable Selection Methods in Case-Control Sequencing Studies.

    Science.gov (United States)

    Larson, Nicholas B; McDonnell, Shannon; Albright, Lisa Cannon; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham; MacInnis, Robert; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catolona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J

    2016-09-01

    Rare variants (RVs) have been shown to be significant contributors to complex disease risk. By definition, these variants have very low minor allele frequencies and traditional single-marker methods for statistical analysis are underpowered for typical sequencing study sample sizes. Multimarker burden-type approaches attempt to identify aggregation of RVs across case-control status by analyzing relatively small partitions of the genome, such as genes. However, it is generally the case that the aggregative measure would be a mixture of causal and neutral variants, and these omnibus tests do not directly provide any indication of which RVs may be driving a given association. Recently, Bayesian variable selection approaches have been proposed to identify RV associations from a large set of RVs under consideration. Although these approaches have been shown to be powerful at detecting associations at the RV level, there are often computational limitations on the total quantity of RVs under consideration and compromises are necessary for large-scale application. Here, we propose a computationally efficient alternative formulation of this method using a probit regression approach specifically capable of simultaneously analyzing hundreds to thousands of RVs. We evaluate our approach to detect causal variation on simulated data and examine sensitivity and specificity in instances of high RV dimensionality as well as apply it to pathway-level RV analysis results from a prostate cancer (PC) risk case-control sequencing study. Finally, we discuss potential extensions and future directions of this work. © 2016 WILEY PERIODICALS, INC.

  16. Bias in the prediction of genetic gain due to mass and half-sib selection in random mating populations

    Directory of Open Access Journals (Sweden)

    José Marcelo Soriano Viana

    2009-01-01

    Full Text Available The prediction of gains from selection allows the comparison of breeding methods and selection strategies, although these estimates may be biased. The objective of this study was to investigate the extent of such bias in predicting genetic gain. For this, we simulated 10 cycles of a hypothetical breeding program that involved seven traits, three population classes, three experimental conditions and two breeding methods (mass and half-sib selection. Each combination of trait, population, heritability, method and cycle was repeated 10 times. The predicted gains were biased, even when the genetic parameters were estimated without error. Gain from selection in both genders is twice the gain from selection in a single gender only in the absence of dominance. The use of genotypic variance or broad sense heritability in the predictions represented an additional source of bias. Predictions based on additive variance and narrow sense heritability were equivalent, as were predictions based on genotypic variance and broad sense heritability. The predictions based on mass and family selection were suitable for comparing selection strategies, whereas those based on selection within progenies showed the largest bias and lower association with the realized gain.

  17. Bioequivalence of generic lamotrigine 100-mg tablets in healthy Thai male volunteers: a randomized, single-dose, two-period, two-sequence crossover study.

    Science.gov (United States)

    Srichaiya, Arunee; Longchoopol, Chaowanee; Oo-Puthinan, Sarawut; Sayasathid, Jarun; Sripalakit, Pattana; Viyoch, Jarupa

    2008-10-01

    Lamotrigine is an antiepileptic drug which has been used in the treatment of epilepsy and bipolar disorder. A search of the literature did not find previously published bioequivalence and pharmacokinetic evaluations of lamotrigine in healthy Thai male volunteers. The aim of this study was to compare the pharmacokinetic parameters between 2 brands of lamotrigine in healthy Thai male volunteers. A randomized, single-dose, 2-period, 2-sequence, crossover study design with a 2-week washout period was conducted in healthy Thai males. Subjects were randomized to receive either the test or reference formulation in the first period. All subjects were required to be nonsmokers and without a history of alcohol or drug abuse. Plasma samples were collected over a 120-hour period after 100-mg lamotrigine administration in each period. A validated high-performance liquid chromatography ultraviolet method was used to analyze lamotrigine concentration in plasma. Pharmacokinetic parameters were determined using a noncompartmental method. Bioequivalence between the test and reference products, as defined by the US Food and Drug Administration (FDA), is determined when the ratio for the 90% CIs of the difference in the means of the log-transformed AUC(0-t), AUC(0-infinity), and C(max) of the 2 products are within 0.80 and 1.25. Adverse events were determined by measuring vital signs after dosing. Subjects were also asked if they suffered from undesirable effects such as nausea, vomiting, dizziness, and headache. This bioequivalence study was performed in 24 healthy Thai males (mean [SD] age, 20.5 [1.3] years; range, 19-24 years; weight, 62.5 [7.4] kg; height, 172.8 [6.9] cm; body mass index, 20.9 [2.0] kg/m(2)). The mean (SD) C(max) and T(max) of the test formulation of lamotrigine were 1.7 (0.3) microg/mL and 1.2 (0.9) hours, respectively. The mean (SD) C(max) and T(max) of the reference formulation of lamotrigine were 1.7 (0.3) microg/mL and 1.4 (1.0) hours, respectively. The mean

  18. Bioequivalence of two tablet formulations of clopidogrel in healthy Argentinian volunteers: a single-dose, randomized-sequence, open-label crossover study.

    Science.gov (United States)

    Di Girolamo, Guillermo; Czerniuk, Paola; Bertuola, Roberto; Keller, Guillermo A

    2010-01-01

    Platelet activation is a major component in the pathogenesis of coronary thrombosis and myocardial infarction. Thienopyridines, particularly clopidogrel, are highly effective in reducing in-stent thrombosis and functional inhibition of adenosine diphosphate-induced platelet activation. The aim of this study was to evaluate the bioequivalence of a new generic formulation of clopidogrel 75-mg tablets (test) and the available branded formulation (reference) to meet regulatory criteria for marketing the test product in Argentina. This was a randomized-sequence, open-label, 2-period crossover study conducted in healthy white volunteers in the fasted state. A single oral dose of the test or reference formulation was followed by a 7-day washout period, after which subjects received the alternative formulation. Blood samples were collected at baseline and at 0.25, 0.5, 0.75, 1, 1.25, 1.5, 2, 2.5, 3, 4, 6, 8, and 12 hours after dosing. Clopidogrel concentrations were determined using an LC-MS/MS method. The formulations were considered bioequivalent if the 90% CI of the geometric mean ratios (test:reference) for C(max) and AUC(0-last) were within the range from 80% to 125%. Adverse events were monitored throughout the study based on clinical parameters and patient reports. Twenty-four volunteers (13 male, 11 female; mean [SD] age, 33.7 [5.2] years [range, 21-42 years]; weight, 72.4 [6.83] kg [range, 59-82 kg]) were enrolled in and completed the study. The geometric mean C(max) for the test and reference formulations was 877.76 and 913.49 pg/mL, respectively. The geometric mean AUC(0-t) was 1911.53 and 2053.09 pg . h/mL, and the geometric mean AUC(0-infinity)) was 2021.33 and 2188.25 pg . h/mL. The geometric mean ratios (test:reference) for C(max), AUC(0-t), and AUC(0-infinity)) were 96.09% (90% CI, 90.71-101.78), 93.10% (90% CI, 85.57-101.3), and 92.37% (90% CI, 85.06-100.31), respectively. There were no significant differences in pharmacokinetic parameters between groups

  19. A Brief, Web-based Personalized Feedback Selective Intervention for College Student Marijuana Use: A Randomized Clinical Trial

    OpenAIRE

    Lee, Christine M.; Neighbors, Clayton; Kilmer, Jason R; Larimer, Mary E.

    2010-01-01

    Despite clear need, brief web-based interventions for marijuana using college students have not been evaluated in the literature. The current study was designed to evaluate a brief, web-based personalized feedback intervention for at-risk marijuana users transitioning to college. All entering first-year students were invited to complete a brief questionnaire. Participants meeting criteria completed a baseline assessment (N = 341) and were randomly assigned to web-based personalized feedback o...

  20. A Randomized Comparative Study of Pulsed Radiofrequency Treatment With or Without Selective Nerve Root Block for Chronic Cervical Radicular Pain.

    Science.gov (United States)

    Wang, Fei; Zhou, Qian; Xiao, Lizu; Yang, Juan; Xong, Donglin; Li, Disen; Liu, LiPing; Ancha, Sigdha; Cheng, Jianguo

    2017-06-01

    We demonstrated a combination of pulsed radiofrequency (PRF) and cervical nerve root block (CNRB) via a posterior approach was superior to a transforaminal epidural steroid injection through the anterolateral approach for cervical radicular pain in a previous study. This randomized trial was conducted to determine the comparative efficacy between CNRB, PRF, and CNRB + PRF for cervical radicular pain. A prospective and randomized design was used in this study. Sixty-two patients were randomized into three parallel groups: CNRB, PRF, or CNRB + PRF. Numeric Rating Scale (NRS) was used to measure pain intensity, and global perceived effect (GPE) was scored by the patient on a 7-point scale, ranging from much worse (-3), no change (0), to total improvement (+3). The outcomes were evaluated at 1 week, 1 month, 3 months, and 6 months. Side effects and complications were noted. The NRS was significantly reduced in all three groups 1 week after the treatments (P 0.05). No serious complications were observed in any of the patients. Combining CNRB and PRF appeared to be a safe and efficacious technique for cervical radicular pain. The combination therapy yielded better outcomes than either CNRB or PRF alone. © 2016 World Institute of Pain.

  1. Use of hyaluronan in the selection of sperm for intracytoplasmic sperm injection (ICSI): significant improvement in clinical outcomes--multicenter, double-blinded and randomized controlled trial.

    Science.gov (United States)

    Worrilow, K C; Eid, S; Woodhouse, D; Perloe, M; Smith, S; Witmyer, J; Ivani, K; Khoury, C; Ball, G D; Elliot, T; Lieberman, J

    2013-02-01

    Does the selection of sperm for ICSI based on their ability to bind to hyaluronan improve the clinical pregnancy rates (CPR) (primary end-point), implantation (IR) and pregnancy loss rates (PLR)? In couples where ≤ 65% of sperm bound hyaluronan, the selection of hyaluronan-bound (HB) sperm for ICSI led to a statistically significant reduction in PLR. HB sperm demonstrate enhanced developmental parameters which have been associated with successful fertilization and embryogenesis. Sperm selected for ICSI using a liquid source of hyaluronan achieved an improvement in IR. A pilot study by the primary author demonstrated that the use of HB sperm in ICSI was associated with improved CPR. The current study represents the single largest prospective, multicenter, double-blinded and randomized controlled trial to evaluate the use of hyaluronan in the selection of sperm for ICSI. Using the hyaluronan binding assay, an HB score was determined for the fresh or initial (I-HB) and processed or final semen specimen (F-HB). Patients were classified as >65% or ≤ 65% I-HB and stratified accordingly. Patients with I-HB scores ≤ 65% were randomized into control and HB selection (HYAL) groups whereas patients with I-HB >65% were randomized to non-participatory (NP), control or HYAL groups, in a ratio of 2:1:1. The NP group was included in the >65% study arm to balance the higher prevalence of patients with I-HB scores >65%. In the control group, oocytes received sperm selected via the conventional assessment of motility and morphology. In the HYAL group, HB sperm meeting the same visual criteria were selected for injection. Patient participants and clinical care providers were blinded to group assignment. Eight hundred two couples treated with ICSI in 10 private and hospital-based IVF programs were enrolled in this study. Of the 484 patients stratified to the I-HB > 65% arm, 115 participants were randomized to the control group, 122 participants were randomized to the HYAL group

  2. The prevalence and classification of chronic kidney disease in cats randomly selected within four age groups and in cats recruited for degenerative joint disease studies

    Science.gov (United States)

    Marino, Christina L; Lascelles, B Duncan X; Vaden, Shelly L; Gruen, Margaret E; Marks, Steven L

    2015-01-01

    Chronic kidney disease (CKD) and degenerative joint disease are both considered common in older cats. Information on the co-prevalence of these two diseases is lacking. This retrospective study was designed to determine the prevalence of CKD in two cohorts of cats: cats randomly selected from four evenly distributed age groups (RS group) and cats recruited for degenerative joint disease studies (DJD group), and to evaluate the concurrence of CKD and DJD in these cohorts. The RS group was randomly selected from four age groups from 6 months to 20 years, and the DJD group comprised cats recruited to four previous DJD studies, with the DJD group excluding cats with a blood urea nitrogen and/or serum creatinine concentration >20% (the upper end of normal) for two studies and cats with CKD stages 3 and 4 for the other two studies. The prevalence of CKD in the RS and DJD groups was higher than expected at 50% and 68.8%, respectively. CKD was common in cats between 1 and 15 years of age, with a similar prevalence of CKD stages 1 and 2 across age groups in both the RS and DJD cats, respectively. We found significant concurrence between CKD and DJD in cats of all ages, indicating the need for increased screening for CKD when selecting DJD treatments. Additionally, this study offers the idea of a relationship and causal commonality between CKD and DJD owing to the striking concurrence across age groups and life stages. PMID:24217707

  3. Blood Selenium Concentration and Blood Cystatin C Concentration in a Randomly Selected Population of Healthy Children Environmentally Exposed to Lead and Cadmium.

    Science.gov (United States)

    Gać, Paweł; Pawlas, Natalia; Wylężek, Paweł; Poręba, Rafał; Poręba, Małgorzata; Pawlas, Krystyna

    2017-01-01

    This study aimed at evaluation of a relationship between blood selenium concentration (Se-B) and blood cystatin C concentration (CST) in a randomly selected population of healthy children, environmentally exposed to lead and cadmium. The studies were conducted on 172 randomly selected children (7.98 ± 0.97 years). Among participants, the subgroups were distinguished, manifesting marginally low blood selenium concentration (Se-B 40-59 μg/l), suboptimal blood selenium concentration (Se-B: 60-79 μg/l) or optimal blood selenium concentration (Se-B ≥ 80 μg/l). At the subsequent stage, analogous subgroups of participants were selected separately in groups of children with BMI below median value (BMI selenium concentration and blood cystatin C concentration. On the other hand, in children with low body mass index, a negative non-linear relationship was present between blood selenium concentration and blood cystatin C concentration.

  4. Prevalence and classification of chronic kidney disease in cats randomly selected from four age groups and in cats recruited for degenerative joint disease studies.

    Science.gov (United States)

    Marino, Christina L; Lascelles, B Duncan X; Vaden, Shelly L; Gruen, Margaret E; Marks, Steven L

    2014-06-01

    Chronic kidney disease (CKD) and degenerative joint disease are both considered common in older cats. Information on the co-prevalence of these two diseases is lacking. This retrospective study was designed to determine the prevalence of CKD in two cohorts of cats: cats randomly selected from four evenly distributed age groups (RS group) and cats recruited for degenerative joint disease studies (DJD group), and to evaluate the concurrence of CKD and DJD in these cohorts. The RS group was randomly selected from four age groups from 6 months to 20 years, and the DJD group comprised cats recruited to four previous DJD studies, with the DJD group excluding cats with a blood urea nitrogen and/or serum creatinine concentration >20% (the upper end of normal) for two studies and cats with CKD stages 3 and 4 for the other two studies. The prevalence of CKD in the RS and DJD groups was higher than expected at 50% and 68.8%, respectively. CKD was common in cats between 1 and 15 years of age, with a similar prevalence of CKD stages 1 and 2 across age groups in both the RS and DJD cats, respectively. We found significant concurrence between CKD and DJD in cats of all ages, indicating the need for increased screening for CKD when selecting DJD treatments. Additionally, this study offers the idea of a relationship and causal commonality between CKD and DJD owing to the striking concurrence across age groups and life stages. © ISFM and AAFP 2013.

  5. SNPs selected by information content outperform randomly selected microsatellite loci for delineating genetic identification and introgression in the endangered dark European honeybee (Apis mellifera mellifera).

    Science.gov (United States)

    Muñoz, Irene; Henriques, Dora; Jara, Laura; Johnston, J Spencer; Chávez-Galarza, Julio; De La Rúa, Pilar; Pinto, M Alice

    2017-07-01

    The honeybee (Apis mellifera) has been threatened by multiple factors including pests and pathogens, pesticides and loss of locally adapted gene complexes due to replacement and introgression. In western Europe, the genetic integrity of the native A. m. mellifera (M-lineage) is endangered due to trading and intensive queen breeding with commercial subspecies of eastern European ancestry (C-lineage). Effective conservation actions require reliable molecular tools to identify pure-bred A. m. mellifera colonies. Microsatellites have been preferred for identification of A. m. mellifera stocks across conservation centres. However, owing to high throughput, easy transferability between laboratories and low genotyping error, SNPs promise to become popular. Here, we compared the resolving power of a widely utilized microsatellite set to detect structure and introgression with that of different sets that combine a variable number of SNPs selected for their information content and genomic proximity to the microsatellite loci. Contrary to every SNP data set, microsatellites did not discriminate between the two lineages in the PCA space. Mean introgression proportions were identical across the two marker types, although at the individual level, microsatellites' performance was relatively poor at the upper range of Q-values, a result reflected by their lower precision. Our results suggest that SNPs are more accurate and powerful than microsatellites for identification of A. m. mellifera colonies, especially when they are selected by information content. © 2016 John Wiley & Sons Ltd.

  6. Post-hoc Analysis for Detecting Individual Rare Variant Risk Associations using Probit Regression Bayesian Variable Selection Methods in Case-Control Sequencing Studies

    Science.gov (United States)

    Larson, Nicholas B.; McDonnell, Shannon; Albright, Lisa Cannon; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A.; Isaacs, William B.; Xu, Jianfeng; Cooney, Kathleen A.; Lange, Ethan; Schleutker, Johanna; Carpten, John D.; Powell, Isaac; Bailey-Wilson, Joan; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham; MacInnis, Robert; Maier, Christiane; Whittemore, Alice S.; Hsieh, Chih-Lin; Wiklund, Fredrik; Catolona, William J.; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J.; Olson, Timothy M.; Klein, Christopher J.; Thibodeau, Stephen N.; Schaid, Daniel J.

    2016-01-01

    Rare variants have been shown to be significant contributors to complex disease risk. By definition, these variants have very low minor allele frequencies and traditional single-marker methods for statistical analysis are underpowered for typical sequencing study sample sizes. Multi-marker burden-type approaches attempt to identify aggregation of rare variants across case-control status by analyzing relatively small partitions of the genome, such as genes. However, it is generally the case that the aggregative measure would be a mixture of causal and neutral variants, and these omnibus tests do not directly provide any indication of which rare variants may be driving a given association. Recently, Bayesian variable selection approaches have been proposed to identify rare variant associations from a large set of rare variants under consideration. While these approaches have been shown to be powerful at detecting associations at the rare variant level, there are often computational limitations on the total quantity of rare variants under consideration and compromises are necessary for large-scale application. Here, we propose a computationally efficient alternative formulation of this method using a probit regression approach specifically capable of simultaneously analyzing hundreds to thousands of rare variants. We evaluate our approach to detect causal variation on simulated data and examine sensitivity and specificity in instances of high rare variant dimensionality as well as apply it to pathway-level rare variant analysis results from a prostate cancer risk case-control sequencing study. Finally, we discuss potential extensions and future directions of this work. PMID:27312771

  7. Differentially Expressed Genes in Endometrium and Corpus Luteum of Holstein Cows Selected for High and Low Fertility Are Enriched for Sequence Variants Associated with Fertility.

    Science.gov (United States)

    Moore, Stephen G; Pryce, Jennie E; Hayes, Ben J; Chamberlain, Amanda J; Kemper, Kathryn E; Berry, Donagh P; McCabe, Matt; Cormican, Paul; Lonergan, Pat; Fair, Trudee; Butler, Stephen T

    2016-01-01

    Despite the importance of fertility in humans and livestock, there has been little success dissecting the genetic basis of fertility. Our hypothesis was that genes differentially expressed in the endometrium and corpus luteum on Day 13 of the estrous cycle between cows with either good or poor genetic merit for fertility would be enriched for genetic variants associated with fertility. We combined a unique genetic model of fertility (cattle that have been selected for high and low fertility and show substantial difference in fertility) with gene expression data from these cattle and genome-wide association study (GWAS) results in ∼20,000 cattle to identify quantitative trait loci (QTL) regions and sequence variants associated with genetic variation in fertility. Two hundred and forty-five QTL regions and 17 sequence variants associated primarily with prostaglandin F2alpha, steroidogenesis, mRNA processing, energy status, and immune-related processes were identified. Ninety-three of the QTL regions were validated by two independent GWAS, with signals for fertility detected primarily on chromosomes 18, 5, 7, 8, and 29. Plausible causative mutations were identified, including one missense variant significantly associated with fertility and predicted to affect the protein function of EIF4EBP3. The results of this study enhance our understanding of 1) the contribution of the endometrium and corpus luteum transcriptome to phenotypic fertility differences and 2) the genetic architecture of fertility in dairy cattle. Including these variants in predictions of genomic breeding values may improve the rate of genetic gain for this critical trait. © 2016 by the Society for the Study of Reproduction, Inc.

  8. Relative bioavailability of generic and branded acetylcysteine effervescent tablets: A single-dose, open-label, randomized-sequence, two-period crossover study in fasting healthy Chinese male volunteers.

    Science.gov (United States)

    Liu, Yan-Mei; Liu, Yun; Lu, Chuan; Jia, Jing-Ying; Liu, Gang-Yi; Weng, Li-Ping; Wang, Jia-Yan; Li, Guo-Xiu; Wang, Wei; Li, Shui-Jun; Yu, Chen

    2010-11-01

    Acetylcysteine may be used as a muco- lytic agent for the treatment of chronic bronchitis, chronic obstructive pulmonary disease, and other pulmonary diseases complicated by the production of viscous mucus. However, little is known of its pharmacokinetic properties when given orally in healthy volunteers, particularly in a Chinese Han population. This study was conducted to provide support for the marketing of a generic product in China. The purpose of this study was to compare the pharmacokinetics and relative bioavailability of a generic test formulation and a branded reference formulation of acetylcysteine in fasting healthy Chinese male volunteers. A single-dose, open-label, randomized-sequence, 2-period crossover design with a 7-day washout period between doses was used in this study. Healthy Chinese male nonsmokers aged 18 to 40 years with a body mass index (BMI) of 19 to 25 kg/m(2) were selected. Eligible volunteers were randomly assigned to receive acetylcysteine 600 mg PO as either the test formulation (3 tablets of 200 mg each) or reference formulation (1 tablet of 600 mg) under fasting conditions. A total of 15 serial blood samples were collected over a 24-hour interval, and total plasma acetylcysteine concentrations were analyzed by a validated liquid chromatography-isotopic dilution mass spectrometry method. Pharmacokinetic parameters (C(max), T(max), t(½) AUC(0-t), and AUC(0-∞) were calculated and analyzed statistically. The 2 formulations were considered bioequivalent if the 90% CIs of the log-transformed ratios (test/reference) of C(max) and AUC were within the predetermined bioequivalence ranges (70%-143% for C(max); 80%-125% for AUC), as established by the State Food and Drug Administration of China. Tolerability was determined by vital signs, clinical laboratory tests, 12-lead ECGs, physical examinations, and interviews with the subjects about adverse events (AEs). A total of 24 healthy Chinese Han male volunteers were enrolled in and

  9. Sigma factor selectivity in Borrelia burgdorferi: RpoS recognition of the ospE/ospF/elp promoters is dependent on the sequence of the -10 region.

    Science.gov (United States)

    Eggers, Christian H; Caimano, Melissa J; Radolf, Justin D

    2006-03-01

    Members of the ospE/ospF/elp lipoprotein gene families of Borrelia burgdorferi, the Lyme disease agent, are transcriptionally upregulated in response to the influx of blood into the midgut of an infected tick. We recently have demonstrated that despite the high degree of similarity between the promoters of the ospF (P(ospF)) and ospE (P(ospE)) genes of B. burgdorferi strain 297, the differential expression of ospF is RpoS-dependent, while ospE is controlled by sigma(70). Herein we used wild-type and RpoS-deficient strains of B. burgdorferi and Escherichia coli to analyse transcriptional reporters consisting of a green fluorescent protein (gfp) gene fused to P(ospF), P(ospE), or two hybrid promoters in which the -10 regions of P(ospF) and P(ospE) were switched [P(ospF ) ((E - 10)) and P(ospE) ((F - 10)) respectively]. We found that the P(ospF)-10 region is both necessary and sufficient for RpoS-dependent recognition in B. burgdorferi, while sigma(70) specificity for P(ospE) is dependent on elements outside of the -10 region. In E. coli, sigma factor selectivity for these promoters was much more permissive, with expression of each being primarily due to sigma(70). Alignment of the sequences upstream of each of the ospE/ospF/elp genes from B. burgdorferi strains 297 and B31 revealed that two B31 ospF paralogues [erpK (BBM38) and erpL (BBO39)] have -10 regions virtually identical to that of P(ospF). Correspondingly, expression of gfp reporters based on the erpK and erpL promoters was RpoS-dependent. Thus, the sequence of the P(ospF)-10 region appears to serve as a motif for RpoS recognition, the first described for any B. burgdorferi promoter. Taken together, our data support the notion that B. burgdorferi utilizes sequence differences at the -10 region as one mechanism for maintaining the transcriptional integrity of RpoS-dependent and -independent genes activated at the onset of tick feeding.

  10. Lineage-specific variations of congruent evolution among DNA sequences from three genomes, and relaxed selective constraints on rbcL in Cryptomonas (Cryptophyceae).

    Science.gov (United States)

    Hoef-Emden, Kerstin; Tran, Hoang-Dung; Melkonian, Michael

    2005-10-18

    Plastid-bearing cryptophytes like Cryptomonas contain four genomes in a cell, the nucleus, the nucleomorph, the plastid genome and the mitochondrial genome. Comparative phylogenetic analyses encompassing DNA sequences from three different genomes were performed on nineteen photosynthetic and four colorless Cryptomonas strains. Twenty-three rbcL genes and fourteen nuclear SSU rDNA sequences were newly sequenced to examine the impact of photosynthesis loss on codon usage in the rbcL genes, and to compare the rbcL gene phylogeny in terms of tree topology and evolutionary rates with phylogenies inferred from nuclear ribosomal DNA (concatenated SSU rDNA, ITS2 and partial LSU rDNA), and nucleomorph SSU rDNA. Largely congruent branching patterns and accelerated evolutionary rates were found in nucleomorph SSU rDNA and rbcL genes in a clade that consisted of photosynthetic and colorless species suggesting a coevolution of the two genomes. The extremely accelerated rates in the rbcL phylogeny correlated with a shift from selection to mutation drift in codon usage of two-fold degenerate NNY codons comprising the amino acids asparagine, aspartate, histidine, phenylalanine, and tyrosine. Cysteine was the sole exception. The shift in codon usage seemed to follow a gradient from early diverging photosynthetic to late diverging photosynthetic or heterotrophic taxa along the branches. In the early branching taxa, codon preferences were changed in one to two amino acids, whereas in the late diverging taxa, including the colorless strains, between four and five amino acids showed changes in codon usage. Nucleomorph and plastid gene phylogenies indicate that loss of photosynthesis in the colorless Cryptomonas strains examined in this study possibly was the result of accelerated evolutionary rates that started already in photosynthetic ancestors. Shifts in codon usage are usually considered to be caused by changes in functional constraints and in gene expression levels. Thus, the

  11. Genomic analysis of a sexually-selected character: EST sequencing and microarray analysis of eye-antennal imaginal discs in the stalk-eyed fly Teleopsis dalmanni (Diopsidae

    Directory of Open Access Journals (Sweden)

    Wang Xianhui

    2009-08-01

    Full Text Available Abstract Background Many species of stalk-eyed flies (Diopsidae possess highly-exaggerated, sexually dimorphic eye-stalks that play an important role in the mating system of these flies. Eye-stalks are increasingly being used as a model system for studying sexual selection, but little is known about the genetic mechanisms producing variation in these ornamental traits. Therefore, we constructed an EST database of genes expressed in the developing eye-antennal imaginal disc of the highly dimorphic species Teleopsis dalmanni. We used this set of genes to construct microarray slides and compare patterns of gene expression between lines of flies with divergent eyespan. Results We generated 33,229 high-quality ESTs from three non-normalized libraries made from the developing eye-stalk tissue at different developmental stages. EST assembly and annotation produced a total of 7,066 clusters comprising 3,424 unique genes with significant sequence similarity to a protein in either Drosophila melanogaster or Anopheles gambiae. Comparisons of the transcript profiles at different stages reveal a developmental shift in relative expression from genes involved in anatomical structure formation, transcription, and cell proliferation at the larval stage to genes involved in neurological processes and cuticle production during the pupal stages. Based on alignments of the EST fragments to homologous sequences in Drosophila and Anopheles, we identified 20 putative gene duplication events in T. dalmanni and numerous genes undergoing significantly faster rates of evolution in T. dalmanni relative to the other Dipteran species. Microarray experiments identified over 350 genes with significant differential expression between flies from lines selected for high and low relative eyespan but did not reveal any primary biological process or pathway that is driving the expression differences. Conclusion The catalogue of genes identified in the EST database provides a valuable

  12. Relative bioavailability of levodropropizine 60 mg capsule and syrup formulations in healthy male Korean volunteers: a singledose, randomized-sequence, open-label, two-way crossover study.

    Science.gov (United States)

    Jang, Jae-Won; Seo, Ji-Hyung; Jo, Min-Ho; Lee, Young-Joo; Cho, Young-Wuk; Yim, Sung-Vin; Lee, Kyung-Tae

    2013-02-01

    Levodropropizine is an oral non-opioid anti-tussive drug used in treatment of cough. A new generic 60 mg capsule formulation of levodropropizine has recently been developed. The aim of this study was to assess the pharmacokinetics and bioequivalence of the test (capsule) formulation and reference (syrup) formulation of levodropropizine (60 mg) in healthy, fasted, male Korean volunteers. This was a single-dose, randomized sequence, open-label, 2-period crossover study conducted in healthy male Korean volunteers in the fasted state at Kyung Hee University Medical Center (Seoul, Republic of Korea). A single oral dose of the test or reference formulation was followed by a 1-week washout period, after which subjects received the alternative formulation. Blood samples were collected at 0 (predose), 0.17, 0.33, 0.5, 0.75, 1, 1.5, 2, 3, 4, 6, 8, and 12 hours after study drug administration. Plasma concentration of levodropropizine was determined using a validated liquid chromatography tandem mass spectrometry (LCMS/ MS) method. The formulations were considered bioequivalent if the 90% CIs for C(max), AUC(0-12h) and AUC(0-∞) were within the predetermined bioequivalence range (80 - 125%, according to the guidelines of the Korea Food and Drug Administration (Korea FDA)). Tolerability was evaluated throughout the study based on vital sign measurements, laboratory analysis (blood biochemistry, hematology, hepatic function and urinalysis) and subject interviews concerning adverse events (AEs). A total of 36 male Korean subjects (mean (SD) age, 23.9 (2.4) years (range 19 - 30 years); height, 176.2 (6.1) cm (range 161 - 190 cm); weight, 69.8 (9.1) kg (range 54.0 - 92.2 kg); body mass index, 22.4 (2.1) kg/m2 (range 19.1 - 28.3 kg/m2)) was enrolled and completed the study. The mean values for C(max), t(max), AUC(0-12h), and AUC(0-∞) with the test formulation of levodropropizine were 331.51 ng/ml, 0.60 hours, 784.32 ng×h/ml, and 825.82 ng×h/ml, respectively; for the reference

  13. Acute changes of hip joint range of motion using selected clinical stretching procedures: A randomized crossover study.

    Science.gov (United States)

    Hammer, Adam M; Hammer, Roger L; Lomond, Karen V; O'Connor, Paul

    2017-09-01

    Hip adductor flexibility and strength is an important component of athletic performance and many activities of daily living. Little research has been done on the acute effects of a single session of stretching on hip abduction range of motion (ROM). The aim of this study was to compare 3 clinical stretching procedures against passive static stretching and control on ROM and peak isometric maximal voluntary contraction (MVC). Using a randomized crossover study design, a total of 40 participants (20 male and 20 female) who had reduced hip adductor muscle length attended a familiarization session and 5 testing sessions on non-consecutive days. Following the warm-up and pre-intervention measures of ROM and MVC, participants were randomly assigned 1 of 3 clinical stretching procedures (modified lunge, multidirectional, and joint mobilization) or a static stretch or control condition. Post-intervention measures of ROM and MVC were taken immediately following completion of the assigned condition. An ANOVA using a repeated measure design with the change score was conducted. All interventions resulted in small but statistically significant (p stretching was greater than control (p = 0.031). These data suggest that a single session of stretching has only a minimal effect on acute changes of hip abduction ROM. Although hip abduction is a frontal plane motion, to effectively increase the extensibility of the structures that limit abduction, integrating multi-planar stretches may be indicated. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Selepressin, a novel selective vasopressin V1A agonist, is an effective substitute for norepinephrine in a phase IIa randomized, placebo-controlled trial in septic shock patients

    DEFF Research Database (Denmark)

    Russell, James A; Vincent, Jean-Louis; Kjølbye, Anne Louise

    2017-01-01

    BACKGROUND: Vasopressin is widely used for vasopressor support in septic shock patients, but experimental evidence suggests that selective V1A agonists are superior. The initial pharmacodynamic effects, pharmacokinetics, and safety of selepressin, a novel V1A-selective vasopressin analogue......, was examined in a phase IIa trial in septic shock patients. METHODS: This was a randomized, double-blind, placebo-controlled multicenter trial in 53 patients in early septic shock (aged ≥18 years, fluid resuscitation, requiring vasopressor support) who received selepressin 1.25 ng/kg/minute (n = 10), 2.5 ng...... for selepressin 2.5 ng/kg/minute and placebo. Two patients were infused at 3.75 ng/kg/minute, one of whom had the study drug infusion discontinued for possible safety reasons, with subsequent discontinuation of this dose group. CONCLUSIONS: In septic shock patients, selepressin 2.5 ng/kg/minute was able...

  15. Comprehensive re-sequencing of adrenal aldosterone producing lesions reveal three somatic mutations near the KCNJ5 potassium channel selectivity filter.

    Directory of Open Access Journals (Sweden)

    Tobias Åkerström

    Full Text Available Aldosterone producing lesions are a common cause of hypertension, but genetic alterations for tumorigenesis have been unclear. Recently, either of two recurrent somatic missense mutations (G151R or L168R was found in the potassium channel KCNJ5 gene in aldosterone producing adenomas. These mutations alter the channel selectivity filter and result in Na(+ conductance and cell depolarization, stimulating aldosterone production and cell proliferation. Because a similar mutation occurs in a mendelian form of primary aldosteronism, these mutations appear to be sufficient for cell proliferation and aldosterone production. The prevalence and spectrum of KCNJ5 mutations in different entities of adrenocortical lesions remain to be defined.The coding region and flanking intronic segments of KCNJ5 were subjected to Sanger DNA sequencing in 351 aldosterone producing lesions, from patients with primary aldosteronism and 130 other adrenocortical lesions. The specimens had been collected from 10 different worldwide referral centers.G151R or L168R somatic mutations were identified in 47% of aldosterone producing adenomas, each with similar frequency. A previously unreported somatic mutation near the selectivity filter, E145Q, was observed twice. Somatic G151R or L168R mutations were also found in 40% of aldosterone producing adenomas associated with marked hyperplasia, but not in specimens with merely unilateral hyperplasia. Mutations were absent in 130 non-aldosterone secreting lesions. KCNJ5 mutations were overrepresented in aldosterone producing adenomas from female compared to male patients (63 vs. 24%. Males with KCNJ5 mutations were significantly younger than those without (45 vs. 54, respectively; p<0.005 and their APAs with KCNJ5 mutations were larger than those without (27.1 mm vs. 17.1 mm; p<0.005.Either of two somatic KCNJ5 mutations are highly prevalent and specific for aldosterone producing lesions. These findings provide new insight into the

  16. Early routine versus late selective surfactant in preterm neonates with respiratory distress syndrome on nasal continuous positive airway pressure: a randomized controlled trial.

    Science.gov (United States)

    Kandraju, Hemasree; Murki, Srinivas; Subramanian, Sreeram; Gaddam, Pramod; Deorari, Ashok; Kumar, Praveen

    2013-01-01

    Preterm neonates with respiratory distress syndrome (RDS) benefit from early application of nasal continuous positive airway pressure (nCPAP). However, it is not clear whether surfactant should be administered early as a routine to all such infants or later in a selective manner. It was the aim of this study to compare the efficacy of early routine versus late selective surfactant treatment in reducing the need for mechanical ventilation (MV) during the first week of life among moderate-sized preterm infants with RDS being supported by nCPAP. Infants born at 28(0/7) to 33(6/7) weeks of gestation with RDS and on nCPAP were randomly assigned within the first 2 h of life to early routine surfactant administration by the InSurE technique (early surfactant group) or to late selective administration of surfactant (late surfactant group). The primary outcome was need for MV in the first 7 days of life. Among 153 infants randomized to early (n = 74) or late surfactant (n = 79) groups, the need for MV was significantly lower in the early surfactant group (16.2 vs. 31.6%; relative risk 0.41, 95% confidence interval 0.19-0.91). The incidence of pneumothorax (1.9 vs. 2.3%) and the need for supplemental O2 at 28 days (2.7 vs. 8.9%) were similar in the two groups. Early routine surfactant administration within 2 h of life as compared to late selective administration significantly reduced the need for MV in the first week of life among preterm infants with RDS on nCPAP. Copyright © 2012 S. Karger AG, Basel.

  17. Pharmacokinetics and bioavailability comparison of generic and branded citalopram 20 mg tablets: an open-label, randomized-sequence, two-period crossover study in healthy Chinese CYP2C19 extensive metabolizers.

    Science.gov (United States)

    Jiang, Tao; Rong, Zhengxing; Xu, Yiping; Chen, Bing; Xie, Yifan; Chen, Congying; Lu, Yang; Shen, Yifeng; Li, Huafang; Sun, Jing; Chen, Hongzhuan

    2013-01-01

    Citalopram is a selective serotonin reuptake inhibitor (SSRI) mainly prescribed to treat major depression. The aim of this study was to compare the pharmacokinetic characteristics of a new and a branded citalopram 20 mg formulation to support the marketing authorization of the test formulation in China. A single-dose, open-label, randomized-sequence, two-period crossover design was used in this study. Healthy Chinese male cytochrome P450 (CYP) 2C19 extensive metabolizers, aged 18-40 years, were eligible to participate. CYP2C19 poor metabolizers were excluded, based on genotyping of genomic DNA from blood samples. Twenty-four subjects were randomly assigned to receive the test formulation followed by the reference formulation, and then vice versa. A 2-week washout occurred between study periods. Blood samples were collected for up to 144 h post-dose. Quantification was carried out using a validated high-performance liquid chromatography-tandem mass spectrometry (HPLC-MS/MS) method. Pharmacokinetic parameters were calculated and analysed statistically. The two formulations were considered pharmacokinetically equivalent if the 90 % confidence intervals (CIs) of the log-transformed ratios (test/reference) of the maximum plasma concentration (C(max)), area under the plasma concentration-time curve from time zero to the last measurable concentration (AUC(last)), and area under the plasma concentration-time curve from time zero to infinity (AUC(∞)) were within the predetermined acceptance range (70-143 % for C(max); 80-125 % for AUC(last) and AUC(∞)) according to China State Food and Drug Administration bioequivalence guidelines. Tolerability was monitored by clinical assessment, vital signs, laboratory analysis and interviews with participants about adverse events. A total of 24 participants, with a mean (SD) age of 26 (3) years (range 22-32 years), body weight of 65.2 (5.0) kg (range 53-73 kg), and height of 172.7 (4.9) cm (range 159-182 cm), were enrolled in this

  18. Does Multimodal Analgesia with Acetaminophen, Nonsteroidal Antiinflammatory Drugs, or Selective Cyclooxygenase-2 Inhibitors and Patient-controlled Analgesia Morphine Offer Advantages over Morphine Alone?: Meta-analyses of Randomized Trials

    National Research Council Canada - National Science Library

    Elia, Nadia; Lysakowski, Christopher; Tramèr, Martin R

    2005-01-01

    The authors analyzed data from 52 randomized placebo-controlled trials (4,893 adults) testing acetaminophen, nonsteroidal antiinflammatory drugs, or selective cyclooxygenase-2 inhibitors given in conjunction with morphine after surgery...

  19. Population structure of Atlantic Mackerel inferred from RAD-seq derived SNP markers: effects of sequence clustering parameters and hierarchical SNP selection

    KAUST Repository

    Rodríguez-Ezpeleta, Naiara

    2016-03-03

    Restriction-site associated DNA sequencing (RAD-seq) and related methods are revolutionizing the field of population genomics in non-model organisms as they allow generating an unprecedented number of single nucleotide polymorphisms (SNPs) even when no genomic information is available. Yet, RAD-seq data analyses rely on assumptions on nature and number of nucleotide variants present in a single locus, the choice of which may lead to an under- or overestimated number of SNPs and/or to incorrectly called genotypes. Using the Atlantic mackerel (Scomber scombrus L.) and a close relative, the Atlantic chub mackerel (Scomber colias), as case study, here we explore the sensitivity of population structure inferences to two crucial aspects in RAD-seq data analysis: the maximum number of mismatches allowed to merge reads into a locus and the relatedness of the individuals used for genotype calling and SNP selection. Our study resolves the population structure of the Atlantic mackerel, but, most importantly, provides insights into the effects of alternative RAD-seq data analysis strategies on population structure inferences that are directly applicable to other species.

  20. Physical self-concept changes in a selective sport high school: a longitudinal cohort-sequence analysis of the big-fish-little-pond effect.

    Science.gov (United States)

    Marsh, Herbert W; Morin, Alexandre J; Parker, Philip D

    2015-04-01

    Elite athletes and nonathletes (N = 1,268) attending the same selective sport high school (4 high school age cohorts, grades 7-10, mean ages varying from 10.9 to 14.1) completed the same physical self-concept instrument 4 times over a 2-year period (multiple waves). We introduce a latent cohort-sequence analysis that provides a stronger basis for assessing developmental stability/change than either cross-sectional (multicohort, single occasion) or longitudinal (single-cohort, multiple occasion) designs, allowing us to evaluate latent means across 10 waves spanning a 5-year period (grades 7-11), although each participant contributed data for only 4 waves, spanning 2 of the 5 years. Consistent with the frame-of-reference effects embodied in the big-fish-little-pond effect (BFLPE), physical self-concepts at the start of high school were much higher for elite athletes than for nonathlete classmates, but the differences declined over time so that by the end of high school there were no differences in the 2 groups. Gender differences in favor of males had a negative linear and quadratic trajectory over time, but the consistently smaller gender differences for athletes than for nonathletes did not vary with time.