WorldWideScience

Sample records for base sequence effects

  1. Main-Sequence Effective Temperatures from a Revised Mass-Luminosity Relation Based on Accurate Properties

    CERN Document Server

    Eker, Z; Soydugan, E; Bilir, S; Gokce, E Yaz; Steer, I; Tuysuz, M; Senyuz, T; Demircan, O

    2015-01-01

    The mass-luminosity (M-L), mass-radius (M-R) and mass-effective temperature ($M-T_{eff}$) diagrams for a subset of galactic nearby main-sequence stars with masses and radii accurate to $\\leq 3\\%$ and luminosities accurate to $\\leq 30\\%$ (268 stars) has led to a putative discovery. Four distinct mass domains have been identified, which we have tentatively associated with low, intermediate, high, and very high mass main-sequence stars, but which nevertheless are clearly separated by three distinct break points at 1.05, 2.4, and 7$M_{\\odot}$ within the mass range studied of $0.38-32M_{\\odot}$. Further, a revised mass-luminosity relation (MLR) is found based on linear fits for each of the mass domains identified. The revised, mass-domain based MLRs, which are classical ($L \\propto M^{\\alpha}$), are shown to be preferable to a single linear, quadratic or cubic equation representing as an alternative MLR. Stellar radius evolution within the main-sequence for stars with $M>1M_{\\odot}$ is clearly evident on the M-R d...

  2. Studies of base pair sequence effects on DNA solvation based on all-atom molecular dynamics simulations

    Indian Academy of Sciences (India)

    Surjit B Dixit; Mihaly Mezei; David L Beveridge

    2012-07-01

    Detailed analyses of the sequence-dependent solvation and ion atmosphere of DNA are presented based on molecular dynamics (MD) simulations on all the 136 unique tetranucleotide steps obtained by the ABC consortium using the AMBER suite of programs. Significant sequence effects on solvation and ion localization were observed in these simulations. The results were compared to essentially all known experimental data on the subject. Proximity analysis was employed to highlight the sequence dependent differences in solvation and ion localization properties in the grooves of DNA. Comparison of the MD-calculated DNA structure with canonical A- and B-forms supports the idea that the G/C-rich sequences are closer to canonical A- than B-form structures, while the reverse is true for the poly A sequences, with the exception of the alternating ATAT sequence. Analysis of hydration density maps reveals that the flexibility of solute molecule has a significant effect on the nature of observed hydration. Energetic analysis of solute–solvent interactions based on proximity analysis of solvent reveals that the GC or CG base pairs interactmore strongly with watermolecules in the minor groove of DNA that the AT or TA base pairs, while the interactions of the AT or TA pairs in the major groove are stronger than those of the GC or CG pairs. Computation of solvent-accessible surface area of the nucleotide units in the simulated trajectories reveals that the similarity with results derived from analysis of a database of crystallographic structures is excellent. The MD trajectories tend to follow Manning’s counterion condensation theory, presenting a region of condensed counterions within a radius of about 17 Å from the DNA surface independent of sequence. The GC and CG pairs tend to associate with cations in the major groove of the DNA structure to a greater extent than the AT and TA pairs. Cation association is more frequent in the minor groove of AT than the GC pairs. In general

  3. Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies.

    Directory of Open Access Journals (Sweden)

    Patrick D Schloss

    Full Text Available The advent of next generation sequencing has coincided with a growth in interest in using these approaches to better understand the role of the structure and function of the microbial communities in human, animal, and environmental health. Yet, use of next generation sequencing to perform 16S rRNA gene sequence surveys has resulted in considerable controversy surrounding the effects of sequencing errors on downstream analyses. We analyzed 2.7×10(6 reads distributed among 90 identical mock community samples, which were collections of genomic DNA from 21 different species with known 16S rRNA gene sequences; we observed an average error rate of 0.0060. To improve this error rate, we evaluated numerous methods of identifying bad sequence reads, identifying regions within reads of poor quality, and correcting base calls and were able to reduce the overall error rate to 0.0002. Implementation of the PyroNoise algorithm provided the best combination of error rate, sequence length, and number of sequences. Perhaps more problematic than sequencing errors was the presence of chimeras generated during PCR. Because we knew the true sequences within the mock community and the chimeras they could form, we identified 8% of the raw sequence reads as chimeric. After quality filtering the raw sequences and using the Uchime chimera detection program, the overall chimera rate decreased to 1%. The chimeras that could not be detected were largely responsible for the identification of spurious operational taxonomic units (OTUs and genus-level phylotypes. The number of spurious OTUs and phylotypes increased with sequencing effort indicating that comparison of communities should be made using an equal number of sequences. Finally, we applied our improved quality-filtering pipeline to several benchmarking studies and observed that even with our stringent data curation pipeline, biases in the data generation pipeline and batch effects were observed that could potentially

  4. Genotyping-in-Thousands by sequencing (GT-seq): A cost effective SNP genotyping method based on custom amplicon sequencing.

    Science.gov (United States)

    Campbell, Nathan R; Harmon, Stephanie A; Narum, Shawn R

    2015-07-01

    Genotyping-in-Thousands by sequencing (GT-seq) is a method that uses next-generation sequencing of multiplexed PCR products to generate genotypes from relatively small panels (50-500) of targeted single-nucleotide polymorphisms (SNPs) for thousands of individuals in a single Illumina HiSeq lane. This method uses only unlabelled oligos and PCR master mix in two thermal cycling steps for amplification of targeted SNP loci. During this process, sequencing adapters and dual barcode sequence tags are incorporated into the amplicons enabling thousands of individuals to be pooled into a single sequencing library. Post sequencing, reads from individual samples are split into individual files using their unique combination of barcode sequences. Genotyping is performed with a simple perl script which counts amplicon-specific sequences for each allele, and allele ratios are used to determine the genotypes. We demonstrate this technique by genotyping 2068 individual steelhead trout (Oncorhynchus mykiss) samples with a set of 192 SNP markers in a single library sequenced in a single Illumina HiSeq lane. Genotype data were 99.9% concordant to previously collected TaqMan(™) genotypes at the same 192 loci, but call rates were slightly lower with GT-seq (96.4%) relative to Taqman (99.0%). Of the 192 SNPs, 187 were genotyped in ≥90% of the individual samples and only 3 SNPs were genotyped in <70% of samples. This study demonstrates amplicon sequencing with GT-seq greatly reduces the cost of genotyping hundreds of targeted SNPs relative to existing methods by utilizing a simple library preparation method and massive efficiency of scale.

  5. Effective Simulation of Quantum Entanglement Based on Classical Fields Modulated with Pseudorandom Phase Sequences

    CERN Document Server

    Fu, Jian; Xu, Yingying; Dong, Hongtao

    2010-01-01

    We demonstrate that n classical fields modulated with n different pseudorandom phase sequences can constitute a 2^n-dimensional Hilbert space that contains tensor product structure. By using classical fields modulated with pseudorandom phase sequences, we discuss effective simulation of Bell states and GHZ state, and apply both correlation analysis and von Neumann entropy to characterize the simulation. We obtain similar results with the cases in quantum mechanics and find that the conclusions can be easily generalized to n quantum particles. The research on simulation of quantum entanglement may be important, for it not only provides useful insights into fundamental features of quantum entanglement, but also yields new insights into quantum computation.

  6. Effect of base sequence on the DNA cross-linking properties of pyrrolobenzodiazepine (PBD) dimers.

    Science.gov (United States)

    Rahman, Khondaker M; James, Colin H; Thurston, David E

    2011-07-01

    Pyrrolo[2,1-c][1,4]benzodiazepine (PBD) dimers are synthetic sequence-selective DNA minor-groove cross-linking agents that possess two electrophilic imine moieties (or their equivalent) capable of forming covalent aminal linkages with guanine C2-NH(2) functionalities. The PBD dimer SJG-136, which has a C8-O-(CH(2))(3)-O-C8'' central linker joining the two PBD moieties, is currently undergoing phase II clinical trials and current research is focused on developing analogues of SJG-136 with different linker lengths and substitution patterns. Using a reversed-phase ion pair HPLC/MS method to evaluate interaction with oligonucleotides of varying length and sequence, we recently reported (JACS, 2009, 131, 13 756) that SJG-136 can form three different types of adducts: inter- and intrastrand cross-linked adducts, and mono-alkylated adducts. These studies have now been extended to include PBD dimers with a longer central linker (C8-O-(CH(2))(5)-O-C8'), demonstrating that the type and distribution of adducts appear to depend on (i) the length of the C8/C8'-linker connecting the two PBD units, (ii) the positioning of the two reactive guanine bases on the same or opposite strands, and (iii) their separation (i.e. the number of base pairs, usually ATs, between them). Based on these data, a set of rules are emerging that can be used to predict the DNA-interaction behaviour of a PBD dimer of particular C8-C8' linker length towards a given DNA sequence. These observations suggest that it may be possible to design PBD dimers to target specific DNA sequences.

  7. Effects of Sequenced Kodaly Literacy-Based Music Instruction on the Spatial Reasoning Skills of Kindergarten Students

    Science.gov (United States)

    Hanson, Marlene

    2003-01-01

    This study was an investigation of the effects of sequenced Kodaly literacy-based music instruction on the spatial reasoning skills of kindergarten students. Subjects in the pretest-posttest control group design were 54 kindergarten students who were enrolled in three kindergarten classes in a rural elementary community school. Experimental group…

  8. Molecular simulations of polycation-DNA binding exploring the effect of peptide chemistry and sequence in nuclear localization sequence based polycations.

    Science.gov (United States)

    Elder, Robert M; Jayaraman, Arthi

    2013-10-10

    Gene therapy relies on the delivery of DNA into cells, and polycations are one class of vectors enabling efficient DNA delivery. Nuclear localization sequences (NLS), cationic oligopeptides that target molecules for nuclear entry, can be incorporated into polycations to improve their gene delivery efficiency. We use simulations to study the effect of peptide chemistry and sequence on the DNA-binding behavior of NLS-grafted polycations by systematically mutating the residues in the grafts, which are based on the SV40 NLS (peptide sequence PKKKRKV). Replacing arginine (R) with lysine (K) reduces binding strength by eliminating arginine-DNA interactions, but placing R in a less hindered location (e.g., farther from the grafting point to the polycation backbone) has surprisingly little effect on polycation-DNA binding strength. Changing the positions of the hydrophobic proline (P) and valine (V) residues relative to the polycation backbone changes hydrophobic aggregation within the polycation and, consequently, changes the conformational entropy loss that occurs upon polycation-DNA binding. Since conformational entropy loss affects the free energy of binding, the positions of P and V in the grafts affect DNA binding affinity. The insight from this work guides synthesis of polycations with tailored DNA binding affinity and, in turn, efficient DNA delivery.

  9. Classification of Base Sequences (+1,

    Directory of Open Access Journals (Sweden)

    Dragomir Ž. Ðoković

    2010-01-01

    Full Text Available Base sequences BS(+1, are quadruples of {±1}-sequences (;;;, with A and B of length +1 and C and D of length n, such that the sum of their nonperiodic autocor-relation functions is a -function. The base sequence conjecture, asserting that BS(+1, exist for all n, is stronger than the famous Hadamard matrix conjecture. We introduce a new definition of equivalence for base sequences BS(+1, and construct a canonical form. By using this canonical form, we have enumerated the equivalence classes of BS(+1, for ≤30. As the number of equivalence classes grows rapidly (but not monotonically with n, the tables in the paper cover only the cases ≤13.

  10. Sequence-based Association Analysis Reveals an MGST1 eQTL with Pleiotropic Effects on Bovine Milk Composition

    Science.gov (United States)

    Littlejohn, Mathew D.; Tiplady, Kathryn; Fink, Tania A.; Lehnert, Klaus; Lopdell, Thomas; Johnson, Thomas; Couldrey, Christine; Keehan, Mike; Sherlock, Richard G.; Harland, Chad; Scott, Andrew; Snell, Russell G.; Davis, Stephen R.; Spelman, Richard J.

    2016-01-01

    The mammary gland is a prolific lipogenic organ, synthesising copious amounts of triglycerides for secretion into milk. The fat content of milk varies widely both between and within species, and recent independent genome-wide association studies have highlighted a milk fat percentage quantitative trait locus (QTL) of large effect on bovine chromosome 5. Although both EPS8 and MGST1 have been proposed to underlie these signals, the causative status of these genes has not been functionally confirmed. To investigate this QTL in detail, we report genome sequence-based imputation and association mapping in a population of 64,244 taurine cattle. This analysis reveals a cluster of 17 non-coding variants spanning MGST1 that are highly associated with milk fat percentage, and a range of other milk composition traits. Further, we exploit a high-depth mammary RNA sequence dataset to conduct expression QTL (eQTL) mapping in 375 lactating cows, revealing a strong MGST1 eQTL underpinning these effects. These data demonstrate the utility of DNA and RNA sequence-based association mapping, and implicate MGST1, a gene with no obvious mechanistic relationship to milk composition regulation, as causally involved in these processes. PMID:27146958

  11. On the base sequence conjecture

    CERN Document Server

    Djokovic, Dragomir Z

    2010-01-01

    Let BS(m,n) denote the set of base sequences (A;B;C;D), with A and B of length m and C and D of length n. The base sequence conjecture (BSC) asserts that BS(n+1,n) exist (i.e., are non-empty) for all n. This is known to be true for n <= 36 and when n is a Golay number. We show that it is also true for n=37 and n=38. It is worth pointing out that BSC is stronger than the famous Hadamard matrix conjecture. In order to demonstrate the abundance of base sequences, we have previously attached to BS(n+1,n) a graph Gamma_n and computed the Gamma_n for n <= 27. We now extend these computations and determine the Gamma_n for n=28,...,35. We also propose a conjecture describing these graphs in general.

  12. Field effect in graphene-based van der Waals heterostructures: Stacking sequence matters.

    Science.gov (United States)

    Stradi, Daniele; Papior, Nick; Hansen, Ole; Brandbyge, Mads

    2017-03-06

    Stacked van der Waals (vdW) heterostructures where semi-conducting two-dimensional (2D) materials are contacted by overlayed graphene electrodes enable atomically-thin, flexible electronics. We use first-principles quantum transport simulations of graphene- contacted MoS2 devices to show how the transistor effect critically depends on the stacking configuration relative to the gate electrode. We can trace this behavior to the stacking-dependent response of the contact region to the capacitive electric field induced by the gate. The contact resistance is a central parameter and our observation establishes an important design rule for ultra-thin devices based on 2D atomic crystals.

  13. The Effect of Haptic Cues on Motor and Perceptual Based Implicit Sequence Learning

    Directory of Open Access Journals (Sweden)

    Dongwon eKim

    2014-03-01

    Full Text Available We introduced haptic cues to the serial reaction time (SRT sequence learning task alongside the standard visual cues to assess the relative contributions of haptic and visual stimuli to the formation of motor and perceptual memories. We used motorized keys to deliver brief pulse-like displacements to the resting fingers, expecting that the proximity and similarity of these cues to the subsequent response motor actions (finger activated key-presses would strengthen the motor memory trace in particular. We adopted the experimental protocol developed by Willingham in 1999 to explore whether haptic cues contribute differently than visual cues to the balance of motor and perceptual learning. We found that sequence learning occurs with haptic stimuli as with visual stimuli and we found that irrespective of the stimuli (visual or haptic the serial reaction time task leads to a greater amount of motor learning than perceptual learning.

  14. Effect of selection and sequencing of representative wave conditions on process-based predictions of equilibrium embayed beach morphology

    Science.gov (United States)

    Daly, Christopher J.; Bryan, Karin R.; Gonzalez, Mauricio R.; Klein, Antonio H. F.; Winter, Christian

    2014-06-01

    In order to decrease the simulation time of morphodynamic models, often-complex wave climates are reduced to a few representative wave conditions (RWC). When applied to embayed beaches, a test of whether a reduced wave climate is representative or not is to see whether it can recreate the observed equilibrium (long-term averaged) bathymetry of the bay. In this study, the wave climate experienced at Milagro Beach, Tarragona, Spain was discretized into `average' and `extreme' RWCs. Process-based morphodynamic simulations were sequenced and merged based on `persistent' and `transient' forcing conditions, the results of which were used to estimate the equilibrium bathymetry of the bay. Results show that the effect of extreme wave events appeared to have less influence on the equilibrium of the bay compared to average conditions of longer overall duration. Additionally, the persistent seasonal variation of the wave climate produces pronounced beach rotation and tends to accumulate sediment at the extremities of the beach, rather than in the central sections. It is, therefore, important to account for directional variability and persistence in the selection and sequencing of representative wave conditions as is it essential for accurately balancing the effects beach rotation events.

  15. EFFECT OF SEQUENCE STRUCTURE ON THE THERMOTROPIC LIQUID CRYSTALLINE PROPERTIES OF POLYESTERAMIDES BASED ON DIMETHYLBENZIDINE, BISPHENOL-A AND p-TEREPHTHALYL CHLORIDE

    Institute of Scientific and Technical Information of China (English)

    1998-01-01

    A series of thermotropic liquid crystalline polyesteramides with different sequence structure based on dimethylbenzidine (DMBD), bisphenol-A(BPA) and pterephthalyl chloride (TPC) was synthesized by changing the feeding order of monomers in low temperature solution polycondensation system. By means of NMR and a computer program the sequence structure parameters were measured. The effect of sequence structure on liquid crystalline phase transition temperature of PEAs obtained was investigated.

  16. cis sequence effects on gene expression

    Directory of Open Access Journals (Sweden)

    Jacobs Kevin

    2007-08-01

    Full Text Available Abstract Background Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature. Results We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p cis sequence effects in our study, respectively. Conclusion Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.

  17. Effective inhibition of human cytomegalovirus gene expression by DNA-based external guide sequences

    Institute of Scientific and Technical Information of China (English)

    Zhifeng Zeng; Hongjian Li; Yueqing Li; Yanwei Cui; Qi Zhou; Yi Zou; Guang Yang; Tianhong Zhou

    2009-01-01

    To investigate whether a 12 nucleotide DNA-based miniEGSs can silence the expression of human cytomegalovirus(HCMV)UL49 gene efficiently,A HeLa cell line stably expressing UL49 gene was constructed and the putative miniEGSs(UL49-miniEGSs)were assayed in the stable cell line.Quantitative RT-PCR and western blot resuits showed a reduction of 67%in UL49expression level in HeLa cells that were transfected with UL49-miniEGSs.It was significantly different from that of mock and control miniEGSs(TK-miniEGSs)which were 1 and 7%,respectively.To further confirm the gene silence directed by UL49-miniEGSs with human RNase P,a mutant of UL49-miniEGSs was constructedand a modified 5'RACE was carried out.Data showed that the inhibition of UL49 gene expression directed by UL49-miniEGSs was RNase P-dependent and the clea vage of UL49 mRNA by RNase P was site specific.As a result,the length of DNA-based miniEGSs that could silence gene expression efficiently was only 12 nt.That is significantly less than any other Oligonucleotide-based method of gene inactivation known SO far.MiniEGSs may represent novel gene-targeting agents for the inhibition of viral genes and other human disease reiated gene expression.

  18. Next-generation sequencing-based genome diagnostics across clinical genetics centers : implementation choices and their effects

    NARCIS (Netherlands)

    Vrijenhoek, Terry; Kraaijeveld, Ken; Elferink, Martin; de Ligt, Joep; Kranendonk, Elcke; Santen, Gijs; Nijman, Isaac J.; Butler, Derek; Claes, Godelieve; Costessi, Adalberto; Dorlijn, Wim; van Eyndhoven, Winfried; Halley, Dicky J. J.; van den Hout, Mirjam C. G. N.; van Hove, Steven; Johansson, Lennart F.; Jongbloed, Jan D. H.; Kamps, Rick; Kockx, Christel E. M.; de Koning, Bart; Kriek, Marjolein; Deprez, Ronald Lekanne Dit; Lunstroo, Hans; Mannens, Marcel; Mook, Olaf R.; Nelen, Marcel; Ploem, Corrette; Rijnen, Marco; Saris, Jasper J.; Sinke, Richard; Sistermans, Erik; van Slegtenhorst, Marjon; Sleutels, Frank; van der Stoep, Nienke; van Tienhoven, Marianne; Vermaat, Martijn; Vogel, Maartje; Waisfisz, Quinten; Weiss, Janneke Marjan; van den Wijngaard, Arthur; van Workum, Wilbert; Ijntema, Helger; van der Zwaag, Bert; van IJcken, Wilfred F. J.; den Dunnen, Johan T.; Veltman, Joris A.; Hennekam, Raoul; Cuppen, Edwin

    2015-01-01

    Implementation of next-generation DNA sequencing (NGS) technology into routine diagnostic genome care requires strategic choices. Instead of theoretical discussions on the consequences of such choices, we compared NGS-based diagnostic practices in eight clinical genetic centers in the Netherlands, b

  19. Effect of Addition Sequence during Neutralization and Precipitation on Iron-based Catalysts for High Temperature Shift Reaction

    Institute of Scientific and Technical Information of China (English)

    Li Wei; Zhu Jianhua; Mou Zhanjun

    2007-01-01

    The preparation of the iron-based catalysts promoted by cobalt with a small amount of copper and aluminum for the high temperature shift reaction (HTS) with different sequences of adding catalyst raw materials during neutralization and precipitation was investigated. XRD,BET and particle size distribution (PSD) were used to characterize the prepared catalysts. It was found that the catalyst crystals were all γ-Fe2O3,and the intermediate of the catalyst after aging was Fe3O4. The crystallographic form of the catalyst and its intermediate was not affected by the addition sequence in the neutralization and precipitation process. The results showed that the specific surface area and the particle size of the catalysts depended on the addition sequence to the mother liquor. Cobalt with a small amount of copper and aluminum could increase the specific surface area and decrease the particle size of catalysts.

  20. Comparative genomics beyond sequence-based alignments

    DEFF Research Database (Denmark)

    Þórarinsson, Elfar; Yao, Zizhen; Wiklund, Eric D.;

    2008-01-01

    Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment me...

  1. The effects of alignment quality, distance calculation method, sequence filtering, and region on the analysis of 16S rRNA gene-based studies.

    Directory of Open Access Journals (Sweden)

    Patrick D Schloss

    Full Text Available Pyrosequencing of PCR-amplified fragments that target variable regions within the 16S rRNA gene has quickly become a powerful method for analyzing the membership and structure of microbial communities. This approach has revealed and introduced questions that were not fully appreciated by those carrying out traditional Sanger sequencing-based methods. These include the effects of alignment quality, the best method of calculating pairwise genetic distances for 16S rRNA genes, whether it is appropriate to filter variable regions, and how the choice of variable region relates to the genetic diversity observed in full-length sequences. I used a diverse collection of 13,501 high-quality full-length sequences to assess each of these questions. First, alignment quality had a significant impact on distance values and downstream analyses. Specifically, the greengenes alignment, which does a poor job of aligning variable regions, predicted higher genetic diversity, richness, and phylogenetic diversity than the SILVA and RDP-based alignments. Second, the effect of different gap treatments in determining pairwise genetic distances was strongly affected by the variation in sequence length for a region; however, the effect of different calculation methods was subtle when determining the sample's richness or phylogenetic diversity for a region. Third, applying a sequence mask to remove variable positions had a profound impact on genetic distances by muting the observed richness and phylogenetic diversity. Finally, the genetic distances calculated for each of the variable regions did a poor job of correlating with the full-length gene. Thus, while it is tempting to apply traditional cutoff levels derived for full-length sequences to these shorter sequences, it is not advisable. Analysis of beta-diversity metrics showed that each of these factors can have a significant impact on the comparison of community membership and structure. Taken together, these results

  2. A novel sequence-based antigenic distance measure for H1N1, with application to vaccine effectiveness and the selection of vaccine strains

    Science.gov (United States)

    Pan, Keyao; Subieta, Krystina C.; Deem, Michael W.

    2011-01-01

    H1N1 influenza causes substantial seasonal illness and was the subtype of the 2009 influenza pandemic. Precise measures of antigenic distance between the vaccine and circulating virus strains help researchers design influenza vaccines with high vaccine effectiveness. We here introduce a sequence-based method to predict vaccine effectiveness in humans. Historical epidemiological data show that this sequence-based method is as predictive of vaccine effectiveness as hemagglutination inhibition assay data from ferret animal model studies. Interestingly, the expected vaccine effectiveness is greater against H1N1 than H3N2, suggesting a stronger immune response against H1N1 than H3N2. The evolution rate of hemagglutinin in H1N1 is also shown to be greater than that in H3N2, presumably due to greater immune selection pressure. PMID:21123189

  3. NGS-based deep bisulfite sequencing.

    Science.gov (United States)

    Lee, Suman; Kim, Joomyeong

    2016-01-01

    We have developed an NGS-based deep bisulfite sequencing protocol for the DNA methylation analysis of genomes. This approach allows the rapid and efficient construction of NGS-ready libraries with a large number of PCR products that have been individually amplified from bisulfite-converted DNA. This approach also employs a bioinformatics strategy to sort the raw sequence reads generated from NGS platforms and subsequently to derive DNA methylation levels for individual loci. The results demonstrated that this NGS-based deep bisulfite sequencing approach provide not only DNA methylation levels but also informative DNA methylation patterns that have not been seen through other existing methods.•This protocol provides an efficient method generating NGS-ready libraries from individually amplified PCR products.•This protocol provides a bioinformatics strategy sorting NGS-derived raw sequence reads.•This protocol provides deep bisulfite sequencing results that can measure DNA methylation levels and patterns of individual loci.

  4. Deep Illumina-based shotgun sequencing reveals dietary effects on the structure and function of the fecal microbiome of growing kittens.

    Directory of Open Access Journals (Sweden)

    Oliver Deusch

    Full Text Available Previously, we demonstrated that dietary protein:carbohydrate ratio dramatically affects the fecal microbial taxonomic structure of kittens using targeted 16S gene sequencing. The present study, using the same fecal samples, applied deep Illumina shotgun sequencing to identify the diet-associated functional potential and analyze taxonomic changes of the feline fecal microbiome.Fecal samples from kittens fed one of two diets differing in protein and carbohydrate content (high-protein, low-carbohydrate, HPLC; and moderate-protein, moderate-carbohydrate, MPMC were collected at 8, 12 and 16 weeks of age (n = 6 per group. A total of 345.3 gigabases of sequence were generated from 36 samples, with 99.75% of annotated sequences identified as bacterial. At the genus level, 26% and 39% of reads were annotated for HPLC- and MPMC-fed kittens, with HPLC-fed cats showing greater species richness and microbial diversity. Two phyla, ten families and fifteen genera were responsible for more than 80% of the sequences at each taxonomic level for both diet groups, consistent with the previous taxonomic study. Significantly different abundances between diet groups were observed for 324 genera (56% of all genera identified demonstrating widespread diet-induced changes in microbial taxonomic structure. Diversity was not affected over time. Functional analysis identified 2,013 putative enzyme function groups were different (p<0.000007 between the two dietary groups and were associated to 194 pathways, which formed five discrete clusters based on average relative abundance. Of those, ten contained more (p<0.022 enzyme functions with significant diet effects than expected by chance. Six pathways were related to amino acid biosynthesis and metabolism linking changes in dietary protein with functional differences of the gut microbiome.These data indicate that feline feces-derived microbiomes have large structural and functional differences relating to the dietary

  5. The Effect of a Classroom-Based Intensive Robotics and Programming Workshop on Sequencing Ability in Early Childhood

    Science.gov (United States)

    Kazakoff, Elizabeth R.; Sullivan, Amanda; Bers, Marina U.

    2013-01-01

    This paper examines the impact of programming robots on sequencing ability during a 1-week intensive robotics workshop at an early childhood STEM magnet school in the Harlem area of New York City. Children participated in computer programming activities using a developmentally appropriate tangible programming language CHERP, specifically designed…

  6. SNAD: sequence name annotation-based designer

    Directory of Open Access Journals (Sweden)

    Gorbalenya Alexander E

    2009-08-01

    Full Text Available Abstract Background A growing diversity of biological data is tagged with unique identifiers (UIDs associated with polynucleotides and proteins to ensure efficient computer-mediated data storage, maintenance, and processing. These identifiers, which are not informative for most people, are often substituted by biologically meaningful names in various presentations to facilitate utilization and dissemination of sequence-based knowledge. This substitution is commonly done manually that may be a tedious exercise prone to mistakes and omissions. Results Here we introduce SNAD (Sequence Name Annotation-based Designer that mediates automatic conversion of sequence UIDs (associated with multiple alignment or phylogenetic tree, or supplied as plain text list into biologically meaningful names and acronyms. This conversion is directed by precompiled or user-defined templates that exploit wealth of annotation available in cognate entries of external databases. Using examples, we demonstrate how this tool can be used to generate names for practical purposes, particularly in virology. Conclusion A tool for controllable annotation-based conversion of sequence UIDs into biologically meaningful names and acronyms has been developed and placed into service, fostering links between quality of sequence annotation, and efficiency of communication and knowledge dissemination among researchers.

  7. Probabilistic model based error correction in a set of various mutant sequences analyzed by next-generation sequencing.

    Science.gov (United States)

    Aita, Takuyo; Ichihashi, Norikazu; Yomo, Tetsuya

    2013-12-01

    To analyze the evolutionary dynamics of a mutant population in an evolutionary experiment, it is necessary to sequence a vast number of mutants by high-throughput (next-generation) sequencing technologies, which enable rapid and parallel analysis of multikilobase sequences. However, the observed sequences include many errors of base call. Therefore, if next-generation sequencing is applied to analysis of a heterogeneous population of various mutant sequences, it is necessary to discriminate between true bases as point mutations and errors of base call in the observed sequences, and to subject the sequences to error-correction processes. To address this issue, we have developed a novel method of error correction based on the Potts model and a maximum a posteriori probability (MAP) estimate of its parameters corresponding to the "true sequences". Our method of error correction utilizes (1) the "quality scores" which are assigned to individual bases in the observed sequences and (2) the neighborhood relationship among the observed sequences mapped in sequence space. The computer experiments of error correction of artificially generated sequences supported the effectiveness of our method, showing that 50-90% of errors were removed. Interestingly, this method is analogous to a probabilistic model based method of image restoration developed in the field of information engineering.

  8. Deciphering Seed Sequence Based Off-Target Effects in a Large-Scale RNAi Reporter Screen for E-Cadherin Expression.

    Directory of Open Access Journals (Sweden)

    Robert Adams

    Full Text Available Functional RNAi based screening is affected by large numbers of false positive and negative hits due to prevalent sequence based off-target effects. We performed a druggable genome targeting siRNA screen intended to identify novel regulators of E-cadherin (CDH1 expression, a known key player in epithelial mesenchymal transition (EMT. Analysis of primary screening results indicated a large number of false-positive hits. To address these crucial difficulties we developed an analysis method, SENSORS, which, similar to published methods, is a seed enrichment strategy for analyzing siRNA off-targets in RNAi screens. Using our approach, we were able to demonstrate that accounting for seed based off-target effects stratifies primary screening results and enables the discovery of additional screening hits. While traditional hit detection methods are prone to false positive results which are undetected, we were able to identify false positive hits robustly. Transcription factor MYBL1 was identified as a putative novel target required for CDH1 expression and verified experimentally. No siRNA pool targeting MYBL1 was present in the used siRNA library. Instead, MYBL1 was identified as a putative CDH1 regulating target solely based on the SENSORS off-target score, i.e. as a gene that is a cause for off-target effects down regulating E-cadherin expression.

  9. Base sequence effects on DNA replication influenced by bulky adducts. Final report, March 1, 1995--February 28, 1997

    Energy Technology Data Exchange (ETDEWEB)

    Geacintov, N.E.

    1997-05-31

    Polycyclic aromatic hydrocarbons (PAH) are environmental pollutants that are present in air, food, and water. While PAH compounds are chemically inert and are sparingly soluble in aqueous solutions, in living cells they are metabolized to a variety of oxygenated derivatives, including the high mutagenic and tumorigenic diol epoxide derivatives. The diol epoxides of the sterically hindered fjord region compound benzo[c]phenanthrene (B[c]PhDE) are among the most powerful tumorigenic compounds in animal model test systems. In this project, site-specifically modified oligonucleotides containing single B[c]PhDE-N{sup 6}-dA lesions derived from the reactions of the 1S,2R,3R,4S and 1R,2S,3S,4R diol epoxides of B[c]PhDE with dA residues were synthesized. The replication of DNA catalyzed by a prokaryotic DNA polymerase (the exonuclease-free Klenow fragment E. Coli Po1 I) in the vicinity of the lesion at base-specific sites on B[c]PhDE-modified template strands was investigated in detail. The Michaelis-Menten parameters for the insertion of single deoxynucleotide triphosphates into growing DNA (primer) strands using the modified dA* and the bases just before and after the dA* residue as templates, depend markedly on the stereochemistry of the B[c]PhDE-modified dA residues. These observations provide novel insights into the mechanisms by which bulky PAH-DNA adducts affect normal DNA replication.

  10. Thermodynamics-based models of transcriptional regulation with gene sequence.

    Science.gov (United States)

    Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

    2015-12-01

    Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.

  11. Stream cipher based on GSS sequences

    Institute of Scientific and Technical Information of China (English)

    HU Yupu; XIAO Guozhen

    2004-01-01

    Generalized self-shrinking sequences, simply named the GSS sequences,are novel periodic sequences that have many advantages in cryptography. In this paper,we give several results about GSS sequence's application to cryptography. First, we give a simple method for selecting those GSS sequences whose least periods reach the maximum. Second, we give a method for describing and computing the auto-correlation coefficients of GSS sequences. Finally, we point out that some GSS sequences, when used as stream ciphers, have a security weakness.

  12. Chip-based sequencing nucleic acids

    Science.gov (United States)

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  13. Steganalytic method based on short and repeated sequence distance statistics

    Institute of Scientific and Technical Information of China (English)

    WANG GuoXin; PING XiJian; XU ManKun; ZHANG Tao; BAO XiRui

    2008-01-01

    According to the distribution characteristics of short and repeated sequence (SRS),a steganalytic method based on the correlation of image bit planes is proposed.Firstly,we provide the conception of SRS distance statistics and deduce its statistical distribution.Because the SRS distance statistics can effectively reflect the correlation of the sequence,SRS has statistical features when the image bit plane sequence equals the image width.Using this characteristic,the steganalytic method is fulfilled by the distinct test of Poisson distribution.Experimental results show a good performance for detecting LSB matching steganographic method in still images.By the way,the proposed method is not designed for specific steganographic algorithms and has good generality.

  14. Quick Trickle Permutation Based on Quick Trickle Characteristic Sequence

    Institute of Scientific and Technical Information of China (English)

    Wang Li-na; Fei Ru-chun; Liu Zhu

    2003-01-01

    The concept of quick trickle characteristic sequence is presented, the properties and count of quick trickle characteristic sequence are researched, the mapping relationship between quick trickle characteristic sequence and quick trickle permutation is discussed. Finally, an efficient construction of quick trickle permutation based on quick trickle characteristic sequence is given, by which quick trickle permutation can be figured out after constructing quick trickle characteristic sequence. Quick trickle permutation has good cryptographic properties.

  15. Solid-State Nanopore-Based DNA Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Zewen Liu

    2016-01-01

    Full Text Available The solid-state nanopore-based DNA sequencing technology is becoming more and more attractive for its brand new future in gene detection field. The challenges that need to be addressed are diverse: the effective methods to detect base-specific signatures, the control of the nanopore’s size and surface properties, and the modulation of translocation velocity and behavior of the DNA molecules. Among these challenges, the realization of the high-quality nanopores with the help of modern micro/nanofabrication technologies is a crucial one. In this paper, typical technologies applied in the field of solid-state nanopore-based DNA sequencing have been reviewed.

  16. Period of the d-Sequence Based Random Number Generator

    OpenAIRE

    Thippireddy, Suresh; Chalasani, Sandeep

    2007-01-01

    This paper presents an expression to compute the exact period of a recursive random number generator based on d-sequences. Using the multi-recursive version of this generator we can produce large number of pseudorandom sequences.

  17. Effect of dephasing on DNA sequencing via transverse electronic transport

    Energy Technology Data Exchange (ETDEWEB)

    Zwolak, Michael [Los Alamos National Laboratory; Krems, Matt [NON LANL; Pershin, Yuriy V [NON LANL; Di Ventra, Massimiliano [NON LANL

    2009-01-01

    We study theoretically the effects of dephasing on DNA sequencing in a nanopore via transverse electronic transport. To do this, we couple classical molecular dynamics simulations with transport calculations using scattering theory. Previous studies, which did not include dephasing, have shown that by measuring the transverse current of a particular base multiple times, one can get distributions of currents for each base that are distinguishable. We introduce a dephasing parameter into transport calculations to simulate the effects of the ions and other fluctuations. These effects lower the overall magnitude of the current, but have little effect on the current distributions themselves. The results of this work further implicate that distinguishing DNA bases via transverse electronic transport has potential as a sequencing tool.

  18. Effects of impregnation sequence on the microstructure and performances of Cu-Co based catalysts for the synthesis of higher alcohols

    Institute of Scientific and Technical Information of China (English)

    Siyu Deng; Wei Chu; Huiyuan Xu; Limin Shi; Lihong Huang

    2008-01-01

    Silica-supported CuCo catalysts were prepared by impregnation method with different impregnation sequence for higher alcohols synthesis. These catalysts were characterized by H2-TPR, XRD, N2 adsorption, XPS techniques and CO selective hydrogenation reaction measurement. The effects of impregnation sequence on the structure and performance of cata-lysts were investigated, and there were important influences on the selectivity to higher alcohols. There was a strong synergistic effect between copper and cobalt for the co-impregnated sample. The CuCo/SiO2 catalyst prepared by co-impregnation showed a better yield of total alcohols, and a higher selectivity to total alcohols which reached 51.5%.

  19. Spot-Based Generations for Meta-Fibonacci Sequences

    CERN Document Server

    Dalton, Barnaby; Tanny, Stephen

    2011-01-01

    For many meta-Fibonacci sequences it is possible to identify a partition of the sequence into successive intervals (sometimes called blocks) with the property that the sequence behaves "similarly" in each block. This partition provides insights into the sequence properties. To date, for any given sequence, only ad hoc methods have been available to identify this partition. We apply a new concept - the spot-based generation sequence - to derive a general methodology for identifying this partition for a large class of meta-Fibonacci sequences. This class includes the Conolly and Conway sequences and many of their well-behaved variants, and even some highly chaotic sequences, such as Hofstadter's famous Q-sequence.

  20. Short sequence effect of ancient DNA on mammoth phylogenetic analyses

    Institute of Scientific and Technical Information of China (English)

    Guilian SHENG; Lianjuan WU; Xindong HOU; Junxia YUAN; Shenghong CHENG; Bojian ZHONG; Xulong LAI

    2009-01-01

    The evolution of Elephantidae has been intensively studied in the past few years, especially after 2006. The molecular approaches have made great contribution to the assumption that the extinct woolly mammoth has a close relationship with the Asian elephant instead of the African elephant. In this study, partial ancient DNA sequences of cytochrome b (cyt b) gene in mitochondrial genome were successfully retrieved from Late Pleistocene Mammuthus primigenius bones collected from Heilongjiang Province in Northeast China. Both the partial and complete homologous cyt b gene sequences and the whole mitochondrial genome sequences extracted from GenBank were aligned and used as datasets for phylogenetic analyses. All of the phylogenetic trees, based on either the partial or the complete cyt b gene, reject the relationship constructed by the whole mitochondrial genome, showing the occurrence of an effect of sequence length of cyt b gene on mammoth phylogenetic analyses.

  1. Movement Pattern Analysis Based on Sequence Signatures

    Directory of Open Access Journals (Sweden)

    Seyed Hossein Chavoshi

    2015-09-01

    Full Text Available Increased affordability and deployment of advanced tracking technologies have led researchers from various domains to analyze the resulting spatio-temporal movement data sets for the purpose of knowledge discovery. Two different approaches can be considered in the analysis of moving objects: quantitative analysis and qualitative analysis. This research focuses on the latter and uses the qualitative trajectory calculus (QTC, a type of calculus that represents qualitative data on moving point objects (MPOs, and establishes a framework to analyze the relative movement of multiple MPOs. A visualization technique called sequence signature (SESI is used, which enables to map QTC patterns in a 2D indexed rasterized space in order to evaluate the similarity of relative movement patterns of multiple MPOs. The applicability of the proposed methodology is illustrated by means of two practical examples of interacting MPOs: cars on a highway and body parts of a samba dancer. The results show that the proposed method can be effectively used to analyze interactions of multiple MPOs in different domains.

  2. A sequence based synteny map between soybean and Arabidopsis thaliana

    Directory of Open Access Journals (Sweden)

    Lightfoot David A

    2007-01-01

    Full Text Available Abstract Background Soybean (Glycine max, L. Merr. is one of the world's most important crops, however, its complete genomic sequence has yet to be determined. Nonetheless, a large body of sequence information exists, particularly in the form of expressed sequence tags (ESTs. Herein, we report the use of the model organism Arabidopsis thaliana (thale cress for which the entire genomic sequence is available as a framework to align thousands of short soybean sequences. Results A series of JAVA-based programs were created that processed and compared 341,619 soybean DNA sequences against A. thaliana chromosomal DNA. A. thaliana DNA was probed for short, exact matches (15 bp to each soybean sequence, and then checked for the number of additional 7 bp matches in the adjacent 400 bp region. The position of these matches was used to order soybean sequences in relation to the A. thaliana genome. Conclusion Reported associations between soybean sequences and A. thaliana were within a 95% confidence interval of e-30 – e-100. In addition, the clustering of soybean expressed sequence tags (ESTs based on A. thaliana sequence was accurate enough to identify potential single nucleotide polymorphisms (SNPs within the soybean sequence clusters. An EST, bacterial artificial chromosome (BAC end sequence and marker amplicon sequence synteny map of soybean and A. thaliana is presented. In addition, all JAVA programs used to create this map are available upon request and on the WEB.

  3. Speeding disease gene discovery by sequence based candidate prioritization

    Directory of Open Access Journals (Sweden)

    Porteous David J

    2005-03-01

    Full Text Available Abstract Background Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the time and twenty-fold 11% of the time. Conclusion PROSPECTR is a simple and effective way to identify genes involved in Mendelian and oligogenic disorders. It performs markedly better than the single existing sequence-based classifier on novel data. PROSPECTR could save investigators looking at large regions of interest time and effort by prioritizing positional candidate genes for mutation detection and case-control association studies.

  4. Caption detection from video sequence based on fuzzy neural networks

    Science.gov (United States)

    Gao, Xinbo; Xin, Hong; Li, Jie

    2001-09-01

    Caption graphically superimposed in video frames can provide important indexing information. The automatic detection and recognition of video captions can be of great help in querying topics of interest in digital news library. To detect the caption from video sequence, we present algorithms based on fuzzy clustering neural networks. Since neural networks have the capabilities of learning and self-organizing and parallel computing mechanism, with the great increasing of digital images and video databases, neural networks based techniques become more efficient and popular tools for multimedia processing. Experimental results show that our caption detection scheme is effective and robust.

  5. Simulation-Based Evaluation of Learning Sequences for Instructional Technologies

    Science.gov (United States)

    McEneaney, John E.

    2016-01-01

    Instructional technologies critically depend on systematic design, and learning hierarchies are a commonly advocated tool for designing instructional sequences. But hierarchies routinely allow numerous sequences and choosing an optimal sequence remains an unsolved problem. This study explores a simulation-based approach to modeling learning…

  6. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  7. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Directory of Open Access Journals (Sweden)

    Chun-Tien Chang

    2012-01-01

    Full Text Available The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs, insertion-deletions (indels, short tandem repeats (STRs, and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR, which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS; (iii determine human papilloma virus (HPV genotypes by searching current viral databases in cases of double infections; (iv estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4 and its paralog HSPDP3.

  8. Mixed sequence reader: a program for analyzing DNA sequences with heterozygous base calling.

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3.

  9. Efficient Quantum Private Communication Based on Dynamic Control Code Sequence

    Science.gov (United States)

    Cao, Zheng-Wen; Feng, Xiao-Yi; Peng, Jin-Ye; Zeng, Gui-Hua; Qi, Jin

    2017-04-01

    Based on chaos and quantum properties, we propose a quantum private communication scheme with dynamic control code sequence. The initial sequence is obtained via chaotic systems, and the control code sequence is derived by grouping, XOR and extracting. A shift cycle algorithm is designed to enable the dynamic change of control code sequence. Analysis shows that transmission efficiency could reach 100 % with high dynamics and security.

  10. Efficient Quantum Private Communication Based on Dynamic Control Code Sequence

    Science.gov (United States)

    Cao, Zheng-Wen; Feng, Xiao-Yi; Peng, Jin-Ye; Zeng, Gui-Hua; Qi, Jin

    2016-12-01

    Based on chaos and quantum properties, we propose a quantum private communication scheme with dynamic control code sequence. The initial sequence is obtained via chaotic systems, and the control code sequence is derived by grouping, XOR and extracting. A shift cycle algorithm is designed to enable the dynamic change of control code sequence. Analysis shows that transmission efficiency could reach 100 % with high dynamics and security.

  11. More on the Cause-Effect Sequence

    Science.gov (United States)

    Janik, Jerzy A.

    2007-06-01

    Does every event have a cause? An answer is not simple. The notion of cause contains a particular being y acting on being x plus everything that may be called the boundary conditions. These may form necessary and suffcient conditions giving rise to a strong cause, or only necessary conditions, giving rise to a weak cause. These matters are discussed in this article with particular attention being paid to the argumentation of Thomas Aquinas known as prima via. Prima via is the analysis of a cause-effect sequence which leads (according to Thomas) to a First Cause (First Mover). It seems that the extrapolation of the cause-effect sequence to infinity is permissible from the logical point of view. But the possibility of weak causes seems to destroy the cause-effect "line". Here it is perhaps useful to "escape" to the metaphysical abstraction which looks at things sub ratione entitatis. If we ignore space and time (which is characteristic of this abstraction) we are led to believe that the IS of cause is finally unavoidable, which means that from the vantage point of this abstraction, i.e. from the point of view of IS, all causes are strong.

  12. Spatial Based Integrated Assessment of Bedrock and Ground Motions, Fault Offsets, and Their Effects for the October-November 2002 Earthquake Sequence on the Denali Fault, Alaska

    Science.gov (United States)

    Vinson, T. S.; Carlson, R.; Hansen, R.; Hulsey, L.; Ma, J.; White, D.; Barnes, D.; Shur, Y.

    2003-12-01

    A National Science Foundation (NSF) Small Grant Exploratory Research Grant was awarded to the University of Alaska Fairbanks to archive bedrock and ground motions and fault offsets and their effects for the October-November 2002 earthquake sequence on the Denali Fault, Alaska. The scope of work included the accumulation of all strong motion records, satellite imagery, satellite remote sensing data, aerial and ground photographs, and structural response (both measured and anecdotal) that would be useful to achieve the objective. Several interesting data sets were archived including ice cover, lateral movement of stream channels, landslides, avalanches, glacial fracturing, "felt" ground motions, and changes in water quantity and quality. The data sources may be spatially integrated to provide a comprehensive assessment of the bedrock and ground motions and fault offsets for the October-November 2002 earthquake sequence. In the aftermath of the October-November 2002 earthquake sequence on the Denali fault, the Alaskan engineering community expressed a strong interest to understand why their structures and infrastructure were not substantially damaged by the ground motions they experienced during the October-November 2002 Earthquake Sequence on the Denali Fault. The research work proposed under this NSF Grant is a necessary prerequisite to this understanding. Furthermore, the proposed work will facilitate a comparison of Denali events with the Loma Prieta and recent Kocelli and Dozce events in Turkey, all of which were associated with strike-slip faulting. Finally, the spatially integrated data will provide the basis for research work that is truly innovative. For example, is may be possible to predict the observed (1) landsliding and avalanches, (2) changes in water quantity and quality, (3) glacial fracturing, and (4) the widespread liquefaction and lateral spreading, which occurred along the Tok cutoff and Northway airport, with the bedrock and ground motions and

  13. A random effects epidemic-type aftershock sequence model.

    Science.gov (United States)

    Lin, Feng-Chang

    2011-04-01

    We consider an extension of the temporal epidemic-type aftershock sequence (ETAS) model with random effects as a special case of a well-known doubly stochastic self-exciting point process. The new model arises from a deterministic function that is randomly scaled by a nonnegative random variable, which is unobservable but assumed to follow either positive stable or one-parameter gamma distribution with unit mean. Both random effects models are of interest although the one-parameter gamma random effects model is more popular when modeling associated survival times. Our estimation is based on the maximum likelihood approach with marginalized intensity. The methods are shown to perform well in simulation experiments. When applied to an earthquake sequence on the east coast of Taiwan, the extended model with positive stable random effects provides a better model fit, compared to the original ETAS model and the extended model with one-parameter gamma random effects.

  14. Watermarking scheme of colour image based on chaotic sequences

    Institute of Scientific and Technical Information of China (English)

    LIU Nian-sheng; GUO Dong-hui

    2009-01-01

    The proposed perceptual mask is based on the singularity of cover image and matches very well with the properties of the human visual system. The cover colour image is decomposed into several subbands by the wavelet transform. The water-mark composed of chaotic sequence and the covert image is embedded into the subband with the largest energy. The chaos system plays an important role in the security invisibility and robustness of the proposed scheme. The parameter and initial state of chaos system can directly influence the generation of watermark information as a key. Moreover, the watermark information has the property of spread spectrum signal by chaotic sequence to improve the invisibility and security of watermarked image. Experimental results and comparisons with other watermarking techniques prove that the proposed algorithm is effective and feasible, and improves the security, invisibility and robustness of watermarking information.

  15. [Segmentation Method for Liver Organ Based on Image Sequence Context].

    Science.gov (United States)

    Zhang, Meiyun; Fang, Bin; Wang, Yi; Zhong, Nanchang

    2015-10-01

    In view of the problems of more artificial interventions and segmentation defects in existing two-dimensional segmentation methods and abnormal liver segmentation errors in three-dimensional segmentation methods, this paper presents a semi-automatic liver organ segmentation method based on the image sequence context. The method takes advantage of the existing similarity between the image sequence contexts of the prior knowledge of liver organs, and combines region growing and level set method to carry out semi-automatic segmentation of livers, along with the aid of a small amount of manual intervention to deal with liver mutation situations. The experiment results showed that the liver segmentation algorithm presented in this paper had a high precision, and a good segmentation effect on livers which have greater variability, and can meet clinical application demands quite well.

  16. Identification of protein superfamily from structure- based sequence motif

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The structure-based sequence motif of the distant proteins in evolution, protein tyrosine phosphatases (PTP) Ⅰ and Ⅱ superfamilies, as an example, has been defined by the structural comparison, structure-based sequence alignment and analyses on substitution patterns of residues in common sequence conserved regions. And the phosphatases Ⅰ and Ⅱ can be correctly identified together by the structure-based PTP sequence motif from SWISS-PROT and TrEBML databases. The results show that the correct rates of identification are over 98%. This is the first time to identify PTP Ⅰ and Ⅱ together by this motif.

  17. Image-based temporal alignment of echocardiographic sequences

    Science.gov (United States)

    Danudibroto, Adriyana; Bersvendsen, Jørn; Mirea, Oana; Gerard, Olivier; D'hooge, Jan; Samset, Eigil

    2016-04-01

    Temporal alignment of echocardiographic sequences enables fair comparisons of multiple cardiac sequences by showing corresponding frames at given time points in the cardiac cycle. It is also essential for spatial registration of echo volumes where several acquisitions are combined for enhancement of image quality or forming larger field of view. In this study, three different image-based temporal alignment methods were investigated. First, a method based on dynamic time warping (DTW). Second, a spline-based method that optimized the similarity between temporal characteristic curves of the cardiac cycle using 1D cubic B-spline interpolation. Third, a method based on the spline-based method with piecewise modification. These methods were tested on in-vivo data sets of 19 echo sequences. For each sequence, the mitral valve opening (MVO) time was manually annotated. The results showed that the average MVO timing error for all methods are well under the time resolution of the sequences.

  18. Asynchronous symmetry-based sequences for homonuclear dipolar recoupling in solid-state nuclear magnetic resonance

    Energy Technology Data Exchange (ETDEWEB)

    Tan, Kong Ooi; Ernst, Matthias, E-mail: madhu@tifr.res.in, E-mail: maer@ethz.ch [Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich (Switzerland); Rajeswari, M. [Department of Chemical Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Colaba, Mumbai 400 005 (India); Madhu, P. K., E-mail: madhu@tifr.res.in, E-mail: maer@ethz.ch [Department of Chemical Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Colaba, Mumbai 400 005 (India); TIFR Centre for Interdisciplinary Sciences, 21 Brundavan Colony, Narsingi, Hyderabad 500 075 (India)

    2015-02-14

    We show a theoretical framework, based on triple-mode Floquet theory, to analyze recoupling sequences derived from symmetry-based pulse sequences, which have a non-vanishing effective field and are not rotor synchronized. We analyze the properties of one such sequence, a homonuclear double-quantum recoupling sequence derived from the C7{sub 2}{sup 1} sequence. The new asynchronous sequence outperforms the rotor-synchronized version for spin pairs with small dipolar couplings in the presence of large chemical-shift anisotropy. The resonance condition of the new sequence is analyzed using triple-mode Floquet theory. Analytical calculations of second-order effective Hamiltonian are performed to compare the efficiency in suppressing second-order cross terms. Experiments and numerical simulations are shown to corroborate the results of the theoretical analysis.

  19. Sequence Alignment with Dynamic Divisor Generation for Keystroke Dynamics Based User Authentication

    OpenAIRE

    Jiacang Ho; Dae-Ki Kang

    2015-01-01

    Keystroke dynamics based authentication is one of the prevention mechanisms used to protect one’s account from criminals’ illegal access. In this authentication mechanism, keystroke dynamics are used to capture patterns in a user typing behavior. Sequence alignment is shown to be one of effective algorithms for keystroke dynamics based authentication, by comparing the sequences of keystroke data to detect imposter’s anomalous sequences. In previous research, static divisor has been used for s...

  20. ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors

    OpenAIRE

    2009-01-01

    This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein–DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are esse...

  1. Prediction of potential drug targets based on simple sequence properties

    Directory of Open Access Journals (Sweden)

    Lai Luhua

    2007-09-01

    Full Text Available Abstract Background During the past decades, research and development in drug discovery have attracted much attention and efforts. However, only 324 drug targets are known for clinical drugs up to now. Identifying potential drug targets is the first step in the process of modern drug discovery for developing novel therapeutic agents. Therefore, the identification and validation of new and effective drug targets are of great value for drug discovery in both academia and pharmaceutical industry. If a protein can be predicted in advance for its potential application as a drug target, the drug discovery process targeting this protein will be greatly speeded up. In the current study, based on the properties of known drug targets, we have developed a sequence-based drug target prediction method for fast identification of novel drug targets. Results Based on simple physicochemical properties extracted from protein sequences of known drug targets, several support vector machine models have been constructed in this study. The best model can distinguish currently known drug targets from non drug targets at an accuracy of 84%. Using this model, potential protein drug targets of human origin from Swiss-Prot were predicted, some of which have already attracted much attention as potential drug targets in pharmaceutical research. Conclusion We have developed a drug target prediction method based solely on protein sequence information without the knowledge of family/domain annotation, or the protein 3D structure. This method can be applied in novel drug target identification and validation, as well as genome scale drug target predictions.

  2. An Incremental Algorithm of Text Clustering Based on Semantic Sequences

    Institute of Scientific and Technical Information of China (English)

    FENG Zhonghui; SHEN Junyi; BAO Junpeng

    2006-01-01

    This paper proposed an incremental textclustering algorithm based on semantic sequence.Using similarity relation of semantic sequences and calculating the cover of similarity semantic sequences set, the candidate cluster with minimum entropy overlap value was selected as a result cluster every time in this algorithm.The comparison of experimental results shows that the precision of the algorithm is higher than other algorithms under same conditions and this is obvious especially on long documents set.

  3. RNA-RNA interaction prediction based on multiple sequence alignments

    CERN Document Server

    Li, Andrew X; Qin, Jing; Reidys, Christian M

    2010-01-01

    Recently, $O(N^6)$ time and $O(N^4)$ space dynamic programming algorithms have become available that compute the partition function of RNA-RNA interaction complexes for pairs of RNA sequences. These algorithms and the biological requirement of more reliable interactions motivate to utilize the additional information contained in multiple sequence alignments and to generalize the above framework to the partition function and base pairing probabilities for multiple sequence alignments.

  4. Feature-based Image Sequence Compression Coding

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    A novel compressing method for video teleconference applications is presented. Semantic-based coding based on human image feature is realized, where human features are adopted as parameters. Model-based coding and the concept of vector coding are combined with the work on image feature extraction to obtain the result.

  5. Rare variant detection using family-based sequencing analysis.

    Science.gov (United States)

    Peng, Gang; Fan, Yu; Palculict, Timothy B; Shen, Peidong; Ruteshouser, E Cristy; Chi, Aung-Kyaw; Davis, Ronald W; Huff, Vicki; Scharfe, Curt; Wang, Wenyi

    2013-03-05

    Next-generation sequencing is revolutionizing genomic analysis, but this analysis can be compromised by high rates of missing true variants. To develop a robust statistical method capable of identifying variants that would otherwise not be called, we conducted sequence data simulations and both whole-genome and targeted sequencing data analysis of 28 families. Our method (Family-Based Sequencing Program, FamSeq) integrates Mendelian transmission information and raw sequencing reads. Sequence analysis using FamSeq reduced the number of false negative variants by 14-33% as assessed by HapMap sample genotype confirmation. In a large family affected with Wilms tumor, 84% of variants uniquely identified by FamSeq were confirmed by Sanger sequencing. In children with early-onset neurodevelopmental disorders from 26 families, de novo variant calls in disease candidate genes were corrected by FamSeq as mendelian variants, and the number of uniquely identified variants in affected individuals increased proportionally as additional family members were included in the analysis. To gain insight into maximizing variant detection, we studied factors impacting actual improvements of family-based calling, including pedigree structure, allele frequency (common vs. rare variants), prior settings of minor allele frequency, sequence signal-to-noise ratio, and coverage depth (∼20× to >200×). These data will help guide the design, analysis, and interpretation of family-based sequencing studies to improve the ability to identify new disease-associated genes.

  6. Streaming Support for Data Intensive Cloud-Based Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Shadi A. Issa

    2013-01-01

    Full Text Available Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client’s site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.

  7. Measurement of word frequencies in genomic DNA sequences based on partial alignment and fuzzy set.

    Science.gov (United States)

    Shida, Fumiya; Mizuta, Satoshi

    2014-08-01

    Accompanied with the rapid increase of the amount of data registered in the databases of biological sequences, the need for a fast method of sequence comparison applicable to sequences of large size is also increasing. In general, alignment is used for sequence comparison. However, the alignment may not be appropriate for comparison of sequences of large size such as whole genome sequences due to its large time complexity. In this article, we propose a semi alignment-free method of sequence comparison based on word frequency distributions, in which we partially use the alignment to measure word frequencies along with the idea of fuzzy set theory. Experiments with ten bacterial genome sequences demonstrated that the fuzzy measurements has the effect that facilitates discrimination between close relatives and distant relatives.

  8. An Ant-Based Model for Multiple Sequence Alignment

    CERN Document Server

    Guinand, Frédéric

    2008-01-01

    Multiple sequence alignment is a key process in today's biology, and finding a relevant alignment of several sequences is much more challenging than just optimizing some improbable evaluation functions. Our approach for addressing multiple sequence alignment focuses on the building of structures in a new graph model: the factor graph model. This model relies on block-based formulation of the original problem, formulation that seems to be one of the most suitable ways for capturing evolutionary aspects of alignment. The structures are implicitly built by a colony of ants laying down pheromones in the factor graphs, according to relations between blocks belonging to the different sequences.

  9. Facility Layout Based on Sequence Analysis: Design of Flowshops

    Institute of Scientific and Technical Information of China (English)

    ZHOU Jin; WU Zhi-ming

    2009-01-01

    A computer-aided method to design a hybrid layout-tree-shape planar flowlines is presented. In new-type flowshop layout, the common machines shared by several flowlines could be located together in functional sections. The approach combines traditional cell formation techniques with sequence alignment algorithms. Firstly, a sequence analysis based cell formation procedure is adopted; then the operation sequences for parts are aligned to maximize machines adjacency in hyperedge representations; finally a tree-shape planar flowline will be obtained for each part family. With the help of a sample of operation sequences obtained from industry, this algorithm is illustrated.

  10. Swarm-based Sequencing Recommendations in E-learning

    NARCIS (Netherlands)

    Van den Berg, Bert; Tattersall, Colin; Janssen, José; Brouns, Francis; Kurvers, Hub; Koper, Rob

    2005-01-01

    Van den Berg, B., Tattersall, C., Janssen, J., Brouns, F., Kurvers, H., & Koper, R. (2006). Swarm-based Sequencing Recommendations in E-learning. International Journal of Computer Science & Applications, III(III), 1-11.

  11. Analysis of chimpanzee history based on genome sequence alignments.

    Directory of Open Access Journals (Sweden)

    Jennifer L Caswell

    2008-04-01

    Full Text Available Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously available. We show that bonobos and common chimpanzees were separated approximately 1,290,000 years ago, western and other common chimpanzees approximately 510,000 years ago, and eastern and central chimpanzees at least 50,000 years ago. We infer that the central chimpanzee population size increased by at least a factor of 4 since its separation from western chimpanzees, while the western chimpanzee effective population size decreased. Surprisingly, in about one percent of the genome, the genetic relationships between humans, chimpanzees, and bonobos appear to be different from the species relationships. We used PCR-based resequencing to confirm 11 regions where chimpanzees and bonobos are not most closely related. Study of such loci should provide information about the period of time 5-7 million years ago when the ancestors of humans separated from those of the chimpanzees.

  12. Immune and Genetic Algorithm Based Assembly Sequence Planning

    Institute of Scientific and Technical Information of China (English)

    YANG Jian-guo; LI Bei-zhi; YU Lei; JIN Yu-song

    2004-01-01

    In this paper an assembly sequence planning model inspired by natural immune and genetic algorithm (ASPIG) based on the part degrees of freedom matrix (PDFM) is proposed, and a proto system - DSFAS based on the ASPIG is introduced to solve assembly sequence problem. The concept and generation of PDFM and DSFAS are also discussed. DSFAS can prevent premature convergence, and promote population diversity, and can accelerate the learning and convergence speed in behavior evolution problem.

  13. Novel Frequency Hopping Sequences Generator Based on AES Algorithm

    Institute of Scientific and Technical Information of China (English)

    李振荣; 庄奕琪; 张博; 张超

    2010-01-01

    A novel frequency hopping(FH) sequences generator based on advanced encryption standard(AES) iterated block cipher is proposed for FH communication systems.The analysis shows that the FH sequences based on AES algorithm have good performance in uniformity, correlation, complexity and security.A high-speed, low-power and low-cost ASIC of FH sequences generator is implemented by optimizing the structure of S-Box and MixColumns of AES algorithm, proposing a hierarchical power management strategy, and applying ...

  14. Molecular Systematics of Polygonum minus Huds. Based on ITS Sequences

    Directory of Open Access Journals (Sweden)

    Normah Mohd Noor

    2011-11-01

    Full Text Available Plastid trnL-trnF and nuclear ribosomal ITS sequences were obtained from selected wild-type individuals of Polygonum minus Huds. in Peninsular Malaysia. The 380 bp trnL-trnF sequences of the Polygonum minus accessions were identical. Therefore, the trnL-trnF failed to distinguish between the Polygonum minus accessions. However, the divergence of ITS sequences (650 bp among the Polygonum minus accessions was 1%, indicating that these accessions could be distinguished by the ITS sequences. A phylogenetic relationship based on the ITS sequences was inferred using neighbor-joining, maximum parsimony and Bayesian inference. All of the tree topologies indicated that Polygonum minus from Peninsular Malaysia is unique and different from the synonymous Persicaria minor (Huds. Opiz and Polygonum kawagoeanum Makino.

  15. An optical CDMA system based on chaotic sequences

    Science.gov (United States)

    Liu, Xiao-lei; En, De; Wang, Li-guo

    2014-03-01

    In this paper, a coherent asynchronous optical code division multiple access (OCDMA) system is proposed, whose encoder/decoder is an all-optical generator. This all-optical generator can generate analog and bipolar chaotic sequences satisfying the logistic maps. The formula of bit error rate (BER) is derived, and the relationship of BER and the number of simultaneous transmissions is analyzed. Due to the good property of correlation, this coherent OCDMA system based on these bipolar chaotic sequences can support a large number of simultaneous users, which shows that these chaotic sequences are suitable for asynchronous OCDMA system.

  16. New chaos-based encryption scheme for digital sequence

    Institute of Scientific and Technical Information of China (English)

    Zhang Zhengwei; Fan Yangyu; Zeng Li

    2007-01-01

    To enhance the anti-breaking performance of privacy information, this article proposes a new encryption method utilizing the leaping peculiarity of the periodic orbits of chaos systems. This method maps the secret sequence to several chaos periodic orbits, and a short sequence obtained by evolving the system parameters of the periodic orbits in another nonlinear system will be the key to reconstruct these periodic orbits. In the decryption end, the shadowing method of chaos trajectory based on the modified Newton-Raphson algorithm is adopted to restore these system parameters. Through deciding which orbit each pair coordinate falls on, the original digital sequence can be decrypted.

  17. Markov chaotic sequences for correlation based watermarking schemes

    Energy Technology Data Exchange (ETDEWEB)

    Tefas, A.; Nikolaidis, A.; Nikolaidis, N.; Solachidis, V.; Tsekeridou, S.; Pitas, I. E-mail: pitas@zeus.csd.auth.gr

    2003-07-01

    In this paper, statistical analysis of watermarking schemes based on correlation detection is presented. Statistical properties of watermark sequences generated by piecewise-linear Markov maps are exploited, resulting in superior watermark detection reliability. Correlation/spectral properties of such sequences are easily controllable, a fact that affects the watermarking system performance. A family of chaotic maps, namely the skew tent map family, is proposed for use in watermarking schemes.

  18. Peptide based diagnostics: are random-sequence peptides more useful than tiling proteome sequences?

    Science.gov (United States)

    Navalkar, Krupa Arun; Johnston, Stephan Albert; Stafford, Phillip

    2015-02-01

    Diagnostics using peptide ligands have been available for decades. However, their adoption in diagnostics has been limited, not because of poor sensitivity but in many cases due to diminished specificity. Numerous reports suggest that protein-based rather than peptide-based disease detection is more specific. We examined two different approaches to peptide-based diagnostics using Coccidioides (aka Valley Fever) as the disease model. Although the pathogen was discovered more than a century ago, a highly sensitive diagnostic remains unavailable. We present a case study where two different approaches to diagnosing Valley Fever were used: first, overlapping Valley Fever epitopes representing immunodominant Coccidioides antigens were tiled using a microarray format of presynthesized peptides. Second, a set of random sequence peptides identified using a 10,000 peptide immunosignaturing microarray was compared for sensitivity and specificity. The scientific hypothesis tested was that actual epitope peptides from Coccidioides would provide sufficient sensitivity and specificity as a diagnostic. Results demonstrated that random sequence peptides exhibited higher accuracy when classifying different stages of Valley Fever infection vs. epitope peptides. The epitope peptide array did provide better performance than the existing immunodiffusion array, but when directly compared to the random sequence peptides, reported lower overall accuracy. This study suggests that there are competing aspects of antibody recognition that involve conservation of pathogen sequence and aspects of mimotope recognition and amino acid substitutions. These factors may prove critical when developing the next generation of high-performance immunodiagnostics.

  19. Co-barcoded sequence reads from long DNA fragments: A cost-effective solution for Perfect Genome sequencing

    Directory of Open Access Journals (Sweden)

    Brock A Peters

    2015-01-01

    Full Text Available Next generation sequencing (NGS technologies, primarily based on massively parallel sequencing (MPS, have touched and radically changed almost all aspects of research worldwide. These technologies have allowed for the rapid analysis, to date, of the genomes of more than 2,000 different species. In humans, NGS has arguably had the largest impact. Over 100,000 genomes of individual humans (based on various estimates have been sequenced allowing for deep insights into what makes individuals and families unique and what causes disease in each of us. Despite all of this progress, the current state of the art in sequence technology is far from generating a perfect genome sequence and much remains to be understood in the biology of human and other organisms’ genomes. In the article that follows we outline, why the perfect genome in humans is important, what is lacking from current human whole genome sequences, and a potential strategy for achieving the perfect genome in a cost effective manner.

  20. MutS recognition: Multiple mismatches and sequence context effects

    Indian Academy of Sciences (India)

    Amita Joshi; Basuthkar J Rao

    2001-12-01

    Escherichia coli MutS is a versatile repair protein that specifically recognizes not only various types of mismatches but also single stranded loops of up to 4 nucleotides in length. Specific binding, followed by the next step of tracking the DNA helix that locates hemi-methylated sites, is regulated by the conformational state of the protein as a function of ATP binding/hydrolysis. Here, we study how various molecular determinants of a heteroduplex regulate mismatch recognition by MutS, the critical first step of mismatch repair. Using classical DNase I footprinting assays, we demonstrate that the hierarchy of MutS binding to various types of mismatches is identical whether the mismatches are present singly or in multiples. Moreover, this unique hierarchy is indifferent both to the differential level of DNA helical flexibility and to the unpaired status of the mismatched bases in a heteroduplex. Surprisingly, multiple mismatches exhibit reduced affinity of binding to MutS, compared to that of a similar single mismatch. Such a reduction in the affinity might be due to sequence context effects, which we established more directly by studying two identical single mismatches in an altered sequence background. A mismatch, upon simply being flipped at the same location, elicits changes in MutS specific contacts, thereby underscoring the importance of sequence context in modulating MutS binding to mismatches.

  1. Transfer in motor sequence learning: effects of practice schedule and sequence context

    OpenAIRE

    Diana Margit Müssgens; Fredrik eUllén

    2015-01-01

    Transfer (i.e., the application of a learned skill in a novel context) is an important and desirable outcome of motor skill learning. While much research has been devoted to understanding transfer of explicit skills the mechanisms of skill transfer after incidental learning remain poorly understood. The aim of this study was to 1) examine the effect of practice schedule on transfer and 2) investigate whether sequence-specific knowledge can transfer to an unfamiliar sequence context. We traine...

  2. Accuracy of structure-based sequence alignment of automatic methods

    Directory of Open Access Journals (Sweden)

    Lee Byungkook

    2007-09-01

    similarity is low, structure-based methods produce better sequence alignments than by using sequence similarities alone. However, current structure-based methods still mis-align 11–19% of the conserved core residues when compared to the human-curated CDD alignments. The alignment quality of each program depends on the protein structural type and similarity, with DaliLite showing the most agreement with CDD on average.

  3. Repeat Sequences and Base Correlations in Human Y Chromosome Palindromes

    Institute of Scientific and Technical Information of China (English)

    Neng-zhi Jin; Zi-xian Liu; Yan-jiao Qi; Wen-yuan Qiu

    2009-01-01

    On the basis of information theory and statistical methods, we use mutual information, n-tuple entropy and conditional entropy, combined with biological characteristics, to analyze the long range correlation and short range correlation in human Y chromosome palindromes. The magnitude distribution of the long range correlation which can be reflected by the mutual information is P5>P5a>P5b (P5a and P5b are the sequences that replace solely Alu repeats and all interspersed repeats with random uncorrelated sequences in human Y chromosome palindrome 5, respectively); and the magnitude distribution of the short range correlation which can be reflected by the n-tuple entropy and the conditional entropy is P5>P5a>P5b>random uncorrelated sequence. In other words, when the Alu repeats and all interspersed repeats replace with random uncorrelated sequence, the long range and short range correlation decrease gradually. However, the random uncorrelated sequence has no correlation. This research indicates that more repeat sequences result in stronger correlation between bases in human Y chromosome. The analyses may be helpful to understand the special structures of human Y chromosome palindromes profoundly.

  4. Generating Multiple Base-Resolution DNA Methylomes Using Reduced Representation Bisulfite Sequencing.

    Science.gov (United States)

    Chatterjee, Aniruddha; Rodger, Euan J; Stockwell, Peter A; Le Mée, Gwenn; Morison, Ian M

    2017-01-01

    Reduced representation bisulfite sequencing (RRBS) is an effective technique for profiling genome-wide DNA methylation patterns in eukaryotes. RRBS couples size selection, bisulfite conversion, and second-generation sequencing to enrich for CpG-dense regions of the genome. The progressive improvement of second-generation sequencing technologies and reduction in cost provided an opportunity to examine the DNA methylation patterns of multiple genomes. Here, we describe a protocol for sequencing multiple RRBS libraries in a single sequencing reaction to generate base-resolution methylomes. Furthermore, we provide a brief guideline for base-calling and data analysis of multiplexed RRBS libraries. These strategies will be useful to perform large-scale, genome-wide DNA methylation analysis.

  5. Recursive organizer (ROR): an analytic framework for sequence-based association analysis.

    Science.gov (United States)

    Zhao, Lue Ping; Huang, Xin

    2013-07-01

    The advent of next-generation sequencing technologies affords the ability to sequence thousands of subjects cost-effectively, and is revolutionizing the landscape of genetic research. With the evolving genotyping/sequencing technologies, it is not unrealistic to expect that we will soon obtain a pair of diploidic fully phased genome sequences from each subject in the near future. Here, in light of this potential, we propose an analytic framework called, recursive organizer (ROR), which recursively groups sequence variants based upon sequence similarities and their empirical disease associations, into fewer and potentially more interpretable super sequence variants (SSV). As an illustration, we applied ROR to assess an association between HLA-DRB1 and type 1 diabetes (T1D), discovering SSVs of HLA-DRB1 with sequence data from the Wellcome Trust Case Control Consortium. Specifically, ROR reduces 36 observed unique HLA-DRB1 sequences into 8 SSVs that empirically associate with T1D, a fourfold reduction of sequence complexity. Using HLA-DRB1 data from Type 1 Diabetes Genetics Consortium as cases and data from Fred Hutchinson Cancer Research Center as controls, we are able to validate associations of these SSVs with T1D. Further, SSVs consist of nine nucleotides, and each associates with its corresponding amino acids. Detailed examination of these selected amino acids reveals their potential functional roles in protein structures and possible implication to the mechanism of T1D.

  6. Transgenerational inheritance: Models and mechanisms of non-DNA sequence-based inheritance.

    Science.gov (United States)

    Miska, Eric A; Ferguson-Smith, Anne C

    2016-10-07

    Heritability has traditionally been thought to be a characteristic feature of the genetic material of an organism-notably, its DNA. However, it is now clear that inheritance not based on DNA sequence exists in multiple organisms, with examples found in microbes, plants, and invertebrate and vertebrate animals. In mammals, the molecular mechanisms have been challenging to elucidate, in part due to difficulties in designing robust models and approaches. Here we review some of the evidence, concepts, and potential mechanisms of non-DNA sequence-based transgenerational inheritance. We highlight model systems and discuss whether phenotypes are replicated or reconstructed over successive generations, as well as whether mechanisms operate at transcriptional and/or posttranscriptional levels. Finally, we explore the short- and long-term implications of non-DNA sequence-based inheritance. Understanding the effects of non-DNA sequence-based mechanisms is key to a full appreciation of heritability in health and disease.

  7. Identifying Affinity Classes of Inorganic Materials Binding Sequences via a Graph-Based Model.

    Science.gov (United States)

    Du, Nan; Knecht, Marc R; Swihart, Mark T; Tang, Zhenghua; Walsh, Tiffany R; Zhang, Aidong

    2015-01-01

    Rapid advances in bionanotechnology have recently generated growing interest in identifying peptides that bind to inorganic materials and classifying them based on their inorganic material affinities. However, there are some distinct characteristics of inorganic materials binding sequence data that limit the performance of many widely-used classification methods when applied to this problem. In this paper, we propose a novel framework to predict the affinity classes of peptide sequences with respect to an associated inorganic material. We first generate a large set of simulated peptide sequences based on an amino acid transition matrix tailored for the specific inorganic material. Then the probability of test sequences belonging to a specific affinity class is calculated by minimizing an objective function. In addition, the objective function is minimized through iterative propagation of probability estimates among sequences and sequence clusters. Results of computational experiments on two real inorganic material binding sequence data sets show that the proposed framework is highly effective for identifying the affinity classes of inorganic material binding sequences. Moreover, the experiments on the structural classification of proteins (SCOP) data set shows that the proposed framework is general and can be applied to traditional protein sequences.

  8. How effective is graphene nanopore geometry on DNA sequencing?

    CERN Document Server

    Satarifard, Vahid; Ejtehadi, Mohammad Reza

    2015-01-01

    In this paper we investigate the effects of graphene nanopore geometry on homopolymer ssDNA pulling process through nanopore using steered molecular dynamic (SMD) simulations. Different graphene nanopores are examined including axially symmetric and asymmetric monolayer graphene nanopores as well as five layer graphene polyhedral crystals (GPC). The pulling force profile, moving fashion of ssDNA, work done in irreversible DNA pulling and orientations of DNA bases near the nanopore are assessed. Simulation results demonstrate the strong effect of the pore shape as well as geometrical symmetry on free energy barrier, orientations and dynamic of DNA translocation through graphene nanopore. Our study proposes that the symmetric circular geometry of monolayer graphene nanopore with high pulling velocity can be used for DNA sequencing.

  9. Protein Function Prediction Based on Sequence and Structure Information

    KAUST Repository

    Smaili, Fatima Z.

    2016-05-25

    The number of available protein sequences in public databases is increasing exponentially. However, a significant fraction of these sequences lack functional annotation which is essential to our understanding of how biological systems and processes operate. In this master thesis project, we worked on inferring protein functions based on the primary protein sequence. In the approach we follow, 3D models are first constructed using I-TASSER. Functions are then deduced by structurally matching these predicted models, using global and local similarities, through three independent enzyme commission (EC) and gene ontology (GO) function libraries. The method was tested on 250 “hard” proteins, which lack homologous templates in both structure and function libraries. The results show that this method outperforms the conventional prediction methods based on sequence similarity or threading. Additionally, our method could be improved even further by incorporating protein-protein interaction information. Overall, the method we use provides an efficient approach for automated functional annotation of non-homologous proteins, starting from their sequence.

  10. Which Microbial Communities Are Present? Sequence-Based Metagenomics

    Science.gov (United States)

    Caffrey, Sean M.

    The use of metagenomic methods that directly sequence environmental samples has revealed the extraordinary microbial diversity missed by traditional culture-based methodologies. Therefore, to develop a complete and representative model of an environment's microbial community and activities, metagenomic analysis is an essential tool.

  11. Constructing Breaker Sequence based System Restoration Strategy with Graph Theory

    OpenAIRE

    Peng, C.; Qin, Z.; Wang, C.; Hou, Y

    2014-01-01

    This paper has proposed a mapping approach to serve as an interface between the branch-bus model and the breaker-based model. In order to find the specific optimal operation for breakers in substations according to the restoration strategies, firstly, the paper has established the breaker-based model for the substation by using graphic theory, and then the optimal operation sequence for breakers has been figured out by adopting Dijkstra algorithm. Finally, a case study for a realistic power s...

  12. A New Images Hiding Scheme Based on Chaotic Sequences

    Institute of Scientific and Technical Information of China (English)

    LIU Nian-sheng; GUO Dong-hui; WU Bo-xi; Parr G

    2005-01-01

    We propose a data hidding technique in a still image. This technique is based on chaotic sequence in the transform domain of covert image. We use different chaotic random sequences multiplied by multiple sensitive images, respectively, to spread the spectrum of sensitive images. Multiple sensitive images are hidden in a covert image as a form of noise. The results of theoretical analysis and computer simulation show the new hiding technique have better properties with high security, imperceptibility and capacity for hidden information in comparison with the conventional scheme such as LSB (Least Significance Bit).

  13. 3D Motion Parameters Determination Based on Binocular Sequence Images

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Exactly capturing three dimensional (3D) motion information of an object is an essential and important task in computer vision, and is also one of the most difficult problems. In this paper, a binocular vision system and a method for determining 3D motion parameters of an object from binocular sequence images are introduced. The main steps include camera calibration, the matching of motion and stereo images, 3D feature point correspondences and resolving the motion parameters. Finally, the experimental results of acquiring the motion parameters of the objects with uniform velocity and acceleration in the straight line based on the real binocular sequence images by the mentioned method are presented.

  14. A comparative analysis of HIV drug resistance interpretation based on short reverse transcriptase sequences versus full sequences

    Directory of Open Access Journals (Sweden)

    Stevens Wendy

    2010-10-01

    Full Text Available Abstract Background As second-line antiretroviral treatment (ART becomes more accessible in resource-limited settings (RLS, the need for more affordable monitoring tools such as point-of-care viral load assays and simplified genotypic HIV drug resistance (HIVDR tests increases substantially. The prohibitive expenses of genotypic HIVDR assays could partly be addressed by focusing on a smaller region of the HIV reverse transcriptase gene (RT that encompasses the majority of HIVDR mutations for people on ART in RLS. In this study, an in silico analysis of 125,329 RT sequences was performed to investigate the effect of submitting short RT sequences (codon 41 to 238 to the commonly used virco®TYPE and Stanford genotype interpretation tools. Results Pair-wise comparisons between full-length and short RT sequences were performed. Additionally, a non-inferiority approach with a concordance limit of 95% and two-sided 95% confidence intervals was used to demonstrate concordance between HIVDR calls based on full-length and short RT sequences. The results of this analysis showed that HIVDR interpretations based on full-length versus short RT sequences, using the Stanford algorithms, had concordance significantly above 95%. When using the virco®TYPE algorithm, similar concordance was demonstrated (>95%, but some differences were observed for d4T, AZT and TDF, where predictions were affected in more than 5% of the sequences. Most differences in interpretation, however, were due to shifts from fully susceptible to reduced susceptibility (d4T or from reduced response to minimal response (AZT, TDF or vice versa, as compared to the predicted full RT sequence. The virco®TYPE prediction uses many more mutations outside the RT 41-238 amino acid domain, which significantly contribute to the HIVDR prediction for these 3 antiretroviral agents. Conclusions This study illustrates the acceptability of using a shortened RT sequences (codon 41-238 to obtain reliable

  15. Different Sequences of Feedback Types: Effectiveness, Attitudes, and Preferences

    Science.gov (United States)

    Wanchid, Raveewan

    2015-01-01

    The purposes of this research were to: 1) to compare the effects of different sequences of feedback types on the students' writing ability and their effect size; 2) to compare the effects of the levels of general English proficiency (high, moderate, and low) on the students' writing ability and their effect size; 3) to investigate the interaction…

  16. Genetic interaction mapping with microfluidic-based single cell sequencing

    Science.gov (United States)

    Haliburton, John R.; Shao, Wenjun; Deutschbauer, Adam; Arkin, Adam; Abate, Adam R.

    2017-01-01

    Genetic interaction mapping is useful for understanding the molecular basis of cellular decision making, but elucidating interactions genome-wide is challenging due to the massive number of gene combinations that must be tested. Here, we demonstrate a simple approach to thoroughly map genetic interactions in bacteria using microfluidic-based single cell sequencing. Using single cell PCR in droplets, we link distinct genetic information into single DNA sequences that can be decoded by next generation sequencing. Our approach is scalable and theoretically enables the pooling of entire interaction libraries to interrogate multiple pairwise genetic interactions in a single culture. The speed, ease, and low-cost of our approach makes genetic interaction mapping viable for routine characterization, allowing the interaction network to be used as a universal read out for a variety of biology experiments, and for the elucidation of interaction networks in non-model organisms. PMID:28170417

  17. Spectroscopic investigation on the telomeric DNA base sequence repeat

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Telomeres are protein-DNA complexes at the terminals of linear chromosomes, which protect chromosomal integrity and maintain cellular replicative capacity.From single-cell organisms to advanced animals and plants,structures and functions of telomeres are both very conservative. In cells of human and vertebral animals, telomeric DNA base sequences all are (TTAGGG)n. In the present work, we have obtained absorption and fluorescence spectra measured from seven synthesized oligonucleotides to simulate the telomeric DNA system and calculated their relative fluorescence quantum yields on which not only telomeric DNA characteristics are predicted but also possibly the shortened telomeric sequences during cell division are imrelative fluorescence quantum yield and remarkable excitation energy innerconversion, which tallies with the telomeric sequence of (TTAGGG)n. This result shows that telomeric DNA has a strong non-radiative or innerconvertible capability.``

  18. Phylogenetic relationships of Malassezia species based on multilocus sequence analysis.

    Science.gov (United States)

    Castellá, Gemma; Coutinho, Selene Dall' Acqua; Cabañes, F Javier

    2014-01-01

    Members of the genus Malassezia are lipophilic basidiomycetous yeasts, which are part of the normal cutaneous microbiota of humans and other warm-blooded animals. Currently, this genus consists of 14 species that have been characterized by phenetic and molecular methods. Although several molecular methods have been used to identify and/or differentiate Malassezia species, the sequencing of the rRNA genes and the chitin synthase-2 gene (CHS2) are the most widely employed. There is little information about the β-tubulin gene in the genus Malassezia, a gene has been used for the analysis of complex species groups. The aim of the present study was to sequence a fragment of the β-tubulin gene of Malassezia species and analyze their phylogenetic relationship using a multilocus sequence approach based on two rRNA genes (ITS including 5.8S rRNA and D1/D2 region of 26S rRNA) together with two protein encoding genes (CHS2 and β-tubulin). The phylogenetic study of the partial β-tubulin gene sequences indicated that this molecular marker can be used to assess diversity and identify new species. The multilocus sequence analysis of the four loci provides robust support to delineate species at the terminal nodes and could help to estimate divergence times for the origin and diversification of Malassezia species.

  19. DUK - A Fast and Efficient Kmer Based Sequence Matching Tool

    Energy Technology Data Exchange (ETDEWEB)

    Li, Mingkun; Copeland, Alex; Han, James

    2011-03-21

    A new tool, DUK, is developed to perform matching task. Matching is to find whether a query sequence partially or totally matches given reference sequences or not. Matching is similar to alignment. Indeed many traditional analysis tasks like contaminant removal use alignment tools. But for matching, there is no need to know which bases of a query sequence matches which position of a reference sequence, it only need know whether there exists a match or not. This subtle difference can make matching task much faster than alignment. DUK is accurate, versatile, fast, and has efficient memory usage. It uses Kmer hashing method to index reference sequences and Poisson model to calculate p-value. DUK is carefully implemented in C++ in object oriented design. The resulted classes can also be used to develop other tools quickly. DUK have been widely used in JGI for a wide range of applications such as contaminant removal, organelle genome separation, and assembly refinement. Many real applications and simulated dataset demonstrate its power.

  20. Detailed protein sequence alignment based on Spectral Similarity Score (SSS

    Directory of Open Access Journals (Sweden)

    Thomas Dina

    2005-04-01

    Full Text Available Abstract Background The chemical property and biological function of a protein is a direct consequence of its primary structure. Several algorithms have been developed which determine alignment and similarity of primary protein sequences. However, character based similarity cannot provide insight into the structural aspects of a protein. We present a method based on spectral similarity to compare subsequences of amino acids that behave similarly but are not aligned well by considering amino acids as mere characters. This approach finds a similarity score between sequences based on any given attribute, like hydrophobicity of amino acids, on the basis of spectral information after partial conversion to the frequency domain. Results Distance matrices of various branches of the human kinome, that is the full complement of human kinases, were developed that matched the phylogenetic tree of the human kinome establishing the efficacy of the global alignment of the algorithm. PKCd and PKCe kinases share close biological properties and structural similarities but do not give high scores with character based alignments. Detailed comparison established close similarities between subsequences that do not have any significant character identity. We compared their known 3D structures to establish that the algorithm is able to pick subsequences that are not considered similar by character based matching algorithms but share structural similarities. Similarly many subsequences with low character identity were picked between xyna-theau and xyna-clotm F/10 xylanases. Comparison of 3D structures of the subsequences confirmed the claim of similarity in structure. Conclusion An algorithm is developed which is inspired by successful application of spectral similarity applied to music sequences. The method captures subsequences that do not align by traditional character based alignment tools but give rise to similar secondary and tertiary structures. The Spectral

  1. Repeat-based Sequence Typing of Carnobacterium maltaromaticum.

    Science.gov (United States)

    Rahman, Abdur; El Kheir, Sara M; Back, Alexandre; Mangavel, Cécile; Revol-Junelles, Anne-Marie; Borges, Frédéric

    2016-06-01

    Carnobacterium maltaromaticum is a Lactic Acid Bacterium (LAB) of technological interest for the food industry, especially the dairy as bioprotection and ripening flora. The industrial use of this LAB requires accurate and resolutive typing tools. A new typing method for C. maltaromaticum inspired from MLVA analysis and called Repeat-based Sequence Typing (RST) is described. Rather than electrophoresis analysis, our RST method is based on sequence analysis of multiple loci containing Variable-Number Tandem-Repeats (VNTRs). The method described here for C. maltaromaticum relies on the analysis of three VNTR loci, and was applied to a collection of 24 strains. For each strain, a PCR product corresponding to the amplification of each VNTR loci was sequenced. Sequence analysis allowed delineating 11, 11, and 12 alleles for loci VNTR-A, VNTR-B, and VNTR-C, respectively. Considering the allele combination exhibited by each strain allowed defining 15 genotypes, ending in a discriminatory index of 0.94. Comparison with MLST revealed that both methods were complementary for strain typing in C. maltaromaticum.

  2. Revision of Begomovirus taxonomy based on pairwise sequence comparisons

    KAUST Repository

    Brown, Judith K.

    2015-04-18

    Viruses of the genus Begomovirus (family Geminiviridae) are emergent pathogens of crops throughout the tropical and subtropical regions of the world. By virtue of having a small DNA genome that is easily cloned, and due to the recent innovations in cloning and low-cost sequencing, there has been a dramatic increase in the number of available begomovirus genome sequences. Even so, most of the available sequences have been obtained from cultivated plants and are likely a small and phylogenetically unrepresentative sample of begomovirus diversity, a factor constraining taxonomic decisions such as the establishment of operationally useful species demarcation criteria. In addition, problems in assigning new viruses to established species have highlighted shortcomings in the previously recommended mechanism of species demarcation. Based on the analysis of 3,123 full-length begomovirus genome (or DNA-A component) sequences available in public databases as of December 2012, a set of revised guidelines for the classification and nomenclature of begomoviruses are proposed. The guidelines primarily consider a) genus-level biological characteristics and b) results obtained using a standardized classification tool, Sequence Demarcation Tool, which performs pairwise sequence alignments and identity calculations. These guidelines are consistent with the recently published recommendations for the genera Mastrevirus and Curtovirus of the family Geminiviridae. Genome-wide pairwise identities of 91 % and 94 % are proposed as the demarcation threshold for begomoviruses belonging to different species and strains, respectively. Procedures and guidelines are outlined for resolving conflicts that may arise when assigning species and strains to categories wherever the pairwise identity falls on or very near the demarcation threshold value.

  3. Effects of Sequence on Transmission Properties of DNA Molecules

    Institute of Scientific and Technical Information of China (English)

    DONG Rui-Xin; YAN Xun-Ling; YANG Bing

    2008-01-01

    A double helix model of charge transport in DNA molecule is given and the transmission spectra of four DNA sequences are obtained. The calculated results show that the transmission characteristics of DNA are not only related to the longitudinal transport but also to the transverse transport of molecule. The periodic sequence with the same composition has stronger conduction ability. With the increasing of bases composition, the conductive ability reduces, but the weight of θ direction rises in charge transfer.

  4. Antibiotic Selection Pressure Determination through Sequence-Based Metagenomics.

    Science.gov (United States)

    Willmann, Matthias; El-Hadidi, Mohamed; Huson, Daniel H; Schütz, Monika; Weidenmaier, Christopher; Autenrieth, Ingo B; Peter, Silke

    2015-12-01

    The human gut forms a dynamic reservoir of antibiotic resistance genes (ARGs). Treatment with antimicrobial agents has a significant impact on the intestinal resistome and leads to enhanced horizontal transfer and selection of resistance. We have monitored the development of intestinal ARGs over a 6-day course of ciprofloxacin (Cp) treatment in two healthy individuals by using sequenced-based metagenomics and different ARG quantification methods. Fixed- and random-effect models were applied to determine the change in ARG abundance per defined daily dose of Cp as an expression of the respective selection pressure. Among various shifts in the composition of the intestinal resistome, we found in one individual a strong positive selection for class D beta-lactamases which were partly located on a mobile genetic element. Furthermore, a trend to a negative selection has been observed with class A beta-lactamases (-2.66 hits per million sample reads/defined daily dose; P = 0.06). By 4 weeks after the end of treatment, the composition of ARGs returned toward their initial state but to a different degree in both subjects. We present here a novel analysis algorithm for the determination of antibiotic selection pressure which can be applied in clinical settings to compare therapeutic regimens regarding their effect on the intestinal resistome. This information is of critical importance for clinicians to choose antimicrobial agents with a low selective force on their patients' intestinal ARGs, likely resulting in a diminished spread of resistance and a reduced burden of hospital-acquired infections with multidrug-resistant pathogens.

  5. Application of genotyping-by-sequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley.

    Directory of Open Access Journals (Sweden)

    Martin Mascher

    Full Text Available The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS, a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new

  6. Application of genotyping-by-sequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley.

    Science.gov (United States)

    Mascher, Martin; Wu, Shuangye; Amand, Paul St; Stein, Nils; Poland, Jesse

    2013-01-01

    The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL) population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new sequencing

  7. Spike-Based Bayesian-Hebbian Learning of Temporal Sequences.

    Science.gov (United States)

    Tully, Philip J; Lindén, Henrik; Hennig, Matthias H; Lansner, Anders

    2016-05-01

    Many cognitive and motor functions are enabled by the temporal representation and processing of stimuli, but it remains an open issue how neocortical microcircuits can reliably encode and replay such sequences of information. To better understand this, a modular attractor memory network is proposed in which meta-stable sequential attractor transitions are learned through changes to synaptic weights and intrinsic excitabilities via the spike-based Bayesian Confidence Propagation Neural Network (BCPNN) learning rule. We find that the formation of distributed memories, embodied by increased periods of firing in pools of excitatory neurons, together with asymmetrical associations between these distinct network states, can be acquired through plasticity. The model's feasibility is demonstrated using simulations of adaptive exponential integrate-and-fire model neurons (AdEx). We show that the learning and speed of sequence replay depends on a confluence of biophysically relevant parameters including stimulus duration, level of background noise, ratio of synaptic currents, and strengths of short-term depression and adaptation. Moreover, sequence elements are shown to flexibly participate multiple times in the sequence, suggesting that spiking attractor networks of this type can support an efficient combinatorial code. The model provides a principled approach towards understanding how multiple interacting plasticity mechanisms can coordinate hetero-associative learning in unison.

  8. Spike-Based Bayesian-Hebbian Learning of Temporal Sequences.

    Directory of Open Access Journals (Sweden)

    Philip J Tully

    2016-05-01

    Full Text Available Many cognitive and motor functions are enabled by the temporal representation and processing of stimuli, but it remains an open issue how neocortical microcircuits can reliably encode and replay such sequences of information. To better understand this, a modular attractor memory network is proposed in which meta-stable sequential attractor transitions are learned through changes to synaptic weights and intrinsic excitabilities via the spike-based Bayesian Confidence Propagation Neural Network (BCPNN learning rule. We find that the formation of distributed memories, embodied by increased periods of firing in pools of excitatory neurons, together with asymmetrical associations between these distinct network states, can be acquired through plasticity. The model's feasibility is demonstrated using simulations of adaptive exponential integrate-and-fire model neurons (AdEx. We show that the learning and speed of sequence replay depends on a confluence of biophysically relevant parameters including stimulus duration, level of background noise, ratio of synaptic currents, and strengths of short-term depression and adaptation. Moreover, sequence elements are shown to flexibly participate multiple times in the sequence, suggesting that spiking attractor networks of this type can support an efficient combinatorial code. The model provides a principled approach towards understanding how multiple interacting plasticity mechanisms can coordinate hetero-associative learning in unison.

  9. Leveraging Sequence Classification by Taxonomy-Based Multitask Learning

    Science.gov (United States)

    Widmer, Christian; Leiva, Jose; Altun, Yasemin; Rätsch, Gunnar

    In this work we consider an inference task that biologists are very good at: deciphering biological processes by bringing together knowledge that has been obtained by experiments using various organisms, while respecting the differences and commonalities of these organisms. We look at this problem from an sequence analysis point of view, where we aim at solving the same classification task in different organisms. We investigate the challenge of combining information from several organisms, whereas we consider the relation between the organisms to be defined by a tree structure derived from their phylogeny. Multitask learning, a machine learning technique that recently received considerable attention, considers the problem of learning across tasks that are related to each other. We treat each organism as one task and present three novel multitask learning methods to handle situations in which the relationships among tasks can be described by a hierarchy. These algorithms are designed for large-scale applications and are therefore applicable to problems with a large number of training examples, which are frequently encountered in sequence analysis. We perform experimental analyses on synthetic data sets in order to illustrate the properties of our algorithms. Moreover, we consider a problem from genomic sequence analysis, namely splice site recognition, to illustrate the usefulness of our approach. We show that intelligently combining data from 15 eukaryotic organisms can indeed significantly improve the prediction performance compared to traditional learning approaches. On a broader perspective, we expect that algorithms like the ones presented in this work have the potential to complement and enrich the strategy of homology-based sequence analysis that are currently the quasi-standard in biological sequence analysis.

  10. Sequence Alignment with Dynamic Divisor Generation for Keystroke Dynamics Based User Authentication

    Directory of Open Access Journals (Sweden)

    Jiacang Ho

    2015-01-01

    Full Text Available Keystroke dynamics based authentication is one of the prevention mechanisms used to protect one’s account from criminals’ illegal access. In this authentication mechanism, keystroke dynamics are used to capture patterns in a user typing behavior. Sequence alignment is shown to be one of effective algorithms for keystroke dynamics based authentication, by comparing the sequences of keystroke data to detect imposter’s anomalous sequences. In previous research, static divisor has been used for sequence generation from the keystroke data, which is a number used to divide a time difference of keystroke data into an equal-length subinterval. After the division, the subintervals are mapped to alphabet letters to form sequences. One major drawback of this static divisor is that the amount of data for this subinterval generation is often insufficient, which leads to premature termination of subinterval generation and consequently causes inaccurate sequence alignment. To alleviate this problem, we introduce sequence alignment of dynamic divisor (SADD in this paper. In SADD, we use mean of Horner’s rule technique to generate dynamic divisors and apply them to produce the subintervals with different length. The comparative experimental results with SADD and other existing algorithms indicate that SADD is usually comparable to and often outperforms other existing algorithms.

  11. Entamoeba histolytica: observations on metabolism based on thegenome sequence

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain J.; Loftus, Brendan J.

    2005-07-01

    The sequencing of the genome of Entamoeba histolytica has allowed a reconstruction of its metabolic pathways, many of which are unusual for a eukaryote. Based on the genome sequence, it appears that amino acids may play a larger role than previously thought in energy metabolism, with roles in both ATP synthesis and NAD regeneration. Arginine decarboxylase may be involved in survival of E. histolytica during its passage through the stomach. The usual pyrimidine synthesis pathway is absent, but a partial pyrimidine degradation pathway could be part of a novel pyrimidine synthesis pathway. Ribonucleotide reductase was not found in the E. histolytica genome, but it was found in the close relatives Entamoeba invadens and Entamoeba moshkovskii, suggesting a recent loss from E. histolytica. The usual eukaryotic glucose transporters are not present, but members of a prokaryotic monosaccharide transporter family are present.

  12. MOST: a modified MLST typing tool based on short read sequencing

    Science.gov (United States)

    Dallman, Timothy; Schaefer, Ulf; Sheppard, Carmen L.; Ashton, Philip; Pichon, Bruno; Ellington, Matthew; Swift, Craig; Green, Jonathan; Underwood, Anthony

    2016-01-01

    Multilocus sequence typing (MLST) is an effective method to describe bacterial populations. Conventionally, MLST involves Polymerase Chain Reaction (PCR) amplification of housekeeping genes followed by Sanger DNA sequencing. Public Health England (PHE) is in the process of replacing the conventional MLST methodology with a method based on short read sequence data derived from Whole Genome Sequencing (WGS). This paper reports the comparison of the reliability of MLST results derived from WGS data, comparing mapping and assembly-based approaches to conventional methods using 323 bacterial genomes of diverse species. The sensitivity of the two WGS based methods were further investigated with 26 mixed and 29 low coverage genomic data sets from Salmonella enteridis and Streptococcus pneumoniae. Of the 323 samples, 92.9% (n = 300), 97.5% (n = 315) and 99.7% (n = 322) full MLST profiles were derived by the conventional method, assembly- and mapping-based approaches, respectively. The concordance between samples that were typed by conventional (92.9%) and both WGS methods was 100%. From the 55 mixed and low coverage genomes, 89.1% (n = 49) and 67.3% (n = 37) full MLST profiles were derived from the mapping and assembly based approaches, respectively. In conclusion, deriving MLST from WGS data is more sensitive than the conventional method. When comparing WGS based methods, the mapping based approach was the most sensitive. In addition, the mapping based approach described here derives quality metrics, which are difficult to determine quantitatively using conventional and WGS-assembly based approaches. PMID:27602279

  13. Model-based quality assessment and base-calling for second-generation sequencing data.

    Science.gov (United States)

    Bravo, Héctor Corrada; Irizarry, Rafael A

    2010-09-01

    Second-generation sequencing (sec-gen) technology can sequence millions of short fragments of DNA in parallel, making it capable of assembling complex genomes for a small fraction of the price and time of previous technologies. In fact, a recently formed international consortium, the 1000 Genomes Project, plans to fully sequence the genomes of approximately 1200 people. The prospect of comparative analysis at the sequence level of a large number of samples across multiple populations may be achieved within the next five years. These data present unprecedented challenges in statistical analysis. For instance, analysis operates on millions of short nucleotide sequences, or reads-strings of A,C,G, or T's, between 30 and 100 characters long-which are the result of complex processing of noisy continuous fluorescence intensity measurements known as base-calling. The complexity of the base-calling discretization process results in reads of widely varying quality within and across sequence samples. This variation in processing quality results in infrequent but systematic errors that we have found to mislead downstream analysis of the discretized sequence read data. For instance, a central goal of the 1000 Genomes Project is to quantify across-sample variation at the single nucleotide level. At this resolution, small error rates in sequencing prove significant, especially for rare variants. Sec-gen sequencing is a relatively new technology for which potential biases and sources of obscuring variation are not yet fully understood. Therefore, modeling and quantifying the uncertainty inherent in the generation of sequence reads is of utmost importance. In this article, we present a simple model to capture uncertainty arising in the base-calling procedure of the Illumina/Solexa GA platform. Model parameters have a straightforward interpretation in terms of the chemistry of base-calling allowing for informative and easily interpretable metrics that capture the variability in

  14. Diagnosis of stator faults in induction motor based on zero sequence voltage after switch-off

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    To improve the accuracy of the stator winding fault diagnosis in induction motor, a new diagnostic method based on the Hilbert-Huang transform (HHT) was proposed. The ratio of fundamental zero sequence voltage to positive sequence voltage after switch-offwas selected as the stator fault characteristic, which could effectively avoid the influence of the supply unbalance and the load fluctuation, and directly represent the asymmetry in the stator. Using the empirical mode decomposition (EMD) based on HHT, the zero sequence voltage after switch-off was decomposed and the fundamental component was extracted. Then, the fault characteristic can be acquired. Experimental results on a 4-kW induction motor demonstrate the feasibility and effectiveness of this method.

  15. ParticleCall: A particle filter for base calling in next-generation sequencing systems

    Directory of Open Access Journals (Sweden)

    Shen Xiaohu

    2012-07-01

    Full Text Available Abstract Background Next-generation sequencing systems are capable of rapid and cost-effective DNA sequencing, thus enabling routine sequencing tasks and taking us one step closer to personalized medicine. Accuracy and lengths of their reads, however, are yet to surpass those provided by the conventional Sanger sequencing method. This motivates the search for computationally efficient algorithms capable of reliable and accurate detection of the order of nucleotides in short DNA fragments from the acquired data. Results In this paper, we consider Illumina’s sequencing-by-synthesis platform which relies on reversible terminator chemistry and describe the acquired signal by reformulating its mathematical model as a Hidden Markov Model. Relying on this model and sequential Monte Carlo methods, we develop a parameter estimation and base calling scheme called ParticleCall. ParticleCall is tested on a data set obtained by sequencing phiX174 bacteriophage using Illumina’s Genome Analyzer II. The results show that the developed base calling scheme is significantly more computationally efficient than the best performing unsupervised method currently available, while achieving the same accuracy. Conclusions The proposed ParticleCall provides more accurate calls than the Illumina’s base calling algorithm, Bustard. At the same time, ParticleCall is significantly more computationally efficient than other recent schemes with similar performance, rendering it more feasible for high-throughput sequencing data analysis. Improvement of base calling accuracy will have immediate beneficial effects on the performance of downstream applications such as SNP and genotype calling. ParticleCall is freely available at https://sourceforge.net/projects/particlecall.

  16. Analysis of Sequence Based Classifier Prediction for HIV Subtypes

    Directory of Open Access Journals (Sweden)

    S. Santhosh Kumar

    2012-10-01

    Full Text Available Human immunodeficiency virus (HIV is a lent virus that causes acquired immunodeficiency syndrome (AIDS. The main drawback in HIV treatment process is its sub type prediction. The sub type and group classification of HIV is based on its genetic variability and location. HIV can be divided into two major types, HIV type 1 (HIV-1 and HIV type 2 (HIV-2. Many classifier approaches have been used to classify HIV subtypes based on their group, but some of cases are having two groups in one; in such cases the classification becomes more complex. The methodology used is this paper based on the HIV sequences. For this work several classifier approaches are used to classify the HIV1 and HIV2. For implementation of the work a real time patient database is taken and the patient records are experimented and the final best classifier is identified with quick response time and least error rate.

  17. Next-Generation Sequencing-Based Molecular Diagnosis of Choroideremia

    Directory of Open Access Journals (Sweden)

    Kayo Shimizu

    2015-07-01

    Full Text Available We screened patients with choroideremia using next-generation sequencing (NGS and identified a novel mutation and a known mutation in the CHM gene. One patient presented an atypical fundus appearance for choroideremia. Another patient presented macular hole retinal detachment in the left eye. The present case series shows the utility of NGS-based screening in patients with choroideremia. In addition, the presence of macular hole in 1 of the 2 patients, together with a previous report, indicated the susceptibility of patients with choroideremia to macular hole.

  18. All-optical pseudorandom bit sequences generator based on TOADs

    Science.gov (United States)

    Sun, Zhenchao; Wang, Zhi; Wu, Chongqing; Wang, Fu; Li, Qiang

    2016-03-01

    A scheme for all-optical pseudorandom bit sequences (PRBS) generator is demonstrated with optical logic gate 'XNOR' and all-optical wavelength converter based on cascaded Tera-Hertz Optical Asymmetric Demultiplexer (TOADs). Its feasibility is verified by generation of return-to-zero on-off keying (RZ-OOK) 263-1 PRBS at the speed of 1 Gb/s with 10% duty radio. The high randomness of ultra-long cycle PRBS is validated by successfully passing the standard benchmark test.

  19. Rapid sequencing of DNA based on single-molecule detection

    Science.gov (United States)

    Soper, Steven A.; Davis, Lloyd M.; Fairfield, Frederick R.; Hammond, Mark L.; Harger, Carol A.; Jett, James H.; Keller, Richard A.; Marrone, Babetta L.; Martin, John C.; Nutter, Harvey L.; Shera, E. Brooks; Simpson, Daniel J.

    1991-07-01

    Sequencing the human genome is a major undertaking considering the large number of nucleotides present in the genome and the slow methods currently available to perform the task. The authors have recently reported on a scheme to sequence DNA rapidly using a non-gel based technique. The concept is based upon the incorporation of fluorescently labeled nucleotides into a strand of DNA, isolation and manipulation of a labeled DNA fragment and the detection of single nucleotides using ultra-sensitive laser-induced fluorescence detection following their cleavage from the fragment. Detection of individual fluorophores in the liquid phase was accomplished with time-gated detection following pulsed-laser excitation. The photon bursts from individual rhodamine 6G (R6G) molecules travelling through a laser beam have been observed, as have bursts from single fluorescently modified nucleotides. Using two different biotinylated nucleotides as a model system for fluorescently labeled nucleotides, the authors have observed synthesis of the complementary copy of M13 bacteriophage. Work with fluorescently labeled nucleotides is underway. Individual molecules of DNA attached to a microbead have been observed and manipulated with an epifluorescence microscope.

  20. Security Analysis of a Block Encryption Algorithm Based on Dynamic Sequences of Multiple Chaotic Systems

    Institute of Scientific and Technical Information of China (English)

    DU Mao-Kang; HE Bo; WANG Yong

    2011-01-01

    Recently, the cryptosystem based on chaos has attracted much attention. Wang and Yu (Commun. Nonlin. Sci. Numer. Simulat. 14(2009)574) proposed a block encryption algorithm based on dynamic sequences of multiple chaotic systems. We analyze the potential Saws in the algorithm. Then, a chosen-plaintext attack is presented. Some remedial measures are suggested to avoid the flaws effectively. Furthermore, an improved encryption algorithm is proposed to resist the attacks and to keep all the merits of the original cryptosystem.

  1. Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach

    Science.gov (United States)

    Hofmann, Hansjörg; Sakti, Sakriani; Hori, Chiori; Kashioka, Hideki; Nakamura, Satoshi; Minker, Wolfgang

    The performance of English automatic speech recognition systems decreases when recognizing spontaneous speech mainly due to multiple pronunciation variants in the utterances. Previous approaches address this problem by modeling the alteration of the pronunciation on a phoneme to phoneme level. However, the phonetic transformation effects induced by the pronunciation of the whole sentence have not yet been considered. In this article, the sequence-based pronunciation variation is modeled using a noisy channel approach where the spontaneous phoneme sequence is considered as a “noisy” string and the goal is to recover the “clean” string of the word sequence. Hereby, the whole word sequence and its effect on the alternation of the phonemes will be taken into consideration. Moreover, the system not only learns the phoneme transformation but also the mapping from the phoneme to the word directly. In this study, first the phonemes will be recognized with the present recognition system and afterwards the pronunciation variation model based on the noisy channel approach will map from the phoneme to the word level. Two well-known natural language processing approaches are adopted and derived from the noisy channel model theory: Joint-sequence models and statistical machine translation. Both of them are applied and various experiments are conducted using microphone and telephone of spontaneous speech.

  2. Sequence-based classification using discriminatory motif feature selection.

    Directory of Open Access Journals (Sweden)

    Hao Xiong

    Full Text Available Most existing methods for sequence-based classification use exhaustive feature generation, employing, for example, all k-mer patterns. The motivation behind such (enumerative approaches is to minimize the potential for overlooking important features. However, there are shortcomings to this strategy. First, practical constraints limit the scope of exhaustive feature generation to patterns of length ≤ k, such that potentially important, longer (> k predictors are not considered. Second, features so generated exhibit strong dependencies, which can complicate understanding of derived classification rules. Third, and most importantly, numerous irrelevant features are created. These concerns can compromise prediction and interpretation. While remedies have been proposed, they tend to be problem-specific and not broadly applicable. Here, we develop a generally applicable methodology, and an attendant software pipeline, that is predicated on discriminatory motif finding. In addition to the traditional training and validation partitions, our framework entails a third level of data partitioning, a discovery partition. A discriminatory motif finder is used on sequences and associated class labels in the discovery partition to yield a (small set of features. These features are then used as inputs to a classifier in the training partition. Finally, performance assessment occurs on the validation partition. Important attributes of our approach are its modularity (any discriminatory motif finder and any classifier can be deployed and its universality (all data, including sequences that are unaligned and/or of unequal length, can be accommodated. We illustrate our approach on two nucleosome occupancy datasets and a protein solubility dataset, previously analyzed using enumerative feature generation. Our method achieves excellent performance results, with and without optimization of classifier tuning parameters. A Python pipeline implementing the approach is

  3. Persisting viral sequences shape microbial CRISPR-based immunity.

    Directory of Open Access Journals (Sweden)

    Ariel D Weinberger

    Full Text Available Well-studied innate immune systems exist throughout bacteria and archaea, but a more recently discovered genomic locus may offer prokaryotes surprising immunological adaptability. Mediated by a cassette-like genomic locus termed Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR, the microbial adaptive immune system differs from its eukaryotic immune analogues by incorporating new immunities unidirectionally. CRISPR thus stores genomically recoverable timelines of virus-host coevolution in natural organisms refractory to laboratory cultivation. Here we combined a population genetic mathematical model of CRISPR-virus coevolution with six years of metagenomic sequencing to link the recoverable genomic dynamics of CRISPR loci to the unknown population dynamics of virus and host in natural communities. Metagenomic reconstructions in an acid-mine drainage system document CRISPR loci conserving ancestral immune elements to the base-pair across thousands of microbial generations. This 'trailer-end conservation' occurs despite rapid viral mutation and despite rapid prokaryotic genomic deletion. The trailer-ends of many reconstructed CRISPR loci are also largely identical across a population. 'Trailer-end clonality' occurs despite predictions of host immunological diversity due to negative frequency dependent selection (kill the winner dynamics. Statistical clustering and model simulations explain this lack of diversity by capturing rapid selective sweeps by highly immune CRISPR lineages. Potentially explaining 'trailer-end conservation,' we record the first example of a viral bloom overwhelming a CRISPR system. The polyclonal viruses bloom even though they share sequences previously targeted by host CRISPR loci. Simulations show how increasing random genomic deletions in CRISPR loci purges immunological controls on long-lived viral sequences, allowing polyclonal viruses to bloom and depressing host fitness. Our results thus link

  4. Development of Sequence-Based Microsatellite Marker for Phalaenopsis Orchid

    Directory of Open Access Journals (Sweden)

    FATIMAH

    2011-06-01

    Full Text Available Phalaenopsis is one of the most interesting genera of orchids due to the members are often used as parents to produce hybrids. The establishment and development of highly reliable and discriminatory methods for identifying species and cultivars has become increasingly more important to plant breeders and members of the nursery industry. The aim of this research was to develop sequence-based microsatellite (eSSR markers for the Phalaenopsis orchid designed from the sequence of GenBank NCBI. Seventeen primers were designed and thirteen primers pairs could amplify the DNA giving the expected PCR product with polymorphism. A total of 51 alleles, with an average of 3 alleles per locus and polymorphism information content (PIC values at 0.674, were detected at the 16 SSR loci. Therefore, these markers could be used for identification of the Phalaenopsis orchid used in this study. Genetic similarity and principle coordinate analysis identified five major groups of Phalaenopsis sp. the first group consisted of P. amabilis, P. fuscata, P. javanica, and P. zebrine. The second group consisted of P. amabilis, P. amboinensis, P. bellina, P. floresens, and P. mannii. The third group consisted of P. bellina, P. cornucervi, P. cornucervi, P. violaceae sumatra, P. modesta. The forth group consisted of P. cornucervi and P. lueddemanniana, and the fifth group was P. amboinensis.

  5. Research progress of plant population genomics based on high-throughput sequencing.

    Science.gov (United States)

    Yunsheng, Wang

    2016-08-01

    Population genomics, a new paradigm for population genetics, combine the concepts and techniques of genomics with the theoretical system of population genetics and improve our understanding of microevolution through identification of site-specific effect and genome-wide effects using genome-wide polymorphic sites genotypeing. With the appearance and improvement of the next generation high-throughput sequencing technology, the numbers of plant species with complete genome sequences increased rapidly and large scale resequencing has also been carried out in recent years. Parallel sequencing has also been done in some plant species without complete genome sequences. These studies have greatly promoted the development of population genomics and deepened our understanding of the genetic diversity, level of linking disequilibium, selection effect, demographical history and molecular mechanism of complex traits of relevant plant population at a genomic level. In this review, I briely introduced the concept and research methods of population genomics and summarized the research progress of plant population genomics based on high-throughput sequencing. I also discussed the prospect as well as existing problems of plant population genomics in order to provide references for related studies.

  6. A sequence-based variation map of zebrafish.

    Science.gov (United States)

    Patowary, Ashok; Purkanti, Ramya; Singh, Meghna; Chauhan, Rajendra; Singh, Angom Ramcharan; Swarnkar, Mohit; Singh, Naresh; Pandey, Vikas; Torroja, Carlos; Clark, Matthew D; Kocher, Jean-Pierre; Clark, Karl J; Stemple, Derek L; Klee, Eric W; Ekker, Stephen C; Scaria, Vinod; Sivasubbu, Sridhar

    2013-03-01

    Zebrafish (Danio rerio) is a popular vertebrate model organism largely deployed using outbred laboratory animals. The nonisogenic nature of the zebrafish as a model system offers the opportunity to understand natural variations and their effect in modulating phenotype. In an effort to better characterize the range of natural variation in this model system and to complement the zebrafish reference genome project, the whole genome sequence of a wild zebrafish at 39-fold genome coverage was determined. Comparative analysis with the zebrafish reference genome revealed approximately 5.2 million single nucleotide variations and over 1.6 million insertion-deletion variations. This dataset thus represents a new catalog of genetic variations in the zebrafish genome. Further analysis revealed selective enrichment for variations in genes involved in immune function and response to the environment, suggesting genome-level adaptations to environmental niches. We also show that human disease gene orthologs in the sequenced wild zebrafish genome show a lower ratio of nonsynonymous to synonymous single nucleotide variations.

  7. Similarity Measurement of Web Sessions Based on Sequence Alignment

    Institute of Scientific and Technical Information of China (English)

    LI Chaofeng; LU Yansheng

    2007-01-01

    The task of clustering Web sessions is to group Web sessions based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-group similarity.The first and foremost question needed to be considered in clustering Web sessions is how to measure the similarity between Web sessions. However, there are many shortcomings in traditional measurements. This paper introduces a new method for measuring similarities between Web pages that takes into account not only the URL but also the viewing time of the visited Web page. Then we give a new method to measure the similarity of Web sessions using sequence alignment and the similarity of Web page access in detail.Experiments have proved that our method is valid and efficient.

  8. Spike-Based Bayesian-Hebbian Learning of Temporal Sequences

    DEFF Research Database (Denmark)

    Tully, Philip J; Lindén, Henrik; Hennig, Matthias H

    2016-01-01

    of firing in pools of excitatory neurons, together with asymmetrical associations between these distinct network states, can be acquired through plasticity. The model's feasibility is demonstrated using simulations of adaptive exponential integrate-and-fire model neurons (AdEx). We show that the learning......Many cognitive and motor functions are enabled by the temporal representation and processing of stimuli, but it remains an open issue how neocortical microcircuits can reliably encode and replay such sequences of information. To better understand this, a modular attractor memory network is proposed...... in which meta-stable sequential attractor transitions are learned through changes to synaptic weights and intrinsic excitabilities via the spike-based Bayesian Confidence Propagation Neural Network (BCPNN) learning rule. We find that the formation of distributed memories, embodied by increased periods...

  9. ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors.

    Science.gov (United States)

    Chu, Wen-Yi; Huang, Yu-Feng; Huang, Chun-Chin; Cheng, Yi-Sheng; Huang, Chien-Kang; Oyang, Yen-Jen

    2009-07-01

    This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein-DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are essential for correct gene regulation. In this respect, ProteDNA is distinctive since it has been designed to identify sequence-specific binding residues. In order to accommodate users with different application needs, ProteDNA has been designed to operate under two modes, namely, the high-precision mode and the balanced mode. According to the experiments reported in this article, under the high-precision mode, ProteDNA has been able to deliver precision of 82.3%, specificity of 99.3%, sensitivity of 49.8% and accuracy of 96.5%. Meanwhile, under the balanced mode, ProteDNA has been able to deliver precision of 60.8%, specificity of 97.6%, sensitivity of 60.7% and accuracy of 95.4%. ProteDNA is available at the following websites: http://protedna.csbb.ntu.edu.tw/, http://protedna.csie.ntu.edu.tw/, http://bio222.esoe.ntu.edu.tw/ProteDNA/.

  10. Marker-Based Human Motion Capture in Multiview Sequences

    Directory of Open Access Journals (Sweden)

    Canton-Ferrer Cristian

    2010-01-01

    Full Text Available This paper presents a low-cost real-time alternative to available commercial human motion capture systems. First, a set of distinguishable markers are placed on several human body landmarks, and the scene is captured by a number of calibrated and synchronized cameras. In order to establish a physical relation among markers, a human body model is defined. Markers are detected on all camera views and delivered as the input of an annealed particle filter scheme where every particle encodes an instance of the pose of the body model to be estimated. Likelihood between particles and input data is performed through the robust generalized symmetric epipolar distance and kinematic constrains are enforced in the propagation step towards avoiding impossible poses. Tests over the HumanEva annotated data set yield quantitative results showing the effectiveness of the proposed algorithm. Results over sequences involving fast and complex motions are also presented.

  11. BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations.

    Science.gov (United States)

    Bahr, A; Thompson, J D; Thierry, J C; Poch, O

    2001-01-01

    BAliBASE is specifically designed to serve as an evaluation resource to address all the problems encountered when aligning complete sequences. The database contains high quality, manually constructed multiple sequence alignments together with detailed annotations. The alignments are all based on three-dimensional structural superpositions, with the exception of the transmembrane sequences. The first release provided sets of reference alignments dealing with the problems of high variability, unequal repartition and large N/C-terminal extensions and internal insertions. Here we describe version 2.0 of the database, which incorporates three new reference sets of alignments containing structural repeats, trans-membrane sequences and circular permutations to evaluate the accuracy of detection/prediction and alignment of these complex sequences. BAliBASE can be viewed at the web site http://www-igbmc.u-strasbg. fr/BioInfo/BAliBASE2/index.html or can be downloaded from ftp://ftp-igbmc.u-strasbg.fr/pub/BAliBASE2 /.

  12. MOST: a modified MLST typing tool based on short read sequencing

    Directory of Open Access Journals (Sweden)

    Rediat Tewolde

    2016-08-01

    Full Text Available Multilocus sequence typing (MLST is an effective method to describe bacterial populations. Conventionally, MLST involves Polymerase Chain Reaction (PCR amplification of housekeeping genes followed by Sanger DNA sequencing. Public Health England (PHE is in the process of replacing the conventional MLST methodology with a method based on short read sequence data derived from Whole Genome Sequencing (WGS. This paper reports the comparison of the reliability of MLST results derived from WGS data, comparing mapping and assembly-based approaches to conventional methods using 323 bacterial genomes of diverse species. The sensitivity of the two WGS based methods were further investigated with 26 mixed and 29 low coverage genomic data sets from Salmonella enteridis and Streptococcus pneumoniae. Of the 323 samples, 92.9% (n = 300, 97.5% (n = 315 and 99.7% (n = 322 full MLST profiles were derived by the conventional method, assembly- and mapping-based approaches, respectively. The concordance between samples that were typed by conventional (92.9% and both WGS methods was 100%. From the 55 mixed and low coverage genomes, 89.1% (n = 49 and 67.3% (n = 37 full MLST profiles were derived from the mapping and assembly based approaches, respectively. In conclusion, deriving MLST from WGS data is more sensitive than the conventional method. When comparing WGS based methods, the mapping based approach was the most sensitive. In addition, the mapping based approach described here derives quality metrics, which are difficult to determine quantitatively using conventional and WGS-assembly based approaches.

  13. Effects of the Ion PGM™ Hi-Q™ sequencing chemistry on sequence data quality.

    Science.gov (United States)

    Churchill, Jennifer D; King, Jonathan L; Chakraborty, Ranajit; Budowle, Bruce

    2016-09-01

    Massively parallel sequencing (MPS) offers substantial improvements over current forensic DNA typing methodologies such as increased resolution, scalability, and throughput. The Ion PGM™ is a promising MPS platform for analysis of forensic biological evidence. The system employs a sequencing-by-synthesis chemistry on a semiconductor chip that measures a pH change due to the release of hydrogen ions as nucleotides are incorporated into the growing DNA strands. However, implementation of MPS into forensic laboratories requires a robust chemistry. Ion Torrent's Hi-Q™ Sequencing Chemistry was evaluated to determine if it could improve on the quality of the generated sequence data in association with selected genetic marker targets. The whole mitochondrial genome and the HID-Ion STR 10-plex panel were sequenced on the Ion PGM™ system with the Ion PGM™ Sequencing 400 Kit and the Ion PGM™ Hi-Q™ Sequencing Kit. Concordance, coverage, strand balance, noise, and deletion ratios were assessed in evaluating the performance of the Ion PGM™ Hi-Q™ Sequencing Kit. The results indicate that reliable, accurate data are generated and that sequencing through homopolymeric regions can be improved with the use of Ion Torrent's Hi-Q™ Sequencing Chemistry. Overall, the quality of the generated sequencing data supports the potential for use of the Ion PGM™ in forensic genetic laboratories.

  14. Multiple sequence alignment based on combining genetic algorithm with chaotic sequences.

    Science.gov (United States)

    Gao, C; Wang, B; Zhou, C J; Zhang, Q

    2016-06-24

    In bioinformatics, sequence alignment is one of the most common problems. Multiple sequence alignment is an NP (nondeterministic polynomial time) problem, which requires further study and exploration. The chaos optimization algorithm is a type of chaos theory, and a procedure for combining the genetic algorithm (GA), which uses ergodicity, and inherent randomness of chaotic iteration. It is an efficient method to solve the basic premature phenomenon of the GA. Applying the Logistic map to the GA and using chaotic sequences to carry out the chaotic perturbation can improve the convergence of the basic GA. In addition, the random tournament selection and optimal preservation strategy are used in the GA. Experimental evidence indicates good results for this process.

  15. Pseudo-Random Sequences Generator Based on Discrete Hyperchaotic Systems

    Institute of Scientific and Technical Information of China (English)

    李昌刚; 韩正之

    2003-01-01

    We first design a discrete hyperchaotic system via piecewise linear state feedback. The states of the closed loop system are locally expanding in two directions but absolutely bounded on the whole, which implies hyperchaos. Then, we use three suchlike hyperchaotie systems with different feedback gain matrices to design a pseudo-random sequence generator (PRSG). Through a threshold function, three sub-sequences generated from the output of piecewise linear functions are changed into 0-1 sequences. Then, followed by XOR operation, an unpredictable pseudo-random sequence (PRS) is ultimately obtained. The analysis and simulation results indicate that the PRS, generated with hyperchaotic systems, has desirable statistical features.

  16. Effective automated feature construction and selection for classification of biological sequences.

    Directory of Open Access Journals (Sweden)

    Uday Kamath

    Full Text Available Many open problems in bioinformatics involve elucidating underlying functional signals in biological sequences. DNA sequences, in particular, are characterized by rich architectures in which functional signals are increasingly found to combine local and distal interactions at the nucleotide level. Problems of interest include detection of regulatory regions, splice sites, exons, hypersensitive sites, and more. These problems naturally lend themselves to formulation as classification problems in machine learning. When classification is based on features extracted from the sequences under investigation, success is critically dependent on the chosen set of features.We present an algorithmic framework (EFFECT for automated detection of functional signals in biological sequences. We focus here on classification problems involving DNA sequences which state-of-the-art work in machine learning shows to be challenging and involve complex combinations of local and distal features. EFFECT uses a two-stage process to first construct a set of candidate sequence-based features and then select a most effective subset for the classification task at hand. Both stages make heavy use of evolutionary algorithms to efficiently guide the search towards informative features capable of discriminating between sequences that contain a particular functional signal and those that do not.To demonstrate its generality, EFFECT is applied to three separate problems of importance in DNA research: the recognition of hypersensitive sites, splice sites, and ALU sites. Comparisons with state-of-the-art algorithms show that the framework is both general and powerful. In addition, a detailed analysis of the constructed features shows that they contain valuable biological information about DNA architecture, allowing biologists and other researchers to directly inspect the features and potentially use the insights obtained to assist wet-laboratory studies on retainment or modification

  17. Sequence-Dependent Effects on the Properties of Semiflexible Biopolymers

    CERN Document Server

    Zicong, Bela

    2008-01-01

    Using path integral technique, we show exactly that for a semiflexible biopolymer in constant extension ensemble, no matter how long the polymer and how large the external force, the effects of short range correlations in the sequence-dependent spontaneous curvatures and torsions can be incorporated into a model with well-defined mean spontaneous curvature and torsion as well as a renormalized persistence length. Moreover, for a long biopolymer with large mean persistence length, the sequence-dependent persistence lengths can be replaced by their mean. However, for a short biopolymer or for a biopolymer with small persistence lengths, inhomogeneity in persistence lengths tends to make physical observables very sensitive to details and therefore less predictable.

  18. Effects of interlinker sequences on the biological properties of bispecific single-chain antibodies

    Institute of Scientific and Technical Information of China (English)

    FANG Min; JIANG Xin; YANG Zhi; YIN Changcheng; LI Hua; ZHAO Rui; ZHANG Zhong; LIN Qing; HUANG Hualiang

    2003-01-01

    Single-chain bispecific antibody (scBsAb) is one of the promising genetic engineering antibody formats for clinical application. But the effects of interlinker sequences on the biological properties of bispecific single-chain antibodies have not been studied in detail. Three interlinker sequences were designed and synthesized, and denominated as Fc, HSA, 205C′, respectively. Universal vectors with these different interlinker sequences for scBsAb expression in E. coli were constructed. A model scBsAb based on a reshaped single-chain antibody (scFv) against human CD3 and a scFv directed against human ovarian carcinoma were generated and expressed in E. coli. The results of SDS-PAGE and Western blot showed that the different interlinker sequences did not affect the expression levelof scBsAb. However, as demonstrated by ELISA and pharmacokinetics studies performed in mice, scBsAbs with different interlinker sequences had difference in the antigen-binding activities and terminal half-life time (T1/2β) in vivo, the interlinker HSA could remarkably prolong the retention time of scBsAb in blood. These results indicated that the peptide sequence of interlinker could affect important biological properties of scBsAb, such as antigen-binding properties and stability in vivo. So, selection of an appropriate interlinker sequence is very important for scBsAb construction. Optimal interlinker can bring scBsAb biologicalproperties more suitable for clinical application.

  19. PCR-based assays versus direct sequencing for evaluating the effect of KRAS status on anti-EGFR treatment response in colorectal cancer patients: a systematic review and meta-analysis.

    Directory of Open Access Journals (Sweden)

    Lianfeng Shan

    Full Text Available BACKGROUND: The survival rate of colorectal cancer (CRC patients carrying wild-type KRAS is significantly increased by combining anti-EGFR monoclonal antibody (mAb with standard chemotherapy. However, conflicting data exist in both the wild-type KRAS and mutant KRAS groups, which strongly challenge CRC anti-EGFR treatment. Here we conducted a meta-analysis in an effort to provide more reliable information regarding anti-EGFR treatment in CRC patients. METHODS: We searched full reports of randomized clinical trials using Medline, the American Society of Clinical Oncology (ASCO, and the European Society for Medical Oncology (ESMO. Two investigators independently screened the published literature according to our inclusive and exclusive criteria and the relative data were extracted. We used Review Manager 5.2 software to analyze the data. RESULTS: The addition of anti-EGFR mAb to standard chemotherapy significantly improved both progression-free survival (PFS and median overall survival (mOS in the wild-type KRAS group; hazard ratios (HRs for PFS and mOS were 0.70 [95% confidence interval (CI, 0.58-0.84] and 0.83 [95% CI, 0.75-0.91], respectively. In sub-analyses of the wild-type KRAS group, when PCR-based assays are employed, PFS and mOS notably increase: the HRs were 0.74 [95% CI, 0.62-0.88] and 0.87 [95% CI, 0.78-0.96], respectively. In sub-analyses of the mutant KRAS group, neither PCR-based assays nor direct sequencing enhance PFS or mOS. CONCLUSION: Our data suggest that PCR-based assays with high sensitivity and specificity allow accurate identification of patients with wild-type KRAS and thus increase PFS and mOS. Furthermore, such assays liberate patients with mutant KRAS from unnecessary drug side effects, and provide them an opportunity to receive appropriate treatment. Thus, establishing a precise standard reference test will substantially optimize CRC-targeted therapies.

  20. Effects of priming goal pursuit on implicit sequence learning.

    Science.gov (United States)

    Gamble, Katherine R; Lee, Joanna M; Howard, James H; Howard, Darlene V

    2014-11-01

    Implicit learning, the type of learning that occurs without intent to learn or awareness of what has been learned, has been thought to be insensitive to the effects of priming, but recent studies suggest this is not the case. One study found that learning in the serial reaction time (SRT) task was improved by nonconscious goal pursuit, primed via a word search task (Eitam et al. in Psychol Sci 19:261-267, 2008). In two studies, we used the goal priming word search task from Eitam et al., but with a different version of the SRT, the alternating serial reaction time task (ASRT). Unlike the SRT, which often results in explicit knowledge and assesses sequence learning at one point in time, the ASRT has been shown to be implicit through sensitive measures of judgment, and it enables sequence learning to be measured continuously. In both studies, we found that implicit learning was superior in the groups that were primed for goal achievement compared to control groups, but the effect was transient. We discuss possible reasons for the observed time course of the positive effects of goal priming, as well as some future areas of investigation to better understand the mechanisms that underlie this effect, which could lead to methods to prolong the positive effects.

  1. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure tha...

  2. Clinical Sequencing Exploratory Research Consortium: Accelerating Evidence-Based Practice of Genomic Medicine.

    Science.gov (United States)

    Green, Robert C; Goddard, Katrina A B; Jarvik, Gail P; Amendola, Laura M; Appelbaum, Paul S; Berg, Jonathan S; Bernhardt, Barbara A; Biesecker, Leslie G; Biswas, Sawona; Blout, Carrie L; Bowling, Kevin M; Brothers, Kyle B; Burke, Wylie; Caga-Anan, Charlisse F; Chinnaiyan, Arul M; Chung, Wendy K; Clayton, Ellen W; Cooper, Gregory M; East, Kelly; Evans, James P; Fullerton, Stephanie M; Garraway, Levi A; Garrett, Jeremy R; Gray, Stacy W; Henderson, Gail E; Hindorff, Lucia A; Holm, Ingrid A; Lewis, Michelle Huckaby; Hutter, Carolyn M; Janne, Pasi A; Joffe, Steven; Kaufman, David; Knoppers, Bartha M; Koenig, Barbara A; Krantz, Ian D; Manolio, Teri A; McCullough, Laurence; McEwen, Jean; McGuire, Amy; Muzny, Donna; Myers, Richard M; Nickerson, Deborah A; Ou, Jeffrey; Parsons, Donald W; Petersen, Gloria M; Plon, Sharon E; Rehm, Heidi L; Roberts, J Scott; Robinson, Dan; Salama, Joseph S; Scollon, Sarah; Sharp, Richard R; Shirts, Brian; Spinner, Nancy B; Tabor, Holly K; Tarczy-Hornoch, Peter; Veenstra, David L; Wagle, Nikhil; Weck, Karen; Wilfond, Benjamin S; Wilhelmsen, Kirk; Wolf, Susan M; Wynn, Julia; Yu, Joon-Ho

    2016-06-02

    Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine.

  3. Extracting flat-field images from scene-based image sequences using phase correlation

    Science.gov (United States)

    Caron, James N.; Montes, Marcos J.; Obermark, Jerome L.

    2016-06-01

    Flat-field image processing is an essential step in producing high-quality and radiometrically calibrated images. Flat-fielding corrects for variations in the gain of focal plane array electronics and unequal illumination from the system optics. Typically, a flat-field image is captured by imaging a radiometrically uniform surface. The flat-field image is normalized and removed from the images. There are circumstances, such as with remote sensing, where a flat-field image cannot be acquired in this manner. For these cases, we developed a phase-correlation method that allows the extraction of an effective flat-field image from a sequence of scene-based displaced images. The method uses sub-pixel phase correlation image registration to align the sequence to estimate the static scene. The scene is removed from sequence producing a sequence of misaligned flat-field images. An average flat-field image is derived from the realigned flat-field sequence.

  4. High-Throughput Sequencing Based Methods of RNA Structure Investigation

    DEFF Research Database (Denmark)

    Kielpinski, Lukasz Jan

    describe several computational methods. One that alleviates PCR bias by estimating number of unique molecules existing before the amplification, and two methods for data normalization: one applicable when the paired end sequencing is performed, and the other that works with the single read sequencing...

  5. Effects of cloning and root-tip size on observations of fungal ITS sequences from Picea glauca roots.

    Science.gov (United States)

    Lindner, Daniel L; Banik, Mark T

    2009-01-01

    To better understand the effects of cloning on observations of fungal ITS sequences from Picea glauca (white spruce) roots two techniques were compared: (i) direct sequencing of fungal ITS regions from individual root tips without cloning and (ii) cloning and sequencing of fungal ITS regions from individual root tips. Effect of root tip size was investigated by selecting 20 small root tips (SRT, 1.0-2.0 mm long) and 20 large root tips (LRT, 5.0-6.0 mm long). DNA was isolated from each tip and PCR-amplified with fungal-specific primers. PCR reactions were divided into two portions, one of which was sequenced directly and one of which was cloned first followed by sequencing of 12 random clones. With direct sequencing all 20 SRT produced an identifiable sequence, while only 13 of 20 LRT (65%) yielded an identifiable sequence. With cloning and sequencing all 40 tips produced identifiable fungal ITS sequences regardless of size. Failure of direct sequencing in LRT was associated with the presence of multispecies assemblages. Cloning identified 18 taxa overall while direct sequencing identified four. Cloning was not affected by tip size and identified more taxa relative to direct sequencing, although cost and probability of observing lab-based contaminants (e.g., airborne or reagent-based) were higher. We suggest that standardized controls be run whenever clones are sequenced from environmental samples, including positive controls derived from pure cultures and negative controls that cover the entire extraction, amplification and cloning process. Additional studies on larger root segments and bulked samples are needed to determine whether cloning can detect fungi accurately and cost-effectively in complex environmental samples.

  6. IQSeq: integrated isoform quantification analysis based on next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Jiang Du

    Full Text Available With the recent advances in high-throughput RNA sequencing (RNA-Seq, biologists are able to measure transcription with unprecedented precision. One problem that can now be tackled is that of isoform quantification: here one tries to reconstruct the abundances of isoforms of a gene. We have developed a statistical solution for this problem, based on analyzing a set of RNA-Seq reads, and a practical implementation, available from archive.gersteinlab.org/proj/rnaseq/IQSeq, in a tool we call IQSeq (Isoform Quantification in next-generation Sequencing. Here, we present theoretical results which IQSeq is based on, and then use both simulated and real datasets to illustrate various applications of the tool. In order to measure the accuracy of an isoform-quantification result, one would try to estimate the average variance of the estimated isoform abundances for each gene (based on resampling the RNA-seq reads, and IQSeq has a particularly fast algorithm (based on the Fisher Information Matrix for calculating this, achieving a speedup of ~ 500 times compared to brute-force resampling. IQSeq also calculates an information theoretic measure of overall transcriptome complexity to describe isoform abundance for a whole experiment. IQSeq has many features that are particularly useful in RNA-Seq experimental design, allowing one to optimally model the integration of different sequencing technologies in a cost-effective way. In particular, the IQSeq formalism integrates the analysis of different sample (i.e. read sets generated from different technologies within the same statistical framework. It also supports a generalized statistical partial-sample-generation function to model the sequencing process. This allows one to have a modular, "plugin-able" read-generation function to support the particularities of the many evolving sequencing technologies.

  7. Protein sequence for clustering DNA based on Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Gamal. F. Elhadi

    2012-01-01

    Full Text Available DNA is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms and some viruses. Clustering is a process that groups a set of objects into clusters so that the similarity among objects in the same cluster is high, while that among the objects in different clusters is low. In this paper, we proposed an approach for clustering DNA sequences using Self-Organizing Map (SOM algorithm and Protein Sequence. The main objective is to analyze biological data and to bunch DNA to many clusters more easily and efficiently. We use the proposed approach to analyze both large and small amount of input DNA sequences. The results show that the similarity of the sequences does not depend on the amount of input sequences. Our approach depends on evaluating the degree of the DNA sequences similarity using the hierarchal representation Dendrogram. Representing large amount of data using hierarchal tree gives the ability to compare large sequences efficiently

  8. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    Directory of Open Access Journals (Sweden)

    Dobbs Drena

    2011-06-01

    Full Text Available Abstract Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i NPS-HomPPI (Non partner-specific HomPPI, which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii PS-HomPPI (Partner-specific HomPPI, which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of

  9. A novel DNA sequence similarity calculation based on simplified pulse-coupled neural network and Huffman coding

    Science.gov (United States)

    Jin, Xin; Nie, Rencan; Zhou, Dongming; Yao, Shaowen; Chen, Yanyan; Yu, Jiefu; Wang, Quan

    2016-11-01

    A novel method for the calculation of DNA sequence similarity is proposed based on simplified pulse-coupled neural network (S-PCNN) and Huffman coding. In this study, we propose a coding method based on Huffman coding, where the triplet code was used as a code bit to transform DNA sequence into numerical sequence. The proposed method uses the firing characters of S-PCNN neurons in DNA sequence to extract features. Besides, the proposed method can deal with different lengths of DNA sequences. First, according to the characteristics of S-PCNN and the DNA primary sequence, the latter is encoded using Huffman coding method, and then using the former, the oscillation time sequence (OTS) of the encoded DNA sequence is extracted. Simultaneously, relevant features are obtained, and finally the similarities or dissimilarities of the DNA sequences are determined by Euclidean distance. In order to verify the accuracy of this method, different data sets were used for testing. The experimental results show that the proposed method is effective.

  10. PHARMACOGENETIC TESTING OPPORTUNITIES IN CARDIOLOGY BASED ON EXOME SEQUENCING

    Directory of Open Access Journals (Sweden)

    N. V. Shcherbakova

    2014-01-01

    Full Text Available Aim. To study what cardiac drugs currently have any comments on biomarkers and what information can be obtained by pharmacogenetic testing using data exome sequencing in patients with cardiac diseases.Material and methods. Exome sequencing in random participant of the ATEROGEN IVANOVO study and bioinformatics analysis of the data were performed. Point mutations were annotated using ANNOVAR program, as well as comparison with a number of specialized databases was done on the basis of user protocols.Results. 11 cardiac drugs and 7 genes which variants can influence cardiac drug metabolism were analyzed. According to exome sequencing of the participant we did not reveal allelic variants that require dose regime correction and careful efficacy control.Conclusion. The exome sequencing application is the next step to a wide range of personalized therapy. Future opportunities for improvement of the risk-benefit ratio in each patient are the main purpose of the collection and analysis of pharmacogenetic data.

  11. Phylogenetic relationships of Salmonella based on rRNA sequences

    DEFF Research Database (Denmark)

    Christensen, H.; Nordentoft, Steen; Olsen, J.E.

    1998-01-01

    To establish the phylogenetic relationships between the subspecies of Salmonella enterica (official name Salmonella choleraesuis), Salmonella bongori and related members of Enterobacteriaceae, sequence comparison of rRNA was performed by maximum-likelihood analysis. The two Salmonella species wer...

  12. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    Science.gov (United States)

    2008-07-01

    COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES ... sequences which are generalizations of the Fibonacci sequences . 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16...coding bound on the rate of DNA codes is proved. To obtain the bound, we use some ensembles of DNA sequences which are generalizations of the Fibonacci

  13. Changes in DNA base sequences in the mutant of Arabidopsis thaliana induced by low-energy N+ implantation

    Institute of Scientific and Technical Information of China (English)

    常凤启; 刘选明; 李银心; 贾庚祥; 马晶晶; 刘公社; 朱至清

    2003-01-01

    To reveal the mutation effect of low-energy ion implantation on Arabidopsis thaliana in vivo, T80II, a stable dwarf mutant, derived from the seeds irradiated by 30 keV N+ with the dose of 80×1015 ions/cm2 was used for Random Amplified Polymorphic DNA (RAPD) and base sequence analysis. The results indicated that among total 397 RAPD bands observed, 52 bands in T80II were different from those of wild type showing a variation frequency 13.1%. In comparison with the sequences of A. thaliana in GenBank, the RAPD fragments in T80II were changed greatly in base sequences with an average rate of one base change per 16.8 bases. The types of base changes included base transition, transversion, deletion and insertion. Among the 275 base changes detected, single base substitutions (97.09%) occurred more frequently than base deletions and insertions (2.91%). And the frequency of base transitions (66.55%) was higher than that of base transversions (30.55%). Adenine, thymine, guanine or cytosine could be replaced by any of other three bases in cloned DNA fragments in T80II. It seems that thymine was more sensitive to the irradiation than other bases. The flanking sequences of the base changes in RAPD fragments in T80II were analyzed and the mutational "hotspot" induced by low-energy ion implantation was discussed.

  14. Multifunctional hybrid networks based on self assembling peptide sequences

    Science.gov (United States)

    Sathaye, Sameer

    The overall aim of this dissertation is to achieve a comprehensive correlation between the molecular level changes in primary amino acid sequences of amphiphilic beta-hairpin peptides and their consequent solution-assembly properties and bulk network hydrogel behavior. This has been accomplished using two broad approaches. In the first approach, amino acid substitutions were made to peptide sequence MAX1 such that the hydrophobic surfaces of the folded beta-hairpins from the peptides demonstrate shape specificity in hydrophobic interactions with other beta-hairpins during the assembly process, thereby causing changes to the peptide nanostructure and bulk rheological properties of hydrogels formed from the peptides. Steric lock and key complementary hydrophobic interactions were designed to occur between two beta-hairpin molecules of a single molecule, LNK1 during beta-sheet fibrillar assembly of LNK1. Experimental results from circular dichroism, transmission electron microscopy and oscillatory rheology collectively indicate that the molecular design of the LNK1 peptide can be assigned the cause of the drastically different behavior of the networks relative to MAX1. The results indicate elimination or significant reduction of fibrillar branching due to steric complementarity in LNK1 that does not exist in MAX1, thus supporting the original hypothesis. As an extension of the designed steric lock and key complementarity between two beta-hairpin molecules of the same peptide molecule. LNK1, three new pairs of peptide molecules LP1-KP1, LP2-KP2 and LP3-KP3 that resemble complementary 'wedge' and 'trough' shapes when folded into beta-hairpins were designed and studied. All six peptides individually and when blended with their corresponding shape complement formed fibrillar nanostructures with non-uniform thickness values. Loose packing in the assembled structures was observed in all the new peptides as compared to the uniform tight packing in MAX1 by SANS analysis. This

  15. Prediction of peptide drift time in ion mobility mass spectrometry from sequence-based features

    KAUST Repository

    Wang, Bing

    2013-05-09

    Background: Ion mobility-mass spectrometry (IMMS), an analytical technique which combines the features of ion mobility spectrometry (IMS) and mass spectrometry (MS), can rapidly separates ions on a millisecond time-scale. IMMS becomes a powerful tool to analyzing complex mixtures, especially for the analysis of peptides in proteomics. The high-throughput nature of this technique provides a challenge for the identification of peptides in complex biological samples. As an important parameter, peptide drift time can be used for enhancing downstream data analysis in IMMS-based proteomics.Results: In this paper, a model is presented based on least square support vectors regression (LS-SVR) method to predict peptide ion drift time in IMMS from the sequence-based features of peptide. Four descriptors were extracted from peptide sequence to represent peptide ions by a 34-component vector. The parameters of LS-SVR were selected by a grid searching strategy, and a 10-fold cross-validation approach was employed for the model training and testing. Our proposed method was tested on three datasets with different charge states. The high prediction performance achieve demonstrate the effectiveness and efficiency of the prediction model.Conclusions: Our proposed LS-SVR model can predict peptide drift time from sequence information in relative high prediction accuracy by a test on a dataset of 595 peptides. This work can enhance the confidence of protein identification by combining with current protein searching techniques. 2013 Wang et al.; licensee BioMed Central Ltd.

  16. Novel Sequence Number Based Secure Authentication Scheme for Wireless LANs

    Institute of Scientific and Technical Information of China (English)

    Rajeev Singh; Teek Parval Sharma

    2015-01-01

    Authentication per frame is an implicit necessity for security in wireless local area networks (WLANs). We propose a novel per frame secure authentication scheme which provides authentication to data frames in WLANs. The scheme involves no cryptographic overheads for authentication of frames. It utilizes the sequence number of the frame along with the authentication stream generators for authentication. Hence, it requires no extra bits or messages for the authentication purpose and also no change in the existing frame format is required. The scheme provides authentication by modifying the sequence number of the frame at the sender, and that the modification is verified at the receiver. The modified sequence number is protected by using the XOR operation with a random number selected from the random stream. The authentication is lightweight due to the fact that it requires only trivial arithmetic operations like the subtraction and XOR operation.

  17. A method to prioritize quantitative traits and individuals for sequencing in family-based studies.

    Directory of Open Access Journals (Sweden)

    Kaanan P Shah

    Full Text Available Owing to recent advances in DNA sequencing, it is now technically feasible to evaluate the contribution of rare variation to complex traits and diseases. However, it is still cost prohibitive to sequence the whole genome (or exome of all individuals in each study. For quantitative traits, one strategy to reduce cost is to sequence individuals in the tails of the trait distribution. However, the next challenge becomes how to prioritize traits and individuals for sequencing since individuals are often characterized for dozens of medically relevant traits. In this article, we describe a new method, the Rare Variant Kinship Test (RVKT, which leverages relationship information in family-based studies to identify quantitative traits that are likely influenced by rare variants. Conditional on nuclear families and extended pedigrees, we evaluate the power of the RVKT via simulation. Not unexpectedly, the power of our method depends strongly on effect size, and to a lesser extent, on the frequency of the rare variant and the number and type of relationships in the sample. As an illustration, we also apply our method to data from two genetic studies in the Old Order Amish, a founder population with extensive genealogical records. Remarkably, we implicate the presence of a rare variant that lowers fasting triglyceride levels in the Heredity and Phenotype Intervention (HAPI Heart study (p = 0.044, consistent with the presence of a previously identified null mutation in the APOC3 gene that lowers fasting triglyceride levels in HAPI Heart study participants.

  18. Whole-genome sequence-based analysis of thyroid function

    OpenAIRE

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J.; Traglia, Michela; Brown, Suzanne J.; Mullin, Benjamin H; Shihab, Hashem A.; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R.; Beilby, John P.; Charoen, Pimphen

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 1...

  19. Spike-Based Bayesian-Hebbian Learning of Temporal Sequences

    DEFF Research Database (Denmark)

    Tully, Philip J; Lindén, Henrik; Hennig, Matthias H;

    2016-01-01

    of firing in pools of excitatory neurons, together with asymmetrical associations between these distinct network states, can be acquired through plasticity. The model's feasibility is demonstrated using simulations of adaptive exponential integrate-and-fire model neurons (AdEx). We show that the learning...... and speed of sequence replay depends on a confluence of biophysically relevant parameters including stimulus duration, level of background noise, ratio of synaptic currents, and strengths of short-term depression and adaptation. Moreover, sequence elements are shown to flexibly participate multiple times...

  20. Predicting tissue-specific expressions based on sequence characteristics

    KAUST Repository

    Paik, Hyojung

    2011-04-30

    In multicellular organisms, including humans, understanding expression specificity at the tissue level is essential for interpreting protein function, such as tissue differentiation. We developed a prediction approach via generated sequence features from overrepresented patterns in housekeeping (HK) and tissue-specific (TS) genes to classify TS expression in humans. Using TS domains and transcriptional factor binding sites (TFBSs), sequence characteristics were used as indices of expressed tissues in a Random Forest algorithm by scoring exclusive patterns considering the biological intuition; TFBSs regulate gene expression, and the domains reflect the functional specificity of a TS gene. Our proposed approach displayed better performance than previous attempts and was validated using computational and experimental methods.

  1. Tentative revision of the global Pliocene-Pleistocene sequences based on the sequence stratigraphy in the Gulf of Mexico

    Energy Technology Data Exchange (ETDEWEB)

    Wornardt, W.W. Jr. (Micro-Strat, Inc., Houston, TX (United States) Rice Univ., Houston, TX (United States)); Vail, P.R. (Rice Univ., Houston, TX (United States))

    1991-08-01

    The Pliocene-Pleistocene sequence chronostratigraphy presented in this paper is based on a vast amount of data obtained from more than 100 wells drilled over the past eight years in the south additions and deep-water areas in offshore Texas and Louisiana, Gulf of Mexico. This high-resolution biostratigraphic data base consists of individual checklist with the abundance, diversity, occurrences, and ranges of planktonic and benthic foraminifers and calcareous nannofossils plotted against depth. The benthic foraminifers have been interpreted largely for their paleobathymetric significance and result in a water-depth curve for each well studied. These wells have been further calibrated by having a portion of the study wells tied to sequence stratigraphic interpretations of seismic record sections through a two-way-time log or synthetic seismogram. The Pliocene and Pleistocene is tentatively divided into 14 fourth-order sequences from 3.0 to 0 Ma and three third-order cycles from 5.5 to 3.0 Ma. Each of the cycles is bounded by a sequence boundary and has an age-dateable maximum flooding surface. Depending on location, each sequence may have lowstand, transgressive, and highstand systems tracts within the basin. The age (Ma) of the sequence boundaries are third-order, 5.5, 4.2, 3.8, fourth-order, 3.0, 2.6, 2.4, 1.86, 1.4., 1.0, 0.82, 0.72, 0.62, 0.52, 0.42., 0.32, 0.22, 0.12, and 0.02. Within these sequence boundaries are the 5.0, 4.0, 3.4, 2.7, 2.45, 2.0, 1.47, 1.3, 0.92, 0.76, 0.66, 0.56, 0.46, 0.36, 0.26, 0.16, and 0.06 Ma maximum flooding surfaces, respectively. All of the condensed sections associated with the maximum flooding surfaces, systems tract boundaries, and sequences boundaries in the Pliocene-Pleistocene can be recognized and traced on well logs and seismic record sections in the offshore Texas and Louisiana areas.

  2. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  3. Angiosperm phylogeny based on matK sequence information

    NARCIS (Netherlands)

    Hilu, K.W.; Borsch, T.; Müller, K.; Soltis, D.E.; Savolainen, V.; Chase, M.W.; Powell, M.; Alice, L.A.; Evans, R.; Sauquet, H.; Neinhuis, C.; Slotta, T.A.B.; Rohwer, J.G.; Campbell, C.; Chatrou, L.W.

    2003-01-01

    Plastid matK gene sequences for 374 genera representing all angiosperm orders and 12 genera of gymnosperms were analyzed using parsimony (MP) and Bayesian inference (BI) approaches. Traditionally, slowly evolving genomic regions have been preferred for deep-level phylogenetic inference in angiosperm

  4. Customer Clustering Based on Customer Purchasing Sequence Data

    Directory of Open Access Journals (Sweden)

    Yen-Chung Liu

    2017-01-01

    Full Text Available Customer clustering has become a priority for enterprises because of the importance of customer relationship management. Customer clustering can improve understanding of the composition and characteristics of customers, thereby enabling the creation of appropriate marketing strategies for each customer group. Previously, different customer clustering approaches have been proposed according to data type, namely customer profile data, customer value data, customer transaction data, and customer purchasing sequence data. This paper considers the customer clustering problem in the context of customer purchasing sequence data. However, two major aspects distinguish this paper from past research: (1 in our model, a customer sequence contains itemsets, which is a more realistic configuration than previous models, which assume a customer sequence would merely consist of items; and (2 in our model, a customer may belong to multiple clusters or no cluster, whereas in existing models a customer is limited to only one cluster. The second difference implies that each cluster discovered using our model represents a crucial type of customer behavior and that a customer can exhibit several types of behavior simultaneously. Finally, extensive experiments are conducted through a retail data set, and the results show that the clusters obtained by our model can provide more accurate descriptions of customer purchasing behaviors.

  5. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  6. Identification of the new HLA-DRB1{sup *}0812 allele detected by sequencing based typing

    Energy Technology Data Exchange (ETDEWEB)

    Versluis, L.F.; Zwan, A.W. van der; Tilanus, M.G.J. [Univ. Hospital Utrecht (Netherlands); Savelkoul, P.H.M.; Berg-Loonen, E.M. van den [Univ. Hospital Maastricht (Netherlands)] [and others

    1996-12-31

    HLA-DRB typing by polymerase chain reaction-sequence specific priming (PCR-SSP) and sequencing based typing (SBT) was studied within the framework of the Antigen and Haplotype Society 11 and the Sequencing Based Typing Component of the Twelfth International HLA workshop. Sequencing was performed as described by McGinnis and co-workers in 1995 on coded samples, including most DR2 subtypes, resulting in high resolution HLA-DR typing. Sequences were compared with a database containing 107 DRB1, four DRB3, and five DRB5 alleles in a similar way as described for HLA-DPB. One sample showed a new DR8 sequence, indicating the presence of a new allele. This individual (4390) is of Indonesian origin. The specific amplification of the DR8 allele and subsequent sequencing resulted in a sequence which did not match the database and new polymorphism was identified. The complementary strand was sequenced and confirmed the presence of a new DRB1 allele. Cloning and subsequent sequencing of the polymerase chain reaction fragment resulted in confirmation of the direct sequence data. Later this variant was officially named DRB1{sup *}0812. The complete nucleotide sequence of exon 2 of this new allele is shown. This allele differs from DRB1{sup *}0810 by one nucleotide at codon 85, resulting in an alanine (GTT), whereas DRB1{sup *}0810 carries a valine (GCT). 5 refs., 1 fig.

  7. Genomic Variance Estimation Based on Genotyping-by-Sequencing with Different Coverage in Perennial Ryegrass

    DEFF Research Database (Denmark)

    Ashraf, Bilal; Fé, Dario; Jensen, Just

    2014-01-01

    at each SNP in family pools or polyploids. There are, however, several statistical challenges associated with this method, including low sequencing depth and missing values. Low sequencing depth results in inaccuracies in estimates of allele frequencies for each SNP. In this work we have focused...... on optimizing methods and models utilizing F2 family phenotype records and NGS information from F2 family pools in perennial ryegrass. Genomic variance was estimated using genomic relationship matrices based on different coverage depths to verify effects of coverage depth. Example traits were seed yield, rust...... score and heading date. A total of 995 F2 families were genotyped via GBS, resulting in allele frequency estimates at 1 million SNPs in each family, the coverage within family ranging from 0 to 60. Results from both real and simulated data show that genomic variance is overestimated at lower coverage...

  8. A Hybrid Time Synchronization Algorithm Based on Broadcast Sequencing for Wireless Sensor Networks

    Science.gov (United States)

    2014-09-01

    sequence per the flow charts detailed in Figures 43–45 located in Appendix A. The input 1 in Figure 12 is a recursive step from some of the...SYNCHRONIZATION ALGORITHM BASED ON BROADCAST SEQUENCING FOR WIRELESS SENSOR NETWORKS by Sung C. Park September 2014 Thesis Co-Advisors...REPORT TYPE AND DATES COVERED Master’s Thesis 4. TITLE AND SUBTITLE A HYBRID TIME SYNCHRONIZATION ALGORITHM BASED ON BROADCAST SEQUENCING FOR

  9. MR-based attenuation correction in brain PET based on UTE sequences

    Energy Technology Data Exchange (ETDEWEB)

    Cabello, Jorge; Nekolla, Stephan G; Ziegler, Sibylle I [Department of Nuclear Medicine, Klinikum rechts der Isar, Technische Universität München (Germany)

    2014-07-29

    Attenuation correction (AC) in brain PET/MR has recently emerged as one of the challenging tasks in the PET/MR field. It has been shown that to ignore the attenuation produced by bone can lead to errors ranging from 5-30% in regions close to bone structures. Since the information provided by the MR signal is not directly related to tissue attenuation, alternative methods have to be developed. Signal from bone tissue is difficult to measure given its short transverse relaxation time (T2). Ultrashort-echo time (UTE) pulse sequences were developed to measure signal from tissues with short T2. A combination of two consecutive UTE echoes has been used in several works to measure signal from bone tissue. The first echo is able to measure signal from bone tissue in addition to soft tissue, while the second echo contains most of the soft tissue contained in the first echo but not bone. In this work we extract the attenuation information from the difference between the logarithm of two images obtained after applying two consecutive UTE pulse sequences using the mMR scanner (Siemens Healthcare). Subsequently, image processing techniques are applied to reduce the noise and extract air cavities within the head. The resulting image is converted to linear attenuation coefficients, generating what is known as µ-map, to be used during reconstruction. For comparison purposes PET/CT scans of the same patients were acquired prior to the PET/MR scan. Additional µ-maps obtained for comparison were extracted from a Dixon sequence (used in clinical routine) and an additional µ-map calculated by the scanner based on UTE pulse sequences. Preliminary quantitative results measured in the cerebellum, using the value obtained with CT-based AC as reference, show differences of 34% without AC, 13% using the Dixon-based and UTE-based provided by the scanner, and 0.8% with the AC strategy presented here.

  10. Enrichment, Amplification, and Sequence-Based Typing of Salmonella enterica and Other Foodborne Pathogens.

    Science.gov (United States)

    Edlind, Tom; Brewster, Jeffrey D; Paoli, George C

    2017-01-01

    Detection of Salmonella enterica in foods typically involves microbiological enrichment, molecular-based assay, and subsequent isolation and identification of a pure culture. This is ideally followed by strain typing, which provides information critical to the investigation of outbreaks and the attribution of their sources. Pulsed-field gel electrophoresis is the "gold standard" for S. enterica strain typing, but its limitations have encouraged the search for alternative methods, including whole genome sequencing. Both methods typically require a pure culture, which adds to the cost and turnaround time. A more rapid and cost-effective method with sufficient discriminatory power would benefit food industries, regulatory agencies, and public health laboratories. To address this need, a novel enrichment, amplification, and sequence-based typing (EAST) approach was developed involving (i) overnight enrichment and total DNA preparation, (ii) amplification of polymorphic tandem repeat-containing loci with electrophoretic detection, and (iii) DNA sequencing and bioinformatic analysis to identify related strains. EAST requires 3 days or less and provides a strain resolution that exceeds serotyping and is comparable to pulsed-field gel electrophoresis. Evaluation with spiked ground turkey demonstrated its sensitivity (with a starting inoculum of ≤1 CFU/g) and specificity (with unique or nearly unique alleles relative to databases of >1,000 strains). In tests with unspiked retail chicken parts, 3 of 11 samples yielded S. enterica -specific PCR products. Sequence analysis of three distinct typing targets (SeMT1, SeCRISPR1, and SeCRISPR2) revealed consistent similarities to specific serotype Schwarzengrund, Montevideo, and Typhimurium strains. EAST provides a time-saving and cost-effective approach for detecting and typing foodborne S. enterica , and postenrichment steps can be commercially outsourced to facilitate its implementation. Initial studies with Listeria

  11. The contrasting structures of mismatched DNA sequences containing looped-out bases (bulges) and multiple mismatches (bubbles).

    Science.gov (United States)

    Bhattacharyya, A; Lilley, D M

    1989-09-12

    We have studied the structure and reactivities of two kinds of mismatched DNA sequences--unopposed bases, or bulges, and multiple mismatched pairs of bases. These were generated in a constant sequence environment, in relatively long DNA fragments, using a technique based on heteroduplex formation between sequences cloned into single-stranded M13 phage. The mismatched sequences were studied from two points of view, viz 1. The mobility of the fragments on gel electrophoresis in polyacrylamide was studied in order to examine possible bending of the DNA due to the presence of the mismatch defect. Such bending would constitute a global effect on the conformation of the molecule. 2. Sequences in and around the mismatches were studied using enzyme and chemical probes of DNA structure. This would reveal more local structural effects of the mismatched sequences. We observed that the structures of the bulges and the multiple mismatches appear to be fundamentally different. The bulged sequences exhibited a large gel retardation, consistent with a significant bending of the DNA at the bulge, and whose magnitude depends on the number of mismatched bases. The larger bulges were sensitive to cleavage by single-strand specific nucleases, and modified by diethyl pyrocarbonate (adenines) or osmium tetroxide (thymines) in a non-uniform way, suggesting that the bulges have a precise structure that leads to exposure of some, but not all, of the bases. In contrast the multiple mismatches ('bubbles') cause very much less bending of the DNA fragment in which they occur, and uniform patterns of chemical reactivity along the length of the mismatched sequences, suggesting a less well defined, and possibly flexible, structure. The precise structure of the bulges suggests that such features may be especially significant for recognition by proteins.

  12. ESTPiper – a web-based analysis pipeline for expressed sequence tags

    Directory of Open Access Journals (Sweden)

    Tang Zuojian

    2009-04-01

    Full Text Available Abstract Background EST sequencing projects are increasing in scale and scope as the genome sequencing technologies migrate from core sequencing centers to individual research laboratories. Effectively, generating EST data is no longer a bottleneck for investigators. However, processing large amounts of EST data remains a non-trivial challenge for many. Web-based EST analysis tools are proving to be the most convenient option for biologists when performing their analysis, so these tools must continuously improve on their utility to keep in step with the growing needs of research communities. We have developed a web-based EST analysis pipeline called ESTPiper, which streamlines typical large-scale EST analysis components. Results The intuitive web interface guides users through each step of base calling, data cleaning, assembly, genome alignment, annotation, analysis of gene ontology (GO, and microarray oligonucleotide probe design. Each step is modularized. Therefore, a user can execute them separately or together in batch mode. In addition, the user has control over the parameters used by the underlying programs. Extensive documentation of ESTPiper's functionality is embedded throughout the web site to facilitate understanding of the required input and interpretation of the computational results. The user can also download intermediate results and port files to separate programs for further analysis. In addition, our server provides a time-stamped description of the run history for reproducibility. The pipeline can also be installed locally, allowing researchers to modify ESTPiper to suit their own needs. Conclusion ESTPiper streamlines the typical process of EST analysis. The pipeline was initially designed in part to support the Daphnia pulex cDNA sequencing project. A web server hosting ESTPiper is provided at http://estpiper.cgb.indiana.edu/ to now support projects of all size. The software is also freely available from the authors for

  13. Spatially Enhanced Differential RNA Methylation Analysis from Affinity-Based Sequencing Data with Hidden Markov Model.

    Science.gov (United States)

    Zhang, Yu-Chen; Zhang, Shao-Wu; Liu, Lian; Liu, Hui; Zhang, Lin; Cui, Xiaodong; Huang, Yufei; Meng, Jia

    2015-01-01

    With the development of new sequencing technology, the entire N6-methyl-adenosine (m(6)A) RNA methylome can now be unbiased profiled with methylated RNA immune-precipitation sequencing technique (MeRIP-Seq), making it possible to detect differential methylation states of RNA between two conditions, for example, between normal and cancerous tissue. However, as an affinity-based method, MeRIP-Seq has yet provided base-pair resolution; that is, a single methylation site determined from MeRIP-Seq data can in practice contain multiple RNA methylation residuals, some of which can be regulated by different enzymes and thus differentially methylated between two conditions. Since existing peak-based methods could not effectively differentiate multiple methylation residuals located within a single methylation site, we propose a hidden Markov model (HMM) based approach to address this issue. Specifically, the detected RNA methylation site is further divided into multiple adjacent small bins and then scanned with higher resolution using a hidden Markov model to model the dependency between spatially adjacent bins for improved accuracy. We tested the proposed algorithm on both simulated data and real data. Result suggests that the proposed algorithm clearly outperforms existing peak-based approach on simulated systems and detects differential methylation regions with higher statistical significance on real dataset.

  14. Knowledge-Based Approach to Assembly Sequence Planning for Wind-Driven Generator

    Directory of Open Access Journals (Sweden)

    Meiping Wu

    2013-01-01

    Full Text Available Assembly sequence planning plays an essential role in the manufacturing industry. However, there still exist some challenges for the research of assembly planning, one of which is the weakness in effective description of assembly knowledge and information. In order to reduce the computational task, this paper presents a novel approach based on engineering assembly knowledge to the assembly sequence planning problem and provides an appropriate way to express both geometric information and nongeometric knowledge. In order to increase the sequence planning efficiency, the assembly connection graph is built according to the knowledge in engineering, design, and manufacturing fields. Product semantic information model could offer much useful information for the designer to finish the assembly (process design and make the right decision in that process. Therefore, complex and low-efficient computation in the assembly design process could be avoided. Finally, a product assembly planning example is presented to illustrate the effectiveness of the proposed approach. Initial experience with the approach indicates the potential to reduce lead times and thereby can help in completing new product launch projects on time.

  15. Evolutionary insights from suffix array-based genome sequence analysis

    Indian Academy of Sciences (India)

    Anindya Poddar; Nagasuma Chandra; Madhavi Ganapathiraju; K Sekar; Judith Klein-Seetharaman; Raj Reddy; N Balakrishnan

    2007-08-01

    Gene and protein sequence analyses, central components of studies in modern biology are easily amenable to string matching and pattern recognition algorithms. The growing need of analysing whole genome sequences more efficiently and thoroughly, has led to the emergence of new computational methods. Suffix trees and suffix arrays are data structures, well known in many other areas and are highly suited for sequence analysis too. Here we report an improvement to the design of construction of suffix arrays. Enhancement in versatility and scalability, enabled by this approach, is demonstrated through the use of real-life examples. The scalability of the algorithm to whole genomes renders it suitable to address many biologically interesting problems. One example is the evolutionary insight gained by analysing unigrams, bi-grams and higher n-grams, indicating that the genetic code has a direct influence on the overall composition of the genome. Further, different proteomes have been analysed for the coverage of the possible peptide space, which indicate that as much as a quarter of the total space at the tetra-peptide level is left un-sampled in prokaryotic organisms, although almost all tri-peptides can be seen in one protein or another in a proteome. Besides, distinct patterns begin to emerge for the counts of particular tetra and higher peptides, indicative of a ‘meaning’ for tetra and higher n-grams. The toolkit has also been used to demonstrate the usefulness of identifying repeats in whole proteomes efficiently. As an example, 16 members of one COG, coded by the genome of Mycobacterium tuberculosis H37Rv have been found to contain a repeating sequence of 300 amino acids.

  16. MetaSeq: privacy preserving meta-analysis of sequencing-based association studies.

    Science.gov (United States)

    Singh, Angad Pal; Zafer, Samreen; Pe'er, Itsik

    2013-01-01

    Human genetics recently transitioned from GWAS to studies based on NGS data. For GWAS, small effects dictated large sample sizes, typically made possible through meta-analysis by exchanging summary statistics across consortia. NGS studies groupwise-test for association of multiple potentially-causal alleles along each gene. They are subject to similar power constraints and therefore likely to resort to meta-analysis as well. The problem arises when considering privacy of the genetic information during the data-exchange process. Many scoring schemes for NGS association rely on the frequency of each variant thus requiring the exchange of identity of the sequenced variant. As such variants are often rare, potentially revealing the identity of their carriers and jeopardizing privacy. We have thus developed MetaSeq, a protocol for meta-analysis of genome-wide sequencing data by multiple collaborating parties, scoring association for rare variants pooled per gene across all parties. We tackle the challenge of tallying frequency counts of rare, sequenced alleles, for metaanalysis of sequencing data without disclosing the allele identity and counts, thereby protecting sample identity. This apparent paradoxical exchange of information is achieved through cryptographic means. The key idea is that parties encrypt identity of genes and variants. When they transfer information about frequency counts in cases and controls, the exchanged data does not convey the identity of a mutation and therefore does not expose carrier identity. The exchange relies on a 3rd party, trusted to follow the protocol although not trusted to learn about the raw data. We show applicability of this method to publicly available exome-sequencing data from multiple studies, simulating phenotypic information for powerful meta-analysis. The MetaSeq software is publicly available as open source.

  17. Does order matter? Investigating the effect of sequence on glance duration during on-road driving

    Science.gov (United States)

    Roberts, Shannon C.; Reimer, Bryan; Mehler, Bruce

    2017-01-01

    Previous literature has shown that vehicle crash risks increases as drivers’ off-road glance duration increases. Many factors influence drivers’ glance duration such as individual differences, driving environment, or task characteristics. Theories and past studies suggest that glance duration increases as the task progresses, but the exact relationship between glance sequence and glance durations is not fully understood. The purpose of this study was to examine the effect of glance sequence on glance duration among drivers completing a visual-manual radio tuning task and an auditory-vocal based multi-modal navigation entry task. Eighty participants drove a vehicle on urban highways while completing radio tuning and navigation entry tasks. Forty participants drove under an experimental protocol that required three button presses followed by rotation of a tuning knob to complete the radio tuning task while the other forty participants completed the task with one less button press. Multiple statistical analyses were conducted to measure the effect of glance sequence on glance duration. Results showed that across both tasks and a variety of statistical tests, glance sequence had inconsistent effects on glance duration—the effects varied according to the number of glances, task type, and data set that was being evaluated. Results suggest that other aspects of the task as well as interface design effect glance duration and should be considered in the context of examining driver attention or lack thereof. All in all, interface design and task characteristics have a more influential impact on glance duration than glance sequence, suggesting that classical design considerations impacting driver attention, such as the size and location of buttons, remain fundamental in designing in-vehicle interfaces. PMID:28158301

  18. BAC-end sequence-based SNP mining in Allotetraploid Cotton (Gossypium) utilizing re-sequencing data, phylogenetic inferences and perspectives for genetic mapping

    Science.gov (United States)

    A bacterial artificial chromosome (BAC) library and BAC-end sequences for Gossypium hirsutum L. have recently been developed. Here we report on genomic-based genome-wide SNP mining utilizing re-sequencing data with a BAC-end sequence reference for twelve G. hirsutum L. lines, one G. barbadense L. li...

  19. DNA Lossless Differential Compression Algorithm based on Similarity of Genomic Sequence Database

    CERN Document Server

    Afify, Heba; Wahed, Manal Abdel

    2011-01-01

    Modern biological science produces vast amounts of genomic sequence data. This is fuelling the need for efficient algorithms for sequence compression and analysis. Data compression and the associated techniques coming from information theory are often perceived as being of interest for data communication and storage. In recent years, a substantial effort has been made for the application of textual data compression techniques to various computational biology tasks, ranging from storage and indexing of large datasets to comparison of genomic databases. This paper presents a differential compression algorithm that is based on production of difference sequences according to op-code table in order to optimize the compression of homologous sequences in dataset. Therefore, the stored data are composed of reference sequence, the set of differences, and differences locations, instead of storing each sequence individually. This algorithm does not require a priori knowledge about the statistics of the sequence set. The...

  20. Sequence-dependent elasticity and electrostatics of single-stranded DNA: signatures of base-stacking.

    Science.gov (United States)

    McIntosh, Dustin B; Duggan, Gina; Gouil, Quentin; Saleh, Omar A

    2014-02-04

    Base-stacking is a key factor in the energetics that determines nucleic acid structure. We measure the tensile response of single-stranded DNA as a function of sequence and monovalent salt concentration to examine the effects of base-stacking on the mechanical and thermodynamic properties of single-stranded DNA. By comparing the elastic response of highly stacked poly(dA) and that of a polypyrimidine sequence with minimal stacking, we find that base-stacking in poly(dA) significantly enhances the polymer's rigidity. The unstacking transition of poly(dA) at high force reveals that the intrinsic electrostatic tension on the molecule varies significantly more weakly on salt concentration than mean-field predictions. Further, we provide a model-independent estimate of the free energy difference between stacked poly(dA) and unstacked polypyrimidine, finding it to be ∼-0.25 kBT/base and nearly constant over three orders of magnitude in salt concentration.

  1. SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues.

    Directory of Open Access Journals (Sweden)

    Xiaoxia Yang

    Full Text Available Protein-nucleic acid interactions are central to various fundamental biological processes. Automated methods capable of reliably identifying DNA- and RNA-binding residues in protein sequence are assuming ever-increasing importance. The majority of current algorithms rely on feature-based prediction, but their accuracy remains to be further improved. Here we propose a sequence-based hybrid algorithm SNBRFinder (Sequence-based Nucleic acid-Binding Residue Finder by merging a feature predictor SNBRFinderF and a template predictor SNBRFinderT. SNBRFinderF was established using the support vector machine whose inputs include sequence profile and other complementary sequence descriptors, while SNBRFinderT was implemented with the sequence alignment algorithm based on profile hidden Markov models to capture the weakly homologous template of query sequence. Experimental results show that SNBRFinderF was clearly superior to the commonly used sequence profile-based predictor and SNBRFinderT can achieve comparable performance to the structure-based template methods. Leveraging the complementary relationship between these two predictors, SNBRFinder reasonably improved the performance of both DNA- and RNA-binding residue predictions. More importantly, the sequence-based hybrid prediction reached competitive performance relative to our previous structure-based counterpart. Our extensive and stringent comparisons show that SNBRFinder has obvious advantages over the existing sequence-based prediction algorithms. The value of our algorithm is highlighted by establishing an easy-to-use web server that is freely accessible at http://ibi.hzau.edu.cn/SNBRFinder.

  2. H.264 MOTION ESTIMATION ALGORITHM BASED ON VIDEO SEQUENCES ACTIVITY

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    Motion estimation is an important part of H.264/AVC encoding progress, with high computational complexity. Therefore, it is quite necessary to find a fast motion estimation algorithm for real-time applications. The algorithm proposed in this letter adjudges the macroblocks activity degree first; then classifies different video sequences, and applies different search strategies according to the result. Experiments show that this method obtains almost the same video quality with the Full Search (FS) algorithm but with reduced more than 95% computation cost.

  3. Comparison of sequence-based and structure-based phylogenetic trees of homologous proteins: Inferences on protein evolution

    Indian Academy of Sciences (India)

    S Balaji; N Srinivasan

    2007-01-01

    Several studies based on the known three-dimensional (3-D) structures of proteins show that two homologous proteins with insignificant sequence similarity could adopt a common fold and may perform same or similar biochemical functions. Hence, it is appropriate to use similarities in 3-D structure of proteins rather than the amino acid sequence similarities in modelling evolution of distantly related proteins. Here we present an assessment of using 3-D structures in modelling evolution of homologous proteins. Using a dataset of 108 protein domain families of known structures with at least 10 members per family we present a comparison of extent of structural and sequence dissimilarities among pairs of proteins which are inputs into the construction of phylogenetic trees. We find that correlation between the structure-based dissimilarity measures and the sequence-based dissimilarity measures is usually good if the sequence similarity among the homologues is about 30% or more. For protein families with low sequence similarity among the members, the correlation coefficient between the sequence-based and the structure-based dissimilarities are poor. In these cases the structure-based dendrogram clusters proteins with most similar biochemical functional properties better than the sequence-similarity based dendrogram. In multi-domain protein families and disulphide-rich protein families the correlation coefficient for the match of sequence-based and structure-based dissimilarity (SDM) measures can be poor though the sequence identity could be higher than 30%. Hence it is suggested that protein evolution is best modelled using 3-D structures if the sequence similarities (SSM) of the homologues are very low.

  4. Cluster-Based Multipolling Sequencing Algorithm for Collecting RFID Data in Wireless LANs

    Science.gov (United States)

    Choi, Woo-Yong; Chatterjee, Mainak

    2015-03-01

    With the growing use of RFID (Radio Frequency Identification), it is becoming important to devise ways to read RFID tags in real time. Access points (APs) of IEEE 802.11-based wireless Local Area Networks (LANs) are being integrated with RFID networks that can efficiently collect real-time RFID data. Several schemes, such as multipolling methods based on the dynamic search algorithm and random sequencing, have been proposed. However, as the number of RFID readers associated with an AP increases, it becomes difficult for the dynamic search algorithm to derive the multipolling sequence in real time. Though multipolling methods can eliminate the polling overhead, we still need to enhance the performance of the multipolling methods based on random sequencing. To that extent, we propose a real-time cluster-based multipolling sequencing algorithm that drastically eliminates more than 90% of the polling overhead, particularly so when the dynamic search algorithm fails to derive the multipolling sequence in real time.

  5. Construction of a phylogenetic tree of photosynthetic prokaryotes based on average similarities of whole genome sequences.

    Science.gov (United States)

    Satoh, Soichirou; Mimuro, Mamoru; Tanaka, Ayumi

    2013-01-01

    Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.

  6. Fault Tree Based Diagnosis with Optimal Test Sequencing for Field Service Engineers

    Science.gov (United States)

    Iverson, David L.; George, Laurence L.; Patterson-Hine, F. A.; Lum, Henry, Jr. (Technical Monitor)

    1994-01-01

    When field service engineers go to customer sites to service equipment, they want to diagnose and repair failures quickly and cost effectively. Symptoms exhibited by failed equipment frequently suggest several possible causes which require different approaches to diagnosis. This can lead the engineer to follow several fruitless paths in the diagnostic process before they find the actual failure. To assist in this situation, we have developed the Fault Tree Diagnosis and Optimal Test Sequence (FTDOTS) software system that performs automated diagnosis and ranks diagnostic hypotheses based on failure probability and the time or cost required to isolate and repair each failure. FTDOTS first finds a set of possible failures that explain exhibited symptoms by using a fault tree reliability model as a diagnostic knowledge to rank the hypothesized failures based on how likely they are and how long it would take or how much it would cost to isolate and repair them. This ordering suggests an optimal sequence for the field service engineer to investigate the hypothesized failures in order to minimize the time or cost required to accomplish the repair task. Previously, field service personnel would arrive at the customer site and choose which components to investigate based on past experience and service manuals. Using FTDOTS running on a portable computer, they can now enter a set of symptoms and get a list of possible failures ordered in an optimal test sequence to help them in their decisions. If facilities are available, the field engineer can connect the portable computer to the malfunctioning device for automated data gathering. FTDOTS is currently being applied to field service of medical test equipment. The techniques are flexible enough to use for many different types of devices. If a fault tree model of the equipment and information about component failure probabilities and isolation times or costs are available, a diagnostic knowledge base for that device can be

  7. DNA sequence-based analysis of the Pseudomonas species.

    Science.gov (United States)

    Mulet, Magdalena; Lalucat, Jorge; García-Valdés, Elena

    2010-06-01

    Partial sequences of four core 'housekeeping' genes (16S rRNA, gyrB, rpoB and rpoD) of the type strains of 107 Pseudomonas species were analysed in order to obtain a comprehensive view regarding the phylogenetic relationships within the Pseudomonas genus. Gene trees allowed the discrimination of two lineages or intrageneric groups (IG), called IG P. aeruginosa and IG P. fluorescens. The first IG P. aeruginosa, was divided into three main groups, represented by the species P. aeruginosa, P. stutzeri and P. oleovorans. The second IG was divided into six groups, represented by the species P. fluorescens, P. syringae, P. lutea, P. putida, P. anguilliseptica and P. straminea. The P. fluorescens group was the most complex and included nine subgroups, represented by the species P. fluorescens, P. gessardi, P. fragi, P. mandelii, P. jesseni, P. koreensis, P. corrugata, P. chlororaphis and P. asplenii. Pseudomonas rhizospherae was affiliated with the P. fluorescens IG in the phylogenetic analysis but was independent of any group. Some species were located on phylogenetic branches that were distant from defined clusters, such as those represented by the P. oryzihabitans group and the type strains P. pachastrellae, P. pertucinogena and P. luteola. Additionally, 17 strains of P. aeruginosa, 'P. entomophila', P. fluorescens, P. putida, P. syringae and P. stutzeri, for which genome sequences have been determined, have been included to compare the results obtained in the analysis of four housekeeping genes with those obtained from whole genome analyses.

  8. Dissecting the roles of local packing density and longer-range effects in protein sequence evolution

    CERN Document Server

    Shahmoradi, Amir

    2015-01-01

    What are the structural determinants of protein sequence evolution? A number of site-specific structural characteristics have been proposed, most of which are broadly related to either the density of contacts or the solvent accessibility of individual residues. Most importantly, there has been disagreement in the literature over the relative importance of solvent accessibility and local packing density for explaining site-specific sequence variability in proteins. We show here that this discussion has been confounded by the definition of local packing density. The most commonly used measures of local packing, such as the contact number and the weighted contact number, represent by definition the combined effects of local packing density and longer-range effects. As an alternative, we here propose a truly local measure of packing density around a single residue, based on the Voronoi cell volume. We show that the Voronoi cell volume, when calculated relative to the geometric center of amino-acid side chains, be...

  9. Neural network predicts sequence of TP53 gene based on DNA chip

    DEFF Research Database (Denmark)

    Spicker, J.S.; Wikman, F.; Lu, M.L.;

    2002-01-01

    We have trained an artificial neural network to predict the sequence of the human TP53 tumor suppressor gene based on a p53 GeneChip. The trained neural network uses as input the fluorescence intensities of DNA hybridized to oligonucleotides on the surface of the chip and makes between zero...... and four errors in the predicted 1300 bp sequence when tested on wild-type TP53 sequence....

  10. Parallel divide and conquer bio-sequence comparison based on Smith-Waterman algorithm

    Institute of Scientific and Technical Information of China (English)

    ZHANG Fa; QIAO Xiangzhen; LIU Zhiyong

    2004-01-01

    Tools for pair-wise bio-sequence alignment have for long played a central role in computation biology. Several algorithms for bio-sequence alignment have been developed. The Smith-Waterman algorithm, based on dynamic programming, is considered the most fundamental alignment algorithm in bioinformatics. However the existing parallel Smith-Waterman algorithm needs large memory space, and this disadvantage limits the size of a sequence to be handled. As the data of biological sequences expand rapidly, the memory requirement of the existing parallel SmithWaterman algorithm has become a critical problem. For solving this problem, we develop a new parallel bio-sequence alignment algorithm, using the strategy of divide and conquer, named PSW-DC algorithm. In our algorithm, first, we partition the query sequence into several subsequences and distribute them to every processor respectively,then compare each subsequence with the whole subject sequence in parallel, using the Smith-Waterman algorithm, and get an interim result, finally obtain the optimal alignment between the query sequence and subject sequence, through the special combination and extension method. Memory space required in our algorithm is reduced significantly in comparison with existing ones. We also develop a key technique of combination and extension, named the C&E method, to manipulate the interim results and obtain the final sequences alignment. We implement the new parallel bio-sequences alignment algorithm,the PSW-DC, in a cluster parallel system.

  11. A Priori Knowledge and Probability Density Based Segmentation Method for Medical CT Image Sequences

    Directory of Open Access Journals (Sweden)

    Huiyan Jiang

    2014-01-01

    Full Text Available This paper briefly introduces a novel segmentation strategy for CT images sequences. As first step of our strategy, we extract a priori intensity statistical information from object region which is manually segmented by radiologists. Then we define a search scope for object and calculate probability density for each pixel in the scope using a voting mechanism. Moreover, we generate an optimal initial level set contour based on a priori shape of object of previous slice. Finally the modified distance regularity level set method utilizes boundaries feature and probability density to conform final object. The main contributions of this paper are as follows: a priori knowledge is effectively used to guide the determination of objects and a modified distance regularization level set method can accurately extract actual contour of object in a short time. The proposed method is compared to other seven state-of-the-art medical image segmentation methods on abdominal CT image sequences datasets. The evaluated results demonstrate our method performs better and has the potential for segmentation in CT image sequences.

  12. Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus

    OpenAIRE

    Chen, Chunxian; Gmitter Jr, Fred G

    2013-01-01

    Background Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. Results In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for...

  13. Spatiotemporal Super-Resolution Reconstruction Based on Robust Optical Flow and Zernike Moment for Video Sequences

    Directory of Open Access Journals (Sweden)

    Meiyu Liang

    2013-01-01

    Full Text Available In order to improve the spatiotemporal resolution of the video sequences, a novel spatiotemporal super-resolution reconstruction model (STSR based on robust optical flow and Zernike moment is proposed in this paper, which integrates the spatial resolution reconstruction and temporal resolution reconstruction into a unified framework. The model does not rely on accurate estimation of subpixel motion and is robust to noise and rotation. Moreover, it can effectively overcome the problems of hole and block artifacts. First we propose an efficient robust optical flow motion estimation model based on motion details preserving, then we introduce the biweighted fusion strategy to implement the spatiotemporal motion compensation. Next, combining the self-adaptive region correlation judgment strategy, we construct a fast fuzzy registration scheme based on Zernike moment for better STSR with higher efficiency, and then the final video sequences with high spatiotemporal resolution can be obtained by fusion of the complementary and redundant information with nonlocal self-similarity between the adjacent video frames. Experimental results demonstrate that the proposed method outperforms the existing methods in terms of both subjective visual and objective quantitative evaluations.

  14. Sequence-Length Requirement of Distance-Based Phylogeny Reconstruction: Breaking the Polynomial Barrier

    CERN Document Server

    Roch, Sebastien

    2009-01-01

    We introduce a new distance-based phylogeny reconstruction technique which provably achieves, at sufficiently short branch lengths, a polylogarithmic sequence-length requirement -- improving significantly over previous polynomial bounds for distance-based methods. The technique is based on an averaging procedure that implicitly reconstructs ancestral sequences. In the same token, we extend previous results on phase transitions in phylogeny reconstruction to general time-reversible models. More precisely, we show that in the so-called Kesten-Stigum zone (roughly, a region of the parameter space where ancestral sequences are well approximated by ``linear combinations'' of the observed sequences) sequences of length $\\poly(\\log n)$ suffice for reconstruction when branch lengths are discretized. Here $n$ is the number of extant species. Our results challenge, to some extent, the conventional wisdom that estimates of evolutionary distances alone carry significantly less information about phylogenies than full sequ...

  15. Evaluation of cleaved amplified polymorphic sequence markers for Chamaecyparis obtusa based on expressed sequence tag information from Cryptomeria japonica.

    Science.gov (United States)

    Matsumoto, A; Tsumura, Y

    2004-12-01

    We have developed and evaluated sequence-tagged site (STS) primers based on expressed sequence-tag information derived from sugi (Cryptomeria japonica) for use in hinoki (Chamaecyparis obtusa), a species that belongs to a different family (although it appears to be fairly closely related to sugi). Of the 417 C. japonica STS primer pairs we screened, 120 (approximately 30%) were transferable and provided specific PCR amplification products from 16 C. obtusa plus trees. We used haploid megagametophytes to investigate the homology of 80 STS fragments between C. obtusa and C. japonica and to identify orthologous loci. Nearly 90% of the fragments showed high (>70%) degrees of similarity between the species, and 35 STSs indicated homology to entries with the same putative function in a public DNA database. Of the 120 STS fragments amplified, 72 showed restriction fragment length polymorphisms; in addition, the CC2430 primers detected amplicon length polymorphism. We assessed the inheritance pattern of 27 cleaved amplified polymorphic sequence markers, using 20 individuals from the segregation population. All the markers analyzed were consistent with the marker inheritance patterns obtained from the screening panel, and no markers (except CC2716) showed significant (Pobtusa. Most of the markers should also provide reliable anchor loci for comparative mapping studies of the C. obtusa and C. japonica genomes.

  16. Phylogeny of Pelargonium (Geraniaceae) based on DNA sequences from three genomes

    NARCIS (Netherlands)

    Bakker, F.T.; Culham, A.; Hettiarachi, P.; Touloumendidou, T.; Gibby, M.

    2004-01-01

    Phylogenetic hypotheses for the largely South African genus Pelargonium L'Hér. (Geraniaceae) were derived based on DNA sequence data from nuclear, chloroplast and mitochondrial encoded regions. The datasets were unequally represented and comprised cpDNA trnL-F sequences for 152 taxa, nrDNA ITS seque

  17. Rapid detection of a norovirus pseudo-outbreak by using real-time sequence based information

    NARCIS (Netherlands)

    Rahamat-Langendoen, J. C.; Lokate, M.; Scholvinck, E. H.; Friedrich, A. W.; Niesters, H. G. M.

    2013-01-01

    Background: Sequence based information is increasingly used to study the epidemiology of viruses, not only to provide insight in viral evolution, but also to understand transmission patterns during outbreaks. However, sequence analysis is not yet routinely performed by diagnostic laboratories, limit

  18. Genotyping of Histomonas meleagridis isolates based on Internal Transcribed Spacer-1 sequences

    NARCIS (Netherlands)

    H.M.J.F. van der Heijden; W.J.M. Landman; S. Greve; R. Peek

    2006-01-01

    C-profiling is a novel genotyping method for protozoan pathogens, based on polymerase chain reaction and sequencing of AT-rich Internal Transcribed Spacer-1 sequences. It was applied to various Histomonas meleagridis isolates originating from outbreaks of histomoniasis in six Dutch turkey and chicke

  19. The Effects of Common Knowledge Construction Model Sequence of Lessons on Science Achievement and Relational Conceptual Change

    Science.gov (United States)

    Ebenezer, Jazlin; Chacko, Sheela; Kaya, Osman Nafiz; Koya, Satya Kiran; Ebenezer, Devairakkam Luke

    2010-01-01

    The purpose of this study was to investigate the effects of the Common Knowledge Construction Model (CKCM) lesson sequence, an intervention based both in conceptual change theory and in Phenomenography, a subset of conceptual change theory. A mixed approach was used to investigate whether this model had a significant effect on 7th grade students'…

  20. Identification of the novel HLA-DPB1*5801 allele detected by sequenced based typing

    Energy Technology Data Exchange (ETDEWEB)

    Versluis, L.F.; Zwan, A.W. van der; Tilanus, M.G.J. [Academic Hospital Utrecht (Netherlands); Daly, L.N.; Degli-Esposti, M.A.; Dawkins, R.L. [Royal Perth Hospital, Perth (Australia)

    1995-01-11

    Within the framework of HLA-DPB typing of the Fourth Asia-Oceanic Histocompatibility workshop (4AOH) we have typed the A, B, and E panels representing 101 samples. Sequenced based typing (SBT) was used as the method for typing described by Versluis and co-workers, but the sequencing chemistry was modified; in this study we have used the Sequenase enzyme instead of the thermal stable Taq enzyme. Sequences obtained were compared to a database containing all known 55 HLA-DPB1 alleles. One sample showed a new heterozygous sequence, indicating the presence of a new allele. 4 refs., 1 fig.

  1. THE CONSTRUCTIONS OF ALMOST BINARY SEQUENCE PAIRS WITH THREE-LEVEL CORRELATION BASED ON CYCLOTOMY

    Institute of Scientific and Technical Information of China (English)

    Peng Xiuping; Xu Chengqian

    2012-01-01

    In this paper,a new class of almost binary sequence pair with a single zero element is presented.The almost binary sequence pairs with three-level correlation are constructed based on cyclotomic numbers of order 2,4,and 6.Most of them have good correlation and balance property,whose maximum nontrivial correlation magnitudes are 2 and the difference between the numbers of occurrence of +1's and -1's are 0 or 1.In addition,the corresponding binary sequence pairs are investigated as well and we can also get some kinds of binary sequence pairs with optimum balance and good correlation.

  2. An Approach to Assembly Sequence Plannning Based on Hierarchical Strategy and Genetic Algorithm

    Institute of Scientific and Technical Information of China (English)

    Niu Xinwen; Ding Han; Xiong Youlun

    2001-01-01

    Using group and subassembly cluster methods, the hierarchical structure of a product is.generated automatically, which largely reduces the complexity of planning. Based on genetic algofithn the optimal of assembly sequence of each stracture level can be obtained by sequence-bysequence search. As a result, a better assembly sequence of the product can be generated by combining the assembly sequences of all hierarchical structures, which provides more parallelism and flexibility for assembly operations. An industrial example is solved by this new approach.

  3. A method for amplification of unknown flanking sequences based on touchdown PCR and suppression-PCR.

    Science.gov (United States)

    Gao, Song; He, Dan; Li, Guangquan; Zhang, Yanhua; Lv, Huiying; Wang, Li

    2016-09-15

    Thermal asymmetric staggered PCR is the most widely used technique to obtain the flanking sequences. However, it has some limitations, including a low rate of positivity, and complex operation. In this study, a improved method of it was made based on suppression-PCR and touchdown PCR. The PCR fragment obtained by the amplification was used directly for sequencing after gel purification. Using this improved method, the positive rate of amplified flanking sequences of the ATMT mutants reached 99%. In addition, the time from DNA extraction to flanking sequence analysis was shortened to 2 days with about 6 dollars each sample.

  4. Protein Sequence Comparison Based on Physicochemical Properties and the Position-Feature Energy Matrix

    Science.gov (United States)

    Yu, Lulu; Zhang, Yusen; Gutman, Ivan; Shi, Yongtang; Dehmer, Matthias

    2017-01-01

    We develop a novel position-feature-based model for protein sequences by employing physicochemical properties of 20 amino acids and the measure of graph energy. The method puts the emphasis on sequence order information and describes local dynamic distributions of sequences, from which one can get a characteristic B-vector. Afterwards, we apply the relative entropy to the sequences representing B-vectors to measure their similarity/dissimilarity. The numerical results obtained in this study show that the proposed methods leads to meaningful results compared with competitors such as Clustal W. PMID:28393857

  5. An efficient binomial model-based measure for sequence comparison and its application.

    Science.gov (United States)

    Liu, Xiaoqing; Dai, Qi; Li, Lihua; He, Zerong

    2011-04-01

    Sequence comparison is one of the major tasks in bioinformatics, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations. There are several similarity/dissimilarity measures for sequence comparison, but challenges remains. This paper presented a binomial model-based measure to analyze biological sequences. With help of a random indicator, the occurrence of a word at any position of sequence can be regarded as a random Bernoulli variable, and the distribution of a sum of the word occurrence is well known to be a binomial one. By using a recursive formula, we computed the binomial probability of the word count and proposed a binomial model-based measure based on the relative entropy. The proposed measure was tested by extensive experiments including classification of HEV genotypes and phylogenetic analysis, and further compared with alignment-based and alignment-free measures. The results demonstrate that the proposed measure based on binomial model is more efficient.

  6. Molecular phylogenetic relationship of Eplnephelus based on sequences of mtDNA Cty b

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The mtDNA Cyt b gene was sequenced partially for Variola louti of Serranidae,Epinephelinae and seven endemic species of groupers-Epinephelus awoara,E.brunneus,E.coioides,E.longispinis,E.sexfasciatus,E.spilotoceps and E.tauvina in China.The seven endemic species and other seven foreign species of groupers--E,aeneus,E.caninus,E.drummondhayi,E,haifensis,E.labriformis,E.marginatus and E.multinotatus from the GenBank were combined and analysed as ingroup,while Variola louti was used as outgroup.We compared the 420 bp sequences of Cyt b among the 15 species and constructed two types of molecular phylogenetic trees with maximum parsimony method (MP)and neighbor-joining method (NJ) respectively.The results were as follows:(1) As to the base composition of mtDNA Cyt b sequence (402 bp) of 14 species of Epinepkelus,the content of (A + T) was 53.6%,higher than that of (G + C) (46.4%).The transition/transversion ratio was 4.78 with no mutation saturation.(2) The duster relationships between E.awoara and E.sexfasciatus,E.coioides and E.tauvina,E.longispinis and E.spilotoceps were consistent with phenotypes in taxonomy.(3) In the phylogenetic tree,the species in the Atlantic Ocean were associated closely with those in the Pacific Ocean,which suggested that the Cyt b sequences of Epinephelus were highly conserved.This may be attributed to the coordinate evolution.(4) In well-bred mating or heredity management,mating Epinephelus of the same branch should be avoided.It is likely to be an effective way to mate the species of the Atlantic Ocean with those of the Pacific Ocean to improve the inheritance species.

  7. Sparc: a sparsity-based consensus algorithm for long erroneous sequencing reads

    Directory of Open Access Journals (Sweden)

    Chengxi Ye

    2016-06-01

    Full Text Available Motivation. The third generation sequencing (3GS technology generates long sequences of thousands of bases. However, its current error rates are estimated in the range of 15–40%, significantly higher than those of the prevalent next generation sequencing (NGS technologies (less than 1%. Fundamental bioinformatics tasks such as de novo genome assembly and variant calling require high-quality sequences that need to be extracted from these long but erroneous 3GS sequences. Results. We describe a versatile and efficient linear complexity consensus algorithm Sparc to facilitate de novo genome assembly. Sparc builds a sparse k-mer graph using a collection of sequences from a targeted genomic region. The heaviest path which approximates the most likely genome sequence is searched through a sparsity-induced reweighted graph as the consensus sequence. Sparc supports using NGS and 3GS data together, which leads to significant improvements in both cost efficiency and computational efficiency. Experiments with Sparc show that our algorithm can efficiently provide high-quality consensus sequences using both PacBio and Oxford Nanopore sequencing technologies. With only 30× PacBio data, Sparc can reach a consensus with error rate <0.5%. With the more challenging Oxford Nanopore data, Sparc can also achieve similar error rate when combined with NGS data. Compared with the existing approaches, Sparc calculates the consensus with higher accuracy, and uses approximately 80% less memory and time. Availability. The source code is available for download at https://github.com/yechengxi/Sparc.

  8. CAPS satellite spread spectrum communication blind multi-user detecting system based on chaotic sequences

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    Multiple Path Interference (MPI) and Multiple Access Interference (MAI) are important factors that affect the performance of Chinese Area Positioning System (CAPS). These problems can be solved by using spreading sequences with ideal properties and multi-user detectors. Chaotic sequences based on Chebyshev map are studied and the satellite communication system model is set up to investigate the application of chaotic sequences for CAPS in this paper. Simulation results show that chaotic sequences have desirable correlation properties and it is easy to generate a large number of chaotic sequences with good security. It has great practical value to apply chaotic sequences to CAPS together with multi-user detecting technology and the system performance can be improved greatly.

  9. CAPS satellite spread spectrum communication blind multi-user detecting system based on chaotic sequences

    Institute of Scientific and Technical Information of China (English)

    LEI LiHua; SHI HuLi; MA GuanYi

    2009-01-01

    Multiple Path Interference (MPI) and Multiple Access Interference (MAI) are Important factors that affect the performance of Chinese Area Positioning System (CAPS),These problems can be solved by using spreading sequences with ideal properties and multi-user detectors.Chaotic sequences based on Chebyshev map are studied and the satellite communication system model is set up to investigate the application of chaotic sequences for CAPS in this paper,Simulation results show that chaotic sequences have desirable correlation properties and it is easy to generate a large number of chaotic sequences with good security.It has great practical value to apply chaotic sequences to CAPS together with multi-user detecting technology and the system performance can be improved greatly.

  10. Study on multiple-hops performance of MOOC sequences-based optical labels for OPS networks

    Science.gov (United States)

    Zhang, Chongfu; Qiu, Kun; Ma, Chunli

    2009-11-01

    In this paper, we utilize a new study method that is under independent case of multiple optical orthogonal codes to derive the probability function of MOOCS-OPS networks, discuss the performance characteristics for a variety of parameters, and compare some characteristics of the system employed by single optical orthogonal code or multiple optical orthogonal codes sequences-based optical labels. The performance of the system is also calculated, and our results verify that the method is effective. Additionally it is found that performance of MOOCS-OPS networks would, negatively, be worsened, compared with single optical orthogonal code-based optical label for optical packet switching (SOOC-OPS); however, MOOCS-OPS networks can greatly enlarge the scalability of optical packet switching networks.

  11. Effects of the antimicrobial tylosin on the microbial community structure of an anaerobic sequencing batch reactor.

    Science.gov (United States)

    Shimada, Toshio; Li, Xu; Zilles, Julie L; Morgenroth, Eberhard; Raskin, Lutgarde

    2011-02-01

    The effects of the antimicrobial tylosin on a methanogenic microbial community were studied in a glucose-fed laboratory-scale anaerobic sequencing batch reactor (ASBR) exposed to stepwise increases of tylosin (0, 1.67, and 167 mg/L). The microbial community structure was determined using quantitative fluorescence in situ hybridization (FISH) and phylogenetic analyses of bacterial 16S ribosomal RNA (rRNA) gene clone libraries of biomass samples. During the periods without tylosin addition and with an influent tylosin concentration of 1.67 mg/L, 16S rRNA gene sequences related to Syntrophobacter were detected and the relative abundance of Methanosaeta species was high. During the highest tylosin dose of 167 mg/L, 16S rRNA gene sequences related to Syntrophobacter species were not detected and the relative abundance of Methanosaeta decreased considerably. Throughout the experimental period, Propionibacteriaceae and high GC Gram-positive bacteria were present, based on 16S rRNA gene sequences and FISH analyses, respectively. The accumulation of propionate and subsequent reactor failure after long-term exposure to tylosin are attributed to the direct inhibition of propionate-oxidizing syntrophic bacteria closely related to Syntrophobacter and the indirect inhibition of Methanosaeta by high propionate concentrations and low pH.

  12. Graph-based sequence annotation using a data integration approach.

    Science.gov (United States)

    Pesch, Robert; Lysenko, Artem; Hindle, Matthew; Hassani-Pak, Keywan; Thiele, Ralf; Rawlings, Christopher; Köhler, Jacob; Taubert, Jan

    2008-08-25

    The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional assignment. Such approaches generally ignore the links between the information in the reference datasets. These links, however, are valuable for assessing the plausibility of a function assignment and can be used to evaluate the confidence in a prediction. We are working towards a novel annotation system that uses the network of information supporting the function assignment to enrich the annotation process for use by expert curators and predicting the function of previously unannotated genes. In this paper we describe our success in the first stages of this development. We present the data integration steps that are needed to create the core database of integrated reference databases (UniProt, PFAM, PDB, GO and the pathway database Ara-Cyc) which has been established in the ONDEX data integration system. We also present a comparison between different methods for integration of GO terms as part of the function assignment pipeline and discuss the consequences of this analysis for improving the accuracy of gene function annotation. The methods and algorithms presented in this publication are an integral part of the ONDEX system which is freely available from http://ondex.sf.net/.

  13. Extended pseudorandom sequences and two-dimensional coding collimators based on them

    NARCIS (Netherlands)

    Fedorov, G. A.; Tereshchenko, S. A.

    2007-01-01

    A new extensive class of one-dimensional binary sequences, called extended pseudorandom sequences, is proposed which enables a radiation-physics experiment to be optimized more completely and enables problems of planar emission tomography to be solved effectively using integral-code measuring system

  14. VLSI Floorplanning with Boundary Constraints Based on Single-Sequence Representation

    Science.gov (United States)

    Li, Kang; Yu, Juebang; Li, Jian

    In modern VLSI physical design, huge integration scale necessitates hierarchical design and IP reuse to cope with design complexity. Besides, interconnect delay becomes dominant to overall circuit performance. These critical factors require some modules to be placed along designated boundaries to effectively facilitate hierarchical design and interconnection optimization related problems. In this paper, boundary constraints of general floorplan are solved smoothly based on the novel representation Single-Sequence (SS). Necessary and sufficient conditions of rooms along specified boundaries of a floorplan are proposed and proved. By assigning constrained modules to proper boundary rooms, our proposed algorithm always guarantees a feasible SS code with appropriate boundary constraints in each perturbation. Time complexity of the proposed algorithm is O(n). Experimental results on MCNC benchmarks show effectiveness and efficiency of the proposed method.

  15. STRait Razor: a length-based forensic STR allele-calling tool for use with second generation sequencing data.

    Science.gov (United States)

    Warshauer, David H; Lin, David; Hari, Kumar; Jain, Ravi; Davis, Carey; Larue, Bobby; King, Jonathan L; Budowle, Bruce

    2013-07-01

    Recent studies have demonstrated the capability of second generation sequencing (SGS) to provide coverage of short tandem repeats (STRs) found within the human genome. However, there are relatively few bioinformatic software packages capable of detecting these markers in the raw sequence data. The extant STR-calling tools are sophisticated, but are not always applicable to the analysis of the STR loci commonly used in forensic analyses. STRait Razor is a newly developed Perl-based software tool that runs on the Linux/Unix operating system and is designed to detect forensically-relevant STR alleles in FASTQ sequence data, based on allelic length. It is capable of analyzing STR loci with repeat motifs ranging from simple to complex without the need for extensive allelic sequence data. STRait Razor is designed to interpret both single-end and paired-end data and relies on intelligent parallel processing to reduce analysis time. Users are presented with a number of customization options, including variable mismatch detection parameters, as well as the ability to easily allow for the detection of alleles at new loci. In its current state, the software detects alleles for 44 autosomal and Y-chromosome STR loci. The study described herein demonstrates that STRait Razor is capable of detecting STR alleles in data generated by multiple library preparation methods and two Illumina(®) sequencing instruments, with 100% concordance. The data also reveal noteworthy concepts related to the effect of different preparation chemistries and sequencing parameters on the bioinformatic detection of STR alleles.

  16. Improved PCR-Based Detection of Soil Transmitted Helminth Infections Using a Next-Generation Sequencing Approach to Assay Design

    Science.gov (United States)

    Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R.; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S.; Williams, Steven A.

    2016-01-01

    Background The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world’s most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Methodology/Principal Findings Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. Conclusions/Significance The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other

  17. Control allocation and management of redundant control effectors based on bases sequenced optimal method

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    For an advanced aircraft, the amount of its effectors is much more than that for a traditional one, the functions of effectors are more complex and the coupling between each other is more severe. Based on the current control allocation research, this paper puts forward the concept and framework of the control allocation and management system for aircrafts with redundancy con-trol effectors. A new optimal control allocation method, bases sequenced optimal (BSO) method, is then presented. By analyz-ing the physical meaning of the allocation process of BSO method, four types of management strategies are adopted by the system, which act on the control allocation process under different flight conditions, mission requirements and effectors work-ing conditions. Simulation results show that functions of the control allocation system are extended and the system adaptability to flight status, mission requirements and effector failure conditions is improved.

  18. AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

    Directory of Open Access Journals (Sweden)

    Claros M Gonzalo

    2010-06-01

    Full Text Available Abstract Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used

  19. Base J glucosyltransferase does not regulate the sequence specificity of J synthesis in trypanosomatid telomeric DNA.

    Science.gov (United States)

    Bullard, Whitney; Cliffe, Laura; Wang, Pengcheng; Wang, Yinsheng; Sabatini, Robert

    2015-12-01

    Telomeric DNA of trypanosomatids possesses a modified thymine base, called base J, that is synthesized in a two-step process; the base is hydroxylated by a thymidine hydroxylase forming hydroxymethyluracil (hmU) and a glucose moiety is then attached by the J-associated glucosyltransferase (JGT). To examine the importance of JGT in modifiying specific thymine in DNA, we used a Leishmania episome system to demonstrate that the telomeric repeat (GGGTTA) stimulates J synthesis in vivo while mutant telomeric sequences (GGGTTT, GGGATT, and GGGAAA) do not. Utilizing an in vitro GT assay we find that JGT can glycosylate hmU within any sequence with no significant change in Km or kcat, even mutant telomeric sequences that are unable to be J-modified in vivo. The data suggests that JGT possesses no DNA sequence specificity in vitro, lending support to the hypothesis that the specificity of base J synthesis is not at the level of the JGT reaction.

  20. Group Graded Associated Ideals with Flat Base Change of Rings and Short Exact Sequences

    Indian Academy of Sciences (India)

    Srinivas Behara; Shiv Datt Kumar

    2011-05-01

    This paper deals with the study of behaviour of -associated ideals and strong Krull -associated ideals with flat base change of rings and behaviour of -associated ideals with short exact sequences over rings graded by finitely generated abelian group .

  1. Adaptation of a Fault-Tolerant Fpga-Based Launch Sequencer as a Cubesat Payload Processor

    Science.gov (United States)

    2014-06-01

    FAULT–TOLERANT FPGA –BASED LAUNCH SEQUENCER AS A CUBESAT PAYLOAD PROCESSOR by Jordan K. Goff June 2014 Thesis Co-Advisors: Herschel H...TYPE AND DATES COVERED Master’s Thesis 4. TITLE AND SUBTITLE ADAPTATION OF A FAULT–TOLERANT FPGA –BASED LAUNCH SEQUENCER AS A CUBESAT PAYLOAD...set. This processor is implemented on a field programmable gate array ( FPGA ) and will be used as the foundation for a payload processor on a cube

  2. Effects of KLK Peptide on Adjuvanticity of Different ODN Sequences

    Directory of Open Access Journals (Sweden)

    Ghania Chikh

    2016-05-01

    Full Text Available Endosomal Toll-like receptors (TLR such as TLR3, 7, 8 and 9 recognize pathogen associated nucleic acids. While DNA sequence does influence degree of binding to and activation of TLR9, it also appears to influence the ability of the ligand to reach the intracellular endosomal compartment. The KLK (KLKL5KLK antimicrobial peptide, which is immunostimulatory itself, can translocate into cells without cell membrane permeabilization and thus can be used for endosomal delivery of TLR agonists, as has been shown with the IC31 formulation that contains an oligodeoxynucleotide (ODN TLR9 agonist. We evaluated the adjuvant activity of KLK combined with CpG or non-CpG (GpC ODN synthesized with nuclease resistant phosphorothioate (S or native phosphodiester (O backbones with ovalbumin (OVA antigen in mice. As single adjuvants, CpG(S gave the strongest enhancement of OVA-specific immunity and the addition of KLK provided no benefit and was actually detrimental for some readouts. In contrast, KLK enhanced the adjuvant effects of CpG(O and to a lesser extent of GpC (S, which on their own had little or no activity. Indeed while CD8 T cells, IFN-γ secretion and humoral response to vaccine antigen were enhanced when CpG(O was combined with KLK, only IFN-γ secretion was enhanced when GpC (S was combined to KLK. The synergistic adjuvant effects with KLK/ODN combinations were TLR9-mediated since they did not occur in TLR9 knock-out mice. We hypothesize that a nuclease resistant ODN with CpG motifs has its own mechanism for entering cells to reach the endosome. For ODN without CpG motifs, KLK appears to provide an alternate mechanism for accessing the endosome, where it can activate TLR9, albeit with lower potency than a CpG ODN. For nuclease sensitive (O backbone ODN, KLK may also provide protection from nucleases in the tissues.

  3. Bayesian prediction of bacterial growth temperature range based on genome sequences

    Directory of Open Access Journals (Sweden)

    Jensen Dan B

    2012-12-01

    Full Text Available Abstract Background The preferred habitat of a given bacterium can provide a hint of which types of enzymes of potential industrial interest it might produce. These might include enzymes that are stable and active at very high or very low temperatures. Being able to accurately predict this based on a genomic sequence, would thus allow for an efficient and targeted search for production organisms, reducing the need for culturing experiments. Results This study found a total of 40 protein families useful for distinction between three thermophilicity classes (thermophiles, mesophiles and psychrophiles. The predictive performance of these protein families were compared to those of 87 basic sequence features (relative use of amino acids and codons, genomic and 16S rDNA AT content and genome size. When using naïve Bayesian inference, it was possible to correctly predict the optimal temperature range with a Matthews correlation coefficient of up to 0.68. The best predictive performance was always achieved by including protein families as well as structural features, compared to either of these alone. A dedicated computer program was created to perform these predictions. Conclusions This study shows that protein families associated with specific thermophilicity classes can provide effective input data for thermophilicity prediction, and that the naïve Bayesian approach is effective for such a task. The program created for this study is able to efficiently distinguish between thermophilic, mesophilic and psychrophilic adapted bacterial genomes.

  4. A new approach based on PSO algorithm to find good computational encoding sequences

    Institute of Scientific and Technical Information of China (English)

    Cui Guangzhao; Niu Yunyun; Wang Yanfeng; Zhang Xuncai; Pan Linqiang

    2007-01-01

    Computational encoding DNA sequence design is one of the most important steps in molecular computation. A lot of research work has been done to design reliable sequence library. A revised method based on the support system developed by Tanaka et al.is proposed here with different criteria to construct fitness function. Then we adapt particle swarm optimization (PSO) algorithm to our encoding problem. By using the new algorithm, a set of sequences with good quality is generated. The result also shows that our PSO- based approach could rapidly converge at the minimum level for an output of the simulation model. The celerity of the algorithm fits our requirements.

  5. STUDY OF BLOCKING EFFECT ELIMINATION METHODS BY MEANS OF INTRAFRAME VIDEO SEQUENCE INTERPOLATION

    Directory of Open Access Journals (Sweden)

    I. S. Rubina

    2015-01-01

    Full Text Available The paper deals with image interpolation methods and their applicability to eliminate some of the artifacts related to both the dynamic properties of objects in video sequences and algorithms used in the order of encoding steps. The main drawback of existing methods is the high computational complexity, unacceptable in video processing. Interpolation of signal samples for blocking - effect elimination at the output of the convertion encoding is proposed as a part of the study. It was necessary to develop methods for improvement of compression ratio and quality of the reconstructed video data by blocking effect elimination on the borders of the segments by intraframe interpolating of video sequence segments. The main point of developed methods is an adaptive recursive algorithm application with adaptive-sized interpolation kernel both with and without the brightness gradient consideration at the boundaries of objects and video sequence blocks. Within theoretical part of the research, methods of information theory (RD-theory and data redundancy elimination, methods of pattern recognition and digital signal processing, as well as methods of probability theory are used. Within experimental part of the research, software implementation of compression algorithms with subsequent comparison of the implemented algorithms with the existing ones was carried out. Proposed methods were compared with the simple averaging algorithm and the adaptive algorithm of central counting interpolation. The advantage of the algorithm based on the adaptive kernel size selection interpolation is in compression ratio increasing by 30%, and the advantage of the modified algorithm based on the adaptive interpolation kernel size selection is in the compression ratio increasing by 35% in comparison with existing algorithms, interpolation and quality of the reconstructed video sequence improving by 3% compared to the one compressed without interpolation. The findings will be

  6. A sequencing-based linkage map of cucumber

    Science.gov (United States)

    Genetic maps are important tools for molecular breeding, gene cloning, and study of meiotic recombination. In cucumber (Cucumis sativus L.), the marker density, resolution and genome coverage of previously developed genetic maps using PCR-based molecular markers are relatively low. In this study we ...

  7. Conformation and Stability of Intramolecular Telomeric G-Quadruplexes: Sequence Effects in the Loops

    Science.gov (United States)

    Sattin, Giovanna; Artese, Anna; Nadai, Matteo; Costa, Giosuè; Parrotta, Lucia; Alcaro, Stefano; Palumbo, Manlio; Richter, Sara N.

    2013-01-01

    Telomeres are guanine-rich sequences that protect the ends of chromosomes. These regions can fold into G-quadruplex structures and their stabilization by G-quadruplex ligands has been employed as an anticancer strategy. Genetic analysis in human telomeres revealed extensive allelic variation restricted to loop bases, indicating that the variant telomeric sequences maintain the ability to fold into G-quadruplex. To assess the effect of mutations in loop bases on G-quadruplex folding and stability, we performed a comprehensive analysis of mutant telomeric sequences by spectroscopic techniques, molecular dynamics simulations and gel electrophoresis. We found that when the first position in the loop was mutated from T to C or A the resulting structure adopted a less stable antiparallel topology; when the second position was mutated to C or A, lower thermal stability and no evident conformational change were observed; in contrast, substitution of the third position from A to C induced a more stable and original hybrid conformation, while mutation to T did not significantly affect G-quadruplex topology and stability. Our results indicate that allelic variations generate G-quadruplex telomeric structures with variable conformation and stability. This aspect needs to be taken into account when designing new potential anticancer molecules. PMID:24367632

  8. Conformation and stability of intramolecular telomeric G-quadruplexes: sequence effects in the loops.

    Directory of Open Access Journals (Sweden)

    Giovanna Sattin

    Full Text Available Telomeres are guanine-rich sequences that protect the ends of chromosomes. These regions can fold into G-quadruplex structures and their stabilization by G-quadruplex ligands has been employed as an anticancer strategy. Genetic analysis in human telomeres revealed extensive allelic variation restricted to loop bases, indicating that the variant telomeric sequences maintain the ability to fold into G-quadruplex. To assess the effect of mutations in loop bases on G-quadruplex folding and stability, we performed a comprehensive analysis of mutant telomeric sequences by spectroscopic techniques, molecular dynamics simulations and gel electrophoresis. We found that when the first position in the loop was mutated from T to C or A the resulting structure adopted a less stable antiparallel topology; when the second position was mutated to C or A, lower thermal stability and no evident conformational change were observed; in contrast, substitution of the third position from A to C induced a more stable and original hybrid conformation, while mutation to T did not significantly affect G-quadruplex topology and stability. Our results indicate that allelic variations generate G-quadruplex telomeric structures with variable conformation and stability. This aspect needs to be taken into account when designing new potential anticancer molecules.

  9. Case-Based Plan Recognition Using Action Sequence Graphs

    Science.gov (United States)

    2014-10-01

    while probabilistic algorithms include those that use stochastic grammars and probabilistic relational models. Both these approaches are sensitive to...Proceedings of the Fifth Game-On International Conference (pp. 36-40). Reading, UK: University of Wolverhampton Press. Cox, M. T., & Kerkez, B...algorithm based on plan tree grammars . Artificial Intelligence, 173(11), 1101-1132. Ghallab,M., Nau, D., & Traverso, P. (2004). Automated planning: Theory

  10. Rapid Conversion of Traditional Introductory Physics Sequences to an Activity-Based Format

    Science.gov (United States)

    Yoder, Garett; Cook, Jerry

    2014-01-01

    The Department of Physics at EKU [Eastern Kentucky University] with support from the National Science Foundations Course Curriculum and Laboratory Improvement Program has successfully converted our entire introductory physics sequence, both algebra-based and calculus-based courses, to an activity-based format where laboratory activities,…

  11. Interactome-wide prediction of protein-protein binding sites reveals effects of protein sequence variation in Arabidopsis thaliana.

    Directory of Open Access Journals (Sweden)

    Felipe Leal Valentim

    Full Text Available The specificity of protein-protein interactions is encoded in those parts of the sequence that compose the binding interface. Therefore, understanding how changes in protein sequence influence interaction specificity, and possibly the phenotype, requires knowing the location of binding sites in those sequences. However, large-scale detection of protein interfaces remains a challenge. Here, we present a sequence- and interactome-based approach to mine interaction motifs from the recently published Arabidopsis thaliana interactome. The resultant proteome-wide predictions are available via www.ab.wur.nl/sliderbio and set the stage for further investigations of protein-protein binding sites. To assess our method, we first show that, by using a priori information calculated from protein sequences, such as evolutionary conservation and residue surface accessibility, we improve the performance of interface prediction compared to using only interactome data. Next, we present evidence for the functional importance of the predicted sites, which are under stronger selective pressure than the rest of protein sequence. We also observe a tendency for compensatory mutations in the binding sites of interacting proteins. Subsequently, we interrogated the interactome data to formulate testable hypotheses for the molecular mechanisms underlying effects of protein sequence mutations. Examples include proteins relevant for various developmental processes. Finally, we observed, by analysing pairs of paralogs, a correlation between functional divergence and sequence divergence in interaction sites. This analysis suggests that large-scale prediction of binding sites can cast light on evolutionary processes that shape protein-protein interaction networks.

  12. Phylogenetic Analysis of Cynoglossidae in the Yangtze Estuary Based on Partial Sequence of Mitochondrial COⅠ

    Institute of Scientific and Technical Information of China (English)

    Song Chao; Yu Yanan; Zhang Tao; Yang Gang; Zhang Longzhen

    2015-01-01

    To determine the role of mitochondrial COⅠgene in classification and identification of species,a total of 39 single individuals from nine species pertaining to two genera of Cynoglossidae in the Yangtze Estuary were barcoded by COⅠ,sequenced and compared with that of other Cynoglossidae species recorded in the Gen Bank. Total genomic DNA was extracted from each scale sample using the classic phenol / chloroform extraction method. Six hundred and fifty base pairs( bp)COⅠfragments were amplified using the primers ’i. e. ’ F1: 5’- TCA ACC AAC CAC AAA GAC ATT GGC AC- 3’,R1: 5’- TAG ACT TCT GGG TGG CCA AAG AAT CA- 3’. Every PCR amplification was performed in a total volume of 50 μL of PCR mixture. PCR products were purified and then sequenced in both forward and reverse directions using an ABI PRISMTM 3730 XL Automated Sequencer. DNA sequences were aligned with clustal W using default parameters. Base composition,variable and parsimony informative sites were determined using MEGA 5. 0. Neighbor- joining( NJ) and Maximum parsimony( MP) phylogenetic trees were constructed for COⅠhaplotypes( Kimura 2 Parameter substitution model,K2P; 1 000 bootstraps pseudoreplications) using MEGA 5. 0. Using the MEGA5. 0 software for statistical analysis,the averaged AT content was greater than the GC content( Tab. 2). The GC content of codon position 1 averaged 53. 8%( 51. 8%-57. 3%),which of position 2 for 42. 0%,and that of position 3 ranged from 28. 1% to 37. 8% in average of 32. 4%( Tab. 4). The transitional pairs( si) was slightly more than the transversional pairs( sv),and the ratio( R = si/sv) was 1. 45( Tab. 3). Analysis of the frequency of amino acids in COⅠgene encoding protein showed that the highest frequency of amino acid was leucine,and the lowest frequency of amino acid was tryptophan( Tab. 5). The average K2 P distances pairwise-species and within-species were 0. 191 and 0. 003,respectively( Tab. 6). The K2 P distance pairwise-species was 63. 7

  13. GrabCut-based human segmentation in video sequences.

    Science.gov (United States)

    Hernández-Vela, Antonio; Reyes, Miguel; Ponce, Víctor; Escalera, Sergio

    2012-11-09

    In this paper, we present a fully-automatic Spatio-Temporal GrabCut human segmentation methodology that combines tracking and segmentation. GrabCut initialization is performed by a HOG-based subject detection, face detection, and skin color model. Spatial information is included by Mean Shift clustering whereas temporal coherence is considered by the historical of Gaussian Mixture Models. Moreover, full face and pose recovery is obtained by combining human segmentation with Active Appearance Models and Conditional Random Fields. Results over public datasets and in a new Human Limb dataset show a robust segmentation and recovery of both face and pose using the presented methodology.

  14. GrabCut-Based Human Segmentation in Video Sequences

    Directory of Open Access Journals (Sweden)

    Sergio Escalera

    2012-11-01

    Full Text Available In this paper, we present a fully-automatic Spatio-Temporal GrabCut human segmentation methodology that combines tracking and segmentation. GrabCut initialization is performed by a HOG-based subject detection, face detection, and skin color model. Spatial information is included by Mean Shift clustering whereas temporal coherence is considered by the historical of Gaussian Mixture Models. Moreover, full face and pose recovery is obtained by combining human segmentation with Active Appearance Models and Conditional Random Fields. Results over public datasets and in a new Human Limb dataset show a robust segmentation and recovery of both face and pose using the presented methodology.

  15. A New Semi-Empirical Technique For Computing Effective Temperatures For Main Sequence Stars From Their Mass And Radii

    Science.gov (United States)

    Aslan, Gürkan; Soydugan, Faruk; Eker, Zeki; Bilir, Selçuk; Bakış, Volkan

    2016-07-01

    A semi-empirical technique of improving effective temperature for main sequence stars from their observed mass and radius based on the Stefan-Boltzmann law, was introduced and applied to 450 main-sequence stars with accurate parameters. The method requires a mass-luminosity relation (MLR) and theoretical predictions of radius and effective temperature for stars at zero age main-sequence and at terminal age main-sequence. The MLRs, which act as if a catalyst, are necessary but have no effect on the final result. The present sample of main-sequence stars, which are members of the detached double-lined eclipsing binaries in the solar neighborhood chosen from Eker et al. (2014), have an error histogram for the observed effective temperatures with a peak at 2-3%. Errors of refined effective temperatures by the present method are the propagated errors of the observed masses and radii, that is, the refined temperatures and associated errors are independent of the observational temperatures and their associated errors. The histogram of the refined temperature errors shows a peak at less than 1%. A refined sample of stars (270 out of 450) with masses and radii accurate up to 3% and their refined effective temperatures has been used in this study to improve the classical MLRs. One may prefer, however, to use improved classical MLRs, which allows one to compute effective temperatures as accurate as 3.5%.

  16. Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

    OpenAIRE

    Chunsheng Gao; Pengfei Xin; Chaohua Cheng; Qing Tang; Ping Chen; Changbiao Wang; Gonggu Zang; Lining Zhao

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SS...

  17. MRI-Based Thermometry for Tumor Thermal Ablation: A Comparison of Different MR Sequences

    Directory of Open Access Journals (Sweden)

    T. J. Vogl

    2010-05-01

    Full Text Available Background/Objective: To evaluate T1 and PRF thermometry methods utilizing fast MR sequences and fluoroptic thermometer."nMaterials and Methods: The MR-guided LITT (Laser-Induced Interstitial Thermotherapy with a laser wavelength/power of 1064nm/30W was applied to pig liver and a gel phantom. During the ablation process, the temperature was measured using a fluoroptic thermometer and MR imaging was performed applying a 1.5-Tesla tomograph with an EPI (Echo Planar Imaging sequence for PRF (Proton Resonance Frequency method and FLASH, IRTF, SRTF and TRUFI sequences for T1 method. Plotting MR signal intensity against measured temperature determined the temperature constant for each of the T1 sequences. To determine the PRF temperature constant, phase values were recorded from phase images and then plotted against temperature. The PRF temperature constant was verified comparing the MR temperature with the measured one obtained from a second LITT experiment on gel phantom."nResults: The experiments determining the temperature constant for T1 method showed that the IRTF and FLASH sequences have the highest temperature sensitivity and the most linear relationship between MR signal intensity and measured temperature. SRTF sequence presented relatively good linearity but inferior temperature sensitivity compared to IRTF and FLASH sequences. Conversely, TRUFI sequence exhibited the lowest temperature sensitivity and linearity of data points. Concerning the PRF method, the measured and the MR-based temperatures agreed up to approximately 70 C."nConclusion: To demonstrate and control temperature in target tissue during the LITT process, the PRF method with an EPI sequence is preferred for temperatures below 70 C due to its acceptable accuracy. Among the T1 sequences, FLASH is preferable as the most robust, though not the most accurate T1 sequence.

  18. PCR-based VNTR core sequence analysis for inferring genetic diversity in the shrimp Litopenaeus vannamei

    Directory of Open Access Journals (Sweden)

    Freitas Patrícia Domingues de

    2002-01-01

    Full Text Available The genetic variation in two farmed strains (F3-Panama and F17-Venezuela of the shrimp Litopenaeus vannamei was examined based on DNA multiloci analyses. Eighteen adults of each strain were analyzed by PCR using a set of VNTR core sequence primers. Genetic similarity, mean allele frequency, mean heterozygosity and the frequency of polymorphic loci were determined for both strains. A dendrogram of genetic similarity was produced by UPGMA clustering. The results for three primers (INS, M13, YN73 revealed different levels of genetic variation within the strains. The higher genetic similarity seen within strain F17 was apparently related to inbreeding, although a bottleneck effect could not be discarded. The low level of genetic variability of this strain could account for the reduced adaptive advantage of these animals and their inability to adjust to breeding conditions in Brazil.

  19. Image Sequence Fusion and Denoising Based on 3D Shearlet Transform

    Directory of Open Access Journals (Sweden)

    Liang Xu

    2014-01-01

    Full Text Available We propose a novel algorithm for image sequence fusion and denoising simultaneously in 3D shearlet transform domain. In general, the most existing image fusion methods only consider combining the important information of source images and do not deal with the artifacts. If source images contain noises, the noises may be also transferred into the fusion image together with useful pixels. In 3D shearlet transform domain, we propose that the recursive filter is first performed on the high-pass subbands to obtain the denoised high-pass coefficients. The high-pass subbands are then combined to employ the fusion rule of the selecting maximum based on 3D pulse coupled neural network (PCNN, and the low-pass subband is fused to use the fusion rule of the weighted sum. Experimental results demonstrate that the proposed algorithm yields the encouraging effects.

  20. Effective noninvasive zygosity determination by maternal plasma target region sequencing.

    Directory of Open Access Journals (Sweden)

    Jing Zheng

    Full Text Available BACKGROUND: Currently very few noninvasive molecular genetic approaches are available to determine zygosity for twin pregnancies in clinical laboratories. This study aimed to develop a novel method to determine zygosity by using maternal plasma target region sequencing. METHODS: We constructed a statistic model to calculate the possibility of each zygosity type using likelihood ratios ( Li and empirical dynamic thresholds targeting at 4,524 single nucleotide polymorphisms (SNPs loci on 22 autosomes. Then two dizygotic (DZ twin pregnancies,two monozygotic (MZ twin pregnancies and two singletons were recruited to evaluate the performance of our novel method. Finally we estimated the sensitivity and specificity of the model in silico under different cell-free fetal DNA (cff-DNA concentration and sequence depth. RESULTS/CONCLUSIONS: We obtained 8.90 Gbp sequencing data on average for six clinical samples. Two samples were classified as DZ with L values of 1.891 and 1.554, higher than the dynamic DZ cut-off values of 1.162 and 1.172, respectively. Another two samples were judged as MZ with 0.763 and 0.784 of L values, lower than the MZ cut-off values of 0.903 and 0.918. And the rest two singleton samples were regarded as MZ twins, with L values of 0.639 and 0.757, lower than the MZ cut-off values of 0.921 and 0.799. In silico, the estimated sensitivity of our noninvasive zygosity determination was 99.90% under 10% total cff-DNA concentration with 2 Gbp sequence data. As the cff-DNA concentration increased to 15%, the specificity was as high as 97% with 3.50 Gbp sequence data, much higher than 80% with 10% cff-DNA concentration. SIGNIFICANCE: This study presents the feasibility to noninvasively determine zygosity of twin pregnancy using target region sequencing, and illustrates the sensitivity and specificity under various detecting condition. Our method can act as an alternative approach for zygosity determination of twin pregnancies in clinical

  1. Novel computational methods for increasing PCR primer design effectiveness in directed sequencing

    Directory of Open Access Journals (Sweden)

    Busam Dana

    2008-04-01

    Full Text Available Abstract Background Polymerase chain reaction (PCR is used in directed sequencing for the discovery of novel polymorphisms. As the first step in PCR directed sequencing, effective PCR primer design is crucial for obtaining high-quality sequence data for target regions. Since current computational primer design tools are not fully tuned with stable underlying laboratory protocols, researchers may still be forced to iteratively optimize protocols for failed amplifications after the primers have been ordered. Furthermore, potentially identifiable factors which contribute to PCR failures have yet to be elucidated. This inefficient approach to primer design is further intensified in a high-throughput laboratory, where hundreds of genes may be targeted in one experiment. Results We have developed a fully integrated computational PCR primer design pipeline that plays a key role in our high-throughput directed sequencing pipeline. Investigators may specify target regions defined through a rich set of descriptors, such as Ensembl accessions and arbitrary genomic coordinates. Primer pairs are then selected computationally to produce a minimal amplicon set capable of tiling across the specified target regions. As part of the tiling process, primer pairs are computationally screened to meet the criteria for success with one of two PCR amplification protocols. In the process of improving our sequencing success rate, which currently exceeds 95% for exons, we have discovered novel and accurate computational methods capable of identifying primers that may lead to PCR failures. We reveal the laboratory protocols and their associated, empirically determined computational parameters, as well as describe the novel computational methods which may benefit others in future primer design research. Conclusion The high-throughput PCR primer design pipeline has been very successful in providing the basis for high-quality directed sequencing results and for minimizing

  2. The Effects of Explicit Instruction of Formulaic Sequences on Second-Language Writers

    Science.gov (United States)

    Colovic-Markovic, Jelena

    2012-01-01

    The present study investigated the effects of the explicit teaching of formulaic sequences (i.e., academic and topic-induced) on L2 writing. The research examined separately the effects of the treatment on the students' abilities to produce the target formulaic sequences in controlled (i.e., C-tests) and uncontrolled situations (i.e.,…

  3. Combined sequence-based and genetic mapping analysis of complex traits in outbred rats

    Science.gov (United States)

    Baud, Amelie; Hermsen, Roel; Guryev, Victor; Stridh, Pernilla; Graham, Delyth; McBride, Martin W.; Foroud, Tatiana; Calderari, Sophie; Diez, Margarita; Ockinger, Johan; Beyeen, Amennai D.; Gillett, Alan; Abdelmagid, Nada; Guerreiro-Cacais, Andre Ortlieb; Jagodic, Maja; Tuncel, Jonatan; Norin, Ulrika; Beattie, Elisabeth; Huynh, Ngan; Miller, William H.; Koller, Daniel L.; Alam, Imranul; Falak, Samreen; Osborne-Pellegrin, Mary; Martinez-Membrives, Esther; Canete, Toni; Blazquez, Gloria; Vicens-Costa, Elia; Mont-Cardona, Carme; Diaz-Moran, Sira; Tobena, Adolf; Hummel, Oliver; Zelenika, Diana; Saar, Kathrin; Patone, Giannino; Bauerfeind, Anja; Bihoreau, Marie-Therese; Heinig, Matthias; Lee, Young-Ae; Rintisch, Carola; Schulz, Herbert; Wheeler, David A.; Worley, Kim C.; Muzny, Donna M.; Gibbs, Richard A.; Lathrop, Mark; Lansu, Nico; Toonen, Pim; Ruzius, Frans Paul; de Bruijn, Ewart; Hauser, Heidi; Adams, David J.; Keane, Thomas; Atanur, Santosh S.; Aitman, Tim J.; Flicek, Paul; Malinauskas, Tomas; Jones, E. Yvonne; Ekman, Diana; Lopez-Aumatell, Regina; Dominiczak, Anna F; Johannesson, Martina; Holmdahl, Rikard; Olsson, Tomas; Gauguier, Dominique; Hubner, Norbert; Fernandez-Teruel, Alberto; Cuppen, Edwin; Mott, Richard; Flint, Jonathan

    2013-01-01

    Genetic mapping on fully sequenced individuals is transforming our understanding of the relationship between molecular variation and variation in complex traits. Here we report a combined sequence and genetic mapping analysis in outbred rats that maps 355 quantitative trait loci for 122 phenotypes. We identify 35 causal genes involved in 31 phenotypes, implicating novel genes in models of anxiety, heart disease and multiple sclerosis. The relation between sequence and genetic variation is unexpectedly complex: at approximately 40% of quantitative trait loci a single sequence variant cannot account for the phenotypic effect. Using comparable sequence and mapping data from mice, we show the extent and spatial pattern of variation in inbred rats differ significantly from those of inbred mice, and that the genetic variants in orthologous genes rarely contribute to the same phenotype in both species. PMID:23708188

  4. Combined sequence-based and genetic mapping analysis of complex traits in outbred rats.

    Science.gov (United States)

    Baud, Amelie; Hermsen, Roel; Guryev, Victor; Stridh, Pernilla; Graham, Delyth; McBride, Martin W; Foroud, Tatiana; Calderari, Sophie; Diez, Margarita; Ockinger, Johan; Beyeen, Amennai D; Gillett, Alan; Abdelmagid, Nada; Guerreiro-Cacais, Andre Ortlieb; Jagodic, Maja; Tuncel, Jonatan; Norin, Ulrika; Beattie, Elisabeth; Huynh, Ngan; Miller, William H; Koller, Daniel L; Alam, Imranul; Falak, Samreen; Osborne-Pellegrin, Mary; Martinez-Membrives, Esther; Canete, Toni; Blazquez, Gloria; Vicens-Costa, Elia; Mont-Cardona, Carme; Diaz-Moran, Sira; Tobena, Adolf; Hummel, Oliver; Zelenika, Diana; Saar, Kathrin; Patone, Giannino; Bauerfeind, Anja; Bihoreau, Marie-Therese; Heinig, Matthias; Lee, Young-Ae; Rintisch, Carola; Schulz, Herbert; Wheeler, David A; Worley, Kim C; Muzny, Donna M; Gibbs, Richard A; Lathrop, Mark; Lansu, Nico; Toonen, Pim; Ruzius, Frans Paul; de Bruijn, Ewart; Hauser, Heidi; Adams, David J; Keane, Thomas; Atanur, Santosh S; Aitman, Tim J; Flicek, Paul; Malinauskas, Tomas; Jones, E Yvonne; Ekman, Diana; Lopez-Aumatell, Regina; Dominiczak, Anna F; Johannesson, Martina; Holmdahl, Rikard; Olsson, Tomas; Gauguier, Dominique; Hubner, Norbert; Fernandez-Teruel, Alberto; Cuppen, Edwin; Mott, Richard; Flint, Jonathan

    2013-07-01

    Genetic mapping on fully sequenced individuals is transforming understanding of the relationship between molecular variation and variation in complex traits. Here we report a combined sequence and genetic mapping analysis in outbred rats that maps 355 quantitative trait loci for 122 phenotypes. We identify 35 causal genes involved in 31 phenotypes, implicating new genes in models of anxiety, heart disease and multiple sclerosis. The relationship between sequence and genetic variation is unexpectedly complex: at approximately 40% of quantitative trait loci, a single sequence variant cannot account for the phenotypic effect. Using comparable sequence and mapping data from mice, we show that the extent and spatial pattern of variation in inbred rats differ substantially from those of inbred mice and that the genetic variants in orthologous genes rarely contribute to the same phenotype in both species.

  5. Integrated and sequence-ordered BAC- and YAC-based physical maps for the rat genome.

    Science.gov (United States)

    Krzywinski, Martin; Wallis, John; Gösele, Claudia; Bosdet, Ian; Chiu, Readman; Graves, Tina; Hummel, Oliver; Layman, Dan; Mathewson, Carrie; Wye, Natasja; Zhu, Baoli; Albracht, Derek; Asano, Jennifer; Barber, Sarah; Brown-John, Mabel; Chan, Susanna; Chand, Steve; Cloutier, Alison; Davito, Jonathon; Fjell, Chris; Gaige, Tony; Ganten, Detlev; Girn, Noreen; Guggenheimer, Kurtis; Himmelbauer, Heinz; Kreitler, Thomas; Leach, Stephen; Lee, Darlene; Lehrach, Hans; Mayo, Michael; Mead, Kelly; Olson, Teika; Pandoh, Pawan; Prabhu, Anna-Liisa; Shin, Heesun; Tänzer, Simone; Thompson, Jason; Tsai, Miranda; Walker, Jason; Yang, George; Sekhon, Mandeep; Hillier, LaDeana; Zimdahl, Heike; Marziali, Andre; Osoegawa, Kazutoyo; Zhao, Shaying; Siddiqui, Asim; de Jong, Pieter J; Warren, Wes; Mardis, Elaine; McPherson, John D; Wilson, Richard; Hübner, Norbert; Jones, Steven; Marra, Marco; Schein, Jacqueline

    2004-04-01

    As part of the effort to sequence the genome of Rattus norvegicus, we constructed a physical map comprised of fingerprinted bacterial artificial chromosome (BAC) clones from the CHORI-230 BAC library. These BAC clones provide approximately 13-fold redundant coverage of the genome and have been assembled into 376 fingerprint contigs. A yeast artificial chromosome (YAC) map was also constructed and aligned with the BAC map via fingerprinted BAC and P1 artificial chromosome clones (PACs) sharing interspersed repetitive sequence markers with the YAC-based physical map. We have annotated 95% of the fingerprint map clones in contigs with coordinates on the version 3.1 rat genome sequence assembly, using BAC-end sequences and in silico mapping methods. These coordinates have allowed anchoring 358 of the 376 fingerprint map contigs onto the sequence assembly. Of these, 324 contigs are anchored to rat genome sequences localized to chromosomes, and 34 contigs are anchored to unlocalized portions of the rat sequence assembly. The remaining 18 contigs, containing 54 clones, still require placement. The fingerprint map is a high-resolution integrative data resource that provides genome-ordered associations among BAC, YAC, and PAC clones and the assembled sequence of the rat genome.

  6. Micro-motion Recognition of Spatial Cone Target Based on ISAR Image Sequences

    Directory of Open Access Journals (Sweden)

    Changyong Shu

    2016-04-01

    Full Text Available The accurate micro-motions recognition of spatial cone target is the foundation of the characteristic parameter acquisition. For this reason, a micro-motion recognition method based on the distinguishing characteristics extracted from the Inverse Synthetic Aperture Radar (ISAR sequences is proposed in this paper. The projection trajectory formula of cone node strong scattering source and cone bottom slip-type strong scattering sources, which are located on the spatial cone target, are deduced under three micro-motion types including nutation, precession, and spinning, and the correctness is verified by the electromagnetic simulation. By comparison, differences are found among the projection of the scattering sources with different micro-motions, the coordinate information of the scattering sources in the Inverse Synthetic Aperture Radar sequences is extracted by the CLEAN algorithm, and the spinning is recognized by setting the threshold value of Doppler. The double observation points Interacting Multiple Model Kalman Filter is used to separate the scattering sources projection of the nutation target or precession target, and the cross point number of each scattering source’s projection track is used to classify the nutation or precession. Finally, the electromagnetic simulation data are used to verify the effectiveness of the micro-motion recognition method.

  7. Wind Farm Dynamic Equivalence Based on the Wind Turbine Output Active Power Sequence Clustering

    Directory of Open Access Journals (Sweden)

    Zhang Ge

    2016-01-01

    Full Text Available In order to reduce the complexity of simulation model containing wind farms in the context of keeping the accuracy static, this paper put forward a kind of Dynamic Equivalence method aiming at making output characteristic of the connecting point of wind farm consistent. Based on the output power sequence of wind turbines, geometric template matching algorithm is used to obtain the characteristic of that power sequence and then Attribute Threshold Clustering Algorithm is used to classify wind turbine. In each cluster, the parameter of wind turbine is made equal according to the principle of constant power output character and then be distinguished according to AMPSO. At last, this paper takes a practical wind farm as an example and respectively simulates the conditions of fault of system side and variation of wind speed, which is used in comparing the output characteristic of detailed model and Equivalent model. Results show that the output characteristic of the connecting point of wind farm keeps consistent after equivalent and that the Clustering Algorithm can reflect the operating characteristics of the wind turbine in the whole moment of any time period. It can also be saw that Equivalent method is reasonable and effective, which has certain value in engineering application.

  8. Approach for moving small target detection in infrared image sequence based on reinforcement learning

    Science.gov (United States)

    Wang, Chuanyun; Qin, Shiyin

    2016-09-01

    Addressing the problems of moving small target detection in infrared image sequence caused by background clutter and target size variation with time, an approach for moving small target detection is proposed under a pipeline framework with an optimization strategy based on reinforcement learning. The pipeline framework is composed by pipeline establishment, target-background images separation, and target confirmation, in which the pipeline is established by designating several successive images with temporal sliding window, target-background images separation is dealt with low-rank and sparse matrix decomposition via robust principal component analysis, and target confirmation is achieved by employing a voting mechanism over more than one separated target images of the same input image. For unremitting optimization of target-background images separation, the weighting parameter of low-rank and sparse matrix decomposition is dynamically regulated by the way of reinforcement learning in consecutive detection, in which the complexity evaluation from sequential infrared images and results assessment of moving small target detection are integrated. The experiment results over four infrared small target image sequences with different cloudy sky backgrounds demonstrate the effectiveness and advantages of the proposed approach in both background clutter suppression and small target detection.

  9. Phylogeny of the Zygomycota based on nuclear ribosomal sequence data.

    Science.gov (United States)

    White, Merlin M; James, Timothy Y; O'Donnell, Kerry; Cafaro, Matías J; Tanabe, Yuuhiko; Sugiyama, Junta

    2006-01-01

    The Zygomycota is an ecologically heterogenous assemblage of nonzoosporic fungi comprising two classes, Zygomycetes and Trichomycetes. Phylogenetic analyses have suggested that the phylum is polyphyletic; two of four orders of Trichomycetes are related to the Mesomycetozoa (protists) that diverged near the fungal/animal split. Current circumscription of the Zygomycota includes only orders with representatives that produce zygospores. We present a molecular-based phylogeny including recognized representatives of the Zygomycetes and Trichomycetes with a combined dataset for nuclear rRNA 18S (SSU), 5.8S and 28S (LSU) genes. Tree reconstruction by Bayesian analyses suggests the Zygomycota is paraphyletic. Although 12 clades were identified only some of these correspond to the nine orders of Zygomycota currently recognized. A large superordinal clade, comprising the Dimargaritales, Harpellales, Kickxellales and Zoopagales, grouping together many symbiotic fungi, also is identified in part by a unique septal structure. Although Harpellales and Kickxellales are not monophyletic, these lineages are distinct from the Mucorales, Endogonales and Mortierellales, which appear more closely related to the Ascomycota + Basidiomycota + Glomeromycota. The final major group, the insect-associated Entomophthorales, appears to be polyphyletic. In the present analyses Basidiobolus and Neozygites group within Zygomycota but not with the Entomophthorales. Clades are discussed with special reference to traditional classifications, mapping morphological characters and ecology, where possible, as a snapshot of our current phylogenetic perspective of the Zygomycota.

  10. Sequence context of indel mutations and their effect on protein evolution in a bacterial endosymbiont.

    Science.gov (United States)

    Williams, Laura E; Wernegreen, Jennifer J

    2013-01-01

    Indel mutations play key roles in genome and protein evolution, yet we lack a comprehensive understanding of how indels impact evolutionary processes. Genome-wide analyses enabled by next-generation sequencing can clarify the context and effect of indels, thereby integrating a more detailed consideration of indels with our knowledge of nucleotide substitutions. To this end, we sequenced Blochmannia chromaiodes, an obligate bacterial endosymbiont of carpenter ants, and compared it with the close relative, B. pennsylvanicus. The genetic distance between these species is small enough for accurate whole genome alignment but large enough to provide a meaningful spectrum of indel mutations. We found that indels are subjected to purifying selection in coding regions and even intergenic regions, which show a reduced rate of indel base pairs per kilobase compared with nonfunctional pseudogenes. Indels occur almost exclusively in repeat regions composed of homopolymers and multimeric simple sequence repeats, demonstrating the importance of sequence context for indel mutations. Despite purifying selection, some indels occur in protein-coding genes. Most are multiples of three, indicating selective pressure to maintain the reading frame. The deleterious effect of frameshift-inducing indels is minimized by either compensation from a nearby indel to restore reading frame or the indel's location near the 3'-end of the gene. We observed amino acid divergence exceeding nucleotide divergence in regions affected by frameshift-inducing indels, suggesting that these indels may either drive adaptive protein evolution or initiate gene degradation. Our results shed light on how indel mutations impact processes of molecular evolution underlying endosymbiont genome evolution.

  11. Efficient Simulation of Quantum States Based on Classical Fields Modulated with Pseudorandom Phase Sequences

    CERN Document Server

    Fu, Jian

    2010-01-01

    We demonstrate that a tensor product structure could be obtained by introducing pseudorandom phase sequences into classical fields with two orthogonal modes. Using classical fields modulated with pseudorandom phase sequences, we discuss efficient simulation of several typical quantum states, including product state, Bell states, GHZ state, and W state. By performing quadrature demodulation scheme, we could obtain the mode status matrix of the simulating classical fields, based on which we propose a sequence permutation mechanism to reconstruct the simulated quantum states. The research on classical simulation of quantum states is important, for it not only enables potential practical applications in quantum computation, but also provides useful insights into fundamental concepts of quantum mechanics.

  12. PRIMAL: Fast and accurate pedigree-based imputation from sequence data in a founder population.

    Directory of Open Access Journals (Sweden)

    Oren E Livne

    2015-03-01

    Full Text Available Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm, a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL incorporates both existing and original ideas, such as a novel indexing strategy of Identity-By-Descent (IBD segments based on clique graphs. We were able to impute the genomes of 1,317 South Dakota Hutterites, who had genome-wide genotypes for ~300,000 common single nucleotide variants (SNVs, from 98 whole genome sequences. Using a combination of pedigree-based and LD-based imputation, we were able to assign 87% of genotypes with >99% accuracy over the full range of allele frequencies. Using the IBD cliques we were also able to infer the parental origin of 83% of alleles, and genotypes of deceased recent ancestors for whom no genotype information was available. This imputed data set will enable us to better study the relative contribution of rare and common variants on human phenotypes, as well as parental origin effect of disease risk alleles in >1,000 individuals at minimal cost.

  13. Performance of Correspondence Algorithms in Vision-Based Driver Assistance Using an Online Image Sequence Database

    DEFF Research Database (Denmark)

    Klette, Reinhard; Krüger, Norbert; Vaudrey, Tobi;

    2011-01-01

    This paper discusses options for testing correspondence algorithms in stereo or motion analysis that are designed or considered for vision-based driver assistance. It introduces a globally available database, with a main focus on testing on video sequences of real-world data. We suggest the class...... rankings of correspondence techniques on sets of basic sequences that show different situations. It is suggested that correspondence techniques should adaptively be chosen in real time using some type of statistical situation classifiers....

  14. Is the P3 amplitude reduction seen in externalizing psychopathology attributable to stimulus sequence effects?

    Science.gov (United States)

    Gilmore, Casey S; Malone, Stephen M; Iacono, William G

    2012-02-01

    P3 amplitude reduction (P3-AR) is associated with biological vulnerability to a spectrum of externalizing (EXT) disorders, such as conduct disorder, antisocial behavior, and substance use disorders. P3 amplitude, however, can be affected by the context within which it is measured, for example, by the position of the target in the sequence of stimuli during an oddball task. We hypothesized that EXT-related P3-AR may be due to attention or working memory deficits in EXT that would weaken these stimulus sequence effects. Using a community-based sample of adolescent males, we examined the relationship between P3 and EXT as a function of the number of standards preceding the target. Higher EXT was associated with significantly smaller P3 amplitude, regardless of the number of standards preceding the target. These results suggest that P3-AR in EXT does not vary as a function of stimulus sequence, further supporting P3-AR as an endophenotype for EXT disorders.

  15. Sequence complexity effects on speech production in healthy speakers and speakers with hypokinetic or ataxic dysarthria.

    Directory of Open Access Journals (Sweden)

    Kevin J Reilly

    Full Text Available The present study investigated the effects of sequence complexity, defined in terms of phonemic similarity and phonotoactic probability, on the timing and accuracy of serial ordering for speech production in healthy speakers and speakers with either hypokinetic or ataxic dysarthria. Sequences were comprised of strings of consonant-vowel (CV syllables with each syllable containing the same vowel, /a/, paired with a different consonant. High complexity sequences contained phonemically similar consonants, and sounds and syllables that had low phonotactic probabilities; low complexity sequences contained phonemically dissimilar consonants and high probability sounds and syllables. Sequence complexity effects were evaluated by analyzing speech error rates and within-syllable vowel and pause durations. This analysis revealed that speech error rates were significantly higher and speech duration measures were significantly longer during production of high complexity sequences than during production of low complexity sequences. Although speakers with dysarthria produced longer overall speech durations than healthy speakers, the effects of sequence complexity on error rates and speech durations were comparable across all groups. These findings indicate that the duration and accuracy of processes for selecting items in a speech sequence is influenced by their phonemic similarity and/or phonotactic probability. Moreover, this robust complexity effect is present even in speakers with damage to subcortical circuits involved in serial control for speech.

  16. Sequence complexity effects on speech production in healthy speakers and speakers with hypokinetic or ataxic dysarthria.

    Science.gov (United States)

    Reilly, Kevin J; Spencer, Kristie A

    2013-01-01

    The present study investigated the effects of sequence complexity, defined in terms of phonemic similarity and phonotoactic probability, on the timing and accuracy of serial ordering for speech production in healthy speakers and speakers with either hypokinetic or ataxic dysarthria. Sequences were comprised of strings of consonant-vowel (CV) syllables with each syllable containing the same vowel, /a/, paired with a different consonant. High complexity sequences contained phonemically similar consonants, and sounds and syllables that had low phonotactic probabilities; low complexity sequences contained phonemically dissimilar consonants and high probability sounds and syllables. Sequence complexity effects were evaluated by analyzing speech error rates and within-syllable vowel and pause durations. This analysis revealed that speech error rates were significantly higher and speech duration measures were significantly longer during production of high complexity sequences than during production of low complexity sequences. Although speakers with dysarthria produced longer overall speech durations than healthy speakers, the effects of sequence complexity on error rates and speech durations were comparable across all groups. These findings indicate that the duration and accuracy of processes for selecting items in a speech sequence is influenced by their phonemic similarity and/or phonotactic probability. Moreover, this robust complexity effect is present even in speakers with damage to subcortical circuits involved in serial control for speech.

  17. Amplicon-based metagenomic analysis of mixed fungal samples using proton release amplicon sequencing.

    Directory of Open Access Journals (Sweden)

    Daniel P Tonge

    Full Text Available Next generation sequencing technology has revolutionised microbiology by allowing concurrent analysis of whole microbial communities. Here we developed and verified similar methods for the analysis of fungal communities using a proton release sequencing platform with the ability to sequence reads of up to 400 bp in length at significant depth. This read length permits the sequencing of amplicons from commonly used fungal identification regions and thereby taxonomic classification. Using the 400 bp sequencing capability, we have sequenced amplicons from the ITS1, ITS2 and LSU fungal regions to a depth of approximately 700,000 raw reads per sample. Representative operational taxonomic units (OTUs were chosen by the USEARCH algorithm, and identified taxonomically through nucleotide blast (BLASTn. Combination of this sequencing technology with the bioinformatics pipeline allowed species recognition in two controlled fungal spore populations containing members of known identity and concentration. Each species included within the two controlled populations was found to correspond to a representative OTU, and these OTUs were found to be highly accurate representations of true biological sequences. However, the absolute number of reads attributed to each OTU differed among species. The majority of species were represented by an OTU derived from all three genomic regions although in some cases, species were only represented in two of the regions due to the absence of conserved primer binding sites or due to sequence composition. It is apparent from our data that proton release sequencing technologies can deliver a qualitative assessment of the fungal members comprising a sample. The fact that some fungi cannot be amplified by specific "conserved" primer pairs confirms our recommendation that a multi-region approach be taken for other amplicon-based metagenomic studies.

  18. [Clinical Application of Extraction and Analysis of the Key Frames Based on IVUS Sequences].

    Science.gov (United States)

    Mao, Haiqun; Yang, Feng; Huang, Zheng; Cui, Kai; Wang, Xinxin

    2015-08-01

    In this paper, we propose an image-based key frame gating method to reduce motion artifacts in intravascular ultrasound (IVUS) longitudinal cuts. The artifacts are mainly caused by the periodic relative displacement between blood vessels and the IVUS catheter due to cardiac motion. The method is achieved in four steps as following. Firstly, we convert IVUS image sequences to polar coordinates to cut down the amount of calculation. Secondly, we extracted a one-dimensional signal cluster reflecting cardiac motion by spectral analysis and filtering techniques. Thirdly, we designed a Butterworth band-pass filter for filtering the one-dimensional signal clusters. Fourthly, we retrieved the extremes of the filtered signal clusters to seek key frames to compose key-frames gated sequences. Experimental results showed that our algorithm was fast and the average frame processing time was 17ms. Observing the longitudinal viewpictures, we found that comparing to the original ones, the gated sequences had similar trend, less saw tooth shape, and good continuity. We selected 12 groups of clinical IVUS sequences [images (876 +/- 65 frames), coronary segments length (14.61 +/- 1.08 mm)] to calculate vessel volume, lumen volume, mean plaque burden of the original and gated sequences. Statistical results showed that, on one hand, both vessel volume and lumen volume measured of the gated sequences were significantly smaller than those of the original ones, and there was no significant difference on mean plaque burden between original and gated sequences, which met the need of the clinical diagnosis and treatment. On the other hand, variances of vessel area and lumen area of the gated sequences were significantly smaller than those of the original sequences, indicating that the gated sequences would be more stable than the original ones.

  19. SDT: a virus classification tool based on pairwise sequence alignment and identity calculation.

    Directory of Open Access Journals (Sweden)

    Brejnev Muhizi Muhire

    Full Text Available The perpetually increasing rate at which viral full-genome sequences are being determined is creating a pressing demand for computational tools that will aid the objective classification of these genome sequences. Taxonomic classification approaches that are based on pairwise genetic identity measures are potentially highly automatable and are progressively gaining favour with the International Committee on Taxonomy of Viruses (ICTV. There are, however, various issues with the calculation of such measures that could potentially undermine the accuracy and consistency with which they can be applied to virus classification. Firstly, pairwise sequence identities computed based on multiple sequence alignments rather than on multiple independent pairwise alignments can lead to the deflation of identity scores with increasing dataset sizes. Also, when gap-characters need to be introduced during sequence alignments to account for insertions and deletions, methodological variations in the way that these characters are introduced and handled during pairwise genetic identity calculations can cause high degrees of inconsistency in the way that different methods classify the same sets of sequences. Here we present Sequence Demarcation Tool (SDT, a free user-friendly computer program that aims to provide a robust and highly reproducible means of objectively using pairwise genetic identity calculations to classify any set of nucleotide or amino acid sequences. SDT can produce publication quality pairwise identity plots and colour-coded distance matrices to further aid the classification of sequences according to ICTV approved taxonomic demarcation criteria. Besides a graphical interface version of the program for Windows computers, command-line versions of the program are available for a variety of different operating systems (including a parallel version for cluster computing platforms.

  20. The Effect of Pre-Main Sequence Stars on Star Cluster Dynamics

    CERN Document Server

    Wiersma, R; Zwart, S P

    2006-01-01

    We investigate the effects of the addition of pre-main sequence evolution to star cluster simulations. We allowed stars to follow pre-main sequence tracks that begin at the deuterium burning birthline and end at the zero age main sequence. We compared our simulations to ones in which the stars began their lives at the zero age main sequence, and also investigated the effects of particular choices for initial binary orbital parameters. We find that the inclusion of the pre-main sequence phase results in a slightly higher core concentration, lower binary fraction, and fewer hard binary systems. In general, the global properties of star clusters remain almost unchanged, but the properties of the binary star population in the cluster can be dramatically modified by the correct treatment of the pre-main sequence stage.

  1. High Interlaboratory Reprocucibility of DNA Sequence-based Typing of Bacteria in a Multicenter Study

    DEFF Research Database (Denmark)

    Sousa, MA de; Boye, Kit; Lencastre, H de;

    2006-01-01

    Current DNA amplification-based typing methods for bacterial pathogens often lack interlaboratory reproducibility. In this international study, DNA sequence-based typing of the Staphylococcus aureus protein A gene (spa, 110 to 422 bp) showed 100% intra- and interlaboratory reproducibility without...

  2. OrchidBase: a collection of sequences of the transcriptome derived from orchids.

    Science.gov (United States)

    Fu, Chih-Hsiung; Chen, Yun-Wen; Hsiao, Yu-Yun; Pan, Zhao-Jun; Liu, Zhong-Jian; Huang, Yueh-Min; Tsai, Wen-Chieh; Chen, Hong-Hwa

    2011-02-01

    Orchids are one of the most ecological and evolutionarily significant plants, and the Orchidaceae is one of the most abundant families of the angiosperms. Genetic databases will be useful not only for gene discovery but also for future genomic annotation. For this purpose, OrchidBase was established from 37,979,342 sequence reads collected from 11 in-house Phalaenopsis orchid cDNA libraries. Among them, 41,310 expressed sequence tags (ESTs) were obtained by using Sanger sequencing, whereas 37,908,032 reads were obtained by using next-generation sequencing (NGS) including both Roche 454 and Solexa Illumina sequencers. These reads were assembled into 8,501 contigs and 76,116 singletons, resulting in 84,617 non-redundant transcribed sequences with an average length of 459 bp. The analysis pipeline of the database is an automated system written in Perl and C#, and consists of the following components: automatic pre-processing of EST reads, assembly of raw sequences, annotation of the assembled sequences and storage of the analyzed information in SQL databases. A web application was implemented with HTML and a Microsoft .NET Framework C# program for browsing and querying the database, creating dynamic web pages on the client side, analyzing gene ontology (GO) and mapping annotated enzymes to KEGG pathways. The online resources for putative annotation can be searched either by text or by using BLAST, and the results can be explored on the website and downloaded. Consequently, the establishment of OrchidBase will provide researchers with a high-quality genetic resource for data mining and facilitate efficient experimental studies on orchid biology and biotechnology. The OrchidBase database is freely available at http://lab.fhes.tn.edu.tw/est.

  3. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43% in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97 and lower for avian species (0.70. PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  4. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Science.gov (United States)

    Bertolini, Francesca; Ghionda, Marco Ciro; D'Alessandro, Enrico; Geraci, Claudia; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine) for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon) as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43%) in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97) and lower for avian species (0.70). PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  5. Sequence-based identification of microbial contaminants in non-parenteral products

    Directory of Open Access Journals (Sweden)

    Rajapandi Senthilraj

    Full Text Available ABSTRACT Phenotypic profiles for microbial identification are unusual for rare, slow-growing and fastidious microorganisms. In the last decade, as a result of the widespread use of PCR and DNA sequencing, 16S rRNA sequencing has played a pivotal role in the accurate identification of microorganisms and the discovery of novel isolates in microbiology laboratories. The 16S rRNA region is universally distributed among microorganisms and is species-specific. Accordingly, the aim of our study was the genotypic identification of microorganisms isolated from non-parenteral pharmaceutical formulations. DNA was separated from five isolates obtained from the formulations. The target regions of the rRNA genes were amplified by PCR and sequenced using suitable primers. The sequence data were analyzed and aligned in the order of increasing genetic distance to relevant sequences against a library database to achieve an identity match. The DNA sequences of the phylogenetic tree results confirmed the identity of the isolates as Bacillus tequilensis, B. subtilis, Staphylococcus haemolyticus and B. amyloliqueficians. It can be concluded that 16S rRNA sequence-based identification reduces the time by circumventing biochemical tests and also increases specificity and accuracy.

  6. Quasi-Coherent Noise Jamming to LFM Radar Based on Pseudo-random Sequence Phase-modulation

    OpenAIRE

    2015-01-01

    A novel quasi-coherent noise jamming method is proposed against linear frequency modulation (LFM) signal and pulse compression radar. Based on the structure of digital radio frequency memory (DRFM), the jamming signal is acquired by the pseudo-random sequence phase-modulation of sampled radar signal. The characteristic of jamming signal in time domain and frequency domain is analyzed in detail. Results of ambiguity function indicate that the blanket jamming effect along the range direction wi...

  7. Sequence-non-specific effects generated by various types of RNA interference triggers.

    Science.gov (United States)

    Olejniczak, Marta; Urbanek, Martyna O; Jaworska, Edyta; Witucki, Lukasz; Szczesniak, Michal W; Makalowska, Izabela; Krzyzosiak, Wlodzimierz J

    2016-02-01

    RNA interference triggers such as short interfering RNA (siRNA) or genetically encoded short hairpin RNA (shRNA) and artificial miRNA (sh-miR) are widely used to silence the expression of specific genes. In addition to silencing selected targets, RNAi reagents may induce various side effects, including immune responses. To determine the molecular markers of immune response activation when using RNAi reagents, we analyzed the results of experiments gathered in the RNAimmuno (v 2.0) and GEO Profiles databases. To better characterize and compare cellular responses to various RNAi reagents in one experimental system, we designed a reagent series in corresponding siRNA, D-siRNA, shRNA and sh-miR forms. To exclude sequence-specific effects the reagents targeted 3 different transcripts (Luc, ATXN3 and HTT). We demonstrate that RNAi reagents induce a broad variety of sequence-non-specific effects, including the deregulation of cellular miRNA levels. Typical siRNAs are weak stimulators of interferon response but may saturate the miRNA biogenesis pathway, leading to the downregulation of highly expressed miRNAs, whereas plasmid-based reagents induce known markers of immune response and may alter miRNA levels and their isomiR composition.

  8. SiRNA sequence model: redesign algorithm based on available genome-wide libraries.

    Science.gov (United States)

    Kozak, Karol

    2013-12-01

    The evolution of RNA interference (RNAi) and the development of technologies exploiting its biology have enabled scientists to rapidly examine the consequences of depleting a particular gene product in cells. Design tools have been developed based on experimental data to increase the knockdown efficiency of siRNAs. Not all siRNAs that are developed to a given target mRNA are equally effective. Currently available design algorithms take an accession, identify conserved regions among their transcript space, find accessible regions within the mRNA, design all possible siRNAs for these regions, filter them based on multi-scores thresholds, and then perform off-target filtration. These different criteria are used by commercial suppliers to produce siRNA genome-wide libraries for different organisms. In this article, we analyze existing siRNA design algorithms and evaluate weight of design parameters for libraries produced in the last decade. We proved that not all essential parameters are currently applied by siRNA vendors. Based on our evaluation results, we were able to suggest an siRNA sequence pattern. The findings in our study can be useful for commercial vendors improving the design of RNAi constructs, by addressing both the issue of potency and the issue of specificity.

  9. High Throughput Sample Preparation and Analysis for DNA Sequencing, PCR and Combinatorial Screening of Catalysis Based on Capillary Array Technique

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Yonghua [Iowa State Univ., Ames, IA (United States)

    2000-01-01

    Sample preparation has been one of the major bottlenecks for many high throughput analyses. The purpose of this research was to develop new sample preparation and integration approach for DNA sequencing, PCR based DNA analysis and combinatorial screening of homogeneous catalysis based on multiplexed capillary electrophoresis with laser induced fluorescence or imaging UV absorption detection. The author first introduced a method to integrate the front-end tasks to DNA capillary-array sequencers. protocols for directly sequencing the plasmids from a single bacterial colony in fused-silica capillaries were developed. After the colony was picked, lysis was accomplished in situ in the plastic sample tube using either a thermocycler or heating block. Upon heating, the plasmids were released while chromsomal DNA and membrane proteins were denatured and precipitated to the bottom of the tube. After adding enzyme and Sanger reagents, the resulting solution was aspirated into the reaction capillaries by a syringe pump, and cycle sequencing was initiated. No deleterious effect upon the reaction efficiency, the on-line purification system, or the capillary electrophoresis separation was observed, even though the crude lysate was used as the template. Multiplexed on-line DNA sequencing data from 8 parallel channels allowed base calling up to 620 bp with an accuracy of 98%. The entire system can be automatically regenerated for repeated operation. For PCR based DNA analysis, they demonstrated that capillary electrophoresis with UV detection can be used for DNA analysis starting from clinical sample without purification. After PCR reaction using cheek cell, blood or HIV-1 gag DNA, the reaction mixtures was injected into the capillary either on-line or off-line by base stacking. The protocol was also applied to capillary array electrophoresis. The use of cheaper detection, and the elimination of purification of DNA sample before or after PCR reaction, will make this approach an

  10. High Throughput Sample Preparation and Analysis for DNA Sequencing, PCR and Combinatorial Screening of Catalysis Based on Capillary Array Technique

    Energy Technology Data Exchange (ETDEWEB)

    Yonghua Zhang

    2002-05-27

    Sample preparation has been one of the major bottlenecks for many high throughput analyses. The purpose of this research was to develop new sample preparation and integration approach for DNA sequencing, PCR based DNA analysis and combinatorial screening of homogeneous catalysis based on multiplexed capillary electrophoresis with laser induced fluorescence or imaging UV absorption detection. The author first introduced a method to integrate the front-end tasks to DNA capillary-array sequencers. protocols for directly sequencing the plasmids from a single bacterial colony in fused-silica capillaries were developed. After the colony was picked, lysis was accomplished in situ in the plastic sample tube using either a thermocycler or heating block. Upon heating, the plasmids were released while chromsomal DNA and membrane proteins were denatured and precipitated to the bottom of the tube. After adding enzyme and Sanger reagents, the resulting solution was aspirated into the reaction capillaries by a syringe pump, and cycle sequencing was initiated. No deleterious effect upon the reaction efficiency, the on-line purification system, or the capillary electrophoresis separation was observed, even though the crude lysate was used as the template. Multiplexed on-line DNA sequencing data from 8 parallel channels allowed base calling up to 620 bp with an accuracy of 98%. The entire system can be automatically regenerated for repeated operation. For PCR based DNA analysis, they demonstrated that capillary electrophoresis with UV detection can be used for DNA analysis starting from clinical sample without purification. After PCR reaction using cheek cell, blood or HIV-1 gag DNA, the reaction mixtures was injected into the capillary either on-line or off-line by base stacking. The protocol was also applied to capillary array electrophoresis. The use of cheaper detection, and the elimination of purification of DNA sample before or after PCR reaction, will make this approach an

  11. Construction of a phylogenetic tree of photosynthetic prokaryotes based on average similarities of whole genome sequences.

    Directory of Open Access Journals (Sweden)

    Soichirou Satoh

    Full Text Available Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.

  12. JiffyNet: a web-based instant protein network modeler for newly sequenced species.

    Science.gov (United States)

    Kim, Eiru; Kim, Hanhae; Lee, Insuk

    2013-07-01

    Revolutionary DNA sequencing technology has enabled affordable genome sequencing for numerous species. Thousands of species already have completely decoded genomes, and tens of thousands more are in progress. Naturally, parallel expansion of the functional parts list library is anticipated, yet genome-level understanding of function also requires maps of functional relationships, such as functional protein networks. Such networks have been constructed for many sequenced species including common model organisms. Nevertheless, the majority of species with sequenced genomes still have no protein network models available. Moreover, biologists might want to obtain protein networks for their species of interest on completion of the genome projects. Therefore, there is high demand for accessible means to automatically construct genome-scale protein networks based on sequence information from genome projects only. Here, we present a public web server, JiffyNet, specifically designed to instantly construct genome-scale protein networks based on associalogs (functional associations transferred from a template network by orthology) for a query species with only protein sequences provided. Assessment of the networks by JiffyNet demonstrated generally high predictive ability for pathway annotations. Furthermore, JiffyNet provides network visualization and analysis pages for wide variety of molecular concepts to facilitate network-guided hypothesis generation. JiffyNet is freely accessible at http://www.jiffynet.org.

  13. Weather data analysis based on typical weather sequence analysis. Application: energy building simulation

    CERN Document Server

    David, Mathieu; Garde, Francois; Boyer, Harry

    2014-01-01

    In building studies dealing about energy efficiency and comfort, simulation software need relevant weather files with optimal time steps. Few tools generate extreme and mean values of simultaneous hourly data including correlation between the climatic parameters. This paper presents the C++ Runeole software based on typical weather sequences analysis. It runs an analysis process of a stochastic continuous multivariable phenomenon with frequencies properties applied to a climatic database. The database analysis associates basic statistics, PCA (Principal Component Analysis) and automatic classifications. Different ways of applying these methods will be presented. All the results are stored in the Runeole internal database that allows an easy selection of weather sequences. The extreme sequences are used for system and building sizing and the mean sequences are used for the determination of the annual cooling loads as proposed by Audrier-Cros (Audrier-Cros, 1984). This weather analysis was tested with the datab...

  14. An Optimal Sorting of Pulse Amplitude Sequence Based on the Phased Array Radar Beam Tasks

    Institute of Scientific and Technical Information of China (English)

    Chuan Sheng∗,Yongshun Zhang; Wenlong Lu

    2016-01-01

    The study of phased array radar ( PAR) pulse amplitude sequence characteristics is the key to understand the radar’s working state and its beam’s scanning manner. According to the principle of antenna pattern formation and the searching and tracking modes of beams, this paper analyzes the characteristics and differences of pulse amplitude sequence when the radar beams work in searching and tracking modes respectively. Then an optimal sorting model of pulse amplitude sequence is established based on least⁃squares and curve⁃fitting methods. This method is helpful for acquiring the current working state of the radar and recognizing its instantaneous beam pointing by sorting the pulse amplitude sequence without the necessity to estimate the antenna pattern.

  15. Fast interactive segmentation algorithm of image sequences based on relative fuzzy connectedness

    Institute of Scientific and Technical Information of China (English)

    Tian Chunna; Gao Xinbo

    2005-01-01

    A fast interactive segmentation algorithm of image-sequences based on relative fuzzy connectedness is presented. In comparison with the original algorithm, the proposed one, with the same accuracy, accelerates the segmentation speed by three times for single image. Meanwhile, this fast segmentation algorithm is extended from single object to multiple objects and from single-image to image-sequences. Thus the segmentation of multiple objects from complex background and batch segmentation of image-sequences can be achieved. In addition, a post-processing scheme is incorporated in this algorithm, which extracts smooth edge with one-pixel-width for each segmented object. The experimental results illustrate that the proposed algorithm can obtain the object regions of interest from medical image or image-sequences as well as man-made images quickly and reliably with only a little interaction.

  16. BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark.

    Science.gov (United States)

    Thompson, Julie D; Koehl, Patrice; Ripp, Raymond; Poch, Olivier

    2005-10-01

    Multiple sequence alignment is one of the cornerstones of modern molecular biology. It is used to identify conserved motifs, to determine protein domains, in 2D/3D structure prediction by homology and in evolutionary studies. Recently, high-throughput technologies such as genome sequencing and structural proteomics have lead to an explosion in the amount of sequence and structure information available. In response, several new multiple alignment methods have been developed that improve both the efficiency and the quality of protein alignments. Consequently, the benchmarks used to evaluate and compare these methods must also evolve. We present here the latest release of the most widely used multiple alignment benchmark, BAliBASE, which provides high quality, manually refined, reference alignments based on 3D structural superpositions. Version 3.0 of BAliBASE includes new, more challenging test cases, representing the real problems encountered when aligning large sets of complex sequences. Using a novel, semiautomatic update protocol, the number of protein families in the benchmark has been increased and representative test cases are now available that cover most of the protein fold space. The total number of proteins in BAliBASE has also been significantly increased from 1444 to 6255 sequences. In addition, full-length sequences are now provided for all test cases, which represent difficult cases for both global and local alignment programs. Finally, the BAliBASE Web site (http://www-bio3d-igbmc.u-strasbg.fr/balibase) has been completely redesigned to provide a more user-friendly, interactive interface for the visualization of the BAliBASE reference alignments and the associated annotations.

  17. Quasi-Coherent Noise Jamming to LFM Radar Based on Pseudo-random Sequence Phase-modulation

    Directory of Open Access Journals (Sweden)

    N. Tai

    2015-12-01

    Full Text Available A novel quasi-coherent noise jamming method is proposed against linear frequency modulation (LFM signal and pulse compression radar. Based on the structure of digital radio frequency memory (DRFM, the jamming signal is acquired by the pseudo-random sequence phase-modulation of sampled radar signal. The characteristic of jamming signal in time domain and frequency domain is analyzed in detail. Results of ambiguity function indicate that the blanket jamming effect along the range direction will be formed when jamming signal passes through the matched filter. By flexible controlling the parameters of interrupted-sampling pulse and pseudo-random sequence, different covering distances and jamming effects will be achieved. When the jamming power is equivalent, this jamming obtains higher process gain compared with non-coherent jamming. The jamming signal enhances the detection threshold and the real target avoids being detected. Simulation results and circuit engineering implementation validate that the jamming signal covers real target effectively.

  18. Genetic characterization of three novel chicken parvovirus strains based on analysis of their coding sequences.

    Science.gov (United States)

    Koo, Bon-Sang; Lee, Hae-Rim; Jeon, Eun-Ok; Han, Moo-Sung; Min, Kyeong-Cheol; Lee, Seung-Baek; Bae, Yeon-Ji; Cho, Sun-Hyung; Mo, Jong-Suk; Kwon, Hyuk Moo; Sung, Haan Woo; Kim, Jong-Nyeo; Mo, In-Pil

    2015-01-01

    Chicken parvovirus (ChPV) is one of the causative agents of viral enteritis. Recently, the genome of the ABU-P1 strain of ChPV was fully sequenced and determined to have a distinct genomic composition compared with that of vertebrate parvoviruses. However, no comparative sequence analysis of coding regions of ChPVs was possible because of the lack of other sequence information. In this study, we obtained the nucleotide sequences of all genomic coding regions of three ChPVs by polymerase chain reaction using 13 primer sets, and deduced the amino acid sequences from the nucleotide sequences. The non-structural protein 1 (NS1) gene of the three ChPVs showed 95.0 to 95.5% nucleotide sequence identity and 96.5 to 98.1% amino acid sequence identity to those of NS1 from the ABU-P1 strain, respectively, and even higher nucleotide and amino acid similarities to one another. The viral proteins (VP) gene was more divergent between the three ChPV Korean strains and ABU-P1, with 88.1 to 88.3% nucleotide identity and 93.0% amino acid identity. Analysis of the putative tertiary structure of the ChPV VP2 protein showed that variable regions with less than 80% nucleotide similarity between the three Korean strains and ABU-P1 occurred in large loops of the VP2 protein believed to be involved in antigenicity, pathogenicity, and tissue tropism in other parvoviruses. Based on our analysis of full-length coding sequences, we discovered greater variation in ChPV strains than reported previously, especially in partial regions of the VP2 protein.

  19. A 2-D graphical representation of protein sequences based on nucleotide triplet codons

    Science.gov (United States)

    Bai, Fenglan; Wang, Tianming

    2005-09-01

    Graphical representation of DNA provides a simple way of viewing, sorting and comparing various gene structures. A 2-D graphical representation of protein sequences based on nucleotide triplet codons has been derived for similarity analysis of protein sequences. This approach is based on a graphical representation of triplets of DNA in which the interior of the left half plane of the complex plane is used to accommodate 64 sites for the 64 codons. We associate a directed curve, numerical value, or matrix with a protein as a descriptor. The approach is illustrated on the Homo sapiens X-linked nuclear protein (ATRX) gene.

  20. State of the art and challenges in sequence based T-cell epitope prediction

    DEFF Research Database (Denmark)

    Lundegaard, Claus; Hoof, Ilka; Lund, Ole

    2010-01-01

    Sequence based T-cell epitope predictions have improved immensely in the last decade. From predictions of peptide binding to major histocompatibility complex molecules with moderate accuracy, limited allele coverage, and no good estimates of the other events in the antigen-processing pathway...... to MHC alleles characterized by limited or no peptide binding data. Most of the developed methods are publicly available, and have proven to be very useful as a shortcut in epitope discovery. Here, we will go through some of the history of sequence-based predictions of helper as well as cytotoxic T cell...

  1. Benefit-of-doubt (BOD) scoring: a sequencing-based method for SNP candidate assessment from high to medium read number data sets.

    Science.gov (United States)

    Sedlazeck, Fritz Joachim; Talloji, Prabhavathi; von Haeseler, Arndt; Bachmair, Andreas

    2013-03-01

    Identification of single nucleotide polymorphisms (SNPs) is a key element in sequence-based genetic analysis. Next generation sequencing offers a cost-effective basis to generate the necessary, large sequence data sets, and bioinformatic methods are being developed to process sequencing machine readouts. We were interested in detection of SNPs in a 350 kb region of an EMS-mutagenized Arabidopsis chromosome 3. The region was selectively analyzed using PCR-generated, overlapping fragments for Solexa sequencing. The ensuing reads provided a high coverage and were processed bioinformatically. In order to assess the SNP candidates obtained with a frequently used alignment program and SNP caller, we developed an additional method that allows the identification of high confidence SNP loci. The method can easily be applied to complete genome sequence data of sufficient coverage.

  2. Prediction of Antimicrobial Peptides Based on Sequence Alignment and Support Vector Machine-Pairwise Algorithm Utilizing LZ-Complexity

    Directory of Open Access Journals (Sweden)

    Xin Yi Ng

    2015-01-01

    Full Text Available This study concerns an attempt to establish a new method for predicting antimicrobial peptides (AMPs which are important to the immune system. Recently, researchers are interested in designing alternative drugs based on AMPs because they have found that a large number of bacterial strains have become resistant to available antibiotics. However, researchers have encountered obstacles in the AMPs designing process as experiments to extract AMPs from protein sequences are costly and require a long set-up time. Therefore, a computational tool for AMPs prediction is needed to resolve this problem. In this study, an integrated algorithm is newly introduced to predict AMPs by integrating sequence alignment and support vector machine- (SVM- LZ complexity pairwise algorithm. It was observed that, when all sequences in the training set are used, the sensitivity of the proposed algorithm is 95.28% in jackknife test and 87.59% in independent test, while the sensitivity obtained for jackknife test and independent test is 88.74% and 78.70%, respectively, when only the sequences that has less than 70% similarity are used. Applying the proposed algorithm may allow researchers to effectively predict AMPs from unknown protein peptide sequences with higher sensitivity.

  3. Pigs in sequence space: A 0.66X coverage pig genome survey based on shotgun sequencing

    DEFF Research Database (Denmark)

    Wernersson, Rasmus; Schierup, M.H.; Jorgensen, F.G.;

    2005-01-01

    sequences (0.66X coverage) from the pig genome. The data are hereby released (NCBI Trace repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project") together with an initial evolutionary analysis. The non-repetitive fraction of the sequences was aligned to the UCSC human...

  4. DNA Sequence Optimization Based on Continuous Particle Swarm Optimization for Reliable DNA Computing and DNA Nanotechnology

    Directory of Open Access Journals (Sweden)

    N. K. Khalid

    2008-01-01

    Full Text Available Problem statement: In DNA based computation and DNA nanotechnology, the design of good DNA sequences has turned out to be an essential problem and one of the most practical and important research topics. Basically, the DNA sequence design problem is a multi-objective problem and it can be evaluated using four objective functions, namely, Hmeasure, similarity, continuity and hairpin. Approach: There are several ways to solve multi-objective problem, however, in order to evaluate the correctness of PSO algorithm in DNA sequence design, this problem is converted into single objective problem. Particle Swarm Optimization (PSO is proposed to minimize the objective in the problem, subjected to two constraints: melting temperature and GCcontent. A model is developed to present the DNA sequence design based on PSO computation. Results: Based on experiments and researches done, 20 particles are used in the implementation of the optimization process, where the average values and the standard deviation for 100 runs are shown along with comparison to other existing methods. Conclusion: The results achieve verified that PSO can suitably solves the DNA sequence design problem using the proposed method and model, comparatively better than other approaches.

  5. A Novel Abundance-Based Algorithm for Binning Metagenomic Sequences Using l-Tuples

    Science.gov (United States)

    Wu, Yu-Wei; Ye, Yuzhen

    Metagenomics is the study of microbial communities sampled directly from their natural environment, without prior culturing. Among the computational tools recently developed for metagenomic sequence analysis, binning tools attempt to classify all (or most) of the sequences in a metagenomic dataset into different bins (i.e., species), based on various DNA composition patterns (e.g., the tetramer frequencies) of various genomes. Composition-based binning methods, however, cannot be used to classify very short fragments, because of the substantial variation of DNA composition patterns within a single genome. We developed a novel approach (AbundanceBin) for metagenomics binning by utilizing the different abundances of species living in the same environment. AbundanceBin is an application of the Lander-Waterman model to metagenomics, which is based on the l-tuple content of the reads. AbundanceBin achieved accurate, unsupervised, clustering of metagenomic sequences into different bins, such that the reads classified in a bin belong to species of identical or very similar abundances in the sample. In addition, AbundanceBin gave accurate estimations of species abundances, as well as their genome sizes - two important parameters for characterizing a microbial community. We also show that AbundanceBin performed well when the sequence lengths are very short (e.g. 75 bp) or have sequencing errors.

  6. Sequencing-based typing reveals new insight in HLA-DPA1 polymorphism.

    Science.gov (United States)

    Rozemuller, E H; Bouwens, A G; van Oort, E; Versluis, L F; Marsh, S G; Bodmer, J G; Tilanus, M G

    1995-01-01

    An HLA-DPA1 sequencing-based typing (SBT) system has been developed to identify DPA1 alleles. Up to now eight DPA1 alleles have been defined. Six can be discriminated based upon exon 2 polymorphism. The three subtypes of DPA1*01: DPA1*0101, DPA1*0102 and DPA1*0103, have identical exon 2 sequences but show differences in exon 4. Exon 4 sequences were known for only the three DPA1*01 subtypes and for DPA1*0201. We now present additional sequence information for exon 4 and the unknown segments at the 3' end of exon 2. Additionally with the use of this sequencing technique it is also possible to identify previously unidentified polymorphism. We have studied the exon 2 and exon 4 polymorphism of DPA1 in 40 samples which include all known DPA1 alleles. A new allele, DPA1*01 new, was identified which differs by one nucleotide in exon 2 from DPA1*0103, resulting in an aspartic acid at codon 28. The DPA1*01 subtypes DPA1*0101 and DPA1*0102 could not be confirmed in samples which previously were used to define these subtypes, and consequently they do not exist. The exon 4 sequence of DPA1*0201 is corrected based on sequence data of DAUDI, the cell line in which DPA1*0202 was originally defined. The exon 4 regions of the remaining four alleles were resolved: the exon 4 regions of the alleles DPA1*02021 and DPA1*02022 were found to be identical to the--corrected--DPA1*0201 whereas the exon 4 region of DPA1*0301 differs by one nucleotide compared to DPA1*0103. The DPA1*0401 exon 4 region differs by one nucleotide compared to the corrected DPA1*0201.(ABSTRACT TRUNCATED AT 250 WORDS)

  7. Quality evaluation of methyl binding domain based kits for enrichment DNA-methylation sequencing.

    Directory of Open Access Journals (Sweden)

    Tim De Meyer

    Full Text Available DNA-methylation is an important epigenetic feature in health and disease. Methylated sequence capturing by Methyl Binding Domain (MBD based enrichment followed by second-generation sequencing provides the best combination of sensitivity and cost-efficiency for genome-wide DNA-methylation profiling. However, existing implementations are numerous, and quality control and optimization require expensive external validation. Therefore, this study has two aims: 1 to identify a best performing kit for MBD-based enrichment using independent validation data, and 2 to evaluate whether quality evaluation can also be performed solely based on the characteristics of the generated sequences. Five commercially available kits for MBD enrichment were combined with Illumina GAIIx sequencing for three cell lines (HCT15, DU145, PC3. Reduced representation bisulfite sequencing data (all three cell lines and publicly available Illumina Infinium BeadChip data (DU145 and PC3 were used for benchmarking. Consistent large-scale differences in yield, sensitivity and specificity between the different kits could be identified, with Diagenode's MethylCap kit as overall best performing kit under the tested conditions. This kit could also be identified with the Fragment CpG-plot, which summarizes the CpG content of the captured fragments, implying that the latter can be used as a tool to monitor data quality. In conclusion, there are major quality differences between kits for MBD-based capturing of methylated DNA, with the MethylCap kit performing best under the used settings. The Fragment CpG-plot is able to monitor data quality based on inherent sequence data characteristics, and is therefore a cost-efficient tool for experimental optimization, but also to monitor quality throughout routine applications.

  8. Experimental design-based functional mining and characterization of high-throughput sequencing data in the sequence read archive.

    Directory of Open Access Journals (Sweden)

    Takeru Nakazato

    Full Text Available High-throughput sequencing technology, also called next-generation sequencing (NGS, has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA. As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called "Gendoo". We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called "DBCLS SRA" (http://sra.dbcls.jp/. This service will improve accessibility to high-quality data from SRA.

  9. Dissecting the roles of local packing density and longer-range effects in protein sequence evolution.

    Science.gov (United States)

    Shahmoradi, Amir; Wilke, Claus O

    2016-06-01

    What are the structural determinants of protein sequence evolution? A number of site-specific structural characteristics have been proposed, most of which are broadly related to either the density of contacts or the solvent accessibility of individual residues. Most importantly, there has been disagreement in the literature over the relative importance of solvent accessibility and local packing density for explaining site-specific sequence variability in proteins. We show that this discussion has been confounded by the definition of local packing density. The most commonly used measures of local packing, such as contact number and the weighted contact number, represent the combined effects of local packing density and longer-range effects. As an alternative, we propose a truly local measure of packing density around a single residue, based on the Voronoi cell volume. We show that the Voronoi cell volume, when calculated relative to the geometric center of amino-acid side chains, behaves nearly identically to the relative solvent accessibility, and each individually can explain, on average, approximately 34% of the site-specific variation in evolutionary rate in a data set of 209 enzymes. An additional 10% of variation can be explained by nonlocal effects that are captured in the weighted contact number. Consequently, evolutionary variation at a site is determined by the combined effects of the immediate amino-acid neighbors of that site and effects mediated by more distant amino acids. We conclude that instead of contrasting solvent accessibility and local packing density, future research should emphasize on the relative importance of immediate contacts and longer-range effects on evolutionary variation. Proteins 2016; 84:841-854. © 2016 Wiley Periodicals, Inc.

  10. An adaptive, object oriented strategy for base calling in DNA sequence analysis.

    Science.gov (United States)

    Giddings, M C; Brumley, R L; Haker, M; Smith, L M

    1993-01-01

    An algorithm has been developed for the determination of nucleotide sequence from data produced in fluorescence-based automated DNA sequencing instruments employing the four-color strategy. This algorithm takes advantage of object oriented programming techniques for modularity and extensibility. The algorithm is adaptive in that data sets from a wide variety of instruments and sequencing conditions can be used with good results. Confidence values are provided on the base calls as an estimate of accuracy. The algorithm iteratively employs confidence determinations from several different modules, each of which examines a different feature of the data for accurate peak identification. Modules within this system can be added or removed for increased performance or for application to a different task. In comparisons with commercial software, the algorithm performed well. Images PMID:8233787

  11. Mitochondrial DNA sequence-based phylogenetic relationship among flesh flies of the genus Sarcophaga (Sarcophagidae: Diptera)

    Indian Academy of Sciences (India)

    Neelam Bajpai; Raghav Ram Tewari

    2010-04-01

    The phylogenetic relationships among flesh flies of the family Sarcophagidae has been based mainly on the morphology of male genitalia. However, the male genitalic character-based relationships are far from satisfactory. Therefore, in the present study mitochondrial DNA has been used as marker to unravel genetic relatedness and to construct phylogeny among five sympatric species of the genus Sarcophaga. Two mitochondrial genes viz., cytochrome oxidase subunit 1 (COI) and NAD dehydrogenase subunit 5 (ND5) were sequenced and genetic distance values were calculated on the basis of sequence differences in both the mitochondrial genes. The data revealed very few genetic difference among the five species for the COI and ND5 gene sequences.

  12. Sequence Comparison Alignment-Free Approach Based on Suffix Tree and L-Words Frequency

    Directory of Open Access Journals (Sweden)

    Inês Soares

    2012-01-01

    Full Text Available The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions. In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L—L-words—in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.

  13. Comparison of serological and sequence-based methods for typing feline calcivirus isolates from vaccine failures.

    Science.gov (United States)

    Radford, A D; Dawson, S; Wharmby, C; Ryvar, R; Gaskell, R M

    2000-01-29

    Feline calicivirus (FCV) can be typed by exploiting antigenic differences between isolates or, more recently, by the sequence analysis of a hypervariable region of the virus's capsid gene. These two methods were used to characterise FCV isolates from 20 vaccine failures which occurred after the use of a commercial, live-attenuated vaccine. Using virus neutralisation, the isolates showed a spectrum of relatedness to the vaccine; depending on the criterion adopted for identity, 10 to 40 per cent of them appeared to be similar to the vaccine virus. Using sequence analysis, the isolates fell into one of two categories; 20 per cent had a similar sequence to the vaccine (0-67 to 2-67 per cent distant), and the remainder had a dissimilar sequence (21-3 to 36-0 per cent distant). Sequence analysis identified one cat that appeared to be infected with two distinct FCVs. The serological and sequence-based typing methods gave the same result in 80 to 95 per cent of individual cases, depending on the criterion adopted for serological identity. It is suggested that molecular typing is a more definitive method for characterising the relatedness of FCV isolates.

  14. Mapping Protein-DNA Interactions Using ChIP-exo and Illumina-Based Sequencing.

    Science.gov (United States)

    Barfeld, Stefan J; Mills, Ian G

    2016-01-01

    Chromatin immunoprecipitation (ChIP) provides a means of enriching DNA associated with transcription factors, histone modifications, and indeed any other proteins for which suitably characterized antibodies are available. Over the years, sequence detection has progressed from quantitative real-time PCR and Southern blotting to microarrays (ChIP-chip) and now high-throughput sequencing (ChIP-seq). This progression has vastly increased the sequence coverage and data volumes generated. This in turn has enabled informaticians to predict the identity of multi-protein complexes on DNA based on the overrepresentation of sequence motifs in DNA enriched by ChIP with a single antibody against a single protein. In the course of the development of high-throughput sequencing, little has changed in the ChIP methodology until recently. In the last three years, a number of modifications have been made to the ChIP protocol with the goal of enhancing the sensitivity of the method and further reducing the levels of nonspecific background sequences in ChIPped samples. In this chapter, we provide a brief commentary on these methodological changes and describe a detailed ChIP-exo method able to generate narrower peaks and greater peak coverage from ChIPped material.

  15. Structure-based identification of new high-affinity nucleosome binding sequences.

    Science.gov (United States)

    Battistini, Federica; Hunter, Christopher A; Moore, Irene K; Widom, Jonathan

    2012-06-29

    The substrate for the proteins that express genetic information in the cell is not naked DNA but an assembly of nucleosomes, where the DNA is wrapped around histone proteins. The organization of these nucleosomes on genomic DNA is influenced by the DNA sequence. Here, we present a structure-based computational approach that translates sequence information into the energy required to bend DNA into a nucleosome-bound conformation. The calculations establish the relationship between DNA sequence and histone octamer binding affinity. In silico selection using this model identified several new DNA sequences, which were experimentally found to have histone octamer affinities comparable to the highest-affinity sequences known. The results provide insights into the molecular mechanism through which DNA sequence information encodes its organization. A quantitative appreciation of the thermodynamics of nucleosome positioning and rearrangement will be one of the key factors in understanding the regulation of transcription and in the design of new promoter architectures for the purposes of tuning gene expression dynamics.

  16. Modeling compositional dynamics based on GC and purine contents of protein-coding sequences

    KAUST Repository

    Zhang, Zhang

    2010-11-08

    Background: Understanding the compositional dynamics of genomes and their coding sequences is of great significance in gaining clues into molecular evolution and a large number of publically-available genome sequences have allowed us to quantitatively predict deviations of empirical data from their theoretical counterparts. However, the quantification of theoretical compositional variations for a wide diversity of genomes remains a major challenge.Results: To model the compositional dynamics of protein-coding sequences, we propose two simple models that take into account both mutation and selection effects, which act differently at the three codon positions, and use both GC and purine contents as compositional parameters. The two models concern the theoretical composition of nucleotides, codons, and amino acids, with no prerequisite of homologous sequences or their alignments. We evaluated the two models by quantifying theoretical compositions of a large collection of protein-coding sequences (including 46 of Archaea, 686 of Bacteria, and 826 of Eukarya), yielding consistent theoretical compositions across all the collected sequences.Conclusions: We show that the compositions of nucleotides, codons, and amino acids are largely determined by both GC and purine contents and suggest that deviations of the observed from the expected compositions may reflect compositional signatures that arise from a complex interplay between mutation and selection via DNA replication and repair mechanisms.Reviewers: This article was reviewed by Zhaolei Zhang (nominated by Mark Gerstein), Guruprasad Ananda (nominated by Kateryna Makova), and Daniel Haft. 2010 Zhang and Yu; licensee BioMed Central Ltd.

  17. DIALIGN-T: An improved algorithm for segment-based multiple sequence alignment

    Directory of Open Access Journals (Sweden)

    Kaufmann Michael

    2005-03-01

    Full Text Available Abstract Background We present a complete re-implementation of the segment-based approach to multiple protein alignment that contains a number of improvements compared to the previous version 2.2 of DIALIGN. This previous version is superior to Needleman-Wunsch-based multi-alignment programs on locally related sequence sets. However, it is often outperformed by these methods on data sets with global but weak similarity at the primary-sequence level. Results In the present paper, we discuss strengths and weaknesses of DIALIGN in view of the underlying objective function. Based on these results, we propose several heuristics to improve the segment-based alignment approach. For pairwise alignment, we implemented a fragment-chaining algorithm that favours chains of low-scoring local alignments over isolated high-scoring fragments. For multiple alignment, we use an improved greedy procedure that is less sensitive to spurious local sequence similarities. To evaluate our method on globally related protein families, we used the well-known database BAliBASE. For benchmarking tests on locally related sequences, we created a new reference database called IRMBASE which consists of simulated conserved motifs implanted into non-related random sequences. Conclusion On BAliBASE, our new program performs significantly better than the previous version of DIALIGN and is comparable to the standard global aligner CLUSTAL W, though it is outperformed by some newly developed programs that focus on global alignment. On the locally related test sets in IRMBASE, our method outperforms all other programs that we evaluated.

  18. Multipacket Reception in Wireless Ad Hoc Networks Based on CDMA and Polynomial Phase-modulating Sequences

    Institute of Scientific and Technical Information of China (English)

    ZHANG Ji-dong; Wang Bao-yun; ZHENG Bao-yu

    2004-01-01

    Based on the polynomial phase-modulating sequences algorithm, this paper presents two schemes for the application of CDMA with polynomial phase signals to improve the signal separation performance. Simulation results illustrate the proposed approach have 1~3 dB improvement about signal-to-interference and noise ratio in most environment, compared with the PPS algorithm.

  19. Teaching Research Methodology Using a Project-Based Three Course Sequence Critical Reflections on Practice

    Science.gov (United States)

    Braguglia, Kay H.; Jackson, Kanata A.

    2012-01-01

    This article presents a reflective analysis of teaching research methodology through a three course sequence using a project-based approach. The authors reflect critically on their experiences in teaching research methods courses in an undergraduate business management program. The introduction of a range of specific techniques including student…

  20. Phylogeny of Anophelinae (Diptera: Culicidae) Based on Nuclear Ribosomal and Mitochondrial DNA Sequences

    Science.gov (United States)

    2002-01-01

    combining data from nuclear protein-encoding enes for phylogenetic analyses of Noctuoidea (Insecta: Lepidoptera ). Systematic Biology, 49, 202 224. Nixon... Systematic Entomology (2002) 27, 361 382 Phylogeny of Anophelinae (Diptera: Culicidae) based on nuclear ribosomal and mitochondrial DNA sequences...Entomologia M•dica, Departamento de Epidemiologia, Faculdade de Safide Pfiblica, Universidade de S•o Paulo, Brazil, -• Department of Systematic

  1. Magnetism Teaching Sequences Based on an Inductive Approach for First-Year Thai University Science Students

    Science.gov (United States)

    Narjaikaew, Pattawan; Emarat, Narumon; Arayathanitkul, Kwan; Cowie, Bronwen

    2010-01-01

    The study investigated the impact on student motivation and understanding of magnetism of teaching sequences based on an inductive approach. The study was conducted in large lecture classes. A pre- and post-Conceptual Survey of Electricity and Magnetism was conducted with just fewer than 700 Thai undergraduate science students, before and after…

  2. Reproducible analysis of sequencing-based RNA structure probing data with user-friendly tools

    DEFF Research Database (Denmark)

    Kielpinski, Lukasz Jan; Sidiropoulos, Nikos; Vinther, Jeppe

    2015-01-01

    coordinates and vice versa. The collection is implemented as functions in the R statistical environment and as tools in the Galaxy platform, making them easily accessible for the scientific community. We demonstrate the usefulness of the collection by applying it to the analysis of sequencing-based hydroxyl...

  3. Nucleic acid sequence-based amplification with oligochromatography for detection of Trypanosoma brucei in clinical samples

    NARCIS (Netherlands)

    C.M. Mugasa; T. Laurent; G.J. Schoone; P.A. Kager; G.W. Lubega; H.D.F.H. Schallig

    2009-01-01

    Molecular tools, such as real-time nucleic acid sequence-based amplification (NASBA) and PCR, have been developed to detect Trypanosoma brucei parasites in blood for the diagnosis of human African trypanosomiasis (HAT). Despite good sensitivity, these techniques are not implemented in HAT control pr

  4. Automated DNA Sequencing System

    Energy Technology Data Exchange (ETDEWEB)

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  5. CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L. methylation filtered genomic genespace sequences

    Directory of Open Access Journals (Sweden)

    Spraggins Thomas A

    2007-04-01

    Full Text Available Abstract Background Cowpea [Vigna unguiculata (L. Walp.] is one of the most important food and forage legumes in the semi-arid tropics because of its ability to tolerate drought and grow on poor soils. It is cultivated mostly by poor farmers in developing countries, with 80% of production taking place in the dry savannah of tropical West and Central Africa. Cowpea is largely an underexploited crop with relatively little genomic information available for use in applied plant breeding. The goal of the Cowpea Genomics Initiative (CGI, funded by the Kirkhouse Trust, a UK-based charitable organization, is to leverage modern molecular genetic tools for gene discovery and cowpea improvement. One aspect of the initiative is the sequencing of the gene-rich region of the cowpea genome (termed the genespace recovered using methylation filtration technology and providing annotation and analysis of the sequence data. Description CGKB, Cowpea Genespace/Genomics Knowledge Base, is an annotation knowledge base developed under the CGI. The database is based on information derived from 298,848 cowpea genespace sequences (GSS isolated by methylation filtering of genomic DNA. The CGKB consists of three knowledge bases: GSS annotation and comparative genomics knowledge base, GSS enzyme and metabolic pathway knowledge base, and GSS simple sequence repeats (SSRs knowledge base for molecular marker discovery. A homology-based approach was applied for annotations of the GSS, mainly using BLASTX against four public FASTA formatted protein databases (NCBI GenBank Proteins, UniProtKB-Swiss-Prot, UniprotKB-PIR (Protein Information Resource, and UniProtKB-TrEMBL. Comparative genome analysis was done by BLASTX searches of the cowpea GSS against four plant proteomes from Arabidopsis thaliana, Oryza sativa, Medicago truncatula, and Populus trichocarpa. The possible exons and introns on each cowpea GSS were predicted using the HMM-based Genscan gene predication program and the

  6. Multivariate Statistical Process Control and Case-Based Reasoning for situation assessment of Sequencing Batch Reactors

    OpenAIRE

    Ruiz Ordóñez, Magda Liliana

    2008-01-01

    ABSRACTThis thesis focuses on the monitoring, fault detection and diagnosis of Wastewater Treatment Plants (WWTP), which are important fields of research for a wide range of engineering disciplines. The main objective is to evaluate and apply a novel artificial intelligent methodology based on situation assessment for monitoring and diagnosis of Sequencing Batch Reactor (SBR) operation. To this end, Multivariate Statistical Process Control (MSPC) in combination with Case-Based Reasoning (CBR)...

  7. Analysis Of Segmental Duplications In The Pig Genome Based On Next-Generation Sequencing

    DEFF Research Database (Denmark)

    Fadista, João; Bendixen, Christian

    extensively studied in other organisms, its analysis in pig has been hampered by the lack of a complete pig genome assembly. By measuring the depth of coverage of Illumina whole-genome shotgun sequencing reads of the Tabasco animal aligned to the latest pig genome assembly (Sus scrofa 10 – based also...... on Tabasco), led us to the detection of a high-resolution map of segmental duplications in the pig genome. Comparing these segments with four other Duroc animals sequenced at our institute, supplied the resources needed to describe the first genome-wide and systematic analysis of segmental duplications...

  8. Face recognition based on matching of local features on 3D dynamic range sequences

    Science.gov (United States)

    Echeagaray-Patrón, B. A.; Kober, Vitaly

    2016-09-01

    3D face recognition has attracted attention in the last decade due to improvement of technology of 3D image acquisition and its wide range of applications such as access control, surveillance, human-computer interaction and biometric identification systems. Most research on 3D face recognition has focused on analysis of 3D still data. In this work, a new method for face recognition using dynamic 3D range sequences is proposed. Experimental results are presented and discussed using 3D sequences in the presence of pose variation. The performance of the proposed method is compared with that of conventional face recognition algorithms based on descriptors.

  9. Protection algorithm for a wind turbine generator based on positive- and negative-sequence fault components

    DEFF Research Database (Denmark)

    Zheng, Tai-Ying; Cha, Seung-Tae; Crossley, Peter A.;

    2011-01-01

    A protection relay for a wind turbine generator (WTG) based on positive- and negative-sequence fault components is proposed in the paper. The relay uses the magnitude of the positive-sequence component in the fault current to detect a fault on a parallel WTG, connected to the same power collection...... feeder, or a fault on an adjacent feeder; but for these faults, the relay remains stable and inoperative. A fault on the power collection feeder or a fault on the collection bus, both of which require an instantaneous tripping response, are distinguished from an inter-tie fault or a grid fault, which...

  10. Phylogenetic analysis of the pathogenic bacteria Spiroplasma penaei based on multilocus sequence analysis.

    Science.gov (United States)

    Heres, Allan; Lightner, Donald V

    2010-01-01

    A pathogenic Spiroplasma penaei strain was isolated from the hemolymph of moribund Pacific white shrimp, Penaeus vannamei. The shrimp sample originated from a shrimp farm near Cartagena, Colombia, that was suffering from high mortalities in ponds with very low salinity and high temperatures. This new emerging disease in a marine crustacean in the Americas is described as a systemic infection. The multilocus phylogenetic analysis suggests that S. penaei strain has a terrestrial origin. Evolutionary relationship trees, based on five partial DNA sequences of 16S rDNA, 23S rDNA, 5S rDNA, gyrB, rpoB genes and two complete DNA sequences of 16S-23S rDNA and 23S-5S rDNA intergenic spacer region, were reconstructed using the distance-based Neighboring-Joining (NJ) method with Kimura-2-parameter substitution model. The NJ trees based on all DNA sequences investigated in this study positioned S. penaei in the Citri-Poulsonii clade and corroborate the observations by other investigators using the 16S rDNA gene. Pairwise genetic distance calculation between sequences of spiroplasmas showed S. penaei to be closely related to Spiroplasma insolitum and distantly related to Spiroplasma sp. SHRIMP from China.

  11. DNAskew: Statistical Analysis of Base Compositional Asymmetry and Prediction of Replication Boundaries in the Genome Sequences

    Institute of Scientific and Technical Information of China (English)

    Xiang-RuMA; Shao-BoXIAO; Ai-ZhenGUO; Jian-QiangLUE; Huan-ChunCHEN

    2004-01-01

    Sueoka and Lobry declared respectively that, in the absence of bias between the two DNA strands for mutation and selection, the base composition within each strand should be A=T and C=G (this state is called Parity Rule type 2, PR2). However, the genome sequences of many bacteria, vertebrates and viruses showed asymmetries in base composition and gene direction. To determine the relationship of base composition skews with replication orientation, gene function, codon usage biases and phylogenetic evolution,in this paper a program called DNAskew was developed for the statistical analysis of strand asymmetry and codon composition bias in the DNA sequence. In addition, the program can also be used to predict the replication boundaries of genome sequences. The method builds on the fact that there are compositional asymmetries between the leading and the lagging strand for replication. DNAskew was written in Perl script language and implemented on the LINUX operating system. It works quickly with annotated or unannotated sequences in GBFF (GenBank flatfile) or fasta format. The source code is freely available for academic use at http://www.epizooty.com/pub/stat/DNAskew.

  12. A pedigree-based study of mitochondrial D-loop DNA sequence variation among Arabian horses.

    Science.gov (United States)

    Bowling, A T; Del Valle, A; Bowling, M

    2000-02-01

    Through DNA sequence comparisons of a mitochondrial D-loop hypervariable region, we investigated matrilineal diversity for Arabian horses in the United States. Sixty-two horses were tested. From published pedigrees they traced in the maternal line to 34 mares acquired primarily in the mid to late 19th century from nomadic Bedouin tribes. Compared with the reference sequence (GenBank X79547), these samples showed 27 haplotypes with altogether 31 base substitution sites within 397 bp of sequence. Based on examination of pedigrees from a random sampling of 200 horses in current studbooks of the Arabian Horse Registry of America, we estimated that this study defined the expected mtDNA haplotypes for at least 89% of Arabian horses registered in the US. The reliability of the studbook recorded maternal lineages of Arabian pedigrees was demonstrated by haplotype concordance among multiple samplings in 14 lines. Single base differences observed within two maternal lines were interpreted as representing alternative fixations of past heteroplasmy. The study also demonstrated the utility of mtDNA sequence studies to resolve historical maternity questions without access to biological material from the horses whose relationship was in question, provided that representatives of the relevant female lines were available for comparison. The data call into question the traditional assumption that Arabian horses of the same strain necessarily share a common maternal ancestry.

  13. Effects of Aftershock Declustering in Risk Modeling: Case Study of a Subduction Sequence in Mexico

    Science.gov (United States)

    Kane, D. L.; Nyst, M.

    2014-12-01

    Earthquake hazard and risk models often assume that earthquake rates can be represented by a stationary Poisson process, and that aftershocks observed in historical seismicity catalogs represent a deviation from stationarity that must be corrected before earthquake rates are estimated. Algorithms for classifying individual earthquakes as independent mainshocks or as aftershocks vary widely, and analysis of a single catalog can produce considerably different earthquake rates depending on the declustering method implemented. As these rates are propagated through hazard and risk models, the modeled results will vary due to the assumptions implied by these choices. In particular, the removal of large aftershocks following a mainshock may lead to an underestimation of the rate of damaging earthquakes and potential damage due to a large aftershock may be excluded from the model. We present a case study based on the 1907 - 1911 sequence of nine 6.9 Mexico in order to illustrate the variability in risk under various declustering approaches. Previous studies have suggested that subduction zone earthquakes in Mexico tend to occur in clusters, and this particular sequence includes events that would be labeled as aftershocks in some declustering approaches yet are large enough to produce significant damage. We model the ground motion for each event, determine damage ratios using modern exposure data, and then compare the variability in the modeled damage from using the full catalog or one of several declustered catalogs containing only "independent" events. We also consider the effects of progressive damage caused by each subsequent event and how this might increase or decrease the total losses expected from this sequence.

  14. A mapping of an ensemble of mitochondrial sequences for various organisms into 3D space based on the word composition.

    Science.gov (United States)

    Aita, Takuyo; Nishigaki, Koichi

    2012-11-01

    To visualize a bird's-eye view of an ensemble of mitochondrial genome sequences for various species, we recently developed a novel method of mapping a biological sequence ensemble into Three-Dimensional (3D) vector space. First, we represented a biological sequence of a species s by a word-composition vector x(s), where its length [absolute value]x(s)[absolute value] represents the sequence length, and its unit vector x(s)/[absolute value]x(s)[absolute value] represents the relative composition of the K-tuple words through the sequence and the size of the dimension, N=4(K), is the number of all possible words with the length of K. Second, we mapped the vector x(s) to the 3D position vector y(s), based on the two following simple principles: (1) [absolute value]y(s)[absolute value]=[absolute value]x(s)[absolute value] and (2) the angle between y(s) and y(t) maximally correlates with the angle between x(s) and x(t). The mitochondrial genome sequences for 311 species, including 177 Animalia, 85 Fungi and 49 Green plants, were mapped into 3D space by using K=7. The mapping was successful because the angles between vectors before and after the mapping highly correlated with each other (correlation coefficients were 0.92-0.97). Interestingly, the Animalia kingdom is distributed along a single arc belt (just like the Milky Way on a Celestial Globe), and the Fungi and Green plant kingdoms are distributed in a similar arc belt. These two arc belts intersect at their respective middle regions and form a cross structure just like a jet aircraft fuselage and its wings. This new mapping method will allow researchers to intuitively interpret the visual information presented in the maps in a highly effective manner.

  15. Secondary Structure Predictions for Long RNA Sequences Based on Inversion Excursions and MapReduce.

    Science.gov (United States)

    Yehdego, Daniel T; Zhang, Boyu; Kodimala, Vikram K R; Johnson, Kyle L; Taufer, Michela; Leung, Ming-Ying

    2013-05-01

    Secondary structures of ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Experimental observations and computing limitations suggest that we can approach the secondary structure prediction problem for long RNA sequences by segmenting them into shorter chunks, predicting the secondary structures of each chunk individually using existing prediction programs, and then assembling the results to give the structure of the original sequence. The selection of cutting points is a crucial component of the segmenting step. Noting that stem-loops and pseudoknots always contain an inversion, i.e., a stretch of nucleotides followed closely by its inverse complementary sequence, we developed two cutting methods for segmenting long RNA sequences based on inversion excursions: the centered and optimized method. Each step of searching for inversions, chunking, and predictions can be performed in parallel. In this paper we use a MapReduce framework, i.e., Hadoop, to extensively explore meaningful inversion stem lengths and gap sizes for the segmentation and identify correlations between chunking methods and prediction accuracy. We show that for a set of long RNA sequences in the RFAM database, whose secondary structures are known to contain pseudoknots, our approach predicts secondary structures more accurately than methods that do not segment the sequence, when the latter predictions are possible computationally. We also show that, as sequences exceed certain lengths, some programs cannot computationally predict pseudoknots while our chunking methods can. Overall, our predicted structures still retain the accuracy level of the original prediction programs when compared with known experimental secondary structure.

  16. Phylogenetic analysis of Demodex caprae based on mitochondrial 16S rDNA sequence.

    Science.gov (United States)

    Zhao, Ya-E; Hu, Li; Ma, Jun-Xian

    2013-11-01

    Demodex caprae infests the hair follicles and sebaceous glands of goats worldwide, which not only seriously impairs goat farming, but also causes a big economic loss. However, there are few reports on the DNA level of D. caprae. To reveal the taxonomic position of D. caprae within the genus Demodex, the present study conducted phylogenetic analysis of D. caprae based on mt16S rDNA sequence data. D. caprae adults and eggs were obtained from a skin nodule of the goat suffering demodicidosis. The mt16S rDNA sequences of individual mite were amplified using specific primers, and then cloned, sequenced, and aligned. The sequence divergence, genetic distance, and transition/transversion rate were computed, and the phylogenetic trees in Demodex were reconstructed. Results revealed the 339-bp partial sequences of six D. caprae isolates were obtained, and the sequence identity was 100% among isolates. The pairwise divergences between D. caprae and Demodex canis or Demodex folliculorum or Demodex brevis were 22.2-24.0%, 24.0-24.9%, and 22.9-23.2%, respectively. The corresponding average genetic distances were 2.840, 2.926, and 2.665, and the average transition/transversion rates were 0.70, 0.55, and 0.54, respectively. The divergences, genetic distances, and transition/transversion rates of D. caprae versus the other three species all reached interspecies level. The five phylogenetic trees all presented that D. caprae clustered with D. brevis first, and then with D. canis, D. folliculorum, and Demodex injai in sequence. In conclusion, D. caprae is an independent species, and it is closer to D. brevis than to D. canis, D. folliculorum, or D. injai.

  17. MethylC-seq library preparation for base-resolution whole-genome bisulfite sequencing.

    Science.gov (United States)

    Urich, Mark A; Nery, Joseph R; Lister, Ryan; Schmitz, Robert J; Ecker, Joseph R

    2015-03-01

    Current high-throughput DNA sequencing technologies enable acquisition of billions of data points through which myriad biological processes can be interrogated, including genetic variation, chromatin structure, gene expression patterns, small RNAs and protein-DNA interactions. Here we describe the MethylC-sequencing (MethylC-seq) library preparation method, a 2-d protocol that enables the genome-wide identification of cytosine DNA methylation states at single-base resolution. The technique involves fragmentation of genomic DNA followed by adapter ligation, bisulfite conversion and limited amplification using adapter-specific PCR primers in preparation for sequencing. To date, this protocol has been successfully applied to genomic DNA isolated from primary cell culture, sorted cells and fresh tissue from over a thousand plant and animal samples.

  18. Two-Phase and Family-Based Designs for Next-Generation Sequencing Studies

    Directory of Open Access Journals (Sweden)

    Duncan C Thomas

    2013-12-01

    Full Text Available The cost of next-generation sequencing is now approaching that of early GWAS panels, but is still out of reach for large epidemiologic studies and the millions of rare variants expected poses challenges for distinguishing causal from non-causal variants. We review two types of designs for sequencing studies: two-phase designs for targeted follow-up of genomewide association studies using unrelated individuals; and family-based designs exploiting co-segregation for prioritizing variants and genes.Two-phase designs subsample subjects for sequencing from a larger case-control study jointly on the basis of their disease and carrier status; the discovered variants are then tested for association in the parent study. The analysis combines the full sequence data from the substudy with the more limited SNP data from the main study. We discuss various methods for selecting this subset of variants and describe the expected yield of true positive associations in the context of an on-going study of second breast cancers following radiotherapy.While the sharing of variants within families means that family-based designs are less efficient for discovery than sequencing unrelated individuals, the ability to exploit co-segregation of variants with disease within families helps distinguish causal from non-causal ones. Furthermore, by enriching for family history, the yield of causal variants can be improved and use of identity-by-descent information improves imputation of genotypes for other family members. We compare the relative efficiency of these designs with those using unrelated individuals for discovering and prioritizing variants or genes for testing association in larger studies. While associations can be tested with single variants, power is low for rare ones. Recent generalizations of burden or kernel tests for gene-level associations to family-based data are appealing. These approaches are illustrated in the context of a family-based study of

  19. [Characterization of Black and Dichothrix Cyanobacteria Based on the 16S Ribosomal RNA Gene Sequence

    Science.gov (United States)

    Ortega, Maya

    2010-01-01

    My project focuses on characterizing different cyanobacteria in thrombolitic mats found on the island of Highborn Cay, Bahamas. Thrombolites are interesting ecosystems because of the ability of bacteria in these mats to remove carbon dioxide from the atmosphere and mineralize it as calcium carbonate. In the future they may be used as models to develop carbon sequestration technologies, which could be used as part of regenerative life systems in space. These thrombolitic communities are also significant because of their similarities to early communities of life on Earth. I targeted two cyanobacteria in my research, Dichothrix spp. and whatever black is, since they are believed to be important to carbon sequestration in these thrombolitic mats. The goal of my summer research project was to molecularly identify these two cyanobacteria. DNA was isolated from each organism through mat dissections and DNA extractions. I ran Polymerase Chain Reactions (PCR) to amplify the 16S ribosomal RNA (rRNA) gene in each cyanobacteria. This specific gene is found in almost all bacteria and is highly conserved, meaning any changes in the sequence are most likely due to evolution. As a result, the 16S rRNA gene can be used for bacterial identification of different species based on the sequence of their 16S rRNA gene. Since the exact sequence of the Dichothrix gene was unknown, I designed different primers that flanked the gene based on the known sequences from other taxonomically similar cyanobacteria. Once the 16S rRNA gene was amplified, I cloned the gene into specialized Escherichia coli cells and sent the gene products for sequencing. Once the sequence is obtained, it will be added to a genetic database for future reference to and classification of other Dichothrix sp.

  20. Experimental demonstration of tunable multiple optical orthogonal codes sequences-based optical label for optical packets switching

    Science.gov (United States)

    Zhang, Chongfu; Qiu, Kun; Zhou, Heng; Ling, Yun; Wang, Yawei; Xu, Bo

    2010-03-01

    In this paper, the tunable multiple optical orthogonal codes sequences (MOOCS)-based optical label for optical packet switching (OPS) (MOOCS-OPS) is experimentally demonstrated for the first time. The tunable MOOCS-based optical label is performed by using fiber Bragg grating (FBG)-based optical en/decoders group and optical switches configured by using Field Programmable Gate Array (FPGA), and the optical label is erased by using Semiconductor Optical Amplifier (SOA). Some waveforms of the MOOCS-based optical label, optical packet including the MOOCS-based optical label and the payloads are obtained, the switching control mechanism and the switching matrix are discussed, the bit error rate (BER) performance of this system is also studied. These experimental results show that the tunable MOOCS-OPS scheme is effective.

  1. A Cost-Effective Approach to Sequence Hundreds of Complete Mitochondrial Genomes.

    Science.gov (United States)

    Nunez, Joaquin C B; Oleksiak, Marjorie F

    2016-01-01

    We present a cost-effective approach to sequence whole mitochondrial genomes for hundreds of individuals. Our approach uses small reaction volumes and unmodified (non-phosphorylated) barcoded adaptors to minimize reagent costs. We demonstrate our approach by sequencing 383 Fundulus sp. mitochondrial genomes (192 F. heteroclitus and 191 F. majalis). Prior to sequencing, we amplified the mitochondrial genomes using 4-5 custom-made, overlapping primer pairs, and sequencing was performed on an Illumina HiSeq 2500 platform. After removing low quality and short sequences, 2.9 million and 2.8 million reads were generated for F. heteroclitus and F. majalis respectively. Individual genomes were assembled for each species by mapping barcoded reads to a reference genome. For F. majalis, the reference genome was built de novo. On average, individual consensus sequences had high coverage: 61-fold for F. heteroclitus and 57-fold for F. majalis. The approach discussed in this paper is optimized for sequencing mitochondrial genomes on an Illumina platform. However, with the proper modifications, this approach could be easily applied to other small genomes and sequencing platforms.

  2. A Real-Time de novo DNA Sequencing Assembly Platform Based on an FPGA Implementation.

    Science.gov (United States)

    Hu, Yuanqi; Georgiou, Pantelis

    2016-01-01

    This paper presents an FPGA based DNA comparison platform which can be run concurrently with the sensing phase of DNA sequencing and shortens the overall time needed for de novo DNA assembly. A hybrid overlap searching algorithm is applied which is scalable and can deal with incremental detection of new bases. To handle the incomplete data set which gradually increases during sequencing time, all-against-all comparisons are broken down into successive window-against-window comparison phases and executed using a novel dynamic suffix comparison algorithm combined with a partitioned dynamic programming method. The complete system has been designed to facilitate parallel processing in hardware, which allows real-time comparison and full scalability as well as a decrease in the number of computations required. A base pair comparison rate of 51.2 G/s is achieved when implemented on an FPGA with successful DNA comparison when using data sets from real genomes.

  3. Aquifer Vulnerability Assessment Based on Sequence Stratigraphic and ³⁹Ar Transport Modeling.

    Science.gov (United States)

    Sonnenborg, Torben O; Scharling, Peter B; Hinsby, Klaus; Rasmussen, Erik S; Engesgaard, Peter

    2016-03-01

    A large-scale groundwater flow and transport model is developed for a deep-seated (100 to 300 m below ground surface) sedimentary aquifer system. The model is based on a three-dimensional (3D) hydrostratigraphic model, building on a sequence stratigraphic approach. The flow model is calibrated against observations of hydraulic head and stream discharge while the credibility of the transport model is evaluated against measurements of (39)Ar from deep wells using alternative parameterizations of dispersivity and effective porosity. The directly simulated 3D mean age distributions and vertical fluxes are used to visualize the two-dimensional (2D)/3D age and flux distribution along transects and at the top plane of individual aquifers. The simulation results are used to assess the vulnerability of the aquifer system that generally has been assumed to be protected by thick overlaying clayey units and therefore proposed as future reservoirs for drinking water supply. The results indicate that on a regional scale these deep-seated aquifers are not as protected from modern surface water contamination as expected because significant leakage to the deeper aquifers occurs. The complex distribution of local and intermediate groundwater flow systems controlled by the distribution of the river network as well as the topographical variation (Tóth 1963) provides the possibility for modern water to be found in even the deepest aquifers.

  4. Pigs in Sequence Space: A 0.66X Coverage Pig Genome Survey based on Shotgun Sequencing

    DEFF Research Database (Denmark)

    Wernersson, R; Schierup, Mikkel Heide; Jørgensen, Frank Grønlund;

    2005-01-01

    Background Comparative whole genome analysis of Mammalia can benefit from the addition of more species. The pig is an obvious choice due to its economic and medical importance as well as its evolutionary position in the artiodactyls. Results We have generated ~ 3.84 million shotgun sequences (0.6...... as the human branch, and the joint alignment of the shot-gun sequences to the human-mouse alignment offers a rapid way for the investigator to define specific regions for analysis and resequencing....

  5. Effects of priming goal pursuit on implicit sequence learning

    OpenAIRE

    Gamble, Katherine R.; Lee, Joanna M.; Howard, James H.; Howard, Darlene V.

    2014-01-01

    Implicit learning, the type of learning that occurs without intent to learn or awareness of what has been learned, has been thought to be insensitive to the effects of priming, but recent studies suggest this is not the case. One study found that learning in the Serial Reaction Time (SRT) task was improved by nonconscious goal pursuit, primed via a word search task (Eitam et al., 2008). In two studies, we used the goal priming word search task from Eitam et al., but with a different version o...

  6. Two valuation questions in one survey: Is it a recipe for sequencing and instrument context effects?

    Science.gov (United States)

    Giraud, K.L.; Loomis, J.B.; Johnson, R.L.

    1999-01-01

    Economic theory suggests that willingness to pay for two goods independently offered should remain unchanged when the survey instrument changes slightly. Four survey treatments consisting of comprehensive good and a subset of that good were used. The surveys alternated in the question ordering and in the embedded good which accompanied the comprehensive good. We tested for sequencing and instrument context effects using both a combined and split sample designs. In the combined sample case we found some evidence to sequencing effects in the data containing the first subset good. Likelihood ratio tests indicated that sequencing did not effect scale or location of parameters. In the test for instrument context effects, evidence was found indicating context does effect willingness to pay estimates.

  7. PHYLOViZ: phylogenetic inference and data visualization for sequence based typing methods

    Directory of Open Access Journals (Sweden)

    Francisco Alexandre P

    2012-05-01

    Full Text Available Abstract Background With the decrease of DNA sequencing costs, sequence-based typing methods are rapidly becoming the gold standard for epidemiological surveillance. These methods provide reproducible and comparable results needed for a global scale bacterial population analysis, while retaining their usefulness for local epidemiological surveys. Online databases that collect the generated allelic profiles and associated epidemiological data are available but this wealth of data remains underused and are frequently poorly annotated since no user-friendly tool exists to analyze and explore it. Results PHYLOViZ is platform independent Java software that allows the integrated analysis of sequence-based typing methods, including SNP data generated from whole genome sequence approaches, and associated epidemiological data. goeBURST and its Minimum Spanning Tree expansion are used for visualizing the possible evolutionary relationships between isolates. The results can be displayed as an annotated graph overlaying the query results of any other epidemiological data available. Conclusions PHYLOViZ is a user-friendly software that allows the combined analysis of multiple data sources for microbial epidemiological and population studies. It is freely available at http://www.phyloviz.net.

  8. BAC-based sequencing of behaviorally-relevant genes in the prairie vole.

    Directory of Open Access Journals (Sweden)

    Lisa A McGraw

    Full Text Available The prairie vole (Microtus ochrogaster is an important model organism for the study of social behavior, yet our ability to correlate genes and behavior in this species has been limited due to a lack of genetic and genomic resources. Here we report the BAC-based targeted sequencing of behaviorally-relevant genes and flanking regions in the prairie vole. A total of 6.4 Mb of non-redundant or haplotype-specific sequence assemblies were generated that span the partial or complete sequence of 21 behaviorally-relevant genes as well as an additional 55 flanking genes. Estimates of nucleotide diversity from 13 loci based on alignments of 1.7 Mb of haplotype-specific assemblies revealed an average pair-wise heterozygosity (8.4×10(-3. Comparative analyses of the prairie vole proteins encoded by the behaviorally-relevant genes identified >100 substitutions specific to the prairie vole lineage. Finally, our sequencing data indicate that a duplication of the prairie vole AVPR1A locus likely originated from a recent segmental duplication spanning a minimum of 105 kb. In summary, the results of our study provide the genomic resources necessary for the molecular and genetic characterization of a high-priority set of candidate genes for regulating social behavior in the prairie vole.

  9. Shotgun metagenomic sequencing based microbial diversity assessment of Lasundra hot spring, India

    Directory of Open Access Journals (Sweden)

    Amit V. Mangrola

    2015-06-01

    Full Text Available This is the first report on the metagenomic approach for unveiling the microbial diversity of Lasundra hot spring, Gujarat State, India. High-throughput sequencing of community DNA was performed on an Ion Torrent PGM platform. Metagenome consisted of 606,867 sequences represent 98,567,305 bps size with an average length of 162 bps and 46% G + C content. Metagenome sequence information is available at EBI under EBI Metagenomic database with accession no. ERP009313. MG-RAST assisted community analysis revealed that 99.21% sequences were bacterial origin, 0.43% was fit to eukaryotes and 0.11% belongs to archaea. A total of 29 bacterial, 20 eukaryotic and 4 archaeal phyla were detected. Abundant genera were Bacillus (86.7%, Geobacillus (2.4%, Paenibacillus (1.0%, Clostridium (0.7% and Listeria (0.5%, that represent 91.52% in metagenome. In functional analysis, Cluster of Orthologous Group (COG based annotation revealed that 45.4% was metabolism connected and 19.6% falls in poorly characterized group. Subsystem based annotation approach suggests that the 14.0% was carbohydrates, 7.0% was protein metabolism and 3.0% genes for various stress responses together with the versatile presence of commercially useful traits.

  10. incaRNAfbinv: a web server for the fragment-based design of RNA sequences.

    Science.gov (United States)

    Drory Retwitzer, Matan; Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme; Barash, Danny

    2016-07-08

    In recent years, new methods for computational RNA design have been developed and applied to various problems in synthetic biology and nanotechnology. Lately, there is considerable interest in incorporating essential biological information when solving the inverse RNA folding problem. Correspondingly, RNAfbinv aims at including biologically meaningful constraints and is the only program to-date that performs a fragment-based design of RNA sequences. In doing so it allows the design of sequences that do not necessarily exactly fold into the target, as long as the overall coarse-grained tree graph shape is preserved. Augmented by the weighted sampling algorithm of incaRNAtion, our web server called incaRNAfbinv implements the method devised in RNAfbinv and offers an interactive environment for the inverse folding of RNA using a fragment-based design approach. It takes as input: a target RNA secondary structure; optional sequence and motif constraints; optional target minimum free energy, neutrality and GC content. In addition to the design of synthetic regulatory sequences, it can be used as a pre-processing step for the detection of novel natural occurring RNAs. The two complementary methodologies RNAfbinv and incaRNAtion are merged together and fully implemented in our web server incaRNAfbinv, available at http://www.cs.bgu.ac.il/incaRNAfbinv.

  11. Detection of methylation in promoter sequences by melting curve analysis-based semiquantitative real time PCR

    Directory of Open Access Journals (Sweden)

    Lázcoz Paula

    2008-02-01

    Full Text Available Abstract Background We present two melting curve analysis (MCA-based semiquantitative real time PCR techniques to detect the promoter methylation status of genes. The first, MCA-MSP, follows the same principle as standard MSP but it is performed in a real time thermalcycler with results being visualized in a melting curve. The second, MCA-Meth, uses a single pair of primers designed with no CpGs in its sequence. These primers amplify both unmethylated and methylated sequences. In clinical applications the MSP technique has revolutionized methylation detection by simplifying the analysis to a PCR-based protocol. MCA-analysis based techniques may be able to further improve and simplify methylation analyses by reducing starting DNA amounts, by introducing an all-in-one tube reaction and by eliminating a final gel stage for visualization of the result. The current study aimed at investigating the feasibility of both MCA-MSP and MCA-Meth in the analysis of promoter methylation, and at defining potential advantages and shortcomings in comparison to currently implemented techniques, i.e. bisulfite sequencing and standard MSP. Methods The promoters of the RASSF1A (3p21.3, BLU (3p21.3 and MGMT (10q26 genes were analyzed by MCA-MSP and MCA-Meth in 13 astrocytoma samples, 6 high grade glioma cell lines and 4 neuroblastoma cell lines. The data were compared with standard MSP and validated by bisulfite sequencing. Results Both, MCA-MSP and MCA-Meth, successfully determined promoter methylation. MCA-MSP provided information similar to standard MSP analyses. However the analysis was possible in a single tube and avoided the gel stage. MCA-Meth proved to be useful in samples with intermediate methylation status, reflected by a melting curve position shift in dependence on methylation extent. Conclusion We propose MCA-MSP and MCA-Meth as alternative or supplementary techniques to MSP or bisulfite sequencing.

  12. Sequence effect in Parkinson’s disease is related to motor energetic cost

    Directory of Open Access Journals (Sweden)

    Sule eTinaz

    2016-05-01

    Full Text Available Bradykinesia is the most disabling motor symptom of Parkinson’s disease (PD. The sequence effect, a feature of bradykinesia, refers to the rapid decrement in amplitude and speed of repetitive movements (e.g., gait, handwriting and is a major cause of morbidity in PD. Previous research has revealed mixed results regarding the role of dopaminergic treatment in the sequence effect. However, external cueing has been shown to improve it. In this study, we aimed to characterize the sequence effect systematically and relate this phenomenon to the energetic cost of movement within the context of cost-benefit framework of motor control. We used a dynamic isometric motor task with auditory pacing to assess the sequence effect in motor output during a 15 s task segment in PD patients and matched controls. All participants performed the task with both hands, and without and with visual feedback. Patients were also tested in on- and off-dopaminergic states. Patients in the off state did not show higher sequence effect compared to controls, partly due to large variance in their performance. However, patients in the on state and in the absence of visual feedback, showed significantly higher sequence effect compared to controls. Patients expended higher total motor energy compared to controls in all conditions and regardless of their medication status. In this experimental situation, the sequence effect in PD is associated with the cumulative energetic cost of movement. Dopaminergic treatment, critical for internal triggering of movement, fails to maintain the motor vigor across responses. The high motor cost may be related to failure to incorporate limbic/motivational cues into the motor plan. Visual feedback may facilitate performance by shifting the driving of movement from internal to external, or, alternatively, by functioning as a motivational cue.

  13. Enhancing learning through optimal sequencing of web-based and manikin simulators to teach shock physiology in the medical curriculum.

    Science.gov (United States)

    Cendan, Juan C; Johnson, Teresa R

    2011-12-01

    The Association of American Medical Colleges has encouraged educators to investigate proper linkage of simulation experiences with medical curricula. The authors aimed to determine if student knowledge and satisfaction differ between participation in web-based and manikin simulations for learning shock physiology and treatment and to determine if a specific training sequencing had a differential effect on learning. All 40 second-year medical students participated in a randomized, counterbalanced study with two interventions: group 1 (n = 20) participated in a web-based simulation followed by a manikin simulation and group 2 (n = 20) participated in reverse order. Knowledge and attitudes were documented. Mixed-model ANOVA indicated a significant main effect of time (F(1,38) = 18.6, P learning when web-based simulation precedes manikin use. This finding warrants further study.

  14. Pseudorandom Bit Sequence Generator for Stream Cipher Based on Elliptic Curves

    Directory of Open Access Journals (Sweden)

    Jilna Payingat

    2015-01-01

    Full Text Available This paper proposes a pseudorandom sequence generator for stream ciphers based on elliptic curves (EC. A detailed analysis of various EC based random number generators available in the literature is done and a new method is proposed such that it addresses the drawbacks of these schemes. Statistical analysis of the proposed method is carried out using the NIST (National Institute of Standards and Technology test suite and it is seen that the sequence exhibits good randomness properties. The linear complexity analysis shows that the system has a linear complexity equal to the period of the sequence which is highly desirable. The statistical complexity and security against known plain text attack are also analysed. A comparison of the proposed method with other EC based schemes is done in terms of throughput, periodicity, and security, and the proposed method outperforms the methods in the literature. For resource constrained applications where a highly secure key exchange is essential, the proposed method provides a good option for encryption by time sharing the point multiplication unit for EC based key exchange. The algorithm and architecture for implementation are developed in such a way that the hardware consumed in addition to point multiplication unit is much less.

  15. Sequence-specific DNA interactions with calixarene-based langmuir monolayers.

    Science.gov (United States)

    Rullaud, Vanessa; Moridi, Negar; Shahgaldian, Patrick

    2014-07-29

    The interactions of an amphiphilic calixarene, namely p-guanidino-dodecyloxy-calix[4]arene, 1, self-assembled as Langmuir monolayers, with short double stranded DNA, were investigated by surface pressure-area (π-A) isotherms, surface ellipsometry and Brewster angle microscopy (BAM). Three DNA 30mers were used as models, poly(AT), poly(GC) and a random DNA sequence with 50% of G:C base pairs. The interactions of these model DNA duplexes with 1-based Langmuir monolayers were studied by measuring compression isotherms using increasing DNA concentrations (10(-6), 10(-5), 10(-4), and 5 × 10(-4) g L(-1)) in the aqueous subphase. The isotherms of 1 showed an expansion of the monolayer with, interestingly, significant differences depending on the duplex DNA sequence studied. Indeed, the interactions of 1-based monolayers with poly(AT) led to an expansion of the monolayer that was significantly more pronounced that for monolayers on subphases of poly(GC) and the random DNA sequence. The structure and thickness of 1-based Langmuir monolayers were investigated by BAM and surface ellipsometry that showed differences in thickness and structure between a monolayer formed on pure water or on a DNA subphase, with here again relevant dissimilarities depending on the DNA composition.

  16. Base- level Chang and Sequence Stratigraphy of Lishu Fault Lacustrine Basin

    Institute of Scientific and Technical Information of China (English)

    Wang Simin; Liu Zhaojun; Liu Kui

    2000-01-01

    Base - level is a kind of surface which controls sedimentation and erosion. So, it can be concluded that it is baselevel change that controls the formation and internal structure of a sequence. A single cycle of base- level change can generate four sets of different stacking patterns. They are two sets of aggradation, one progradation and one retrogradation, which affects the features of the internal structure of a sequence. Lishu fault subsidence of Songliao basin is a typical half - graben lacustrine basin. Comprehensive base - level change analysis indicates that six base - level cycles and their related six sequences can be recognized between T4 and T5 seismic reflection surface. The contemporaneous fault is the main controlling factor of the fault lacustrine basin. There are obvious differences exist in the composition of sedimentary systems and all systems tracts between its steep slope (the side that basin control fault existed) and flat slope. Except highstand systems tract is composed of fan delta - lacustrine system, lowstand systems tract, transgressive systems tract and regressive systems tract are all made up of fan delta - underwater fan- lacustrine sedimentary systems in the side of steep slope.

  17. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  18. Genome survey sequencing and genetic background characterization of Gracilariopsis lemaneiformis (Rhodophyta based on next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Wei Zhou

    Full Text Available Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb of the genome, and 7737 simple sequence repeats (SSRs were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs, followed by the di- (17.41%, tetra- (5.49%, hexa- (2.90%, and penta- (1.00% nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon.

  19. The study of genetic diversity within Carassius genera, based on sequencing some mitochondrial markers

    Directory of Open Access Journals (Sweden)

    Mihaela-Liliana IONESCU

    2013-05-01

    Full Text Available In this study we investigated the genetic diversity within Carassius genera, studying individuals from isolated aquatic populations in Romania, by analysing the sequences of three mitochondrial DNA genes: cytochrome b (Cyt b, mitochondrial control region (D-loop and cytochrome c oxidase I (COX I. The nucleotide sequence variation of the three genes were used to study the mtDNA divergence for Carassius genera individuals and to examine the phylogenetic relationships within analyzed populations.Based on the alignment of cytochrome b gene sequences from individuals belonging to Carassius genera from analyzed populations, 21 haplotypes have been identified: two of them were found in four of the six analyzed populations and one in two of studied populations. Regarding the D-loop sequences there were identified 20 haplotypes: four of them were found in two or more populations. Following COX I sequence alignment, from individuals of the Carassius genera, in the six populations were identified 22 haplotypes, but only one was found in four of the analyzed populations. Phylogeographic aspects of the D-loop showed that there are common haplotypes between Buzău (Buzău River, Buzău County, Romania, Sofroneşti (Sofroneşti Lake, Vaslui County, Romania Delta (Fortuna Lake, Danube Delta, Romania and Băile Felix (Bihor County, Romania populations, and for COX I between Buzău (Buzău River, Buzău County, Romania, Tăuteşti (Tăuteşti Lake, Iaşi County, Romania, Delta (Fortuna Lake, Danube Delta, Romania and Băile Felix (Bihor County, Romania populations. From the analysis of all sequences, it was found that the rate of occurrence of transitions is greater than the occurrence of transversions.

  20. Heterogeneous Suppression of Sequential Effects in Random Sequence Generation, but Not in Operant Learning

    Science.gov (United States)

    Shteingart, Hanan; Loewenstein, Yonatan

    2016-01-01

    There is a long history of experiments in which participants are instructed to generate a long sequence of binary random numbers. The scope of this line of research has shifted over the years from identifying the basic psychological principles and/or the heuristics that lead to deviations from randomness, to one of predicting future choices. In this paper, we used generalized linear regression and the framework of Reinforcement Learning in order to address both points. In particular, we used logistic regression analysis in order to characterize the temporal sequence of participants’ choices. Surprisingly, a population analysis indicated that the contribution of the most recent trial has only a weak effect on behavior, compared to more preceding trials, a result that seems irreconcilable with standard sequential effects that decay monotonously with the delay. However, when considering each participant separately, we found that the magnitudes of the sequential effect are a monotonous decreasing function of the delay, yet these individual sequential effects are largely averaged out in a population analysis because of heterogeneity. The substantial behavioral heterogeneity in this task is further demonstrated quantitatively by considering the predictive power of the model. We show that a heterogeneous model of sequential dependencies captures the structure available in random sequence generation. Finally, we show that the results of the logistic regression analysis can be interpreted in the framework of reinforcement learning, allowing us to compare the sequential effects in the random sequence generation task to those in an operant learning task. We show that in contrast to the random sequence generation task, sequential effects in operant learning are far more homogenous across the population. These results suggest that in the random sequence generation task, different participants adopt different cognitive strategies to suppress sequential dependencies when

  1. Accuracy Analysis for 6-DOF PKM with Sobol Sequence Based Quasi Monte Carlo Method

    Institute of Scientific and Technical Information of China (English)

    Jianguang Li; Jian Ding; Lijie Guo; Yingxue Yao; Zhaohong Yi; Huaijing Jing; Honggen Fang

    2015-01-01

    To improve the precisions of pose error analysis for 6⁃dof parallel kinematic mechanism ( PKM) during assembly quality control, a Sobol sequence based on Quasi Monte Carlo ( QMC) method is introduced and implemented in pose accuracy analysis for the PKM in this paper. The Sobol sequence based on Quasi Monte Carlo with the regularity and uniformity of samples in high dimensions, can prevail traditional Monte Carlo method with up to 98�59% and 98�25% enhancement for computational precision of pose error statistics. Then a PKM tolerance design system integrating this method is developed and with it pose error distributions of the PKM within a prescribed workspace are finally obtained and analyzed.

  2. Phylogenetic relationships of South China Sea snappers (genus Lutjanus; family Lutjanidae) based on mitochondrial DNA sequences.

    Science.gov (United States)

    Guo, Yusong; Wang, Zhongduo; Liu, Chuwu; Liu, Li; Liu, Yun

    2007-01-01

    Phylogenetic relationships of intra- and interspecies were elucidated based on complete cytochrome b (cyt b) and cytochrome c oxidase subunit II (COII) gene sequences from 12 recognized species of genus Lutjanus Bloch in the South China Sea (SCS). Using the combined data set of consensus cyt b and COII gene sequences, interspecific relationships for all 12 recognized species in SCS were consistent with Allen's morphology-based identifications, with strong correlation between the molecular and morphological characteristics. Monophyly of eight species (L. malabaricus, L. russellii, L. stellatus, L. bohar, L. johnii, L. sebae, L. fulvus, and L. fulviflamma) was strongly supported; however, the pairs L. vitta/L. ophuysenii and L. erythropterus/L. argentimaculatus were more similar than expected We inferred that L. malabaricus exists in SCS, and the introgression caused by hybridization is the reason for the unexpectedly high homogeneity.

  3. Laser positioning of four-quadrant detector based on pseudo-random sequence

    Science.gov (United States)

    Tang, Yanqin; Cao, Ercong; Hu, Xiaobo; Gu, Guohua; Qian, Weixian

    2016-10-01

    Nowadays the technology of laser positioning based on four-quadrant detector has the wide scope of the study and application areas. The main principle of laser positioning is that by capturing the projection of the laser spot on the photosensitive surface of the detector, and then calculating the output signal from the detector to obtain the coordinates of the spot on the photosensitive surface of the detector, the coordinate information of the laser spot in the space with respect to detector system which reflects the spatial position of the target object is calculated effectively. Given the extensive application of FPGA technology and the pseudo-random sequence has the similar correlation of white noise, the measurement process of the interference, noise has little effect on the correlation peak. In order to improve anti-jamming capability of the guided missile in tracking process, when the laser pulse emission, the laser pulse period is pseudo-random encoded which maintains in the range of 40ms-65ms so that people of interfering can't find the exact real laser pulse. Also, because the receiver knows the way to solve the pseudo-random code, when the receiver receives two consecutive laser pulses, the laser pulse period can be decoded successfully. In the FPGA hardware implementation process, around each laser pulse arrival time, the receiver can open a wave door to get location information contained the true signal. Taking into account the first two consecutive pulses received have been disturbed, so after receiving the first laser pulse, it receives all the laser pulse in the next 40ms-65ms to obtain the corresponding pseudo-random code.

  4. SAM: String-based sequence search algorithm for mitochondrial DNA database queries

    Science.gov (United States)

    Röck, Alexander; Irwin, Jodi; Dür, Arne; Parsons, Thomas; Parson, Walther

    2011-01-01

    The analysis of the haploid mitochondrial (mt) genome has numerous applications in forensic and population genetics, as well as in disease studies. Although mtDNA haplotypes are usually determined by sequencing, they are rarely reported as a nucleotide string. Traditionally they are presented in a difference-coded position-based format relative to the corrected version of the first sequenced mtDNA. This convention requires recommendations for standardized sequence alignment that is known to vary between scientific disciplines, even between laboratories. As a consequence, database searches that are vital for the interpretation of mtDNA data can suffer from biased results when query and database haplotypes are annotated differently. In the forensic context that would usually lead to underestimation of the absolute and relative frequencies. To address this issue we introduce SAM, a string-based search algorithm that converts query and database sequences to position-free nucleotide strings and thus eliminates the possibility that identical sequences will be missed in a database query. The mere application of a BLAST algorithm would not be a sufficient remedy as it uses a heuristic approach and does not address properties specific to mtDNA, such as phylogenetically stable but also rapidly evolving insertion and deletion events. The software presented here provides additional flexibility to incorporate phylogenetic data, site-specific mutation rates, and other biologically relevant information that would refine the interpretation of mitochondrial DNA data. The manuscript is accompanied by freeware and example data sets that can be used to evaluate the new software (http://stringvalidation.org). PMID:21056022

  5. The effect of attentional load on implicit sequence learning in children and young adults

    Directory of Open Access Journals (Sweden)

    Daphné eCoomans

    2014-05-01

    Full Text Available We investigated the effect of a secondary task on implicit sequence learning in children and young adults. A serial reaction time task was administered to 8-to-10 year old children and 18-to-22 year old adults. Participants reacted to the location of a target presented in one of four locations on the screen with a spatially corresponding response key. Unknown to participants, the location at which the target appeared was structured according to a deterministic sequence. Occasionally, the black target dot was replaced by a red target dog. To assess the effect of attentional load on implicit sequence learning, half of the participants of each age group was assigned to the single task condition, while the other half executed the task under dual task conditions. Whereas participants in the single task condition could ignore the change in target identity, dual task participants additionally had to count the number of times the black dot was replaced by a red dog to increase the attentional load. Sequence learning was tested under single task conditions in both conditions. Z-transformed results indicate that young adults generally showed more sequence learning than children. Importantly, the secondary task had no effect on sequence learning in children, since children learned as much under dual task conditions as under single task conditions. Adults, on the other hand, showed a different result pattern, as they displayed more sequence learning under single task than under dual task conditions. We surmise that this result is due to the vainly attempt of adults, but not children, to integrate both sequences.

  6. Sequence-based characterization of five SLA loci in Asian wild boars.

    Science.gov (United States)

    Jung, W Y; Choi, N R; Seo, D W; Lim, H T; Ho, C S; Lee, J H

    2014-10-01

    Two swine leucocyte antigen (SLA) class I (SLA-1 and SLA-2) and three class II (DRB1, DQB1 and DQA) genes were investigated for their diversity in Asian wild boars using a sequence-based typing method. A total of 15 alleles were detected at these loci, with eleven being novel. The findings provide one of the first glimpses of the SLA allelic diversity and architecture in the wild boar populations.

  7. Sequence-based characterization of the eight SLA loci in Korean native pigs.

    Science.gov (United States)

    Lee, Y J; Cho, K H; Kim, M J; Smith, D M; Ho, C S; Jung, K C; Jin, D I; Park, C S; Jeon, J T; Lee, J H

    2008-08-01

    Eight swine leucocyte antigen (SLA) gene (SLA-1, SLA-2, SLA-3, SLA-6, DRA, DRB1, DQA, DQB1) alleles were identified using sequence-based typing method in three Korean native pigs used for breeding at the National Institute of Animal Science in Korea. Six new alleles in class I genes and three new alleles in class II genes have been identified in this breed and can give valuable information for xenotransplantation and disease resistance.

  8. PSF : Introduction to R Package for Pattern Sequence Based Forecasting Algorithm

    OpenAIRE

    Bokde, Neeraj; Asencio-Cortés, Gualberto; Martínez-Álvarez, Francisco; Kulat, Kishore

    2016-01-01

    This paper discusses about an R package that implements the Pattern Sequence based Forecasting (PSF) algorithm, which was developed for univariate time series forecasting. This algorithm has been successfully applied to many different fields. The PSF algorithm consists of two major parts: clustering and prediction. The clustering part includes selection of the optimum number of clusters. It labels time series data with reference to such clusters. The prediction part includes functions like op...

  9. Capacitated vehicle routing problem with sequence-based pallet loading and axle weight constraints

    OpenAIRE

    2016-01-01

    In this paper, we introduce and study the capacitated vehicle routing problem with sequence-based pallet loading and axle weight constraints. To the best of our knowledge, it is the first time that axle weight restrictions are incorporated in a vehicle routing model. The aim of this paper is to demonstrate that incorporating axle weight restrictions in a vehicle routing model is possible and necessary for a feasible route planning. Axle weight limits impose a great challenge for transportatio...

  10. Diffusion measurements for molecular capsules: pulse sequences effect on water signal decay.

    Science.gov (United States)

    Avram, Liat; Cohen, Yoram

    2005-04-20

    Diffusion NMR and, more recently, diffusion ordered spectroscopy (DOSY) are gaining popularity as efficient tools for the characterization of supramolecular systems in solution. Here, using diffusion NMR of hydrogen-bond molecular capsules, we demonstrate that the use of different diffusion sequences may have a dramatic effect on exchanging peaks. In fact, we found that the signal decay of the water peak in [(1a)(6)(H(2)O)(8)] is monoexponential in the pulsed gradient spin-echo (PGSE) and stimulated echo (PGSTE) sequences and biexponential in the longitudinal eddy current delay (LED) and the bipolar longitudinal eddy current delay (BPLED) sequences, routinely used in modern DOSY experiments. By performing these diffusion measurements on molecular capsules, in which water is not part of the molecular capsules, we demonstrate that this phenomenon is observed only for water molecules that exchange between two sites that differ considerably in their diffusion coefficients. Degeneration of the LED or the BPLED sequences into PGSTE-type sequences by shortening the te period resulted in the disappearance of the extra slow diffusing component. The origin, as well as the implications of the different results obtained from conventional diffusion sequences, such as the PGSE and PGSTE as compared with the LED and BPLED sequences generally used in DOSY experiments, are briefly discussed.

  11. Detecting Protein-Protein Interactions with a Novel Matrix-Based Protein Sequence Representation and Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Zhu-Hong You

    2015-01-01

    Full Text Available Proteins and their interactions lie at the heart of most underlying biological processes. Consequently, correct detection of protein-protein interactions (PPIs is of fundamental importance to understand the molecular mechanisms in biological systems. Although the convenience brought by high-throughput experiment in technological advances makes it possible to detect a large amount of PPIs, the data generated through these methods is unreliable and may not be completely inclusive of all possible PPIs. Targeting at this problem, this study develops a novel computational approach to effectively detect the protein interactions. This approach is proposed based on a novel matrix-based representation of protein sequence combined with the algorithm of support vector machine (SVM, which fully considers the sequence order and dipeptide information of the protein primary sequence. When performed on yeast PPIs datasets, the proposed method can reach 90.06% prediction accuracy with 94.37% specificity at the sensitivity of 85.74%, indicating that this predictor is a useful tool to predict PPIs. Achieved results also demonstrate that our approach can be a helpful supplement for the interactions that have been detected experimentally.

  12. Phylogeographic analysis of African swine fever virus based on the p72 gene sequence.

    Science.gov (United States)

    Muangkram, Y; Sukmak, M; Wajjwalku, W

    2015-05-04

    African swine fever virus (ASFV) outbreak has been considered as an emerging and re-emerging disease for almost a century. Diagnostically, simple polymerase chain reaction and sequencing-based molecular detection could be employed for both viral identification and genotyping. This study established a novel phylogenetic analysis and epidemiology comparison based on 205 bp of p72 gene sequences. Based on this partial p72 fragment, an updated list of 44 different genotypes from a total of 516 ASFV sequences compiled from GenBank was generated. Nucleotide diversity was 0.04325 ± 0.00231. The analysis of spatial genetic variation divided the ASFV populations of the African continent into four clades (clade A: central and upper eastern Africa; clade B: eastern Africa; clade C: eastern and southern Africa; and clade D: southern Africa). These results and the developed protocol could serve as useful molecular tools for ASFV diagnosis from degraded DNA or putrefied samples, and also provide the phylogeographic perspective to identify the origin of viral outbreaks, facilitating the decision planning to limit their spread.

  13. Compensation of negative sequence stator flux of doubly-fed induction generator using polar voltage control-based direct torque control under unbalanced grid voltage condition

    Directory of Open Access Journals (Sweden)

    Badrinarayan Bansilal Pimple

    2015-02-01

    Full Text Available This study proposes a polar voltage control-based direct torque control method to reduce the effects of unbalanced grid voltage on doubly-fed induction generator (DFIG-based wind turbine system. Under unbalanced grid voltage, the stator flux has a negative sequence component which leads to second harmonic pulsation in torque, stator active power, stator reactive power, stator current and rotor current. In the control scheme, the negative sequence rotor voltage vector is controlled to compensate the negative sequence stator flux by negative sequence rotor flux. Simulation study is carried out on a 2 MW DFIG system using MATLAB/SIMULINK. Feasibility of the proposed control strategy is experimentally verified on a 1.5 kW DFIG system.

  14. A web-based search engine for triplex-forming oligonucleotide target sequences.

    Science.gov (United States)

    Gaddis, Sara S; Wu, Qi; Thames, Howard D; DiGiovanni, John; Walborg, Earl F; MacLeod, Michael C; Vasquez, Karen M

    2006-01-01

    Triplex technology offers a useful approach for site-specific modification of gene structure and function both in vitro and in vivo. Triplex-forming oligonucleotides (TFOs) bind to their target sites in duplex DNA, thereby forming triple-helical DNA structures via Hoogsteen hydrogen bonding. TFO binding has been demonstrated to site-specifically inhibit gene expression, enhance homologous recombination, induce mutation, inhibit protein binding, and direct DNA damage, thus providing a tool for gene-specific manipulation of DNA. We have developed a flexible web-based search engine to find and annotate TFO target sequences within the human and mouse genomes. Descriptive information about each site, including sequence context and gene region (intron, exon, or promoter), is provided. The engine assists the user in finding highly specific TFO target sequences by eliminating or flagging known repeat sequences and flagging overlapping genes. A convenient way to check for the uniqueness of a potential TFO binding site is provided via NCBI BLAST. The search engine may be accessed at spi.mdanderson.org/tfo.

  15. A protein block based fold recognition method for the annotation of twilight zone sequences.

    Science.gov (United States)

    Suresh, V; Ganesan, K; Parthasarathy, S

    2013-03-01

    The description of protein backbone was recently improved with a group of structural fragments called Structural Alphabets instead of the regular three states (Helix, Sheet and Coil) secondary structure description. Protein Blocks is one of the Structural Alphabets used to describe each and every region of protein backbone including the coil. According to de Brevern (2000) the Protein Blocks has 16 structural fragments and each one has 5 residues in length. Protein Blocks fragments are highly informative among the available Structural Alphabets and it has been used for many applications. Here, we present a protein fold recognition method based on Protein Blocks for the annotation of twilight zone sequences. In our method, we align the predicted Protein Blocks of a query amino acid sequence with a library of assigned Protein Blocks of 953 known folds using the local pair-wise alignment. The alignment results with z-value ≥ 2.5 and P-value ≤ 0.08 are predicted as possible folds. Our method is able to recognize the possible folds for nearly 35.5% of the twilight zone sequences with their predicted Protein Block sequence obtained by pb_prediction, which is available at Protein Block Export server.

  16. CT image sequence restoration based on sparse and low-rank decomposition.

    Directory of Open Access Journals (Sweden)

    Shuiping Gou

    Full Text Available Blurry organ boundaries and soft tissue structures present a major challenge in biomedical image restoration. In this paper, we propose a low-rank decomposition-based method for computed tomography (CT image sequence restoration, where the CT image sequence is decomposed into a sparse component and a low-rank component. A new point spread function of Weiner filter is employed to efficiently remove blur in the sparse component; a wiener filtering with the Gaussian PSF is used to recover the average image of the low-rank component. And then we get the recovered CT image sequence by combining the recovery low-rank image with all recovery sparse image sequence. Our method achieves restoration results with higher contrast, sharper organ boundaries and richer soft tissue structure information, compared with existing CT image restoration methods. The robustness of our method was assessed with numerical experiments using three different low-rank models: Robust Principle Component Analysis (RPCA, Linearized Alternating Direction Method with Adaptive Penalty (LADMAP and Go Decomposition (GoDec. Experimental results demonstrated that the RPCA model was the most suitable for the small noise CT images whereas the GoDec model was the best for the large noisy CT images.

  17. A scheme for multiple sequence alignment optimization--an improvement based on family representative mechanics features.

    Science.gov (United States)

    Liu, Xin; Zhao, Ya-Pu

    2009-12-21

    As a basic tool of modern biology, sequence alignment can provide us useful information in fold, function, and active site of protein. For many cases, the increased quality of sequence alignment means a better performance. The motivation of present work is to increase ability of the existing scoring scheme/algorithm by considering residue-residue correlations better. Based on a coarse-grained approach, the hydrophobic force between each pair of residues is written out from protein sequence. It results in the construction of an intramolecular hydrophobic force network that describes the whole residue-residue interactions of each protein molecule, and characterizes protein's biological properties in the hydrophobic aspect. A former work has suggested that such network can characterize the top weighted feature regarding hydrophobicity. Moreover, for each homologous protein of a family, the corresponding network shares some common and representative family characters that eventually govern the conservation of biological properties during protein evolution. In present work, we score such family representative characters of a protein by the deviation of its intramolecular hydrophobic force network from that of background. Such score can assist the existing scoring schemes/algorithms, and boost up the ability of multiple sequences alignment, e.g. achieving a prominent increase (approximately 50%) in searching the structurally alike residue segments at a low identity level. As the theoretical basis is different, the present scheme can assist most existing algorithms, and improve their efficiency remarkably.

  18. Implementing amplicon-based next generation sequencing in the diagnosis of small cell lung carcinoma metastases.

    Science.gov (United States)

    Meder, Lydia; König, Katharina; Fassunke, Jana; Ozretić, Luka; Wolf, Jürgen; Merkelbach-Bruse, Sabine; Heukamp, Lukas C; Buettner, Reinhard

    2015-12-01

    Small cell lung carcinoma (SCLC) is the most aggressive entity of lung cancer. Rapid cancer progression and early formation of systemic metastases drive the deadly outcome of SCLC. Recent advances in identifying oncogenes by cancer whole genome sequencing improved the understanding of SCLC carcinogenesis. However, tumor material is often limited in the clinic. Thus, it is a compulsive issue to improve SCLC diagnostics by combining established immunohistochemistry and next generation sequencing. We implemented amplicon-based next generation deep sequencing in our routine diagnostics pipeline to analyze RB1, TP53, EP300 and CREBBP, frequently mutated in SCLC. Thereby, our pipeline combined routine SCLC histology and identification of somatic mutations. We comprehensively analyzed fifty randomly collected SCLC metastases isolated from trachea and lymph nodes in comparison to specimens derived from primary SCLC. SCLC lymph node metastases showed enhanced proliferation and frequently a collapsed keratin cytoskeleton compared to SCLC metastases isolated from trachea. We identified characteristic synchronous mutations in RB1 and TP53 and non-synchronous CREBBP and EP300 mutations. Our data showed the benefit of implementing deep sequencing into routine diagnostics. We here identify oncogenic drivers and simultaneously gain further insights into SCLC tumor biology.

  19. A framework for the detection of de novo mutations in family-based sequencing data

    Science.gov (United States)

    Francioli, Laurent C; Cretu-Stancu, Mircea; Garimella, Kiran V; Fromer, Menachem; Kloosterman, Wigard P; Wijmenga, Cisca; Investigator, Principal; Swertz, Morris A; van Duijn, Cornelia M; Boomsma, Dorret I; Slagboom, PEline; van Ommen, Gertjan B; de Bakker, Paul IW; Swertz, Morris A; Francioli, Laurent C; van Dijk, Freerk; Menelaou, Androniki; Neerincx, Pieter BT; Pulit, Sara L; Deelen, Patrick; Elbers, Clara C; Francesco Palamara, Pier; Pe'er, Itsik; Abdellaoui, Abdel; Kloosterman, Wigard P; van Oven, Mannis; Vermaat, Martijn; Li, Mingkun; Laros, Jeroen FJ; Stoneking, Mark; de Knijff, Peter; Kayser, Manfred; Veldink, Jan H; van den Berg, Leonard H; Byelas, Heorhiy; den Dunnen, Johan T; Dijkstra, Martijn; Amin, Najaf; van der Velde, K Joeri; Hottenga, Jouke Jan; van Setten, Jessica; van Leeuwen, Elisabeth M; Kanterakis, Alexandros; Kattenberg, Mathijs; Karssen, Lennart C; van Schaik, Barbera DC; Bot, Jan; Nijman, Isaäc J; Renkens, Ivo; van Enckevort, David; Mei, Hailiang; Koval, Vyacheslav; Estrada, Karol; Medina-Gomez, Carolina; Ye, Kai; Lameijer, Eric-Wubbo; Moed, Matthijs H; Hehir-Kwa, Jayne Y; Handsaker, Robert E; McCarroll, Steven A; Sunyaev, Shamil R; Polak, Paz; Vuzman, Dana; Sohail, Mashaal; Hormozdiari, Fereydoun; Marschall, Tobias; Schönhuth, Alexander; Guryev, Victor; de Bakker, Paul IW; Slagboom, P Eline; Beekman, Marian B; de Craen, Anton JM; Suchiman, H Eka D; Hofman, Albert; van Duijn, Cornelia M; Oostra, Ben; Isaacs, Aaron; Amin, Najaf; Rivadeneira, Fernando; Uitterlinden, André G; Boomsma, Dorret I; Willemsen, Gonneke; Platteel, Mathieu; Pitts, Steven J; Potluri, Shobha; Sundar, Purnima; Cox, David R; Li, Qibin; Li, Yingrui; Du, Yuanping; Chen, Ruoyan; Cao, Hongzhi; Li, Ning; Cao, Sujie; Wang, Jun; Bovenberg, Jasper A; Brandsma, Margreet; Samocha, Kaitlin E; Neale, Benjamin M; Daly, Mark J; Banks, Eric; DePristo, Mark A; de Bakker, Paul IW

    2017-01-01

    Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father's age at conception and the number of DNMs in female offspring's X chromosome, consistent with previous literature reports. PMID:27876817

  20. Pigs in sequence space: A 0.66X coverage pig genome survey based on shotgun sequencing

    Directory of Open Access Journals (Sweden)

    Li Wei

    2005-05-01

    Full Text Available Abstract Background Comparative whole genome analysis of Mammalia can benefit from the addition of more species. The pig is an obvious choice due to its economic and medical importance as well as its evolutionary position in the artiodactyls. Results We have generated ~3.84 million shotgun sequences (0.66X coverage from the pig genome. The data are hereby released (NCBI Trace repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project" together with an initial evolutionary analysis. The non-repetitive fraction of the sequences was aligned to the UCSC human-mouse alignment and the resulting three-species alignments were annotated using the human genome annotation. Ultra-conserved elements and miRNAs were identified. The results show that for each of these types of orthologous data, pig is much closer to human than mouse is. Purifying selection has been more efficient in pig compared to human, but not as efficient as in mouse, and pig seems to have an isochore structure most similar to the structure in human. Conclusion The addition of the pig to the set of species sequenced at low coverage adds to the understanding of selective pressures that have acted on the human genome by bisecting the evolutionary branch between human and mouse with the mouse branch being approximately 3 times as long as the human branch. Additionally, the joint alignment of the shot-gun sequences to the human-mouse alignment offers the investigator a rapid way to defining specific regions for analysis and resequencing.

  1. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available Few studies investigated the donkey (Equus asinus at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca. The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing and Ion Torrent (RRL runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  2. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    Science.gov (United States)

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  3. Polypeptoids: A model system for exploring sequence and shape effects on block copolymer self-assembly

    Science.gov (United States)

    Segalman, Rachel

    2015-03-01

    While our ability to understand the detailed relationship between block copolymer chemistry and mesoscopic self-assembly has made remarkable progress over the last many years, yet we are still limited to a relatively small number of blocks in terms of structure-property understanding. Thus, there is a need to explore self-assembly phase space with sequence using a model system. Polypeptoids are non-natural, sequence specific polymers that offer the opportunity to probe the effect of sequence on self-assembly with much simpler molecular interactions and more scalable synthesis than traditional polypeptides. In this talk, I will discuss the use of this model system to understand the role of sequence on chain collapse and globule formation in solution, polymer crystallization, and block copolymer self-assembly. I will then discuss potential application as surface active agents for anti-fouling.

  4. Comparison of two multilocus sequence based genotyping schemes for Leptospira species.

    Directory of Open Access Journals (Sweden)

    Ahmed Ahmed

    2011-11-01

    Full Text Available BACKGROUND: Several sequence based genotyping schemes have been developed for Leptospira spp. The objective of this study was to genotype a collection of clinical and reference isolates using the two most commonly used schemes and compare and contrast the results. METHODS AND FINDINGS: A total of 48 isolates consisting of L. interrogans (n = 40 and L. kirschneri (n = 8 were typed by the 7 locus MLST scheme described by Thaipadungpanit et al., and the 6 locus genotyping scheme described by Ahmed et al., (termed 7L and 6L, respectively. Two L. interrogans isolates were not typed using 6L because of a deletion of three nucleotides in lipL32. The remaining 46 isolates were resolved into 21 sequence types (STs by 7L, and 30 genotypes by 6L. Overall nucleotide diversity (based on concatenated sequence was 3.6% and 2.3% for 7L and 6L, respectively. The D value (discriminatory ability of 7L and 6L were comparable, i.e. 92.0 (95% CI 87.5-96.5 vs. 93.5 (95% CI 88.6-98.4. The dN/dS ratios calculated for each locus indicated that none were under positive selection. Neighbor joining trees were reconstructed based on the concatenated sequences for each scheme. Both trees showed two distinct groups corresponding to L. interrogans and L. kirschneri, and both identified two clones containing 10 and 7 clinical isolates, respectively. There were six instances in which 6L split single STs as defined by 7L into closely related clusters. We noted two discrepancies between the trees in which the genetic relatedness between two pairs of strains were more closely related by 7L than by 6L. CONCLUSIONS: This genetic analysis indicates that the two schemes are comparable. We discuss their practical advantages and disadvantages.

  5. Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm.

    Science.gov (United States)

    Rani, R Ranjani; Ramyachitra, D

    2016-12-01

    Multiple sequence alignment (MSA) is a widespread approach in computational biology and bioinformatics. MSA deals with how the sequences of nucleotides and amino acids are sequenced with possible alignment and minimum number of gaps between them, which directs to the functional, evolutionary and structural relationships among the sequences. Still the computation of MSA is a challenging task to provide an efficient accuracy and statistically significant results of alignments. In this work, the Bacterial Foraging Optimization Algorithm was employed to align the biological sequences which resulted in a non-dominated optimal solution. It employs Multi-objective, such as: Maximization of Similarity, Non-gap percentage, Conserved blocks and Minimization of gap penalty. BAliBASE 3.0 benchmark database was utilized to examine the proposed algorithm against other methods In this paper, two algorithms have been proposed: Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC) and Bacterial Foraging Optimization Algorithm. It was found that Hybrid Genetic Algorithm with Artificial Bee Colony performed better than the existing optimization algorithms. But still the conserved blocks were not obtained using GA-ABC. Then BFO was used for the alignment and the conserved blocks were obtained. The proposed Multi-Objective Bacterial Foraging Optimization Algorithm (MO-BFO) was compared with widely used MSA methods Clustal Omega, Kalign, MUSCLE, MAFFT, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC). The final results show that the proposed MO-BFO algorithm yields better alignment than most widely used methods.

  6. Sequence and single-base polymorphisms of the bovine alpha-lactalbumin 5'-flanking region.

    Science.gov (United States)

    Bleck, G T; Bremel, R D

    1993-04-30

    The alpha-lactalbumin (alpha LA)-encoding gene is a potential quantitative trait locus in dairy animals. In cattle, the production of alpha LA is tightly coupled to the onset of lactation and it serves as a regulatory subunit of the enzyme responsible for lactose synthesis. Lactose is the major osmole controlling water movement in the mammary gland. To better understand the control of bovine alpha LA expression, the 5'-flanking region of a Holstein alpha LA gene was cloned and sequenced. The sequenced clone contains 1952 bp of 5'-flanking region and 66-bp of the protein-coding region. Three single-bp polymorphisms were identified within this region. These polymorphisms occur at positions +15, +21 and +54 relative to the mRNA transcription start point (tsp). The +15 and +21 variations occur in the region encoding the 5'-untranslated region of the mRNA-coding sequence. The +54 polymorphism is a silent mutation in the SP-coding region of the gene. A polymerase chain reaction (PCR, Cetus)-based screening method has been employed to analyze the genotype of cattle at the +15 position. A total of 501 randomly selected cattle from seven breeds were screened for this allele. Of these animals, only the Holstein breed of cattle was found to contain the +15 variation and it occurs at a gene frequency of 32%. Sequence comparisons were conducted between the 5'-flanking regions of the bovine-milk-protein encoding genes, alpha LA, beta-casein and alpha S1-casein, which are coordinately expressed. Regions of similarity extending to 350 bp in length were observed between these sequences.

  7. SEQMINER: An R-Package to Facilitate the Functional Interpretation of Sequence-Based Associations.

    Science.gov (United States)

    Zhan, Xiaowei; Liu, Dajiang J

    2015-12-01

    Next-generation sequencing has enabled the study of a comprehensive catalogue of genetic variants for their impact on various complex diseases. Numerous consortia studies of complex traits have publically released their summary association statistics, which have become an invaluable resource for learning the underlying biology, understanding the genetic architecture, and guiding clinical translations. There is great interest in the field in developing novel statistical methods for analyzing and interpreting results from these genotype-phenotype association studies. One popular platform for method development and data analysis is R. In order to enable these analyses in R, it is necessary to develop packages that can efficiently query files of summary association statistics, explore the linkage disequilibrium structure between variants, and integrate various bioinformatics databases. The complexity and scale of sequence datasets and databases pose significant computational challenges for method developers. To address these challenges and facilitate method development, we developed the R package SEQMINER for annotating and querying files of sequence variants (e.g., VCF/BCF files) and summary association statistics (e.g., METAL/RAREMETAL files), and for integrating bioinformatics databases. SEQMINER provides an infrastructure where novel methods can be distributed and applied to analyzing sequence datasets in practice. We illustrate the performance of SEQMINER using datasets from the 1000 Genomes Project. We show that SEQMINER is highly efficient and easy to use. It will greatly accelerate the process of applying statistical innovations to analyze and interpret sequence-based associations. The R package, its source code and documentations are available from http://cran.r-project.org/web/packages/seqminer and http://seqminer.genomic.codes/.

  8. Molecular Characterization of Five Potyviruses Infecting Korean Sweet Potatoes Based on Analyses of Complete Genome Sequences

    Directory of Open Access Journals (Sweden)

    Hae-Ryun Kwak

    2015-12-01

    Full Text Available Sweet potatoes (Ipomea batatas L. are grown extensively, in tropical and temperate regions, and are important food crops worldwide. In Korea, potyviruses, including Sweet potato feathery mottle virus (SPFMV, Sweet potato virus C (SPVC, Sweet potato virus G (SPVG, Sweet potato virus 2 (SPV2, and Sweet potato latent virus (SPLV, have been detected in sweet potato fields at a high (~95% incidence. In the present work, complete genome sequences of 18 isolates, representing the five potyviruses mentioned above, were compared with previously reported genome sequences. The complete genomes consisted of 10,081 to 10,830 nucleotides, excluding the poly-A tails. Their genomic organizations were typical of the Potyvirus genus, including one target open reading frame coding for a putative polyprotein. Based on phylogenetic analyses and sequence comparisons, the Korean SPFMV isolates belonged to the strains RC and O with >98% nucleotide sequence identity. Korean SPVC isolates had 99% identity to the Japanese isolate SPVC-Bungo and 70% identity to the SPFMV isolates. The Korean SPVG isolates showed 99% identity to the three previously reported SPVG isolates. Korean SPV2 isolates had 97% identity to the SPV2 GWB-2 isolate from the USA. Korean SPLV isolates had a relatively low (88% nucleotide sequence identity with the Taiwanese SPLV-TW isolates, and they were phylogenetically distantly related to SPFMV isolates. Recombination analysis revealed that possible recombination events occurred in the P1, HC-Pro and NIa-NIb regions of SPFMV and SPLV isolates and these regions were identified as hotspots for recombination in the sweet potato potyviruses.

  9. Molecular genotyping of human Ureaplasma species based on multiple-banded antigen (MBA) gene sequences.

    Science.gov (United States)

    Kong, F; Ma, Z; James, G; Gordon, S; Gilbert, G L

    2000-09-01

    Ureaplasma urealyticum has been divided into 14 serovars. Recently, subdivision of U. urealyticum into two species has been proposed: U. parvum (previously U. urealyticum parvo biovar), comprising four serovars (1, 3, 6, 14) and U. urealyticum (previously U. urealyticum T-960 biovar), 10 serovars (2, 4, 5, 7-13). The multiple-banded antigen (MBA) genes of these species contain both species and serovar/subtype specific sequences. Based on whole sequences of the 5'-ends of MBA genes of U. parvum serovars and partial sequences of the 5'-ends of MBA genes of U. urealyticum serovars, we previously divided each of these species into three MBA genotypes. To further elucidate the relationships between serovars, we sequenced the whole 5'-ends of MBA genes of all 10 U. urealyticum serovars and partial repetitive regions of these genes from all serovars of U. parvum and U. urealyticum. For the first time, all four serovars of U. parvum were clearly differentiated from each other. In addition, the 10 serovars of U. urealyticum were divided into five MBA genotypes, as follows: MBA genotype A comprises serovars 2, 5, 8; MBA genotype B, serovar 10 only; MBA genotype C, serovars 4, 12, 13; MBA genotype D, serovar 9 only; and MBA genotype E comprises serovars 7 and 11. There were no sequence differences between members within each MBA genotype. Further work is required to identify other genes or other regions of the MBA genes that may be used to differentiate U. urealyticum serovars within MBA genotypes A, C and E. A better understanding of the molecular basis of serotype differentiation will help to improve subtyping methods for use in studies of the pathogenesis and epidemiology of these organisms.

  10. Priority-sequence of mineral resources’ development and utilization based on grey relational analysis method

    Institute of Scientific and Technical Information of China (English)

    Wang Ying; Zhang Chang; Jiang Gaopeng

    2016-01-01

    Generally, the sequence decision of the development and utilization of Chinese mineral resources is based on national and provincial overall plan of the mineral resources. Such plan usually cannot reflect the relative size of the suitability of the development and utilization of mineral resources. To solve the problem, the paper has selected the gift condition, the market condition, the technological condition, socio-economic condition and environmental condition as the starting-points to analyze the influential factors of the priority-sequence of mineral resources’ development and utilization. The above 5 condi-tions are further specified into 9 evaluative indicators to establish an evaluation indicator system. At last, we propose a decision model of the priority sequence based on grey relational analysis method, and fig-ure out the observation objects by the suitability index of development. Finally, the mineral resources of a certain province in China were analyzed as an example. The calculation results indicate that silver (2.0057), coal (1.9955), zinc (1.9442), cement limestone (1.9077), solvent limestone (1.5624) and other minerals in the province are suitable for development and utilization.

  11. A Chaos-Based Secure Direct-Sequence/Spread-Spectrum Communication System

    Directory of Open Access Journals (Sweden)

    Nguyen Xuan Quyen

    2013-01-01

    Full Text Available This paper proposes a chaos-based secure direct-sequence/spread-spectrum (DS/SS communication system which is based on a novel combination of the conventional DS/SS and chaos techniques. In the proposed system, bit duration is varied according to a chaotic behavior but is always equal to a multiple of the fixed chip duration in the communication process. Data bits with variable duration are spectrum-spread by multiplying directly with a pseudonoise (PN sequence and then modulated onto a sinusoidal carrier by means of binary phase-shift keying (BPSK. To recover exactly the data bits, the receiver needs an identical regeneration of not only the PN sequence but also the chaotic behavior, and hence data security is improved significantly. Structure and operation of the proposed system are analyzed in detail. Theoretical evaluation of bit-error rate (BER performance in presence of additive white Gaussian noise (AWGN is provided. Parameter choice for different cases of simulation is also considered. Simulation and theoretical results are shown to verify the reliability and feasibility of the proposed system. Security of the proposed system is also discussed.

  12. Effect of multimedia information sequencing on educational outcome in orthodontic training.

    Science.gov (United States)

    Aly, Medhat; Willems, Guy; Van Den Noortgate, Wim; Elen, Jan

    2012-08-01

    The aim of this research was to compare the effectiveness of hierarchical sequencing (HS) versus elaboration sequencing (ES) models in improving educational outcome of clinical knowledge when using instructional multimedia programs in postgraduate orthodontic training. Twenty-four postgraduate and 24 undergraduate dental students participated in this study. The postgraduates were following an orthodontic speciality training programme. The undergraduates were fourth- and fifth-year dental students. Twelve instructional multimedia modules were developed, six logically sequenced (LS) discussing six different orthodontic topics. Another six modules on identical topics were sequenced according to one macro-sequencing (MS) model. The implemented MS model was either HS or ES. The only difference between LS and MS modules was the adopted sequencing model. All participants were assigned into consistent pairs of students and were randomly divided into a test and a control group. In each pair, one student studied the LS module (control group) while the other studied the MS version (test group). Pre- and post-evaluation tests of each pair of participants were performed to measure knowledge, understanding and application of each participant with regard to the discussed topic. A multilevel analysis was conducted to assess the estimated effect of the different sequencing models. The level of significance was set at 0.05. At baseline, no significant differences (P > 0.05) were found in pre-test scores between groups. The HS model showed a significant effect on the scores achieved (P = 0.05). The test group showed a significantly higher estimated probability of correct answers to the questions (P = 0.003) when applying the HS model. The HS model may improve educational outcome when using instructional multimedia programs in postgraduate orthodontic training.

  13. ON THE POWER AND LIMITS OF SEQUENCE SIMILARITY BASED CLUSTERING OF PROTEINS INTO FAMILIES

    DEFF Research Database (Denmark)

    Wiwie, Christian; Röttger, Richard

    2017-01-01

    used the data to investigate the behavior of the tools' parameters underlining the diversity of the protein families. Furthermore, we trained regression models for predicting the expected performance of a clustering tool for an unknown data set and aimed to also suggest optimal parameters...... important to also unravel the proteomic repertoire of an organism. A classical computational approach for detecting protein families is a sequence-based similarity calculation coupled with a subsequent cluster analysis. In this work we have intensively analyzed various clustering tools on a large scale. We...... in an automated fashion. Our analysis demonstrates the benefits and limitations of the clustering of proteins with low sequence similarity indicating that each protein family requires its own distinct set of tools and parameters. All results, a tool prediction service, and additional supporting material is also...

  14. Systematic position of Myrtama Ovcz. & Kinz. based on morphological and nrDNA ITS sequence evidence

    Institute of Scientific and Technical Information of China (English)

    ZHANG Daoyuan; ZHANG Yuan; GASKIN J. F.; CHEN Zhiduan

    2006-01-01

    Myrtama is a genus named from Myricaria elegans Royle in the 1970's in terms of its morphological peculiarities. The establishment of this genus and its systematic position have been disputed since its inception. ITS sequences from 10 species of Tamaricaceae are reported, and analyzed by PAUP 4.0b8 and Bayesian Inference to reconstruct the phylogenies. A single ITS tree is generated from maximum parsimony and MrBayes analyses, respectively. The molecular data set shows strong support for Tamarix and Myricaria as monophyletic genera,and Myrtama as a sister group to the genus Myricaria.Based on morphological differences, a single morphological tree is also generated, in which two major lineages existed but Myrtama is a sister group to Tamarix, rather than Myricaria. The evidence from DNA sequences and morphological characters supports that Myicaria elegans should be put into neither Myricaria nor Tamarix, but kept in its own monotypic genus.

  15. discussion on validity of rana maoershanensis based on partial sequence of 16s rrna gene

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    rana maoershanensis found in mt.maoershan in guangxi,china was reported as a new species in 2007,but there was no molecular data for this frog.the partial sequences (543 bp) of 16s rrna gene from 12 specimens of 3 brown frog species (rana hanluica,r.maoershanensis and r.chensinensis) were analyzed with 17 specimens of 9 species from genbank.the nucleotide sequence divergence between r.maoershanensis and the other brown frog species were 4.5%-6.5%,with 22-30 nucleotide substitutions at this locus.the phylogenetic relationships based on mp,ml,and bayesian inference indicate that the brown frogs from southern china were diverged into three groups (clades a,b and c).r.maoershanensis was clustered together a well-supported subclade (b-l).it is suggested that r.maoershanensis is a valid species.

  16. ITS-2 sequences-based identification of Trichogramma species in South America

    Directory of Open Access Journals (Sweden)

    R. P. Almeida

    Full Text Available Abstract ITS2 (Internal transcribed spacer 2 sequences have been used in systematic studies and proved to be useful in providing a reliable identification of Trichogramma species. DNAr sequences ranged in size from 379 to 632 bp. In eleven T. pretiosum lines Wolbachia-induced parthenogenesis was found for the first time. These thelytokous lines were collected in Peru (9, Colombia (1 and USA (1. A dichotomous key for species identification was built based on the size of the ITS2 PCR product and restriction analysis using three endonucleases (EcoRI, MseI and MaeI. This molecular technique was successfully used to distinguish among seventeen native/introduced Trichogramma species collected in South America.

  17. Welding sequences optimization of box structure based on genetic algorithm method

    Institute of Scientific and Technical Information of China (English)

    CUI Xiao-fang; MA Jun; MENG Kai; ZHAO Wen-zhong; ZHAO Hai-yan

    2006-01-01

    In this article, The genetic algorithm method was proposed, that is, to establish the box structure's nonlinear three-dimension optimization numerical model based on thermo-mechanical coupling algorithm, and the objective function of welding distortion has been utilized to determine an optimum welding sequence by optimization simulation. The validity of genetic algorithm method combining with the thermo-mechanical nonlinear finite element model is verified by comparison with the experimental data where available. By choosing the appropriate objective function for the considered case, an optimum welding sequence is determined by a genetic algorithm. All done in this study indicates that the new method presented in this article will have important practical application for designing the welding technical parameters in the future.

  18. An EST-based genome scan using 454 sequencing in the marine snail Littorina saxatilis.

    Science.gov (United States)

    Galindo, J; Grahame, J W; Butlin, R K

    2010-09-01

    Genome scans have been used in the studies of ecological speciation to find genomic regions ('outlier loci') showing reduced gene flow between divergent populations/species. High-throughput sequencing ('454') offers new opportunities in this field via transcriptome sequencing. Divergent ecotypes of the marine gastropod Littorina saxatilis represent a good example of incipient ecological speciation. We performed a 454-based genome scan between H and M ecotypes of L. saxatilis from the British Isles using cDNA of pooled individuals. Allele frequencies were calculated for 2454 single nucleotide polymorphisms (SNPs), within 572 contigs, and 7% of loci were detected as outliers. Functional annotation of the contigs containing outlier SNPs showed that they included shell matrix and muscle proteins (lithostathine, mucin, titin), proteins involved in energetic metabolism (arginine kinase, NADH dehydrogenase) and reverse transcriptases. Follow-up investigations into these proteins and unannotated outliers will be a promising route in the study of ecological speciation in L. saxatilis.

  19. High-throughput Sequencing Based Immune Repertoire Study during Infectious Disease

    Directory of Open Access Journals (Sweden)

    Dongni Hou

    2016-08-01

    Full Text Available The selectivity of the adaptive immune response is based on the enormous diversity of T and B cell antigen-specific receptors. The immune repertoire, the collection of T and B cells with functional diversity in the circulatory system at any given time, is dynamic and reflects the essence of immune selectivity. In this article, we review the recent advances in immune repertoire study of infectious diseases that achieved by traditional techniques and high-throughput sequencing techniques. High-throughput sequencing techniques enable the determination of complementary regions of lymphocyte receptors with unprecedented efficiency and scale. This progress in methodology enhances the understanding of immunologic changes during pathogen challenge, and also provides a basis for further development of novel diagnostic markers, immunotherapies and vaccines.

  20. Phylogenetic analysis of Silphium and subtribe Engelmanniinae (Asteraceae: Heliantheae) based on ITS and ETS sequence data.

    Science.gov (United States)

    Clevinger, J A; Panero, J L

    2000-04-01

    The phylogenetic relationships of Silphium and subtribe Engelmanniinae were examined using DNA sequence data. The internal transcribed spacer (ITS) region and the external transcribed spacer (ETS) region were sequenced for 39 specimens representing the six genera of subtribe Engelmanniinae (Berlandiera, Chrysogonum, Dugesia, Engelmannia, Lindheimera, and Silphium), plus five additional genera identified as closely related to the Engelmanniinae by chloroplast DNA restriction site analysis, and three outgroups. Phylogenetic analysis supported the monophyly of Silphium with Lindheimera as sister. Silphium can be divided into two sections based upon two well-supported clades that correspond to root type and growth form. These results also supported the expansion of subtribe Engelmanniinae to include Balsamorhiza, Borrichia, Rojasianthe, Vigethia, and Wyethia. We hypothesize that subtribe Engelmanniinae originated in Mesoamerica and later radiated to the United States. We suggest that the cypsela complex, which is present in Berlandiera, Chrysogonum, Engelmannia, and Lindheimera, arose only once and was subsequently lost in Silphium.

  1. Statistical framework for detection of genetically modified organisms based on Next Generation Sequencing.

    Science.gov (United States)

    Willems, Sander; Fraiture, Marie-Alice; Deforce, Dieter; De Keersmaecker, Sigrid C J; De Loose, Marc; Ruttink, Tom; Herman, Philippe; Van Nieuwerburgh, Filip; Roosens, Nancy

    2016-02-01

    Because the number and diversity of genetically modified (GM) crops has significantly increased, their analysis based on real-time PCR (qPCR) methods is becoming increasingly complex and laborious. While several pioneers already investigated Next Generation Sequencing (NGS) as an alternative to qPCR, its practical use has not been assessed for routine analysis. In this study a statistical framework was developed to predict the number of NGS reads needed to detect transgene sequences, to prove their integration into the host genome and to identify the specific transgene event in a sample with known composition. This framework was validated by applying it to experimental data from food matrices composed of pure GM rice, processed GM rice (noodles) or a 10% GM/non-GM rice mixture, revealing some influential factors. Finally, feasibility of NGS for routine analysis of GM crops was investigated by applying the framework to samples commonly encountered in routine analysis of GM crops.

  2. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Directory of Open Access Journals (Sweden)

    Luo Ming-Cheng

    2011-01-01

    Full Text Available Abstract Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA

  3. The effect of disc inclination on the main sequence of star-forming galaxies

    Science.gov (United States)

    Morselli, L.; Renzini, A.; Popesso, P.; Erfanianfar, G.

    2016-11-01

    We use the Sloan Digital Sky Survey (York et al.) data base to explore the effect of the disc inclination angle on the derived star formation rate (SFR), hence on the slope and width of the main-sequence (MS) relation for star-forming galaxies. We find that SFRs for nearly edge-on discs are underestimated by factors ranging from ˜0.2 dex for low-mass galaxies up to ˜0.4 dex for high-mass galaxies. This results in a substantially flatter MS relation for high-inclination discs compared to that for less inclined ones, though the global effect over the whole sample of star-forming galaxies is relatively minor, given the small fraction of high-inclination discs. However, we also find that galaxies with high-inclination discs represent a non-negligible fraction of galaxies populating the so-called green valley, with derived SFRs intermediate between the MS and those of quenched, passively evolving galaxies.

  4. Implicit Structured Sequence Learning: An FMRI Study of the Structural Mere-Exposure Effect

    Directory of Open Access Journals (Sweden)

    Vasiliki eFolia

    2014-02-01

    Full Text Available In this event-related FMRI study we investigated the effect of five days of implicit acquisition on preference classification by means of an artificial grammar learning (AGL paradigm based on the structural mere-exposure effect and preference classification using a simple right-linear unification grammar. This allowed us to investigate implicit AGL in a proper learning design by including baseline measurements prior to grammar exposure. After 5 days of implicit acquisition, the FMRI results showed activations in a network of brain regions including the inferior frontal (centered on BA 44/45 and the medial prefrontal regions (centered on BA 8/32. Importantly, and central to this study, the inclusion of a naive preference FMRI baseline measurement allowed us to conclude that these FMRI findings were the intrinsic outcomes of the learning process itself and not a reflection of a preexisting functionality recruited during classification, independent of acquisition. Support for the implicit nature of the knowledge utilized during preference classification on day 5 come from the fact that the basal ganglia, associated with implicit procedural learning, were activated during classification, while the medial temporal lobe system, associated with explicit declarative memory, was consistently deactivated. Thus, preference classification in combination with structural mere-exposure can be used to investigate structural sequence processing (syntax in unsupervised AGL paradigms with proper learning designs.

  5. TFpredict and SABINE: sequence-based prediction of structural and functional characteristics of transcription factors.

    Directory of Open Access Journals (Sweden)

    Johannes Eichner

    Full Text Available One of the key mechanisms of transcriptional control are the specific connections between transcription factors (TF and cis-regulatory elements in gene promoters. The elucidation of these specific protein-DNA interactions is crucial to gain insights into the complex regulatory mechanisms and networks underlying the adaptation of organisms to dynamically changing environmental conditions. As experimental techniques for determining TF binding sites are expensive and mostly performed for selected TFs only, accurate computational approaches are needed to analyze transcriptional regulation in eukaryotes on a genome-wide level. We implemented a four-step classification workflow which for a given protein sequence (1 discriminates TFs from other proteins, (2 determines the structural superclass of TFs, (3 identifies the DNA-binding domains of TFs and (4 predicts their cis-acting DNA motif. While existing tools were extended and adapted for performing the latter two prediction steps, the first two steps are based on a novel numeric sequence representation which allows for combining existing knowledge from a BLAST scan with robust machine learning-based classification. By evaluation on a set of experimentally confirmed TFs and non-TFs, we demonstrate that our new protein sequence representation facilitates more reliable identification and structural classification of TFs than previously proposed sequence-derived features. The algorithms underlying our proposed methodology are implemented in the two complementary tools TFpredict and SABINE. The online and stand-alone versions of TFpredict and SABINE are freely available to academics at http://www.cogsys.cs.uni-tuebingen.de/software/TFpredict/ and http://www.cogsys.cs.uni-tuebingen.de/software/SABINE/.

  6. Consolidating the effects of waking and sleep on motor-sequence learning.

    Science.gov (United States)

    Brawn, Timothy P; Fenn, Kimberly M; Nusbaum, Howard C; Margoliash, Daniel

    2010-10-20

    Sleep is widely believed to play a critical role in memory consolidation. Sleep-dependent consolidation has been studied extensively in humans using an explicit motor-sequence learning paradigm. In this task, performance has been reported to remain stable across wakefulness and improve significantly after sleep, making motor-sequence learning the definitive example of sleep-dependent enhancement. Recent work, however, has shown that enhancement disappears when the task is modified to reduce task-related inhibition that develops over a training session, thus questioning whether sleep actively consolidates motor learning. Here we use the same motor-sequence task to demonstrate sleep-dependent consolidation for motor-sequence learning and explain the discrepancies in results across studies. We show that when training begins in the morning, motor-sequence performance deteriorates across wakefulness and recovers after sleep, whereas performance remains stable across both sleep and subsequent waking with evening training. This pattern of results challenges an influential model of memory consolidation defined by a time-dependent stabilization phase and a sleep-dependent enhancement phase. Moreover, the present results support a new account of the behavioral effects of waking and sleep on explicit motor-sequence learning that is consistent across a wide range of tasks. These observations indicate that current theories of memory consolidation that have been formulated to explain sleep-dependent performance enhancements are insufficient to explain the range of behavioral changes associated with sleep.

  7. Fast clinical molecular diagnosis of hyperphenylalaninemia using next-generation sequencing-based on a custom AmpliSeq™ panel and Ion Torrent PGM sequencing.

    Science.gov (United States)

    Cao, Yan-yan; Qu, Yu-jin; Song, Fang; Zhang, Ting; Bai, Jin-li; Jin, Yu-wei; Wang, Hong

    2014-12-01

    Hyperphenylalaninemia (HPA) can be classified into phenylketonuria (PKU) and tetrahydrobiopterin deficiency (BH4D), according to the defect of enzyme activity, both of which vary substantially in severity, treatment, and prognosis of the disease. To set up a fast and comprehensive assay in order to achieve early etiological diagnosis and differential diagnosis for children with HPA, we designed a custom AmpliSeq™ panel for the sequencing of coding DNA sequence (CDS), flanking introns, 5' untranslated region (UTR) and 3' UTR from five HPA-causing genes (PAH, PTS, QDPR, GCH1, and PCBD1) using the Ion Torrent Personal Genome Machine (PGM) Sequencer. A standard group of 15 samples with previously known DNA sequences and a test group of 37 HPA patients with unknown mutations were used for assay validation and application, respectively. All variations were confirmed by Sanger sequencing. In the standard group, all the known mutations were detected and were consistent with the results of previous Sanger sequencing. In the test group, we identified mutations in 71 of 74 alleles, with a mutation detection rate of 95.9%. We also found a frame shift deletion p.Ile25Metfs*13 in PAH that was previously unreported. In addition, 1 of 37 in the test group was inconsistent with either the molecular diagnosis or clinical diagnosis by traditional differential methods. In conclusion, our comprehensive assay based on a custom AmpliSeq™ panel and Ion Torrent PGM sequencing has wider coverage, higher throughput, is much faster, and more efficient when compared with the traditional molecular detection method for HPA patients, which could meet the medical need for individualized diagnosis and treatment.

  8. Development of Expressed Sequence Tag (EST)-based Cleaved Amplified Polymorphic Sequence (CAPS) markers of tea plant and their application to cultivar identification.

    Science.gov (United States)

    Ujihara, Tomomi; Taniguchi, Fumiya; Tanaka, Jun-Ichi; Hayashi, Nobuyuki

    2011-03-09

    To develop cleaved amplified polymorphic sequence (CAPS) markers for cultivar identification of the tea leaf, 5 primer pairs designed on the basis of genes that encode proteins related to nitrogen assimilation and 26 primer pairs based on expressed sequence tag (EST) sequences of the root of tea plant were screened. From combinations of primer pair and restriction enzyme that showed polymorphism among tea plants, 16 markers were selected and applied to DNA fingerprinting of Japanese tea cultivars. Sixty-three cultivars, except for a bud sport (Kiraka) and its original cultivar (Yabukita) and a pair that was the progeny of the same crossing parent (Harumoegi and Sakimidori), were distinguished from one another. By combining the 16 markers with previously developed CAPS markers and observing the physical appearance, 67 cultivars were distinguishable. The cultivars involve approximately 95% of total tea cultivating area in Japan; therefore, about 95% of tea leaves produced in Japan can be authenticated by labeling their cultivars.

  9. Identification of staphylococcal species based on variations in protein sequences (mass spectrometry) and DNA sequence (sodA microarray).

    Science.gov (United States)

    Kooken, Jennifer; Fox, Karen; Fox, Alvin; Altomare, Diego; Creek, Kim; Wunschel, David; Pajares-Merino, Sara; Martínez-Ballesteros, Ilargi; Garaizar, Javier; Oyarzabal, Omar; Samadpour, Mansour

    2014-02-01

    This report is among the first using sequence variation in newly discovered protein markers for staphylococcal (or indeed any other bacterial) speciation. Variation, at the DNA sequence level, in the sodA gene (commonly used for staphylococcal speciation) provided excellent correlation. Relatedness among strains was also assessed using protein profiling using microcapillary electrophoresis and pulsed field electrophoresis. A total of 64 strains were analyzed including reference strains representing the 11 staphylococcal species most commonly isolated from man (Staphylococcus aureus and 10 coagulase negative species [CoNS]). Matrix assisted time of flight ionization/ionization mass spectrometry (MALDI TOF MS) and liquid chromatography-electrospray ionization tandem mass spectrometry (LC ESI MS/MS) were used for peptide analysis of proteins isolated from gel bands. Comparison of experimental spectra of unknowns versus spectra of peptides derived from reference strains allowed bacterial identification after MALDI TOF MS analysis. After LC-MS/MS analysis of gel bands bacterial speciation was performed by comparing experimental spectra versus virtual spectra using the software X!Tandem. Finally LC-MS/MS was performed on whole proteomes and data analysis also employing X!tandem. Aconitate hydratase and oxoglutarate dehydrogenase served as marker proteins on focused analysis after gel separation. Alternatively on full proteomics analysis elongation factor Tu generally provided the highest confidence in staphylococcal speciation.

  10. Genomic clones of bovine parvovirus: Construction and effect of deletions and terminal sequence inversions on infectivity

    Energy Technology Data Exchange (ETDEWEB)

    Shull, B.C.; Chen, K.C.; Lederman, M.; Stout, E.R.; Bates, R.C. (Virginia Polytechnic Institute and State Univ., Blacksburg (USA))

    1988-02-01

    Genomic clones of the autonomous parvovirus bovine parvovirus (BPV) were constructed by blunt-end ligation of reannealed virion plus and minus DNA strands into the plasmid pUC8. These clones were stable during propagation in Escherichia coli JM107. All clones tested were found to be infectious by the criteria of plaque titer and progressive cytophathic effect after transfection into bovine fetal lung cells. Sequencing of the recombinant plasmids demonstrated that all of the BPV inserts had left-end (3{prime})-terminal deletions of up to 34 bases. Defective genomes could also be detected in the progeny DNA even though the infection was initiated with homogeneous, cloned DNA. Full-length genomic clones with 3{prime} flip and 3{prime} flop conformations were constructed and were found to have equal infectivity. Expression of capsid proteins from tranfected genomes was demonstrated by hemagglutination, indirect immunofluorescence, and immunoprecipitation of ({sup 35}S)methionine-labeled cell lysates. Use of appropriate antiserum for immunoprecipitation showed the synthesis of BPV capsid and noncapsid proteins after transfection. Independently, a series of genomic clones with increasingly larger 3{prime}-terminal deletions was prepared from separately subcloned 3{prime}-terminal fragments. Transfection of these clones into bovine fetal lung cells revealed that deletions of up to 34 bases at the 3{prime} end lowered but did not abolish infectivity, while deletions of greater than 52 bases were lethal. End-label analysis showed that the 34-base deletion was repaired to wild-type length in the progeny virus.

  11. A Study on the Effect of Welding Sequence in Fabrication of Large Stiffened Plate Panels

    Institute of Scientific and Technical Information of China (English)

    Pankaj Biswas; D.Anil Kumar; N.R.Mandal; M.M.Mahapatra

    2011-01-01

    Welding sequence has a significant effect on distortion pattern of large orthogonally stiffened panels normally used in ships and offshore structures.These deformations adversely affect the subsequent fitup and alignment of the adjacent panels.It may also result in loss of structural integrity.These panels primarily suffer from angular and buckling distortions.The extent of distortion depends on several parameters such as welding speed,plate thickness,welding current,voltage,restraints applied to the job while welding,thermal history as well as sequence of welding.Numerical modeling of welding and experimental validation of the FE model has been carried out for estimation of thermal history and resulting distortions.In the present work an FE model has been developed for studying the effect of welding sequence on the distortion pattern and its magnitude in fabrication of orthogonally stiffened plate panels.

  12. Influence of Single Base Change in Shine-Dalgarno Sequence on the Stability of B.Subtilis Plasmid PSM604

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    B.Subtilis expression plasmids generally require a stringent Shine-Dalgarno Sequence(SDS). Site-directed-mutagenesis was explored to change the Shine-Dalgarno Sequence from AAAAATGGGG (mutant type) to AAAAAGGGGG (wild type) in recombinant plasmid PSM604. The single base substitution made the plasmid with wild SDS unstable in structure and segregation. The interaction of SDS with subtilisin leader sequence of PSM604 might be responsible for the instability of plasmid.

  13. Robust design of an optical router based on a tapered side-coupled integrated spaced sequence of optical resonators.

    Science.gov (United States)

    Bettotti, P; Mancinelli, M; Guider, R; Masi, M; Vanacharla, M Rao; Pavesi, L

    2011-04-15

    A novel (to our knowledge) scheme of an optical router/switch element, composed of a tapered side-coupled integrated spaced sequence of optical resonators, is proposed. It is based on a modified design of the ring sequence in which the resonance conditions are set by the single ring resonance and by the coherent feedback of the sequence of rings. This double condition yields robustness against fabrication defects, dense routing capability, and high switching efficiency.

  14. A novel predictor for protein structural class based on integrated information of the secondary structure sequence.

    Science.gov (United States)

    Zhang, Lichao; Zhao, Xiqiang; Kong, Liang; Liu, Shuxia

    2014-08-01

    The structural class has become one of the most important features for characterizing the overall folding type of a protein and played important roles in many aspects of protein research. At present, it is still a challenging problem to accurately predict protein structural class for low-similarity sequences. In this study, an 18-dimensional integrated feature vector is proposed by fusing the information about content and position of the predicted secondary structure elements. The consistently high accuracies of jackknife and 10-fold cross-validation tests on different low-similarity benchmark datasets show that the proposed method is reliable and stable. Comparison of our results with other methods demonstrates that our method is an effective computational tool for protein structural class prediction, especially for low-similarity sequences.

  15. Identifying and calling insertions, deletions, and single-base mutations efficiently from sequence data

    Science.gov (United States)

    Whole genome sequencing studies can directly identify causative mutations for subsequent use in genomic evaluations, but sequence variant identification is a lengthy and sometimes inaccurate process. The speed and accuracy of identifying small insertions and deletions of sequence, collectively terme...

  16. Thermoelectric effect and its dependence on molecular length and sequence in single DNA molecules.

    Science.gov (United States)

    Li, Yueqi; Xiang, Limin; Palma, Julio L; Asai, Yoshihiro; Tao, Nongjian

    2016-01-01

    Studying the thermoelectric effect in DNA is important for unravelling charge transport mechanisms and for developing relevant applications of DNA molecules. Here we report a study of the thermoelectric effect in single DNA molecules. By varying the molecular length and sequence, we tune the charge transport in DNA to either a hopping- or tunnelling-dominated regimes. The thermoelectric effect is small and insensitive to the molecular length in the hopping regime. In contrast, the thermoelectric effect is large and sensitive to the length in the tunnelling regime. These findings indicate that one may control the thermoelectric effect in DNA by varying its sequence and length. We describe the experimental results in terms of hopping and tunnelling charge transport models.

  17. Comparative study of the validity of three regions of the 18S-rRNA gene for massively parallel sequencing-based monitoring of the planktonic eukaryote community.

    Science.gov (United States)

    Tanabe, Akifumi S; Nagai, Satoshi; Hida, Kohsuke; Yasuike, Motoshige; Fujiwara, Atushi; Nakamura, Yoji; Takano, Yoshihito; Katakura, Seiji

    2016-03-01

    The nuclear 18S-rRNA gene has been used as a metabarcoding marker in massively parallel sequencing (MPS)-based environmental surveys for plankton biodiversity research. However, different hypervariable regions have been used in different studies, and their utility has been debated among researchers. In this study, detailed investigations into 18S-rRNA were carried out; we investigated the effective number of sequences deposited in international nucleotide sequence databases (INSDs), the amplification bias, and the amplicon sequence variability among the three variable regions, V1-3, V4-5 and V7-9, using in silico polymerase chain reaction (PCR) amplification based on INSDs. We also examined the primer universality and the taxonomic identification power, using MPS-based environmental surveys in the Sea of Okhotsk, to determine which region is more useful for MPS-based monitoring. The primer universality was not significantly different among the three regions, but the number of sequences deposited in INSDs was markedly larger for the V4-5 region than for the other two regions. The sequence variability was significantly different, with the highest variability in the V1-3 region, followed by the V7-9 region, and the lowest variability in the V4-5 region. The results of the MPS-based environmental surveys showed significantly higher identification power in the V1-3 and V7-9 regions than in the V4-5 region, but no significant difference was detected between the V1-3 and V7-9 regions. We therefore conclude that the V1-3 region will be the most suitable for future MPS-based monitoring of natural eukaryote communities, as the number of sequences deposited in INSDs increases.

  18. Automatic Tracing and Segmentation of Rat Mammary Fat Pads in MRI Image Sequences Based on Cartoon-Texture Model

    Institute of Scientific and Technical Information of China (English)

    TU Shengxian; ZHANG Su; CHEN Yazhu; Freedman Matthew T; WANG Bin; XUAN Jason; WANG Yue

    2009-01-01

    The growth patterns of mammary fat pads and glandular tissues inside the fat pads may be related with the risk factors of breast cancer.Quantitative measurements of this relationship are available after segmentation of mammary pads and glandular tissues.Rat fat pads may lose continuity along image sequences or adjoin similar intensity areas like epidermis and subcutaneous regions.A new approach for automatic tracing and segmentation of fat pads in magnetic resonance imaging (MRI) image sequences is presented,which does not require that the number of pads be constant or the spatial location of pads be adjacent among image slices.First,each image is decomposed into cartoon image and texture image based on cartoon-texture model.They will be used as smooth image and feature image for segmentation and for targeting pad seeds,respectively.Then,two-phase direct energy segmentation based on Chan-Vese active contour model is applied to partitioning the cartoon image into a set of regions,from which the pad boundary is traced iteratively from the pad seed.A tracing algorithm based on scanning order is proposed to accurately trace the pad boundary,which effectively removes the epidermis attached to the pad without any post processing as well as solves the problem of over-segmentation of some small holes inside the pad.The experimental results demonstrate the utility of this approach in accurate delineation of various numbers of mammary pads from several sets of MRI images.

  19. Similarity Estimation Between DNA Sequences Based on Local Pattern Histograms of Binary Images

    Institute of Scientific and Technical Information of China (English)

    Yusei Kobori; Satoshi Mizuta

    2016-01-01

    Graphical representation of DNA sequences is one of the most popular techniques for alignment-free sequence comparison. Here, we propose a new method for the feature extraction of DNA sequences represented by binary images, by estimating the similarity between DNA sequences using the frequency histograms of local bitmap patterns of images. Our method shows linear time complexity for the length of DNA sequences, which is practical even when long sequences, such as whole genome sequences, are compared. We tested five distance measures for the estimation of sequence similarities, and found that the histogram intersection and Manhattan distance are the most appropriate ones for phylogenetic analyses.

  20. The phylogenetic status of Paxillosida (Asteroidea) based on complete mitochondrial DNA sequences.

    Science.gov (United States)

    Matsubara, Mioko; Komatsu, Miéko; Araki, Takeyoshi; Asakawa, Shuichi; Yokobori, Shin-ichi; Watanabe, Kimitsuna; Wada, Hiroshi

    2005-09-01

    One of the most important issues in asteroid phylogeny is the phylogenetic status of Paxillosida. This group lacks an anus and suckers on the tube feet in adults and does not develop the brachiolaria stage in early development. Two controversial hypotheses have been proposed for the phylogenetic status of Paxillosida, i.e., Paxillosida is primitive or rather specialized in asteroids. In this study, we determined the complete mitochondrial DNA nucleotide sequences from two paxillosidans (Astropecten polyacanthus and Luidia quinaria) and one forcipulatidan (Asterias amurensis). The mitochondrial genomes of the three asteroids were identical with respect to gene order and transcription direction, and were identical to the previously reported mitochondrial genomes of Asterina pectinifera (Valvatida) and Pisaster ochraceus (Forcipulatida) in this respect. Therefore, the comparison of genome structures was uninformative for the purposes of asteroid phylogeny. However, molecular phylogenetic analyses based on the amino acid sequences and the nucleotide sequences from the five asteroids supported the monophyly of the clade that included the two paxillosidans and Asterina. This suggests that the paxillosidan characters are secondarily derived ones.

  1. A FRET Biosensor for ROCK Based on a Consensus Substrate Sequence Identified by KISS Technology.

    Science.gov (United States)

    Li, Chunjie; Imanishi, Ayako; Komatsu, Naoki; Terai, Kenta; Amano, Mutsuki; Kaibuchi, Kozo; Matsuda, Michiyuki

    2017-01-11

    Genetically-encoded biosensors based on Förster/fluorescence resonance energy transfer (FRET) are versatile tools for studying the spatio-temporal regulation of signaling molecules within not only the cells but also tissues. Perhaps the hardest task in the development of a FRET biosensor for protein kinases is to identify the kinase-specific substrate peptide to be used in the FRET biosensor. To solve this problem, we took advantage of kinase-interacting substrate screening (KISS) technology, which deduces a consensus substrate sequence for the protein kinase of interest. Here, we show that a consensus substrate sequence for ROCK identified by KISS yielded a FRET biosensor for ROCK, named Eevee-ROCK, with high sensitivity and specificity. By treating HeLa cells with inhibitors or siRNAs against ROCK, we show that a substantial part of the basal FRET signal of Eevee-ROCK was derived from the activities of ROCK1 and ROCK2. Eevee-ROCK readily detected ROCK activation by epidermal growth factor, lysophosphatidic acid, and serum. When cells stably-expressing Eevee-ROCK were time-lapse imaged for three days, ROCK activity was found to increase after the completion of cytokinesis, concomitant with the spreading of cells. Eevee-ROCK also revealed a gradual increase in ROCK activity during apoptosis. Thus, Eevee-ROCK, which was developed from a substrate sequence predicted by the KISS technology, will pave the way to a better understanding of the function of ROCK in a physiological context.

  2. Sonication-based isolation and enrichment of Chlorella protothecoides chloroplasts for illumina genome sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Angelova, Angelina [University of Arizona; Park, Sang-Hycuk [University of Arizona; Kyndt, John [Bellevue University; Fitzsimmons, Kevin [University of Arizona; Brown, Judith K [University of Arizona

    2013-09-01

    With the increasing world demand for biofuel, a number of oleaginous algal species are being considered as renewable sources of oil. Chlorella protothecoides Krüger synthesizes triacylglycerols (TAGs) as storage compounds that can be converted into renewable fuel utilizing an anabolic pathway that is poorly understood. The paucity of algal chloroplast genome sequences has been an important constraint to chloroplast transformation and for studying gene expression in TAGs pathways. In this study, the intact chloroplasts were released from algal cells using sonication followed by sucrose gradient centrifugation, resulting in a 2.36-fold enrichment of chloroplasts from C. protothecoides, based on qPCR analysis. The C. protothecoides chloroplast genome (cpDNA) was determined using the Illumina HiSeq 2000 sequencing platform and found to be 84,576 Kb in size (8.57 Kb) in size, with a GC content of 30.8 %. This is the first report of an optimized protocol that uses a sonication step, followed by sucrose gradient centrifugation, to release and enrich intact chloroplasts from a microalga (C. prototheocoides) of sufficient quality to permit chloroplast genome sequencing with high coverage, while minimizing nuclear genome contamination. The approach is expected to guide chloroplast isolation from other oleaginous algal species for a variety of uses that benefit from enrichment of chloroplasts, ranging from biochemical analysis to genomics studies.

  3. Refinement of Bos taurus sequence assembly based on BAC-FISH experiments

    Directory of Open Access Journals (Sweden)

    Partipilo Giulia

    2011-12-01

    Full Text Available Abstract Background The sequencing of the cow genome was recently published (Btau_4.0 assembly. A second, alternate cow genome assembly (UMD2, based on the same raw sequence data, was also published. The two assemblies have been subsequently updated to Btau_4.2 and UMD3.1, respectively. Results We compared the Btau_4.2 and UMD3.1 alternate assemblies. Inconsistencies were grouped into three main categories: (i DNA segments showing almost coincidental chromosomal mapping but discordant orientation (inversions; (ii DNA segments showing a discordant map position along the same chromosome; and (iii sequences present in one chromosomal assembly but absent in the corresponding chromosome of the other assembly. The latter category mainly consisted of large amounts of scaffolds that were unassigned in Btau_4.2 but successfully mapped in UMD3.1. We sampled 70 inconsistencies and identified appropriate cow BACs for each of them. These clones were then utilized in FISH experiments on cow metaphase or interphase nuclei in order to disambiguate the discrepancies. In almost all instances the FISH results agreed with the UMD3.1 assembly. Occasionally, however, the mapping data of both assemblies were discordant with the FISH results. Conclusions Our work demonstrates how FISH, which is assembly independent, can be efficiently used to solve assembly problems frequently encountered using the shotgun approach.

  4. Brain bases of working memory for time intervals in rhythmic sequences

    Directory of Open Access Journals (Sweden)

    Sundeep eTeki

    2016-06-01

    Full Text Available Perception of auditory time intervals is critical for accurate comprehension of natural sounds like speech and music. However, the neural substrates and mechanisms underlying the representation of time intervals in working memory are poorly understood. In this study, we investigate the brain bases of working memory for time intervals in rhythmic sequences using functional magnetic resonance imaging.We used a novel behavioral paradigm to investigate time-interval representation in working memory as a function of the temporal jitter and memory load of the sequences containing those time intervals. Human participants were presented with a sequence of intervals and required to reproduce the duration of a particular probed interval. We found that perceptual timing areas including the cerebellum and the striatum were more or less active as a function of increasing and decreasing jitter of the intervals held in working memory respectively whilst the activity of the inferior parietal cortex is modulated as a function of memory load. Additionally, we also analyzed structural correlations between grey and white matter density and behavior and found significant correlations in the cerebellum and the striatum, mirroring the functional results.Our data demonstrate neural substrates of working memory for time intervals and suggest that the cerebellum and the striatum represent core areas for representing temporal information in working memory.

  5. A CLIQUE algorithm using DNA computing techniques based on closed-circle DNA sequences.

    Science.gov (United States)

    Zhang, Hongyan; Liu, Xiyu

    2011-07-01

    DNA computing has been applied in broad fields such as graph theory, finite state problems, and combinatorial problem. DNA computing approaches are more suitable used to solve many combinatorial problems because of the vast parallelism and high-density storage. The CLIQUE algorithm is one of the gird-based clustering techniques for spatial data. It is the combinatorial problem of the density cells. Therefore we utilize DNA computing using the closed-circle DNA sequences to execute the CLIQUE algorithm for the two-dimensional data. In our study, the process of clustering becomes a parallel bio-chemical reaction and the DNA sequences representing the marked cells can be combined to form a closed-circle DNA sequences. This strategy is a new application of DNA computing. Although the strategy is only for the two-dimensional data, it provides a new idea to consider the grids to be vertexes in a graph and transform the search problem into a combinatorial problem.

  6. Innovative molecular diagnosis of Trichinella species based on β-carbonic anhydrase genomic sequence.

    Science.gov (United States)

    Zolfaghari Emameh, Reza; Kuuslahti, Marianne; Näreaho, Anu; Sukura, Antti; Parkkila, Seppo

    2016-03-01

    Trichinellosis is a helminthic infection where different species of Trichinella nematodes are the causative agents. Several molecular assays have been designed to aid diagnostics of trichinellosis. These assays are mostly complex and expensive. The genomes of Trichinella species contain certain parasite-specific genes, which can be detected by polymerase chain reaction (PCR) methods. We selected β-carbonic anhydrase (β-CA) gene as a target, because it is present in many parasites genomes but absent in vertebrates. We developed a novel β-CA gene-based method for detection of Trichinella larvae in biological samples. We first identified a β-CA protein sequence from Trichinella spiralis by bioinformatic tools using β-CAs from Caenorhabditis elegans and Drosophila melanogaster. Thereafter, 16 sets of designed primers were tested to detect β-CA genomic sequences from three species of Trichinella, including T. spiralis, Trichinella pseudospiralis and Trichinella nativa. Among all 16 sets of designed primers, the primer set No. 2 efficiently amplified β-CA genomic sequences from T. spiralis, T. pseudospiralis and T. nativa without any false-positive amplicons from other parasite samples including Toxoplasma gondii, Toxocara cati and Parascaris equorum. This robust and straightforward method could be useful for meat inspection in slaughterhouses, quality control by food authorities and medical laboratories.

  7. SVM-PB-Pred: SVM based protein block prediction method using sequence profiles and secondary structures.

    Science.gov (United States)

    Suresh, V; Parthasarathy, S

    2014-01-01

    We developed a support vector machine based web server called SVM-PB-Pred, to predict the Protein Block for any given amino acid sequence. The input features of SVM-PB-Pred include i) sequence profiles (PSSM) and ii) actual secondary structures (SS) from DSSP method or predicted secondary structures from NPS@ and GOR4 methods. There were three combined input features PSSM+SS(DSSP), PSSM+SS(NPS@) and PSSM+SS(GOR4) used to test and train the SVM models. Similarly, four datasets RS90, DB433, LI1264 and SP1577 were used to develop the SVM models. These four SVM models developed were tested using three different benchmarking tests namely; (i) self consistency, (ii) seven fold cross validation test and (iii) independent case test. The maximum possible prediction accuracy of ~70% was observed in self consistency test for the SVM models of both LI1264 and SP1577 datasets, where PSSM+SS(DSSP) input features was used to test. The prediction accuracies were reduced to ~53% for PSSM+SS(NPS@) and ~43% for PSSM+SS(GOR4) in independent case test, for the SVM models of above two same datasets. Using our method, it is possible to predict the protein block letters for any query protein sequence with ~53% accuracy, when the SP1577 dataset and predicted secondary structure from NPS@ server were used. The SVM-PB-Pred server can be freely accessed through http://bioinfo.bdu.ac.in/~svmpbpred.

  8. A Flexible Approach to Modelling Adaptive Course Sequencing based on Graphs implemented using XLink

    Directory of Open Access Journals (Sweden)

    Rachid ELOUAHBI

    2012-02-01

    Full Text Available A major challenge in developing systems of distance learning is the ability to adapt learning to individual users. This adaptation requires a flexible scheme for sequencing the material to teach diverse learners. This is where we intend to contribute to model the personalized learning paths to be followed by the learner to achieve his/her determined educational objective. Our modelling approach of sequencing is based on the pedagogical graph which is called SMARTGraph. This graph allows expressing the totality of the pedagogic constraints under which the learner is submitted in order to achieve his/her pedagogic objective. SMARTGraph is a graph in which the nodes are the learning units and the arcs are the pedagogic constraints between learning units. We shall see how it is possible to organize the learning units and the learning paths to answer the expectations within the framework of individual courses according to the learner profile or within the framework of group courses. To implement our approach we exploit the strength of XLink (XML Linking Language to define the sequencing graph.

  9. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    Science.gov (United States)

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach.

  10. Genotyping of B. licheniformis based on a novel multi-locus sequence typing (MLST scheme

    Directory of Open Access Journals (Sweden)

    Madslien Elisabeth H

    2012-10-01

    Full Text Available Abstract Background Bacillus licheniformis has for many years been used in the industrial production of enzymes, antibiotics and detergents. However, as a producer of dormant heat-resistant endospores B. licheniformis might contaminate semi-preserved foods. The aim of this study was to establish a robust and novel genotyping scheme for B. licheniformis in order to reveal the evolutionary history of 53 strains of this species. Furthermore, the genotyping scheme was also investigated for its use to detect food-contaminating strains. Results A multi-locus sequence typing (MLST scheme, based on the sequence of six house-keeping genes (adk, ccpA, recF, rpoB, spo0A and sucC of 53 B. licheniformis strains from different sources was established. The result of the MLST analysis supported previous findings of two different subgroups (lineages within this species, named “A” and “B” Statistical analysis of the MLST data indicated a higher rate of recombination within group “A”. Food isolates were widely dispersed in the MLST tree and could not be distinguished from the other strains. However, the food contaminating strain B. licheniformis NVH1032, represented by a unique sequence type (ST8, was distantly related to all other strains. Conclusions In this study, a novel and robust genotyping scheme for B. licheniformis was established, separating the species into two subgroups. This scheme could be used for further studies of evolution and population genetics in B. licheniformis.

  11. Picture or Text First? Explaining Sequence Effects When Learning with Pictures and Text

    Science.gov (United States)

    Eitel, Alexander; Scheiter, Katharina

    2015-01-01

    The present article reviews 42 studies investigating the role of sequencing of text and pictures for learning outcomes. Whereas several of the reviewed studies revealed better learning outcomes from presenting the picture before the text rather than after it, other studies demonstrated the opposite effect. Against the backdrop of theories on…

  12. Learning German Formulaic Sequences: The Effect of Two Attention-Drawing Techniques

    Science.gov (United States)

    Peters, Elke

    2012-01-01

    This article reports a small-scale study that investigated the effect of (1) an instructional method, viz. directing learners' attention to formulaic sequences (FS) in a text, and (2) typographic salience, i.e. bold typeface and underlined, on foreign-language (FL) learners' recall of FS and single words (SW). Twenty-eight FL learners read a…

  13. Effects of Representation Sequences and Spatial Ability on Students' Scientific Understandings about the Mechanism of Breathing

    Science.gov (United States)

    Wu, Hsin-Kai; Lin, Yu-Fen; Hsu, Ying-Shao

    2013-01-01

    The purpose of this study was to investigate the effects of representation sequences and spatial ability on students' scientific understandings about the mechanism of breathing in human beings. 130 seventh graders were assigned to two groups with different sequential combinations of static and dynamic representations: SD group (i.e., viewing…

  14. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-01-01

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations. PMID:27999334

  15. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene.

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-12-17

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations.

  16. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Directory of Open Access Journals (Sweden)

    Karin Soares Cunha

    2016-12-01

    Full Text Available Neurofibromatosis 1 (NF1 is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11. We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G. Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns for different types of pathogenic variations, including the deep intronic splicing mutations.

  17. Conserved PCR primer set designing for closely-related species to complete mitochondrial genome sequencing using a sliding window-based PSO algorithm.

    Directory of Open Access Journals (Sweden)

    Cheng-Hong Yang

    Full Text Available BACKGROUND: Complete mitochondrial (mt genome sequencing is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. For long template sequencing, i.e., like the entire mtDNA, it is essential to design primers for Polymerase Chain Reaction (PCR amplicons which are partly overlapping each other. The presented chromosome walking strategy provides the overlapping design to solve the problem for unreliable sequencing data at the 5' end and provides the effective sequencing. However, current algorithms and tools are mostly focused on the primer design for a local region in the genomic sequence. Accordingly, it is still challenging to provide the primer sets for the entire mtDNA. METHODOLOGY/PRINCIPAL FINDINGS: The purpose of this study is to develop an integrated primer design algorithm for entire mt genome in general, and for the common primer sets for closely-related species in particular. We introduce ClustalW to generate the multiple sequence alignment needed to find the conserved sequences in closely-related species. These conserved sequences are suitable for designing the common primers for the entire mtDNA. Using a heuristic algorithm particle swarm optimization (PSO, all the designed primers were computationally validated to fit the common primer design constraints, such as the melting temperature, primer length and GC content, PCR product length, secondary structure, specificity, and terminal limitation. The overlap requirement for PCR amplicons in the entire mtDNA is satisfied by defining the overlapping region with the sliding window technology. Finally, primer sets were designed within the overlapping region. The primer sets for the entire mtDNA sequences were successfully demonstrated in the example of two closely-related fish species. The pseudo code for the primer design algorithm is provided. CONCLUSIONS/SIGNIFICANCE: In conclusion, it can be said that our proposed sliding window-based PSO

  18. A machine-learning approach for predicting palmitoylation sites from integrated sequence-based features.

    Science.gov (United States)

    Li, Liqi; Luo, Qifa; Xiao, Weidong; Li, Jinhui; Zhou, Shiwen; Li, Yongsheng; Zheng, Xiaoqi; Yang, Hua

    2017-02-01

    Palmitoylation is the covalent attachment of lipids to amino acid residues in proteins. As an important form of protein posttranslational modification, it increases the hydrophobicity of proteins, which contributes to the protein transportation, organelle localization, and functions, therefore plays an important role in a variety of cell biological processes. Identification of palmitoylation sites is necessary for understanding protein-protein interaction, protein stability, and activity. Since conventional experimental techniques to determine palmitoylation sites in proteins are both labor intensive and costly, a fast and accurate computational approach to predict palmitoylation sites from protein sequences is in urgent need. In this study, a support vector machine (SVM)-based method was proposed through integrating PSI-BLAST profile, physicochemical properties, [Formula: see text]-mer amino acid compositions (AACs), and [Formula: see text]-mer pseudo AACs into the principal feature vector. A recursive feature selection scheme was subsequently implemented to single out the most discriminative features. Finally, an SVM method was implemented to predict palmitoylation sites in proteins based on the optimal features. The proposed method achieved an accuracy of 99.41% and Matthews Correlation Coefficient of 0.9773 for a benchmark dataset. The result indicates the efficiency and accuracy of our method in prediction of palmitoylation sites based on protein sequences.

  19. Molecular phylogeny of Toxoplasmatinae: comparison between inferences based on mitochondrial and apicoplast genetic sequences

    Directory of Open Access Journals (Sweden)

    Michelle Klein Sercundes

    2016-03-01

    Full Text Available Abstract Phylogenies within Toxoplasmatinae have been widely investigated with different molecular markers. Here, we studied molecular phylogenies of the Toxoplasmatinae subfamily based on apicoplast and mitochondrial genes. Partial sequences of apicoplast genes coding for caseinolytic protease (clpC and beta subunit of RNA polymerase (rpoB, and mitochondrial gene coding for cytochrome B (cytB were analyzed. Laboratory-adapted strains of the closely related parasites Sarcocystis falcatula and Sarcocystis neurona were investigated, along with Neospora caninum, Neospora hughesi, Toxoplasma gondii (strains RH, CTG and PTG, Besnoitia akodoni, Hammondia hammondiand two genetically divergent lineages of Hammondia heydorni. The molecular analysis based on organellar genes did not clearly differentiate between N. caninum and N. hughesi, but the two lineages of H. heydorni were confirmed. Slight differences between the strains of S. falcatula and S. neurona were encountered in all markers. In conclusion, congruent phylogenies were inferred from the three different genes and they might be used for screening undescribed sarcocystid parasites in order to ascertain their phylogenetic relationships with organisms of the family Sarcocystidae. The evolutionary studies based on organelar genes confirm that the genusHammondia is paraphyletic. The primers used for amplification of clpC and rpoB were able to amplify genetic sequences of organisms of the genus Sarcocystisand organisms of the subfamily Toxoplasmatinae as well.

  20. Tracking facial features in video sequences using a deformable-model-based approach

    Science.gov (United States)

    Malciu, Marius; Preteux, Francoise J.

    2000-10-01

    This paper addresses the issue of computer vision-based face motion capture as an alternative to physical sensor-based technologies. The proposed method combines a deformable template-based tracking of mouth and eyes in arbitrary video sequences with a single speaking person with a global 3D head pose estimation procedure yielding robust initializations. Mathematical principles underlying deformable template matching together with definition and extraction of salient image features are presented. Specifically, interpolating cubic B-splines between the MPEG-4 Face Animation Parameters (FAPs) associated with the mouth and eyes are used as template parameterization. Modeling the template a network of springs interconnecting with the mouth and eyes FAPs, the internal energy is expressed as a combination of elastic and symmetry local constraints. The external energy function, which allows to enforce interactions with image data, involves contour, texture and topography properties properly combined within robust potential functions. Template matching is achieved by applying the downhill simplex method for minimizing the global energy cost. Stability and accuracy of the results are discussed on a set of 2000 frames corresponding to 5 video sequences of speaking people.

  1. Cryptanalysis of a novel image encryption scheme based on improved hyperchaotic sequences

    Science.gov (United States)

    Özkaynak, Fatih; Özer, Ahmet Bedri; Yavuz, Sırma

    2012-11-01

    Chaotic cryptography is a new field that has seen a significant amount of research activity during the last 20 years. Despite the many proposals that use various methods in the design of encryption algorithms, there is a definite need for a mathematically rigorous cryptanalysis of these designs. In this study, we analyze the security weaknesses of the "C. Zhu, A novel image encryption scheme based on improved hyperchaotic sequences, Optics Communications 285 (2012) 29-37". By applying chosen plaintext attacks, we show that all the secret parameters can be revealed.

  2. Model-Based Requirements Analysis for Reactive Systems with UML Sequence Diagrams and Coloured Petri Nets

    DEFF Research Database (Denmark)

    Tjell, Simon; Lassen, Kristian Bisgaard

    2008-01-01

    In this paper, we describe a formal foundation for a specialized approach to automatically checking traces against real-time requirements. The traces are obtained from simulation of Coloured Petri Net (CPN) models of reactive systems. The real-time requirements are expressed in terms...... of a derivative of UML 2.0 high-level Sequence Diagrams. The automated requirement checking is part of a bigger tool framework in which VDM++ is applied to automatically generate initial CPN models based on Problem Diagrams. These models are manually enhanced to provide behavioral descriptions of the environment...

  3. A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

    KAUST Repository

    Chen, Peng

    2015-12-03

    Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures

  4. Complete sequence analysis of 18S rDNA based on genomic DNA extraction from individual Demodex mites (Acari: Demodicidae).

    Science.gov (United States)

    Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang

    2012-05-01

    The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs.

  5. Molecular identification based on ITS sequences for Kappaphycus and Eucheuma cultivated in China

    Science.gov (United States)

    Zhao, Sufen; He, Peimin

    2011-11-01

    The systematic classification of the Eucheumatoideae is difficult because of their variable morphology and interpretation of reproductive structures. Kappaphycus and Eucheuma specimens cultivated on the Hainan and Fujian coast of China were introduced from Vietnam, the Philippines and Indonesia. Combined with morphological characteristics, all Kappaphycus and Eucheuma cultivated strains were identified by internal transcribed spacer (ITS) sequences. The phylogenetic tree was constructed using neighbor-joining and maximum likelihood methods. The results indicate that different ITS sequence lengths occurred in the different genera and species. An obvious difference in morphology could be found in the protuberance shape between Kappaphycus and Eucheuma. The protuberance in Eucheuma was thorn-like and in Kappaphycus was wartlike or papillate. Their ITS sequence lengths differed significantly in nucleotide variation rates up to 58.55%-63.90%. All nucleotide variations occurred in the ITS1 and ITS2 regions except for five nucleotide transversions in the 5.8S rDNA region. In addition, the difference was at the branches among congeneric species. Kappaphycus sp. had branches with small buds, while K. alvarezii did not have such a feature. The nucleotide variation rates varied from 7.02% to 7.48% among species; within the same species of the clades it was <1.20%. Eucheumatoideae algae cultivated in China consisted of three clades, K. alvarezii, Kappaphycus sp., and E. denticulatum. The results indicate that ITS sequence analysis was an effective way for identification of interspecies and intraspecies phylogenetic relationships and might provide a clue for molecular identification of algal Eucheumatoideae.

  6. Molecular identification based on ITS sequences for Kappaphycus and Eucheuma cultivated in China

    Institute of Scientific and Technical Information of China (English)

    ZHAO Sufen; HE Peimin

    2011-01-01

    The systematic classification of the Eucheurnatoideae is difficult because of their variable morphology and interpretation of reproductive structures.Kappaphycus and Eucheuma specimens cultivated on the Hainan and Fujian coast of China were introduced from Vietnam,the Philippines and Indonesia.Combined with morphological characteristics,all Kappaphycus and Eucheuma cultivated strains were identified by internal transcribed spacer (ITS) sequences.The phylogenetic tree was constructed using neighbor-joining and maximum likelihood methods.The results indicate that different ITS sequence lengths occurred in the different genera and species.An obvious difference in morphology could be found in the protuberance shape between Kappaphycus and Eucheuma.The protuberance in Eucheuma was thorn-like and in Kappaphycus was wartlike or papillate.Their ITS sequence lengths differed significantly in nucleotide variation rates up to 58.55%-63.90%.All nucleotide variations occurred in the ITS1 andITS2 regions except for five nucleotide transversions in the 5.8S rDNA region.In addition,the difference was at the branches among congeneric species.Kappaphycus sp.had branches with small buds,while K.alvarezii did not have such a feature.The nucleotide variation rates varied from 7.02% to 7.48% among species; within the same species of the clades it was <1.20%.Eucheumatoideae algae cultivated in China consisted of three clades,K.alvarezii,Kappaphycus sp.,and E.denticulatum.The results indicate that ITS sequence analysis was an effective way for identification of interspecies and intraspecies phylogenetic relationships and might provide a clue for molecular identification of algal Eucheumatoideae.

  7. AFM characterization of ss-DNA probes immobilization: a sequence effect on surface organization

    Energy Technology Data Exchange (ETDEWEB)

    Lallemand, D [Laboratoire d' Electronique, Optoelectronique et Microsystemes, Ecole Centrale de Lyon, 36 avenue Guy de Collongue, 69134 Ecully (France); Rouillat, M H [Laboratoire d' Electronique, Optoelectronique et Microsystemes, Ecole Centrale de Lyon, 36 avenue Guy de Collongue, 69134 Ecully (France); Dugas, V [BioTray, Ecole Normale Superieure de Lyon, 46 allee d' Italie, 69364 Lyon Cedex 07 (France); Chevolot, Y [Laboratoire d' Electronique, Optoelectronique et Microsystemes, Ecole Centrale de Lyon, 36 avenue Guy de Collongue, 69134 Ecully (France); Souteyrand, E [Laboratoire d' Electronique, Optoelectronique et Microsystemes, Ecole Centrale de Lyon, 36 avenue Guy de Collongue, 69134 Ecully (France); Phaner-Goutorbe, M [Laboratoire d' Electronique, Optoelectronique et Microsystemes, Ecole Centrale de Lyon, 36 avenue Guy de Collongue, 69134 Ecully (France)

    2007-03-15

    The biological sensitivity of a DNA chip depends on the molecular organization of the immobilized probe molecules, single stranded DNA (ss-DNA), on the substrate in terms of accessibility and non specific interactions between probes and substrate. In this article, Amplitude Modulation - Atomic Force Microscopy (AM-AFM) was used to characterize at a molecular scale, the morphological organization of different immobilized probes. In our system, three different ss-DNA were covalently grafted on a silicon substrate with the same deposit process. We studied the influence of probe length (25 bases, 12 bases) and sequence arrangement (two different 25 base oligoprobes) on the morphological organization. We showed that immobilized probes organize themselves in different structures depending on their sequence.

  8. Changes in DNA base sequence induced by gamma-ray mutagenesis of lambda phage and prophage

    Energy Technology Data Exchange (ETDEWEB)

    Tindall, K.R.; Stein, J.; Hutchinson, F.

    1988-04-01

    Mutations in the cI (repressor) gene were induced by gamma-ray irradiation of lambda phage and of prophage, and 121 mutations were sequenced. Two-thirds of the mutations in irradiated phage assayed in recA host cells (no induction of the SOS response) were G:C to A:T transitions; it is hypothesized that these may arise during DNA replication from adenine mispairing with a cytosine product deaminated by irradiation. For irradiated phage assayed in host cells in which the SOS response had been induced, 85% of the mutations were base substitutions, and in 40 of the 41 base changes, a preexisting base pair had been replaced by an A:T pair; these might come from damaged bases acting as AP (apurinic or apyrimidinic) sites. The remaining mutations were 1 and 2 base deletions. In irradiated prophage, base change mutations involved the substitution of both A:T and of G:C pairs for the preexisting pairs; the substitution of G:C pairs shows that some base substitution mechanism acts on the cell genome but not on the phage. In the irradiated prophage, frameshifts and a significant number of gross rearrangements were also found.

  9. DEVELOPMENT OF STEP-POOL SEQUENCE AND ITS EFFECTS IN RESISTANCE AND STREAM BED STABILITY

    Institute of Scientific and Technical Information of China (English)

    Zhao-Yin WANG; Jiang XU; Changzhi LI

    2004-01-01

    Experiments were conducted and field investigations were performed to study the development of step-pool sequence and its effects on resistance to the flow and stream bed stability. Step-pool sequence develops in incised channels as a result of streambed erosion, which is compared with sand dunes and armor layer of the role in resistance and streambed protection. The tight interlocking of particles in steps gives them an inherent stability which only extreme floods are likely to disturb. That stability suggests that step-pools are a valid equilibrium form, especially when coupled with their apparent regularity form and their role in satisfying the extreme condition of resistance maximization. The development degree of step-pools, SP, is proportional to the streambed slope. If the incoming sediment load is equal to or more than the sediment-carrying capacity of the flow, there is no bed erosion and thence there are no step-pools. Ifthe flow depth increases and is over the step-height the resistance caused by the step-pool sequence will be greatly reduced. The rate of energy dissipation by step-pools is a function of SP. The higher is SP, the larger is the rate of energy dissipation. The step-pool sequence increases the resistance and flow depth, reduces the shear stress of the flow and protects the streambed from erosion. Moreover,step-pool sequence provides ecologically sound habitats for aquatic bio-community as well.

  10. Primer effect in the detection of mitochondrial DNA point heteroplasmy by automated sequencing.

    Science.gov (United States)

    Calatayud, Marta; Ramos, Amanda; Santos, Cristina; Aluja, Maria Pilar

    2013-06-01

    The correct detection of mitochondrial DNA (mtDNA) heteroplasmy by automated sequencing presents methodological constraints. The main goals of this study are to investigate the effect of sense and distance of primers in heteroplasmy detection and to test if there are differences in the accurate determination of heteroplasmy involving transitions or transversions. A gradient of the heteroplasmy levels was generated for mtDNA positions 9477 (transition G/A) and 15,452 (transversion C/A). Amplification and subsequent sequencing with forward and reverse primers, situated at 550 and 150 bp from the heteroplasmic positions, were performed. Our data provide evidence that there is a significant difference between the use of forward and reverse primers. The forward primer is the primer that seems to give a better approximation to the real proportion of the variants. No significant differences were found concerning the distance at which the sequencing primers were placed neither between the analysis of transitions and transversions. The data collected in this study are a starting point that allows to glimpse the importance of the sequencing primers in the accurate detection of point heteroplasmy, providing additional insight into the overall automated sequencing strategy.

  11. Effect of k-tuple length on sample-comparison with high-throughput sequencing data.

    Science.gov (United States)

    Wang, Ying; Lei, Xiaoye; Wang, Shun; Wang, Zicheng; Song, Nianfeng; Zeng, Feng; Chen, Ting

    2016-01-22

    The high-throughput metagenomic sequencing offers a powerful technique to compare the microbial communities. Without requiring extra reference sequences, alignment-free models with short k-tuple (k = 2-10 bp) yielded promising results. Short k-tuples describe the overall statistical distribution, but is hard to capture the specific characteristics inside one microbial community. Longer k-tuple contains more abundant information. However, because the frequency vector of long k-tuple(k ≥ 30 bp) is sparse, the statistical measures designed for short k-tuples are not applicable. In our study, we considered each tuple as a meaningful word and then each sequencing data as a document composed of the words. Therefore, the comparison between two sequencing data is processed as "topic analysis of documents" in text mining. We designed a pipeline with long k-tuple features to compare metagenomic samples combined using algorithms from text mining and pattern recognition. The pipeline is available at http://culotuple.codeplex.com/. Experiments show that our pipeline with long k-tuple features: ①separates genomes with high similarity; ②outperforms short k-tuple models in all experiments. When k ≥ 12, the short k-tuple measures are not applicable anymore. When k is between 20 and 40, long k-tuple pipeline obtains much better grouping results; ③is free from the effect of sequencing platforms/protocols. ③We obtained meaningful and supported biological results on the 40-tuples selected for comparison.

  12. Small RNA transcriptome investigation based on next-generation sequencing technology

    Institute of Scientific and Technical Information of China (English)

    Linglin Zhou; Xueying Li; Qi Liu; Fangqing Zhao; Jinyu Wu

    2011-01-01

    Over the past decade,there has been a growing realization that studying the small RNA transcriptome is essential for understanding the complexity of transcriptional regulation.With an increased throughput and a reduced cost,next-generation sequencing technology has provided an unprecedented opportunity to measure the extent and complexity of small RNA transcriptome.Meanwhile,the large amount of obtained data and varied technology platforms have also posed multiple challenges for effective data analysis and mining.To provide some insight into the small RNA transcriptome investigation,this review describes the major small RNA classes,experimental methods to identify small RNAs,and available bioinformatics tools and databases.

  13. Amplicon-based semiconductor sequencing of human exomes: performance evaluation and optimization strategies.

    Science.gov (United States)

    Damiati, E; Borsani, G; Giacopuzzi, Edoardo

    2016-05-01

    The Ion Proton platform allows to perform whole exome sequencing (WES) at low cost, providing rapid turnaround time and great flexibility. Products for WES on Ion Proton system include the AmpliSeq Exome kit and the recently introduced HiQ sequencing chemistry. Here, we used gold standard variants from GIAB consortium to assess the performances in variants identification, characterize the erroneous calls and develop a filtering strategy to reduce false positives. The AmpliSeq Exome kit captures a large fraction of bases (>94 %) in human CDS, ClinVar genes and ACMG genes, but with 2,041 (7 %), 449 (13 %) and 11 (19 %) genes not fully represented, respectively. Overall, 515 protein coding genes contain hard-to-sequence regions, including 90 genes from ClinVar. Performance in variants detection was maximum at mean coverage >120×, while at 90× and 70× we measured a loss of variants of 3.2 and 4.5 %, respectively. WES using HiQ chemistry showed ~71/97.5 % sensitivity, ~37/2 % FDR and ~0.66/0.98 F1 score for indels and SNPs, respectively. The proposed low, medium or high-stringency filters reduced the amount of false positives by 10.2, 21.2 and 40.4 % for indels and 21.2, 41.9 and 68.2 % for SNP, respectively. Amplicon-based WES on Ion Proton platform using HiQ chemistry emerged as a competitive approach, with improved accuracy in variants identification. False-positive variants remain an issue for the Ion Torrent technology, but our filtering strategy can be applied to reduce erroneous variants.

  14. A new trilocus sequence-based multiplex-PCR to detect major Acinetobacter baumannii clones.

    Science.gov (United States)

    Martins, Natacha; Picão, Renata Cristina; Cerqueira-Alves, Morgana; Uehara, Aline; Barbosa, Lívia Carvalho; Riley, Lee W; Moreira, Beatriz Meurer

    2016-08-01

    A collection of 163 Acinetobacter baumannii isolates detected in a large Brazilian hospital, was potentially related with the dissemination of four clonal complexes (CC): 113/79, 103/15, 109/1 and 110/25, defined by University of Oxford/Institut Pasteur multilocus sequence typing (MLST) schemes. The urge of a simple multiplex-PCR scheme to specify these clones has motivated the present study. The established trilocus sequence-based typing (3LST, for ompA, csuE and blaOXA-51-like genes) multiplex-PCR rapidly identifies international clones I (CC109/1), II (CC118/2) and III (CC187/3). Thus, the system detects only one (CC109/1) out of four main CC in Brazil. We aimed to develop an alternative multiplex-PCR scheme to detect these clones, known to be present additionally in Africa, Asia, Europe, USA and South America. MLST, performed in the present study to complement typing our whole collection of isolates, confirmed that all isolates belonged to the same four CC detected previously. When typed by 3LST-based multiplex-PCR, only 12% of the 163 isolates were classified into groups. By comparative sequence analysis of ompA, csuE and blaOXA-51-like genes, a set of eight primers was designed for an alternative multiplex-PCR to distinguish the five CC 113/79, 103/15, 109/1, 110/25 and 118/2. Study isolates and one CC118/2 isolate were blind-tested with the new alternative PCR scheme; all were correctly clustered in groups of the corresponding CC. The new multiplex-PCR, with the advantage of fitting in a single reaction, detects five leading A. baumannii clones and could help preventing the spread in healthcare settings.

  15. iTriplet, a rule-based nucleic acid sequence motif finder

    Directory of Open Access Journals (Sweden)

    Gunderson Samuel I

    2009-10-01

    Full Text Available Abstract Background With the advent of high throughput sequencing techniques, large amounts of sequencing data are readily available for analysis. Natural biological signals are intrinsically highly variable making their complete identification a computationally challenging problem. Many attempts in using statistical or combinatorial approaches have been made with great success in the past. However, identifying highly degenerate and long (>20 nucleotides motifs still remains an unmet challenge as high degeneracy will diminish statistical significance of biological signals and increasing motif size will cause combinatorial explosion. In this report, we present a novel rule-based method that is focused on finding degenerate and long motifs. Our proposed method, named iTriplet, avoids costly enumeration present in existing combinatorial methods and is amenable to parallel processing. Results We have conducted a comprehensive assessment on the performance and sensitivity-specificity of iTriplet in analyzing artificial and real biological sequences in various genomic regions. The results show that iTriplet is able to solve challenging cases. Furthermore we have confirmed the utility of iTriplet by showing it accurately predicts polyA-site-related motifs using a dual Luciferase reporter assay. Conclusion iTriplet is a novel rule-based combinatorial or enumerative motif finding method that is able to process highly degenerate and long motifs that have resisted analysis by other methods. In addition, iTriplet is distinguished from other methods of the same family by its parallelizability, which allows it to leverage the power of today's readily available high-performance computing systems.

  16. A novel chaos-based image encryption algorithm using DNA sequence operations

    Science.gov (United States)

    Chai, Xiuli; Chen, Yiran; Broyde, Lucie

    2017-01-01

    An image encryption algorithm based on chaotic system and deoxyribonucleic acid (DNA) sequence operations is proposed in this paper. First, the plain image is encoded into a DNA matrix, and then a new wave-based permutation scheme is performed on it. The chaotic sequences produced by 2D Logistic chaotic map are employed for row circular permutation (RCP) and column circular permutation (CCP). Initial values and parameters of the chaotic system are calculated by the SHA 256 hash of the plain image and the given values. Then, a row-by-row image diffusion method at DNA level is applied. A key matrix generated from the chaotic map is used to fuse the confused DNA matrix; also the initial values and system parameters of the chaotic system are renewed by the hamming distance of the plain image. Finally, after decoding the diffused DNA matrix, we obtain the cipher image. The DNA encoding/decoding rules of the plain image and the key matrix are determined by the plain image. Experimental results and security analyses both confirm that the proposed algorithm has not only an excellent encryption result but also resists various typical attacks.

  17. MuffinInfo: HTML5-Based Statistics Extractor from Next-Generation Sequencing Data.

    Science.gov (United States)

    Alic, Andy S; Blanquer, Ignacio

    2016-09-01

    Usually, the information known a priori about a newly sequenced organism is limited. Even resequencing the same organism can generate unpredictable output. We introduce MuffinInfo, a FastQ/Fasta/SAM information extractor implemented in HTML5 capable of offering insights into next-generation sequencing (NGS) data. Our new tool can run on any software or hardware environment, in command line or graphically, and in browser or standalone. It presents information such as average length, base distribution, quality scores distribution, k-mer histogram, and homopolymers analysis. MuffinInfo improves upon the existing extractors by adding the ability to save and then reload the results obtained after a run as a navigable file (also supporting saving pictures of the charts), by supporting custom statistics implemented by the user, and by offering user-adjustable parameters involved in the processing, all in one software. At the moment, the extractor works with all base space technologies such as Illumina, Roche, Ion Torrent, Pacific Biosciences, and Oxford Nanopore. Owing to HTML5, our software demonstrates the readiness of web technologies for mild intensive tasks encountered in bioinformatics.

  18. Identification of forensic samples by using an infrared-based automatic DNA sequencer.

    Science.gov (United States)

    Ricci, Ugo; Sani, Ilaria; Klintschar, Michael; Cerri, Nicoletta; De Ferrari, Francesco; Giovannucci Uzielli, Maria Luisa

    2003-06-01

    We have recently introduced a new protocol for analyzing all core loci of the Federal Bureau of Investigation's (FBI) Combined DNA Index System (CODIS) with an infrared (IR) automatic DNA sequencer (LI-COR 4200). The amplicons were labeled with forward oligonucleotide primers, covalently linked to a new infrared fluorescent molecule (IRDye 800). The alleles were displayed as familiar autoradiogram-like images with real-time detection. This protocol was employed for paternity testing, population studies, and identification of degraded forensic samples. We extensively analyzed some simulated forensic samples and mixed stains (blood, semen, saliva, bones, and fixed archival embedded tissues), comparing the results with donor samples. Sensitivity studies were also performed for the four multiplex systems. Our results show the efficiency, reliability, and accuracy of the IR system for the analysis of forensic samples. We also compared the efficiency of the multiplex protocol with ultraviolet (UV) technology. Paternity tests, undegraded DNA samples, and real forensic samples were analyzed with this approach based on IR technology and with UV-based automatic sequencers in combination with commercially-available kits. The comparability of the results with the widespread UV methods suggests that it is possible to exchange data between laboratories using the same core group of markers but different primer sets and detection methods.

  19. Application of Sequence-based Methods in Human MicrobialEcology

    Energy Technology Data Exchange (ETDEWEB)

    Weng, Li; Rubin, Edward M.; Bristow, James

    2005-08-29

    Ecologists studying microbial life in the environment have recognized the enormous complexity of microbial diversity for many years, and the development of a variety of culture-independent methods, many of them coupled with high-throughput DNA sequencing, has allowed this diversity to be explored in ever greater detail. Despite the widespread application of these new techniques to the characterization of uncultivated microbes and microbial communities in the environment, their application to human health and disease has lagged behind. Because DNA based-techniques for defining uncultured microbes allow not only cataloging of microbial diversity, but also insight into microbial functions, investigators are beginning to apply these tools to the microbial communities that abound on and within us, in what has aptly been called the second Human Genome Project. In this review we discuss the sequence-based methods for microbial analysis that are currently available and their application to identify novel human pathogens, improve diagnosis of known infectious diseases, and to advance understanding of our relationship with microbial communities that normally reside in and on the human body.

  20. An algorithm for the study of DNA sequence evolution based on the genetic code.

    Science.gov (United States)

    Sirakoulis, G Ch; Karafyllidis, I; Sandaltzopoulos, R; Tsalides, Ph; Thanailakis, A

    2004-11-01

    Recent studies of the quantum-mechanical processes in the DNA molecule have seriously challenged the principle that mutations occur randomly. The proton tunneling mechanism causes tautomeric transitions in base pairs resulting in mutations during DNA replication. The meticulous study of the quantum-mechanical phenomena in DNA may reveal that the process of mutagenesis is not completely random. We are still far away from a complete quantum-mechanical model of DNA sequence mutagenesis because of the complexity of the processes and the complex three-dimensional structure of the molecule. In this paper we have developed a quantum-mechanical description of DNA evolution and, following its outline, we have constructed a classical model for DNA evolution assuming that some aspects of the quantum-mechanical processes have influenced the determination of the genetic code. Conversely, our model assumes that the genetic code provides information about the quantum-mechanical mechanisms of mutagenesis, as the current code is the product of an evolutionary process that tries to minimize the spurious consequences of mutagenesis. Based on this model we develop an algorithm that can be used to study the accumulation of mutations in a DNA sequence. The algorithm has a user-friendly interface and the user can change key parameters in order to study relevant hypotheses.

  1. Research on lock-in thermography for aerospace materials of nondestructive test based on image sequence processing

    Science.gov (United States)

    Liu, Junyan; Dai, Jingmin; Wang, Yang

    2008-11-01

    IR Lock in thermography is an active thermography technology based on thermal wave signal processing, especially, it has many advantages for nondestructive test of composite materials and compound structure application and has been applied on aerospace, automotive, mechanics and electric fields. In lock in thermography, given sufficient time for periodic heating, the surface temperature will evolve periodically in a sinusoidal pattern form the transient state to the steady state. In this paper, the principle of lock in thermography is introduced and the heat transferring process is analyzed by the sinusoidal variation heating flow transferred in materials by means of FEM method. In experiment, the modulating optical stimulation is applied to sample, and image sequences are collected by Jade MWIR 550 FPA IR camera. The digital filter algorithm which is Savitzky-Golay digital smoothness filters is used to remove the effects of high frequency noise. A phase image at the frequency of periodic heating can be calculated using a Fourier transform of the periodic heating frequency in transient state for defect detection. The IR lock in thermography processing software is developed by using of visual C++ programmed based image sequence collected. The experimental results show that the developed system reached up to high level of conventional steady state Lock in method.

  2. Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment

    Directory of Open Access Journals (Sweden)

    Manzini Giovanni

    2007-07-01

    Full Text Available Abstract Background Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rather than a formula quantifying the similarity of two strings. Three approximations of USM are available, namely UCD (Universal Compression Dissimilarity, NCD (Normalized Compression Dissimilarity and CD (Compression Dissimilarity. Their applicability and robustness is tested on various data sets yielding a first massive quantitative estimate that the USM methodology and its approximations are of value. Despite the rich theory developed around USM, its experimental assessment has limitations: only a few data compressors have been tested in conjunction with USM and mostly at a qualitative level, no comparison among UCD, NCD and CD is available and no comparison of USM with existing methods, both based on alignments and not, seems to be available. Results We experimentally test the USM methodology by using 25 compressors, all three of its known approximations and six data sets of relevance to Molecular Biology. This offers the first systematic and quantitative experimental assessment of this methodology, that naturally complements the many theoretical and the preliminary experimental results available. Moreover, we compare the USM methodology both with methods based on alignments and not. We may group our experiments into two sets. The first one, performed via ROC

  3. Environment map building and localization for robot navigation based on image sequences

    Institute of Scientific and Technical Information of China (English)

    Ye-hu SHEN; Ji-lin LIU; Xin DU

    2008-01-01

    SLAM is one of the most important components in robot navigation. A SLAM algorithm based on image sequences captured by a single digital camera is proposed in this paper. By this algorithm, SIFT feature points are selected and matched between image pairs sequentially. After three images have been captured, the environment's 3D map and the camera's positions are initialized based on matched feature points and intrinsic parameters of the camera. A robust method is applied to estimate the position and orientation of the camera in the forthcoming images. Finally, a robust adaptive bundle adjustment algorithm is adopted to optimize the environment's 3D map and the camera's positions simultaneously. Results of quantitative and qualitative experiments show that our algorithm can reconstruct the environment and localize the camera accurately and efficiently.

  4. Assessing divergence time of Spirulida and Sepiida (Cephalopoda) based on hemocyanin sequences.

    Science.gov (United States)

    Warnke, Kerstin Martina; Meyer, Achim; Ebner, Bettina; Lieb, Bernhard

    2011-02-01

    The phylogenetic position of the mesopelagic decabrachian cephalopod Spirula is still a matter of debate. Since hemocyanin has successfully been used to calibrate a molecular clock for many molluscan species, a molecular clock was calculated based on this gene with special attention to the cephalopod genera Spirula and Sepia. The obtained partial sequence comprising ca., one third (3567 bp) of the complete gene is similar to that of Sepia officinalis. The molecular clock was calibrated using the splits of Gastropoda-Cephalopoda (ca. 550 ± 50 mya) and Heterobranchia-Vetigastropoda (ca. 380 ± 10 mya). The resulting hemocyanin-based molecular clock is stable, and the estimated divergence time of Spirulida and Sepiida, some 150 ± 30 million years ago, can be deemed reliable.

  5. Repetitive sequence based polymerase chain reaction to differentiate close bacteria strains in acidic sites

    Institute of Scientific and Technical Information of China (English)

    XIE Ming; YIN Hua-qun; LIU Yi; LIU Jie; LIU Xue-duan

    2008-01-01

    To study the diversity of bacteria strains newly isolated from several acid mine drainage(AMD) sites in China,repetitive sequence based polymerase chain reaction (rep-PCR),a well established technology for diversity analysis of closely related bacteria strains,was conducted on 30 strains of bacteria Leptospirillum ferriphilium,8 strains of bacteria Acidithiobacillus ferrooxidans,as well as the Acidithiobacillus ferrooxidans type strain ATCC (American Type Culture Collection) 23270.The results showed that,using ERIC and BOX primer sets,rep-PCR produced highly discriminatory banding patterns.Phylogenetic analysis based on ERIC-PCR banding types was made and the results indicated that rep-PCR could be used as a rapid and highly discriminatory screening technique in studying bacterial diversity,especially in differentiating bacteria within one species in AMD.

  6. Flag-based detection of weak gas signatures in long-wave infrared hyperspectral image sequences

    Science.gov (United States)

    Marrinan, Timothy; Beveridge, J. Ross; Draper, Bruce; Kirby, Michael; Peterson, Chris

    2016-05-01

    We present a flag manifold based method for detecting chemical plumes in long-wave infrared hyperspectral movies. The method encodes temporal and spatial information related to a hyperspectral pixel into a flag, or nested sequence of linear subspaces. The technique used to create the flags pushes information about the background clutter, ambient conditions, and potential chemical agents into the leading elements of the flags. Exploiting this temporal information allows for a detection algorithm that is sensitive to the presence of weak signals. This method is compared to existing techniques qualitatively on real data and quantitatively on synthetic data to show that the flag-based algorithm consistently performs better on data when the SINRdB is low, and beats the ACE and MF algorithms in probability of detection for low probabilities of false alarm even when the SINRdB is high.

  7. Modeling genetic imprinting effects of DNA sequences with multilocus polymorphism data

    Directory of Open Access Journals (Sweden)

    Staud Roland

    2009-08-01

    Full Text Available Abstract Single nucleotide polymorphisms (SNPs represent the most widespread type of DNA sequence variation in the human genome and they have recently emerged as valuable genetic markers for revealing the genetic architecture of complex traits in terms of nucleotide combination and sequence. Here, we extend an algorithmic model for the haplotype analysis of SNPs to estimate the effects of genetic imprinting expressed at the DNA sequence level. The model provides a general procedure for identifying the number and types of optimal DNA sequence variants that are expressed differently due to their parental origin. The model is used to analyze a genetic data set collected from a pain genetics project. We find that DNA haplotype GAC from three SNPs, OPRKG36T (with two alleles G and T, OPRKA843G (with alleles A and G, and OPRKC846T (with alleles C and T, at the kappa-opioid receptor, triggers a significant effect on pain sensitivity, but with expression significantly depending on the parent from which it is inherited (p = 0.008. With a tremendous advance in SNP identification and automated screening, the model founded on haplotype discovery and statistical inference may provide a useful tool for genetic analysis of any quantitative trait with complex inheritance.

  8. Quantitative, small-scale, fluorophore-assisted carbohydrate electrophoresis implemented on a capillary electrophoresis-based DNA sequence analyzer.

    Science.gov (United States)

    Murray, Sarah; McKenzie, Marian; Butler, Ruth; Baldwin, Samantha; Sutton, Kevin; Batey, Ian; Timmerman-Vaughan, Gail M

    2011-06-15

    Fluorophore-assisted carbohydrate electrophoresis (FACE) is an analytical method for characterizing carbohydrate chain length that has been applied to neutral, charged, and N-linked oligosaccharides and that has been implemented using diverse separation platforms, including polyacrylamide gel electrophoresis and capillary electrophoresis. In this article, we describe three substantial improvements to FACE: (i) reducing the amount of starch and APTS required in labeling reactions and systematically analyzing the effect of altering the starch and 8-amino-1,3,6-pyrenetrisulfonic acid (APTS) concentrations on the reproducibility of the FACE peak area distributions; (ii) implementing FACE on a multiple capillary DNA sequencer (an ABI 3130xl), enabling higher throughput than is possible on other separation platforms; and (iii) developing a protocol for producing quantitative output of peak heights and areas using genetic marker analysis software. The results of a designed experiment to determine the effect of decreasing both the starch and fluorophore concentrations on the sensitivity and reproducibility of FACE electrophoregrams are presented. Analysis of the peak area distributions of the FACE electrophoregrams identified the labeling reaction conditions that resulted in the smallest variances in the peak area distributions while retaining strong fluorescence signals from the capillary-based DNA sequencer.

  9. Self-Triggered Model Predictive Control for Linear Systems Based on Transmission of Control Input Sequences

    Directory of Open Access Journals (Sweden)

    Koichi Kobayashi

    2016-01-01

    Full Text Available A networked control system (NCS is a control system where components such as plants and controllers are connected through communication networks. Self-triggered control is well known as one of the control methods in NCSs and is a control method that for sampled-data control systems both the control input and the aperiodic sampling interval (i.e., the transmission interval are computed simultaneously. In this paper, a self-triggered model predictive control (MPC method for discrete-time linear systems with disturbances is proposed. In the conventional MPC method, the first one of the control input sequence obtained by solving the finite-time optimal control problem is sent and applied to the plant. In the proposed method, the first some elements of the control input sequence obtained are sent to the plant, and each element is sequentially applied to the plant. The number of elements is decided according to the effect of disturbances. In other words, transmission intervals can be controlled. Finally, the effectiveness of the proposed method is shown by numerical simulations.

  10. MAG-PGSTE: a new STE-based PGSE NMR sequence for the determination of diffusion in magnetically inhomogeneous samples.

    Science.gov (United States)

    Zheng, Gang; Price, William S

    2008-11-01

    A new stimulated echo based pulsed gradient spin-echo sequence, MAG-PGSTE, has been developed for the determination of self-diffusion in magnetically inhomogeneous samples. The sequence was tested on two glass bead samples (i.e., 212-300 and Thesis, Universität Leipzig, 2003, P.Z. Sun, Nuclear Magnetic Resonance Microscopy and Diffusion, Ph.D. Thesis, Massachusetts Institute of Technology, 2003] sequence and Cotts 13-interval [R.M. Cotts, M.J.R. Hoch, T. Sun, J.T. Marker, Pulsed field gradient stimulated echo methods for improved NMR diffusion measurements in heterogeneous systems, J. Magn. Reson. 83 (1989) 252-266] sequence using both glass bead samples. The MAG-PGSTE and MAGSTE (or MPFG) sequences outperformed the Cotts 13-interval sequence in the measurement of diffusion coefficients; more interestingly, for the sample with higher background gradients (i.e., the sample), the MAG-PGSTE sequence provided higher signal-to-noise ratios and thus better diffusion measurements than the MAGSTE and Cotts 13-interval sequences. In addition, the MAG-PGSTE sequence provided good characterization of the surface-to-volume ratio for the glass bead samples.

  11. Performance of microarray and liquid based capture methods for target enrichment for massively parallel sequencing and SNP discovery.

    Directory of Open Access Journals (Sweden)

    Anna Kiialainen

    Full Text Available Targeted sequencing is a cost-efficient way to obtain answers to biological questions in many projects, but the choice of the enrichment method to use can be difficult. In this study we compared two hybridization methods for target enrichment for massively parallel sequencing and single nucleotide polymorphism (SNP discovery, namely Nimblegen sequence capture arrays and the SureSelect liquid-based hybrid capture system. We prepared sequencing libraries from three HapMap samples using both methods, sequenced the libraries on the Illumina Genome Analyzer, mapped the sequencing reads back to the genome, and called variants in the sequences. 74-75% of the sequence reads originated from the targeted region in the SureSelect libraries and 41-67% in the Nimblegen libraries. We could sequence up to 99.9% and 99.5% of the regions targeted by capture probes from the SureSelect libraries and from the Nimblegen libraries, respectively. The Nimblegen probes covered 0.6 Mb more of the original 3.1 Mb target region than the SureSelect probes. In each sample, we called more SNPs and detected more novel SNPs from the libraries that were prepared using the Nimblegen method. Thus the Nimblegen method gave better results when judged by the number of SNPs called, but this came at the cost of more over-sampling.

  12. Direct prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles.

    Science.gov (United States)

    Li, Zhixiu; Yang, Yuedong; Faraggi, Eshel; Zhan, Jian; Zhou, Yaoqi

    2014-10-01

    Locating sequences compatible with a protein structural fold is the well-known inverse protein-folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predicted by iterative mutations of a random sequence to produce energy-optimized sequences, or by combining sequences of structurally similar fragments in a template library. The latter approach is computationally more efficient but yields less accurate profiles than the former because of lacking tertiary structural information. Here we present a method called SPIN that predicts Sequence Profiles by Integrated Neural network based on fragment-derived sequence profiles and structure-derived energy profiles. SPIN improves over the fragment-derived profile by 6.7% (from 23.6 to 30.3%) in sequence identity between predicted and wild-type sequences. The method also reduces the number of residues in low complex regions by 15.7% and has a significantly better balance of hydrophilic and hydrophobic residues at protein surface. The accuracy of sequence profiles obtained is comparable to those generated from the protein design program RosettaDesign 3.5. This highly efficient method for predicting sequence profiles from structures will be useful as a single-body scoring term for improving scoring functions used in protein design and fold recognition. It also complements protein design programs in guiding experimental design of the sequence library for screening and directed evolution of designed sequences. The SPIN server is available at http://sparks-lab.org.

  13. Chemical and structural effects of base modifications in messenger RNA

    Science.gov (United States)

    Harcourt, Emily M.; Kietrys, Anna M.; Kool, Eric T.

    2017-01-01

    A growing number of nucleobase modifications in messenger RNA have been revealed through advances in detection and RNA sequencing. Although some of the biochemical pathways that involve modified bases have been identified, research into the world of RNA modification -- the epitranscriptome -- is still in an early phase. A variety of chemical tools are being used to characterize base modifications, and the structural effects of known base modifications on RNA pairing, thermodynamics and folding are being determined in relation to their putative biological roles.

  14. Adaptation of Shift Sequence Based Method for High Number in Shifts Rostering Problem for Health Care Workers

    Directory of Open Access Journals (Sweden)

    Mindaugas Liogys

    2011-08-01

    Full Text Available Purpose—is to investigate a shift sequence-based approach efficiency then problem consisting of a high number of shifts. Research objectives:• Solve health care workers rostering problem using a shift sequence based method.• Measure its efficiency then number of shifts increases. Design/methodology/approach—Usually rostering problems are highly constrained.Constraints are classified to soft and hard constraints. Soft and hard constraints of the problem are additionally classified to: sequence constraints, schedule constraints and roster constraints. Sequence constraints are considered when constructing shift sequences. Schedule constraints are considered when constructing a schedule. Roster constraints are applied, then constructing overall solution, i.e. combining all schedules.Shift sequence based approach consists of two stages:• Shift sequences construction,• The construction of schedules.In the shift sequences construction stage, the shift sequences are constructed for each set of health care workers of different skill, considering sequence constraints. Shifts sequences are ranked by their penalties for easier retrieval in later stage.In schedules construction stage, schedules for each health care worker are constructed iteratively, using the shift sequences produced in stage 1. Shift sequence based method is an adaptive iterative method where health care workers who received the highest schedule penalties in the last iteration are scheduled first at the current iteration. During the roster construction, and after a schedule has been generated for the current health care worker, an improvement method based on an efficient greedy local search is carried out on the partial roster. It simply swaps any pair of shifts between two health care workers in the (partial roster, as long as the swaps satisfy hard constraints and decrease the roster penalty.Findings—Using shift sequence method for solving health care workers rostering

  15. Origin and phylogenetic analysis of Tibetan Mastiff based on the mitochondrial DNA sequence

    Institute of Scientific and Technical Information of China (English)

    Qifa Li; Zhuang Xie; Zhenshan Liu; Yinxia Li; Xingbo Zhao; Liyan Dong; Zengxiang Pan; Yuanrong Sun; Ning Li; Yinxue Xu

    2008-01-01

    At present, the Tibetan Mastiff is the oldest and most ferocious dog in the world. However, the origin of the Tibetan Mastiff and its Phylogenetic relationship with other large breed dogs such as Saint Bernard are unclear. In this study, the primers were designed according to the mitochondrial genome sequence of the domestic dog, and the 2,525 bp mitochondrial sequence, containing the whole sequence of Cytochrome b, tRNA-Thr, tRNA-Pro, and control region of the Tibetan Mastiff, was obtained. Using grey wolves and coyotes as outgroups, the Tibetan Mastiff and 12 breeds of domestic dogs were analyzed in phylogenesis. Tibetan Mastiff, domestic dog breeds, and grey wolves were clustered into a group and coyotes were clustered in a group separately. This indicated that the Tibetan Mastiff and the other domestic dogs originated from the grey wolf, and the Tibetan Mastiff belonged to Carnivora, Canidae, Canis, Canis lupus, Canis lupus familiaris on the animal taxonomy. In domestic dogs, the middle and small breed dogs were clustered at first; German Sheepdog, Swedish Elkhound, and Black Russian Terrier were clustered into one group, and the Tibetan Mastiff, Old English Sheepdog, Leonberger, and Saint Bernard were clustered in another group. This confirmed the viewpoint that many of the famous large breed dogs worldwide Such as Saint Bernard possibly had the blood lineage of the Tibetan Mastiff, based on the molecular data. According to the substitution rate, we concluded that the approximate divergence time between Tibetan Mastiff and grey wolf was 58,000 years before the present (YBP), and the approximate divergence time between other domestic dogs and grey wolf was 42,000 YBP, demonstrating that the time of origin of the Tibetan Mastiff was earlier than that of the other domestic dogs.

  16. Pseudo-random sequence generator based on the generalized Henon map

    Institute of Scientific and Technical Information of China (English)

    ZHENG Fan; TIAN Xiao-jian; SONG Jing-yi; LI Xue-yan

    2008-01-01

    By analysis and comparison of several chaotic systems that are applied to generate pseudo-random sequence, the generalized Henon map is proposed as a pseudo-random sequence generator. A new algorithm is created to solve the problem of non-uniform distribution of the sequence generated by the generalized Henon map. First, move the decimal point of elements in the sequence to the right; then, cut off the integer; and finally, quantify it into a binary sequence. Statistical test, security analysis, and the application of image encryption have strongly supported the good random statistical characteristics, high linear complexity, large key space, and great sensitivity of the binary sequence.

  17. Principles and procedures of considering item sequence effects in the development of calibrated item pools: Conceptual analysis and empirical illustration

    Directory of Open Access Journals (Sweden)

    Safir Yousfi

    2012-12-01

    Full Text Available Item responses can be context-sensitive. Consequently, composing test forms flexibly from a calibrated item pool requires considering potential context effects. This paper focuses on context effects that are related to the item sequence. It is argued that sequence effects are not necessarily a violation of item response theory but that item response theory offers a powerful tool to analyze them. If sequence effects are substantial, test forms cannot be composed flexibly on the basis of a calibrated item pool, which precludes applications like computerized adaptive testing. In contrast, minor sequence effects do not thwart applications of calibrated item pools. Strategies to minimize the detrimental impact of sequence effects on item parameters are discussed and integrated into a nomenclature that addresses the major features of item calibration designs. An example of an item calibration design demonstrates how this nomenclature can guide the process of developing a calibrated item pool.

  18. Adaptive combination of P-values for family-based association testing with sequence data.

    Science.gov (United States)

    Lin, Wan-Yu

    2014-01-01

    Family-based study design will play a key role in identifying rare causal variants, because rare causal variants can be enriched in families with multiple affected subjects. Furthermore, different from population-based studies, family studies are robust to bias induced by population substructure. It is well known that rare causal variants are difficult to detect from single-locus tests. Therefore, burden tests and non-burden tests have been developed, by combining signals of multiple variants in a chromosomal region or a functional unit. This inevitably incorporates some neutral variants into the test statistics, which can dilute the power of statistical methods. To guard against the noise caused by neutral variants, we here propose an 'adaptive combination of P-values method' (abbreviated as 'ADA'). This method combines per-site P-values of variants that are more likely to be causal. Variants with large P-values (which are more likely to be neutral variants) are discarded from the combined statistic. In addition to performing extensive simulation studies, we applied these tests to the Genetic Analysis Workshop 17 data sets, where real sequence data were generated according to the 1000 Genomes Project. Compared with some existing methods, ADA is more robust to the inclusion of neutral variants. This is a merit especially when dichotomous traits are analyzed. However, there are some limitations for ADA. First, it is more computationally intensive. Second, pedigree structures and founders' sequence data are required for the permutation procedure. Third, unrelated controls cannot be included. We here show that, for family-based studies, the application of ADA is limited to dichotomous trait analyses with full pedigree information.

  19. Adaptive combination of P-values for family-based association testing with sequence data.

    Directory of Open Access Journals (Sweden)

    Wan-Yu Lin

    Full Text Available Family-based study design will play a key role in identifying rare causal variants, because rare causal variants can be enriched in families with multiple affected subjects. Furthermore, different from population-based studies, family studies are robust to bias induced by population substructure. It is well known that rare causal variants are difficult to detect from single-locus tests. Therefore, burden tests and non-burden tests have been developed, by combining signals of multiple variants in a chromosomal region or a functional unit. This inevitably incorporates some neutral variants into the test statistics, which can dilute the power of statistical methods. To guard against the noise caused by neutral variants, we here propose an 'adaptive combination of P-values method' (abbreviated as 'ADA'. This method combines per-site P-values of variants that are more likely to be causal. Variants with large P-values (which are more likely to be neutral variants are discarded from the combined statistic. In addition to performing extensive simulation studies, we applied these tests to the Genetic Analysis Workshop 17 data sets, where real sequence data were generated according to the 1000 Genomes Project. Compared with some existing methods, ADA is more robust to the inclusion of neutral variants. This is a merit especially when dichotomous traits are analyzed. However, there are some limitations for ADA. First, it is more computationally intensive. Second, pedigree structures and founders' sequence data are required for the permutation procedure. Third, unrelated controls cannot be included. We here show that, for family-based studies, the application of ADA is limited to dichotomous trait analyses with full pedigree information.

  20. Digital Sequences and a Time Reversal-Based Impact Region Imaging and Localization Method

    Directory of Open Access Journals (Sweden)

    Weifeng Qian

    2013-10-01

    Full Text Available To reduce time and cost of damage inspection, on-line impact monitoring of aircraft composite structures is needed. A digital monitor based on an array of piezoelectric transducers (PZTs is developed to record the impact region of impacts on-line. It is small in size, lightweight and has low power consumption, but there are two problems with the impact alarm region localization method of the digital monitor at the current stage. The first one is that the accuracy rate of the impact alarm region localization is low, especially on complex composite structures. The second problem is that the area of impact alarm region is large when a large scale structure is monitored and the number of PZTs is limited which increases the time and cost of damage inspections. To solve the two problems, an impact alarm region imaging and localization method based on digital sequences and time reversal is proposed. In this method, the frequency band of impact response signals is estimated based on the digital sequences first. Then, characteristic signals of impact response signals are constructed by sinusoidal modulation signals. Finally, the phase synthesis time reversal impact imaging method is adopted to obtain the impact region image. Depending on the image, an error ellipse is generated to give out the final impact alarm region. A validation experiment is implemented on a complex composite wing box of a real aircraft. The validation results show that the accuracy rate of impact alarm region localization is approximately 100%. The area of impact alarm region can be reduced and the number of PZTs needed to cover the same impact monitoring region is reduced by more than a half.

  1. Additional data for a new Theileria sp. from China based on the sequences of ribosomal RNA internal transcribed spacers.

    Science.gov (United States)

    Liu, Junlong; Guan, Guiquan; Liu, Zhijie; Liu, Aihong; Ma, Miling; Bai, Qi; Yin, Hong; Luo, Jianxun

    2013-02-01

    Theileria sinensis was recently isolated and named as an independent Theileria species that infects cattle in China. To date, this parasite has been described based on its morphology, transmission and molecular studies, indicating that it should be classified as a distinct species. To test the validity of this taxon, the two internal transcribed spacers (ITS1 and ITS2) and the 5.8S rRNA gene were cloned and sequenced from three T. sinensis isolates. The complete ITS sequences were compared with those of other Theileria sp. available in GenBank. Phylogenetic analyses based on sequence data for the complete ITS sequences indicate that T. sinensis lies in a distinct clade that is separate from that of T. buffeli/orientalis and T. annulata. Sequence comparisons indicate that different T. sinensis isolates possess unique sizes of ITS1 and ITS2 as well as species-specific nucleotide sequences. This analysis provides new molecular data to support the classification of T. sinensis as a distinct species from other known Theileria spp. based on ITS sequences.

  2. A combined sequence-based and fragment-based characterization of microbial eukaryote assemblages provides taxonomic context for the Terminal Restriction Fragment Length Polymorphism (T-RFLP) method.

    Science.gov (United States)

    Kim, Diane Y; Countway, Peter D; Yamashita, Warren; Caron, David A

    2012-12-01

    Microbial eukaryotes in seawater samples collected from two depths (5 m and 500 m) at the USC Microbial Observatory off the coast of Southern California, USA, were characterized by cloning and sequencing of 18S rRNA genes, as well as DNA fragment analysis of these genes. The sequenced genes were assigned to operational taxonomic units (OTUs), and taxonomic information for the sequence-based OTUs was obtained by comparison to public sequence databases. The sequences were then subjected to in silico digestion to predict fragment sizes, and that information was compared to the results of the T-RFLP method applied to the same samples in order to provide taxonomic context for the environmental T-RFLP fragments. A total of 663 and 678 sequences were analyzed for the 5m and 500 m samples, respectively, which clustered into 157 OTUs and 183 OTUs. The sequences yielded substantially fewer taxonomic units as in silico fragment lengths (i.e., following in silico digestion), and the environmental T-RFLP resulted in the fewest unique OTUs (unique fragments). Bray-Curtis similarity analysis of protistan assemblages was greater using the T-RFLP dataset compared to the sequence-based OTU dataset, presumably due to the inability of the fragment method to differentiate some taxa and an inability to detect many rare taxa relative to the sequence-based approach. Nonetheless, fragments in our analysis generally represented the dominant sequence-based OTUs and putative identifications could be assigned to a majority of the fragments in the environmental T-RFLP results. Our empirical examination of the T-RFLP method identified limitations relative to sequence-based community analysis, but the relative ease and low cost of fragment analysis make this method a useful approach for characterizing the dominant taxa within complex assemblages of microbial eukaryotes in large datasets.

  3. PHYLOGENY OF ANGIOSTRONGYLUS CANTONENSIS IN THAILAND BASED ON CYTOCHROME C OXIDASE SUBUNIT I GENE SEQUENCE.

    Science.gov (United States)

    Apichat, Vitta; Narongrit, Srisongcram; Jittranuch, Thiproaj; Anucha, Wongma; Wilaiwan, Polsut; Chamaiporn, Fukruksa; Thatcha, Yimthin; Bandid, Mangkit; Aunchalee, Thanwisai; Paron, Dekumyoy

    2016-05-01

    Angiostrongylus cantonensis is an emerging infectious agent causing eosinophilic meningitis or meningoencephalitis in humans with clinical manifestation of severe headache. Molecular genetic studies on classification and phylogeny of A. cantonensis in Thailand are limited. This study surveyed A. cantonensis larvae prevalence in natural intermediate hosts across Thailand and analyzed their phylogenetic relationships. A total of 14,032 freshwater and land snails were collected from 19 provinces of Thailand. None of Filopaludina sp, Pomacea sp, and Cyclophorus sp were infected with Angiostrongylus larvae, whereas Achatina fulica, Cryptozona siamensis, and Megaustenia siamensis collected from Kalasin, Kamphaeng Phet, Phetchabun, Phitsanulok, and Tak Provinces were infected, with C. siamensis being the common intermediate host. Based on morphology, larvae isolated from 11 samples of these naturally infected snails preliminarily were identified as A. cantonensis. Comparison of partial nucleotide sequences of cytochrome c oxidase subunit I gene revealed that four sequences are identical to A. cantonensis haplotype ac4 from Bangkok and the other seven to that of A. cantonensis isolate AC Thai, indicating two independent lineages of A. cantonensis in Thailand.

  4. A probabilistic coding based quantum genetic algorithm for multiple sequence alignment.

    Science.gov (United States)

    Huo, Hongwei; Xie, Qiaoluan; Shen, Xubang; Stojkovic, Vojislav

    2008-01-01

    This paper presents an original Quantum Genetic algorithm for Multiple sequence ALIGNment (QGMALIGN) that combines a genetic algorithm and a quantum algorithm. A quantum probabilistic coding is designed for representing the multiple sequence alignment. A quantum rotation gate as a mutation operator is used to guide the quantum state evolution. Six genetic operators are designed on the coding basis to improve the solution during the evolutionary process. The features of implicit parallelism and state superposition in quantum mechanics and the global search capability of the genetic algorithm are exploited to get efficient computation. A set of well known test cases from BAliBASE2.0 is used as reference to evaluate the efficiency of the QGMALIGN optimization. The QGMALIGN results have been compared with the most popular methods (CLUSTALX, SAGA, DIALIGN, SB_PIMA, and QGMALIGN) results. The QGMALIGN results show that QGMALIGN performs well on the presenting biological data. The addition of genetic operators to the quantum algorithm lowers the cost of overall running time.

  5. Prevalence and Sequence-Based Identity of Rumen Fluke in Cattle and Deer in New Caledonia.

    Science.gov (United States)

    Cauquil, Laura; Hüe, Thomas; Hurlin, Jean-Claude; Mitchell, Gillian; Searle, Kate; Skuce, Philip; Zadoks, Ruth

    2016-01-01

    An abattoir survey was performed in the French Melanesian archipelago of New Caledonia to determine the prevalence of paramphistomes in cattle and deer and to generate material for molecular typing at species and subspecies level. Prevalence in adult cattle was high at animal level (70% of 387 adult cattle) and batch level (81%). Prevalence was lower in calves at both levels (33% of 484 calves, 51% at batch level). Animals from 2 of 7 deer farms were positive for rumen fluke, with animal-level prevalence of 41.4% (29/70) and 47.1% (33/70), respectively. Using ITS-2 sequencing, 3 species of paramphistomes were identified, i.e. Calicophoron calicophorum, Fischoederius elongatus and Orthocoelium streptocoelium. All three species were detected in cattle as well as deer, suggesting the possibility of rumen fluke transmission between the two host species. Based on heterogeneity in ITS-2 sequences, the C. calicophorum population comprises two clades, both of which occur in cattle as well as deer. The results suggest two distinct routes of rumen fluke introduction into this area. This approach has wider applicability for investigations of the origin of rumen fluke infections and for the possibility of parasite transmission at the livestock-wildlife interface.

  6. Systematic positions of Lamiophiomis and Paraphlomis (Lamiaceae) based on nuclear and chloroplast sequences

    Institute of Scientific and Technical Information of China (English)

    Yue-Zhi PAN; Li-Qin FANG; Gang HAO; Jie CAI; Xun GONG

    2009-01-01

    Genera Lamiophlomis and Paraphlomis were originally separated from genus Phlomis s.l. on the basis of particular morphological characteristics. However, their relationship was highly contentious, as evidenced by the literature. In the present paper, the systematic positions of Lamiophlomis, Paraphlomis, and their related genera were assessed based on nuclear internal transcribed spacer (ITS) and chloroplast rpl16 and trnL-F sequence data using maximum parsimony (MP) and Bayesian methods. In total, 24 species representing six genera of the ingroup and outgroup were sampled. Analyses of both separate and combined sequence data were conducted to resolve the systematic relationships of these genera. The results reveal that Lamiophlomis is nested within Phlomis sect. Phlomoides and its genetic status is not supported. With the inclusion of Lamiophlomis rotata in sect. Phlomoides, sections Phlomis and Phlomoides of Phlomis were resolved as monophyletic. Paraphlomis was supported as an inde-pendent genus. However, the resolution of its monophyly conflicted between MP and Bayesian analyses, suggesting the need for expended sampling and further evidence.

  7. Molecular phylogenetic analysis of Indonesia Solanaceae based on DNA sequences of internal transcribed spacer region

    Science.gov (United States)

    Hidayat, Topik; Priyandoko, Didik; Islami, Dina Karina; Wardiny, Putri Yunitha

    2016-02-01

    Solanaceae is one of largest family in Angiosperm group with highly diverse in morphological character. In Indonesia, this group of plant is very popular due to its usefulness as food, ornamental and medicinal plants. However, investigation on phylogenetic relationship among the member of this family in Indonesia remains less attention. The purpose of this study was to evaluate the phylogenetics relationship of the family especially distributed in Indonesia. DNA sequences of Internal Transcribed Spacer (ITS) region of 19 species of Solanaceae and three species of outgroup, which belongs to family Convolvulaceae, Apocynaceae, and Plantaginaceae, were isolated, amplified, and sequenced. Phylogenetic tree analysis based on parsimony method was conducted with using data derived from the ITS-1, 5.8S, and ITS-2, separately, and the combination of all. Results indicated that the phylogenetic tree derived from the combined data established better pattern of relationship than separate data. Thus, three major groups were revealed. Group 1 consists of tribe Datureae, Cestreae, and Petunieae, whereas group 2 is member of tribe Physaleae. Group 3 belongs to tribe Solaneae. The use of the ITS region as a molecular markers, in general, support the global Solanaceae relationship that has been previously reported.

  8. Molecular Description of Macroorchis spinulosus (Digenea: Nanophyetidae) Based on ITS1 Sequences

    Science.gov (United States)

    Won, Eun Jeong; Kim, Deok-Gyu; Cho, Jaeeun; Jung, Bong-Kwang; Kim, Min-Jae; Yun, Yong Woon; Chai, Jong-Yil; Ryang, Dong Wook

    2016-01-01

    We performed a molecular genetic study on the sequences of 18S ribosomal RNA (ITS1 region) gene in 4-day-old adult worms of Macroorchis spinulosus recovered in mice experimentally infected with metacercariae from crayfish in Jeollanam-do Province, Korea. The metacercariae were round, 180 μm in average diameter, encysted with 2 layers of thick walls, but the stylet on the oral sucker was not clearly seen. The adult flukes were oval shape, and 760-820 μm long and 320-450 μm wide, with anterolateral location of 2 large testes. The phylogenetic tree based on ITS1 sequences of 6 M. spinulosus samples showed their distinguished position from other trematode species in GenBank. The most closely resembled group was Paragonimus spp. which also take crayfish or crabs as the second intermediate host. The present study is the first molecular characterization of M. spinulosus and provided a basis for further phylogenetic studies to compare with other trematode fauna in Korea. PMID:26951989

  9. Phylogenetic Relationships of Palaearctic Formica Species (Hymenoptera, Formicidae) Based on Mitochondrial Cytochrome b Sequences

    Science.gov (United States)

    Goropashnaya, Anna V.; Fedorov, Vadim B.; Seifert, Bernhard; Pamilo, Pekka

    2012-01-01

    Ants of genus Formica demonstrate variation in social organization and represent model species for ecological, behavioral, evolutionary studies and testing theoretical implications of the kin selection theory. Subgeneric division of the Formica ants based on morphology has been questioned and remained unclear after an allozyme study on genetic differentiation between 13 species representing all subgenera was conducted. In the present study, the phylogenetic relationships within the genus were examined using mitochondrial DNA sequences of the cytochrome b and a part of the NADH dehydrogenase subunit 6. All 23 Formica species sampled in the Palaearctic clustered according to the subgeneric affiliation except F. uralensis that formed a separate phylogenetic group. Unlike Coptoformica and Formica s. str., the subgenus Serviformica did not form a tight cluster but more likely consisted of a few small clades. The genetic distances between the subgenera were around 10%, implying approximate divergence time of 5 Myr if we used the conventional insect divergence rate of 2% per Myr. Within-subgenus divergence estimates were 6.69% in Serviformica, 3.61% in Coptoformica, 1.18% in Formica s. str., which supported our previous results on relatively rapid speciation in the latter subgenus. The phylogeny inferred from DNA sequences provides a necessary framework against which the evolution of social traits can be compared. We discuss implications of inferred phylogeny for the evolution of social traits. PMID:22911845

  10. Molecular phylogeny and evolution of Scomber (Teleostei: Scombridae) based on mitochondrial and nuclear DNA sequences

    Institute of Scientific and Technical Information of China (English)

    CHENG Jiao; GAO Tianxiang; MIAO Zhenqing; YANAGIMOTO Takashi

    2011-01-01

    A molecular phylogenetic analysis of the genus Scomber was conducted based on mitochondrial (COI, Cyt b and control region) and nuclear (5S rDNA) DNA sequence data in multigene perspective. A variety of phylogenetic analytic methods were used to clarify the current taxonomic classification and to assess phylogenetic relationships and the evolutionary history of this genus. The present study produced a well-resolved phylogeny that strongly supported the monophyly of Scomber. We confirmed that S. japonicus and S. colias were genetically distinct. Although morphologically and ecologically similar to S. colias, the molecular data showed that S. japonicus has a greater molecular affinity with S. australasicus, which conflicts with the traditional taxonomy. This phyiogenetic pattern was corroborated by the mtDNA data, but incompletely by the nuclear DNA data. Phylogenetic concordance between the mitochondrial and nuclear DNA regions for the basal nodes supports an Atlantic origin for Scomber. The present-day geographic ranges of the species were compared with the resultant molecular phylogeny derived from partition Bayesian analyses of the combined data sets to evaluate possible dispersal routes of the genus. The present-day geographic distribution of Scomber species might be best ascribed to multiple dispersal events. In addition, our results suggest that phylogenies derived from multiple genes and long sequences exhibited improved phylogenetic resolution, from which we conclude that the phylogenetic reconstruction is a reliable representation of the evolutionary history of Scomber.

  11. Trellis-Based Iterative Adaptive Blind Sequence Estimation for Uncoded/Coded Systems with Differential Precoding

    Directory of Open Access Journals (Sweden)

    Chen Xiao-Ming

    2005-01-01

    Full Text Available We propose iterative, adaptive trellis-based blind sequence estimators, which can be interpreted as reduced-complexity receivers derived from the joint ML data/channel estimation problem. The number of states in the trellis is considered as a design parameter, providing a trade-off between performance and complexity. For symmetrical signal constellations, differential encoding or generalizations thereof are necessary to combat the phase ambiguity. At the receiver, the structure of the super-trellis (representing differential encoding and intersymbol interference is explicitly exploited rather than doing differential decoding just for resolving the problem of phase ambiguity. In uncoded systems, it is shown that the data sequence can only be determined up to an unknown shift index. This shift ambiguity can be resolved by taking an outer channel encoder into account. The average magnitude of the soft outputs from the corresponding channel decoder is exploited to identify the shift index. For frequency-hopping systems over fading channels, a double serially concatenated scheme is proposed, where the inner code is applied to combat the shift ambiguity and the outer code provides time diversity in conjunction with an interburst interleaver.

  12. Internal Transcribed Spacer 1 (ITS1 based sequence typing reveals phylogenetically distinct Ascaris population

    Directory of Open Access Journals (Sweden)

    Koushik Das

    2015-01-01

    Full Text Available Taxonomic differentiation among morphologically identical Ascaris species is a debatable scientific issue in the context of Ascariasis epidemiology. To explain the disease epidemiology and also the taxonomic position of different Ascaris species, genome information of infecting strains from endemic areas throughout the world is certainly crucial. Ascaris population from human has been genetically characterized based on the widely used genetic marker, internal transcribed spacer1 (ITS1. Along with previously reported and prevalent genotype G1, 8 new sequence variants of ITS1 have been identified. Genotype G1 was significantly present among female patients aged between 10 to 15 years. Intragenic linkage disequilibrium (LD analysis at target locus within our study population has identified an incomplete LD value with potential recombination events. A separate cluster of Indian isolates with high bootstrap value indicate their distinct phylogenetic position in comparison to the global Ascaris population. Genetic shuffling through recombination could be a possible reason for high population diversity and frequent emergence of new sequence variants, identified in present and other previous studies. This study explores the genetic organization of Indian Ascaris population for the first time which certainly includes some fundamental information on the molecular epidemiology of Ascariasis.

  13. Hellbender genome sequences shed light on genomic expansion at the base of crown salamanders.

    Science.gov (United States)

    Sun, Cheng; Mueller, Rachel Lockridge

    2014-07-01

    Among animals, genome sizes range from 20 Mb to 130 Gb, with 380-fold variation across vertebrates. Most of the largest vertebrate genomes are found in salamanders, an amphibian clade of 660 species. Thus, salamanders are an important system for studying causes and consequences of genomic gigantism. Previously, we showed that plethodontid salamander genomes accumulate higher levels of long terminal repeat (LTR) retrotransposons than do other vertebrates, although the evolutionary origins of such sequences remained unexplored. We also showed that some salamanders in the family Plethodontidae have relatively slow rates of DNA loss through small insertions and deletions. Here, we present new data from Cryptobranchus alleganiensis, the hellbender. Cryptobranchus and Plethodontidae span the basal phylogenetic split within salamanders; thus, analyses incorporating these taxa can shed light on the genome of the ancestral crown salamander lineage, which underwent expansion. We show that high levels of LTR retrotransposons likely characterize all crown salamanders, suggesting that disproportionate expansion of this transposable element (TE) class contributed to genomic expansion. Phylogenetic and age distribution analyses of salamander LTR retrotransposons indicate that salamanders' high TE levels reflect persistence and diversification of ancestral TEs rather than horizontal transfer events. Finally, we show that relatively slow DNA loss rates through small indels likely characterize all crown salamanders, suggesting that a decreased DNA loss rate contributed to genomic expansion at the clade's base. Our identification of shared genomic features across phylogenetically distant salamanders is a first step toward identifying the evolutionary processes underlying accumulation and persistence of high levels of repetitive sequence in salamander genomes.

  14. Multitarget Tracking of Pedestrians in Video Sequences Based on Particle Filters

    Directory of Open Access Journals (Sweden)

    Hui Li

    2012-01-01

    Full Text Available Video target tracking is a critical problem in the field of computer vision. Particle filters have been proven to be very useful in target tracking for nonlinear and non-Gaussian estimation problems. Although most existing algorithms are able to track targets well in controlled environments, it is often difficult to achieve automated and robust tracking of pedestrians in video sequences if there are various changes in target appearance or surrounding illumination. To surmount these difficulties, this paper presents multitarget tracking of pedestrians in video sequences based on particle filters. In order to improve the efficiency and accuracy of the detection, the algorithm firstly obtains target regions in training frames by combining the methods of background subtraction and Histogram of Oriented Gradient (HOG and then establishes discriminative appearance model by generating patches and constructing codebooks using superpixel and Local Binary Pattern (LBP features in those target regions. During the process of tracking, the algorithm uses the similarity between candidates and codebooks as observation likelihood function and processes severe occlusion condition to prevent drift and loss phenomenon caused by target occlusion. Experimental results demonstrate that our algorithm improves the tracking performance in complicated real scenarios.

  15. Safety assessment of Bifidobacterium longum J DM301 based on complete genome sequences

    Institute of Scientific and Technical Information of China (English)

    Yan-Xia Wei; Zhuo-Yang Zhang; Chang Liu; Xiao-Kui Guo; Pradeep K Malakar

    2012-01-01

    AIM: To assess the safety of Bifidobacterium longum (B.longum) JDM301 based on complete genome sequences. METHODS: The complete genome sequences of JDM301 were determined using the GS 20 system. Putative virulence factors, putative antibiotic resistance genes and genes encoding enzymes responsible for harmful metabolites were identified by blast with virulence factors database, antibiotic resistance genes database and genes associated with harmful metabolites in previous reports. Minimum inhibitory concentration of 16 common antimicrobial agents was evaluated by E-test. RESULTS: JDM301 was shown to contain 36 genes associated with antibiotic resistance, 5 enzymes related to harmful metabolites and 162 nonspecific virulence factors mainly associated with transcriptional regulation, adhesion, sugar and amino acid transport. B. longum JDM301 was intrinsically resistant tocipro ciprofloxacin,amikacin, gentamicin and streptomycin and susceptible to vancomycin, amoxicillin, cephalothin, chloramphenicol, erythromycin, ampicillin, cefotaxime, rifampicin, imipenemandtrimethoprim and trimethoprim-sulphamethoxazol. JDM301.JDM301 was moderately resistant to bacitracin, while an earlier study showed that bifidobacteria were susceptible to this antibiotic. A tetracycline resistance gene with the risk of transfer was found in JDM301, which needs to be experimentally validated. CONCLUSION: The safety assessment of JDM301 using information derived from complete bacterial genome will contribute to a wider and deeper insight into the safety of probiotic bacteria.

  16. A framework phylogeny of the American oak clade based on sequenced RAD data.

    Directory of Open Access Journals (Sweden)

    Andrew L Hipp

    Full Text Available Previous phylogenetic studies in oaks (Quercus, Fagaceae have failed to resolve the backbone topology of the genus with strong support. Here, we utilize next-generation sequencing of restriction-site associated DNA (RAD-Seq to resolve a framework phylogeny of a predominantly American clade of oaks whose crown age is estimated at 23-33 million years old. Using a recently developed analytical pipeline for RAD-Seq phylogenetics, we created a concatenated matrix of 1.40 E06 aligned nucleotides, constituting 27,727 sequence clusters. RAD-Seq data were readily combined across runs, with no difference in phylogenetic placement between technical replicates, which overlapped by only 43-64% in locus coverage. 17% (4,715 of the loci we analyzed could be mapped with high confidence to one or more expressed sequence tags in NCBI Genbank. A concatenated matrix of the loci that BLAST to at least one EST sequence provides approximately half as many variable or parsimony-informative characters as equal-sized datasets from the non-EST loci. The EST-associated matrix is more complete (fewer missing loci and has slightly lower homoplasy than non-EST subsampled matrices of the same size, but there is no difference in phylogenetic support or relative attribution of base substitutions to internal versus terminal branches of the phylogeny. We introduce a partitioned RAD visualization method (implemented in the R package RADami; http://cran.r-project.org/web/packages/RADami to investigate the possibility that suboptimal topologies supported by large numbers of loci--due, for example, to reticulate evolution or lineage sorting--are masked by the globally optimal tree. We find no evidence for strongly-supported alternative topologies in our study, suggesting that the phylogeny we recover is a robust estimate of large-scale phylogenetic patterns in the American oak clade. Our study is one of the first to demonstrate the utility of RAD-Seq data for inferring phylogeny in a

  17. A framework phylogeny of the American oak clade based on sequenced RAD data.

    Science.gov (United States)

    Hipp, Andrew L; Eaton, Deren A R; Cavender-Bares, Jeannine; Fitzek, Elisabeth; Nipper, Rick; Manos, Paul S

    2014-01-01

    Previous phylogenetic studies in oaks (Quercus, Fagaceae) have failed to resolve the backbone topology of the genus with strong support. Here, we utilize next-generation sequencing of restriction-site associated DNA (RAD-Seq) to resolve a framework phylogeny of a predominantly American clade of oaks whose crown age is estimated at 23-33 million years old. Using a recently developed analytical pipeline for RAD-Seq phylogenetics, we created a concatenated matrix of 1.40 E06 aligned nucleotides, constituting 27,727 sequence clusters. RAD-Seq data were readily combined across runs, with no difference in phylogenetic placement between technical replicates, which overlapped by only 43-64% in locus coverage. 17% (4,715) of the loci we analyzed could be mapped with high confidence to one or more expressed sequence tags in NCBI Genbank. A concatenated matrix of the loci that BLAST to at least one EST sequence provides approximately half as many variable or parsimony-informative characters as equal-sized datasets from the non-EST loci. The EST-associated matrix is more complete (fewer missing loci) and has slightly lower homoplasy than non-EST subsampled matrices of the same size, but there is no difference in phylogenetic support or relative attribution of base substitutions to internal versus terminal branches of the phylogeny. We introduce a partitioned RAD visualization method (implemented in the R package RADami; http://cran.r-project.org/web/packages/RADami) to investigate the possibility that suboptimal topologies supported by large numbers of loci--due, for example, to reticulate evolution or lineage sorting--are masked by the globally optimal tree. We find no evidence for strongly-supported alternative topologies in our study, suggesting that the phylogeny we recover is a robust estimate of large-scale phylogenetic patterns in the American oak clade. Our study is one of the first to demonstrate the utility of RAD-Seq data for inferring phylogeny in a 23-33 million

  18. Cluster based on sequence comparison of homologous proteins of 95 organism species - Gclust Server | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Gclust Server Cluster based on sequence comparison of homologous proteins of 95 organism species Data detail... Data name Cluster based on sequence comparison of homologous proteins of 95 organism species Description of...e History of This Database Site Policy | Contact Us Cluster based on sequence comparison of homologous proteins of 95 organism species - Gclust Server | LSDB Archive ...

  19. Effect of sequence features on assembly of spider silk block copolymers.

    Science.gov (United States)

    Tokareva, Olena S; Lin, Shangchao; Jacobsen, Matthew M; Huang, Wenwen; Rizzo, Daniel; Li, David; Simon, Marc; Staii, Cristian; Cebe, Peggy; Wong, Joyce Y; Buehler, Markus J; Kaplan, David L

    2014-06-01

    Bioengineered spider silk block copolymers were studied to understand the effect of protein chain length and sequence chemistry on the formation of secondary structure and materials assembly. Using a combination of in vitro protein design and assembly studies, we demonstrate that silk block copolymers possessing multiple repetitive units self-assemble into lamellar microstructures. Additionally, the study provides insights into the assembly behavior of spider silk block copolymers in concentrated salt solutions.

  20. Distribution of zero sequence currents for earth faults occurring along a transmission line and proximity effects

    Energy Technology Data Exchange (ETDEWEB)

    Nahman, J. (Belgrade Univ. (Yugoslavia). Elektrotehnicki Fakultet); Dordevic, V. (Energoprojekt, Belgrade (Yugoslavia))

    1993-09-01

    A relatively simple procedure is suggested for the evaluation of the distribution of zero sequence currents, within the earthing system of a substation, for earth faults occurred along a line coming from the substation. The earthing system model derived takes into account all relevant phenomena including the mutual influence among earth electrodes through the soil to cover the proximity effects which were shown to be significant in certain cases. The procedure suggested is applied to a practical case, for illustration. (author)

  1. The Utility of Specific Markers Based on ITS2 Sequences for Molecular Identification and Detection of Trichogramma spp.

    Institute of Scientific and Technical Information of China (English)

    LI Zheng-xi; SHEN Zuo-rui

    2002-01-01

    The technology based on specific PCR amplification using internal transcribed spacer 2 of nuclear ribosomal DNA for molecular identification and detection of Trichogramma species was studied. Firstly the ITS2s of six Trichogramma species were cloned and sequenced, and the interspecific sequence variation was analyzed. Secondly the ITS2 regions of six geographical populations of T. dendrolimi were cloned and sequenced, and the intraspecific sequence identity was analyzed. The results show that the interspecific variation and intraspecific similarity of ITS2 sequences are very suitable for designation of specific primers at specieslevel. Screening of specific primers for T. dendrolimi leads to final sensitive and stable diagnostic primers. This system lets non-specialists can not only identify adults (males and females), but also identify eggs in parasitized hosts rapidly and accurately, which is impossible by conventional methods. Further development of this protocol can create a complete set of specific primers for different species of the whole genus Trichogramma.

  2. Speech serial control in healthy speakers and speakers with hypokinetic or ataxic dysarthria: Effects of sequence length and practice

    Directory of Open Access Journals (Sweden)

    Kevin J Reilly

    2013-10-01

    Full Text Available The current study investigated the processes responsible for selection of sounds and syllables during production of speech sequences in 10 adults with hypokinetic dysarthria from Parkinson’s disease, 5 adults with ataxic dysarthria, and 14 healthy control speakers. Speech production data from a choice reaction time task were analyzed to evaluate the effects of sequence length and practice on speech sound sequencing. Speakers produced sequences that were between one and five syllables in length over five experimental runs of 60 trials each. In contrast to the healthy speakers, speakers with hypokinetic dysarthria demonstrated exaggerated sequence length effects for both inter-syllable intervals (ISIs and speech error rates. Conversely, speakers with ataxic dysarthria failed to demonstrate a sequence length effect on ISIs and were also the only group that did not exhibit practice-related changes in ISIs and speech error rates over the five experimental runs. The exaggerated sequence length effects in the hypokinetic speakers with Parkinson’s disease are consistent with an impairment of action selection during speech sequence production. The absent length effects observed in the speakers with ataxic dysarthria is consistent with previous findings that indicate a limited capacity to buffer speech sequences in advance of their execution. In addition, the lack of practice effects in these speakers suggests that learning-related improvements in the production rate and accuracy of speech sequences involves processing by structures of the cerebellum. Together, the current findings inform models of serial control for speech in healthy speakers and support the notion that sequencing deficits contribute to speech symptoms in speakers with hypokinetic or ataxic dysarthria. In addition, these findings indicate that speech sequencing is differentially impaired in hypokinetic and ataxic dysarthria.

  3. Interaction among apoptosis-associated sequence variants and joint effects on aggressive prostate cancer

    Directory of Open Access Journals (Sweden)

    Lavender Nicole A

    2012-04-01

    Full Text Available Abstract Background Molecular and epidemiological evidence demonstrate that altered gene expression and single nucleotide polymorphisms in the apoptotic pathway are linked to many cancers. Yet, few studies emphasize the interaction of variant apoptotic genes and their joint modifying effects on prostate cancer (PCA outcomes. An exhaustive assessment of all the possible two-, three- and four-way gene-gene interactions is computationally burdensome. This statistical conundrum stems from the prohibitive amount of data needed to account for multiple hypothesis testing. Methods To address this issue, we systematically prioritized and evaluated individual effects and complex interactions among 172 apoptotic SNPs in relation to PCA risk and aggressive disease (i.e., Gleason score ≥ 7 and tumor stages III/IV. Single and joint modifying effects on PCA outcomes among European-American men were analyzed using statistical epistasis networks coupled with multi-factor dimensionality reduction (SEN-guided MDR. The case-control study design included 1,175 incident PCA cases and 1,111 controls from the prostate, lung, colo-rectal, and ovarian (PLCO cancer screening trial. Moreover, a subset analysis of PCA cases consisted of 688 aggressive and 488 non-aggressive PCA cases. SNP profiles were obtained using the NCI Cancer Genetic Markers of Susceptibility (CGEMS data portal. Main effects were assessed using logistic regression (LR models. Prior to modeling interactions, SEN was used to pre-process our genetic data. SEN used network science to reduce our analysis from > 36 million to Results Following LR modeling, eleven and thirteen sequence variants were associated with PCA risk and aggressive disease, respectively. However, none of these markers remained significant after we adjusted for multiple comparisons. Nevertheless, we detected a modest synergistic interaction between AKT3 rs2125230-PRKCQ rs571715 and disease aggressiveness using SEN-guided MDR (p = 0

  4. Effect of Sequencing Strength and Endurance Training in Young Male Soccer Players.

    Science.gov (United States)

    Makhlouf, Issam; Castagna, Carlo; Manzi, Vincenzo; Laurencelle, Louis; Behm, David G; Chaouachi, Anis

    2016-03-01

    This study examined the effects of strength and endurance training sequence (strength before or after endurance) on relevant fitness variables in youth soccer players. Fifty-seven young elite-level male field soccer players (13.7 ± 0.5 years; 164 ± 8.3 cm; 53.5 ± 8.6 kg; body fat; 15.6 ± 3.9%) were randomly assigned to a control (n = 14, CG) and 3 experimental training groups (twice a week for 12 weeks) strength before (SE, n = 15), after (ES, n = 14) or on alternate days (ASE, n = 14) with endurance training. A significant (p = 0.001) intervention main effect was detected. There were only trivial training sequence differences (ES vs. SE) for all variables (p > 0.05). The CG showed large squat 1 repetition maximum (1RM) and medium sprint, change of direction ability, and jump improvements. ASE demonstrated a trivial difference in endurance performance with ES and SE (p > 0.05). Large to medium greater improvements for SE and ES were reported compared with ASE for sprinting over 10 and 30 m (p training sequence on soccer fitness-relevant variables. However, combining strength and endurance within a single training session provided superior results vs. training on alternate days. Concurrent training may be considered as an effective and safe training method for the development of the prospective soccer player.

  5. The effect of music background on the emotional appraisal of film sequences

    Directory of Open Access Journals (Sweden)

    Pavlović Ivanka

    2011-01-01

    Full Text Available In this study the effects of musical background on the emotional appraisal of film sequences was investigated. Four pairs of polar emotions defined in Plutchik’s model were used as basic emotional qualities: joy-sadness, anticipation-surprise, fear-anger, and trust disgust. In the preliminary study eight film sequences and eight music themes were selected as the best representatives of all eight Plutchik’s emotions. In the main experiment the participant judged the emotional qualities of film-music combinations on eight seven-point scales. Half of the combinations were congruent (e.g. joyful film - joyful music, and half were incongruent (e.g. joyful film - sad music. Results have shown that visual information (film had greater effects on the emotion appraisal than auditory information (music. The modulation effects of music background depend on emotional qualities. In some incongruent combinations (joysadness the modulations in the expected directions were obtained (e.g. joyful music reduces the sadness of a sad film, in some cases (anger-fear no modulation effects were obtained, and in some cases (trust-disgust, anticipation-surprise the modulation effects were in an unexpected direction (e.g. trustful music increased the appraisal of disgust of a disgusting film. These results suggest that the appraisals of conjoint effects of emotions depend on the medium (film masks the music and emotional quality (three types of modulation effects.

  6. Terabit Nyquist PDM-32QAM signal transmission with training sequence based time domain channel estimation.

    Science.gov (United States)

    Zhang, Fan; Wang, Dan; Ding, Rui; Chen, Zhangyuan

    2014-09-22

    We propose a time domain structure of channel estimation for coherent optical communication systems, which employs training sequence based equalizer and is transparent to arbitrary quadrature amplitude modulation (QAM) formats. Enabled with this methodology, 1.02 Tb/s polarization division multiplexed 32 QAM Nyquist pulse shaping signal with a net spectral efficiency of 7.46 b/s/Hz is transmitted over standard single-mode fiber link with Erbium-doped fiber amplifier only amplification. After 1190 km transmission, the average bit-error rate is lower than the 20% hard-decision forward error correction threshold of 1.5 × 10(-2). The transmission distance can be extended to 1428 km by employing intra-subchannel nonlinear compensation with the digital back-propagation method.

  7. Geographic Distribution of Leishmania Species in Ecuador Based on the Cytochrome B Gene Sequence Analysis.

    Science.gov (United States)

    Kato, Hirotomo; Gomez, Eduardo A; Martini-Robles, Luiggi; Muzzio, Jenny; Velez, Lenin; Calvopiña, Manuel; Romero-Alvarez, Daniel; Mimori, Tatsuyuki; Uezato, Hiroshi; Hashiguchi, Yoshihisa

    2016-07-01

    A countrywide epidemiological study was performed to elucidate the current geographic distribution of causative species of cutaneous leishmaniasis (CL) in Ecuador by using FTA card-spotted samples and smear slides as DNA sources. Putative Leishmania in 165 samples collected from patients with CL in 16 provinces of Ecuador were examined at the species level based on the cytochrome b gene sequence analysis. Of these, 125 samples were successfully identified as Leishmania (Viannia) guyanensis, L. (V.) braziliensis, L. (V.) naiffi, L. (V.) lainsoni, and L. (Leishmania) mexicana. Two dominant species, L. (V.) guyanensis and L. (V.) braziliensis, were widely distributed in Pacific coast subtropical and Amazonian tropical areas, respectively. Recently reported L. (V.) naiffi and L. (V.) lainsoni were identified in Amazonian areas, and L. (L.) mexicana was identified in an Andean highland area. Importantly, the present study demonstrated that cases of L. (V.) braziliensis infection are increasing in Pacific coast areas.

  8. Phylogenetic relationship of Podocopida (Ostracoda: Podocopa) based on 18S ribosomal DNA sequences

    Institute of Scientific and Technical Information of China (English)

    YU Na; ZHAO Meiying; CHEN Liqiao; YANG Pin

    2006-01-01

    Nucleotide sequences from 18S rDNA of 11 ostracodes, which represent four suborders and six superfamilies ofpodocopidan, were determined. The phylogenetic relationships were analyzed based on three kinds of methods (maximum-likelihood, maximum-parsimony,and neighbor-joining), and the three topologies gained were basically similar. The results have showed that (1) a monophyletic Podocopida was supported strongly; (2) the phylogenetic relationships of four suborders were (Darwinulocopina plus (Bairdiocopina plus (Cytherocopina plus Cypridocopina))), which indicated that a close relationship between Cytherocopina and Cypridocopina, and Darwinulocopina had separated early from the main podocopinan; (3) Cypridocopinan formed a monophyletic group, among which the phylogenetic relationship of three superfamilies was (Cypridoidea plus (Macrocypridoidea plus Pontocypridoidea)).

  9. Geographic Distribution of Leishmania Species in Ecuador Based on the Cytochrome B Gene Sequence Analysis

    Science.gov (United States)

    Kato, Hirotomo; Gomez, Eduardo A.; Martini-Robles, Luiggi; Muzzio, Jenny; Velez, Lenin; Calvopiña, Manuel; Romero-Alvarez, Daniel; Mimori, Tatsuyuki; Uezato, Hiroshi; Hashiguchi, Yoshihisa

    2016-01-01

    A countrywide epidemiological study was performed to elucidate the current geographic distribution of causative species of cutaneous leishmaniasis (CL) in Ecuador by using FTA card-spotted samples and smear slides as DNA sources. Putative Leishmania in 165 samples collected from patients with CL in 16 provinces of Ecuador were examined at the species level based on the cytochrome b gene sequence analysis. Of these, 125 samples were successfully identified as Leishmania (Viannia) guyanensis, L. (V.) braziliensis, L. (V.) naiffi, L. (V.) lainsoni, and L. (Leishmania) mexicana. Two dominant species, L. (V.) guyanensis and L. (V.) braziliensis, were widely distributed in Pacific coast subtropical and Amazonian tropical areas, respectively. Recently reported L. (V.) naiffi and L. (V.) lainsoni were identified in Amazonian areas, and L. (L.) mexicana was identified in an Andean highland area. Importantly, the present study demonstrated that cases of L. (V.) braziliensis infection are increasing in Pacific coast areas. PMID:27410039

  10. On the precipitation sequence in a Ni-based superalloy: A Coincidence Doppler Broadening study

    Energy Technology Data Exchange (ETDEWEB)

    Macchi, C.E. [IFIMAT, UNCentro and CONICET, Pinto 399, B7000GHG Tandil (Argentina); Somoza, A. [IFIMAT, UNCentro and CICPBA, Pinto 399, B7000GHG Tandil (Argentina); Santos, G. [NIECyT, UNCentro, Pinto 399, B7000GHG Tandil (Argentina); Petkov, M. [Jet Propulsion Lab, California Institute of Technology, Pasadena, CA 91109 (United States); Lynn, K.G. [Department of Physics, Washington State University, Pullman WA 99164-2814 (United States)

    2007-07-01

    The precipitation sequence at 700 C of the Ni{sub 3}(Ti,Al)-type ordered {gamma}' phase in the commercial nickel-based superalloy Inconel X-750 was investigated using Coincidence Doppler Broadening (CDB) technique. The results obtained are discussed in terms of positron annihilation in two well-defined states: one corresponding to the matrix ({gamma} phase) and a second related to the {gamma}' precipitates. Between these two aging stages, CDB distributions corresponding to selected intermediate aging treatments could be presented exactly, within the experimental scatter, as a linear combination of the {gamma} and {gamma}' signatures. (copyright 2007 WILEY-VCH Verlag GmbH and Co. KGaA, Weinheim) (orig.)

  11. All-optical repetition rate multiplication of pseudorandom bit sequences based on cascaded TOADs

    Science.gov (United States)

    Sun, Zhenchao; Wang, Zhi; Wu, Chongqing; Wang, Fu; Li, Qiang

    2016-03-01

    A scheme for all-optical repetition rate multiplication of pseudorandom bit sequences (PRBS) is demonstrated with all-optical wavelength conversion and o