WorldWideScience

Sample records for sequencing reveals patterns

  1. Kangaroo – A pattern-matching program for biological sequences

    Directory of Open Access Journals (Sweden)

    Betel Doron

    2002-07-01

    Full Text Available Abstract Background Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells. Results Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/. Conclusion A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats.

  2. Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry.

    Science.gov (United States)

    Asara, John M; Schweitzer, Mary H; Freimark, Lisa M; Phillips, Matthew; Cantley, Lewis C

    2007-04-13

    Fossilized bones from extinct taxa harbor the potential for obtaining protein or DNA sequences that could reveal evolutionary links to extant species. We used mass spectrometry to obtain protein sequences from bones of a 160,000- to 600,000-year-old extinct mastodon (Mammut americanum) and a 68-million-year-old dinosaur (Tyrannosaurus rex). The presence of T. rex sequences indicates that their peptide bonds were remarkably stable. Mass spectrometry can thus be used to determine unique sequences from ancient organisms from peptide fragmentation patterns, a valuable tool to study the evolution and adaptation of ancient taxa from which genomic sequences are unlikely to be obtained.

  3. Whole exome sequencing in 342 congenital cardiac left sided lesion cases reveals extensive genetic heterogeneity and complex inheritance patterns

    Directory of Open Access Journals (Sweden)

    Alexander H. Li

    2017-10-01

    Full Text Available Abstract Background Left-sided lesions (LSLs account for an important fraction of severe congenital cardiovascular malformations (CVMs. The genetic contributions to LSLs are complex, and the mutations that cause these malformations span several diverse biological signaling pathways: TGFB, NOTCH, SHH, and more. Here, we use whole exome sequence data generated in 342 LSL cases to identify likely damaging variants in putative candidate CVM genes. Methods Using a series of bioinformatics filters, we focused on genes harboring population-rare, putative loss-of-function (LOF, and predicted damaging variants in 1760 CVM candidate genes constructed a priori from the literature and model organism databases. Gene variants that were not observed in a comparably sequenced control dataset of 5492 samples without severe CVM were then subjected to targeted validation in cases and parents. Whole exome sequencing data from 4593 individuals referred for clinical sequencing were used to bolster evidence for the role of candidate genes in CVMs and LSLs. Results Our analyses revealed 28 candidate variants in 27 genes, including 17 genes not previously associated with a human CVM disorder, and revealed diverse patterns of inheritance among LOF carriers, including 9 confirmed de novo variants in both novel and newly described human CVM candidate genes (ACVR1, JARID2, NR2F2, PLRG1, SMURF1 as well as established syndromic CVM genes (KMT2D, NF1, TBX20, ZEB2. We also identified two genes (DNAH5, OFD1 with evidence of recessive and hemizygous inheritance patterns, respectively. Within our clinical cohort, we also observed heterozygous LOF variants in JARID2 and SMAD1 in individuals with cardiac phenotypes, and collectively, carriers of LOF variants in our candidate genes had a four times higher odds of having CVM (odds ratio = 4.0, 95% confidence interval 2.5–6.5. Conclusions Our analytical strategy highlights the utility of bioinformatic resources, including human

  4. Output-Sensitive Pattern Extraction in Sequences

    DEFF Research Database (Denmark)

    Grossi, Roberto; Menconi, Giulia; Pisanti, Nadia

    2014-01-01

    Genomic Analysis, Plagiarism Detection, Data Mining, Intrusion Detection, Spam Fighting and Time Series Analysis are just some examples of applications where extraction of recurring patterns in sequences of objects is one of the main computational challenges. Several notions of patterns exist...... or extend them causes a loss of significant information (where the number of occurrences changes). Output-sensitive algorithms have been proposed to enumerate and list these patterns, taking polynomial time O(nc) per pattern for constant c > 1, which is impractical for massive sequences of very large length...

  5. Deep sequencing reveals distinct patterns of DNA methylation in prostate cancer.

    Science.gov (United States)

    Kim, Jung H; Dhanasekaran, Saravana M; Prensner, John R; Cao, Xuhong; Robinson, Daniel; Kalyana-Sundaram, Shanker; Huang, Christina; Shankar, Sunita; Jing, Xiaojun; Iyer, Matthew; Hu, Ming; Sam, Lee; Grasso, Catherine; Maher, Christopher A; Palanisamy, Nallasivam; Mehra, Rohit; Kominsky, Hal D; Siddiqui, Javed; Yu, Jindan; Qin, Zhaohui S; Chinnaiyan, Arul M

    2011-07-01

    Beginning with precursor lesions, aberrant DNA methylation marks the entire spectrum of prostate cancer progression. We mapped the global DNA methylation patterns in select prostate tissues and cell lines using MethylPlex-next-generation sequencing (M-NGS). Hidden Markov model-based next-generation sequence analysis identified ∼68,000 methylated regions per sample. While global CpG island (CGI) methylation was not differential between benign adjacent and cancer samples, overall promoter CGI methylation significantly increased from ~12.6% in benign samples to 19.3% and 21.8% in localized and metastatic cancer tissues, respectively (P-value prostate tissues, 2481 differentially methylated regions (DMRs) are cancer-specific, including numerous novel DMRs. A novel cancer-specific DMR in the WFDC2 promoter showed frequent methylation in cancer (17/22 tissues, 6/6 cell lines), but not in the benign tissues (0/10) and normal PrEC cells. Integration of LNCaP DNA methylation and H3K4me3 data suggested an epigenetic mechanism for alternate transcription start site utilization, and these modifications segregated into distinct regions when present on the same promoter. Finally, we observed differences in repeat element methylation, particularly LINE-1, between ERG gene fusion-positive and -negative cancers, and we confirmed this observation using pyrosequencing on a tissue panel. This comprehensive methylome map will further our understanding of epigenetic regulation in prostate cancer progression.

  6. SMRT Sequencing Revealed Mitogenome Characteristics and Mitogenome-Wide DNA Modification Pattern in Ophiocordyceps sinensis.

    Science.gov (United States)

    Kang, Xincong; Hu, Liqin; Shen, Pengyuan; Li, Rui; Liu, Dongbo

    2017-01-01

    Single molecule, real-time (SMRT) sequencing was used to characterize mitochondrial (mt) genome of Ophiocordyceps sinensis and to analyze the mt genome-wide pattern of epigenetic DNA modification. The complete mt genome of O. sinensis , with a size of 157,539 bp, is the fourth largest Ascomycota mt genome sequenced to date. It contained 14 conserved protein-coding genes (PCGs), 1 intronic protein rps3 , 27 tRNAs and 2 rRNA subunits, which are common characteristics of the known mt genomes in Hypocreales. A phylogenetic tree inferred from 14 PCGs in Pezizomycotina fungi supports O. sinensis as most closely related to Hirsutella rhossiliensis in Ophiocordycipitaceae. A total of 36 sequence sites in rps3 were under positive selection, with dN/dS >1 in the 20 compared fungi. Among them, 16 sites were statistically significant. In addition, the mt genome-wide base modification pattern of O. sinensis was determined in this study, especially DNA methylation. The methylations were located in coding and uncoding regions of mt PCGs in O. sinensis , and might be closely related to the expression of PCGs or the binding affinity of transcription factor A to mtDNA. Consequently, these methylations may affect the enzymatic activity of oxidative phosphorylation and then the mt respiratory rate; or they may influence mt biogenesis. Therefore, methylations in the mitogenome of O. sinensis might be a genetic feature to adapt to the cold and low PO 2 environment at high altitude, where O. sinensis is endemic. This is the first report on epigenetic modifications in a fungal mt genome.

  7. Multilocus Sequence Analysis of Nectar Pseudomonads Reveals High Genetic Diversity and Contrasting Recombination Patterns

    Science.gov (United States)

    Álvarez-Pérez, Sergio; de Vega, Clara; Herrera, Carlos M.

    2013-01-01

    The genetic and evolutionary relationships among floral nectar-dwelling Pseudomonas ‘sensu stricto’ isolates associated to South African and Mediterranean plants were investigated by multilocus sequence analysis (MLSA) of four core housekeeping genes (rrs, gyrB, rpoB and rpoD). A total of 35 different sequence types were found for the 38 nectar bacterial isolates characterised. Phylogenetic analyses resulted in the identification of three main clades [nectar groups (NGs) 1, 2 and 3] of nectar pseudomonads, which were closely related to five intrageneric groups: Pseudomonas oryzihabitans (NG 1); P. fluorescens, P. lutea and P. syringae (NG 2); and P. rhizosphaerae (NG 3). Linkage disequilibrium analysis pointed to a mostly clonal population structure, even when the analysis was restricted to isolates from the same floristic region or belonging to the same NG. Nevertheless, signatures of recombination were observed for NG 3, which exclusively included isolates retrieved from the floral nectar of insect-pollinated Mediterranean plants. In contrast, the other two NGs comprised both South African and Mediterranean isolates. Analyses relating diversification to floristic region and pollinator type revealed that there has been more unique evolution of the nectar pseudomonads within the Mediterranean region than would be expected by chance. This is the first work analysing the sequence of multiple loci to reveal geno- and ecotypes of nectar bacteria. PMID:24116076

  8. Multilocus sequence analysis of nectar pseudomonads reveals high genetic diversity and contrasting recombination patterns.

    Science.gov (United States)

    Alvarez-Pérez, Sergio; de Vega, Clara; Herrera, Carlos M

    2013-01-01

    The genetic and evolutionary relationships among floral nectar-dwelling Pseudomonas 'sensu stricto' isolates associated to South African and Mediterranean plants were investigated by multilocus sequence analysis (MLSA) of four core housekeeping genes (rrs, gyrB, rpoB and rpoD). A total of 35 different sequence types were found for the 38 nectar bacterial isolates characterised. Phylogenetic analyses resulted in the identification of three main clades [nectar groups (NGs) 1, 2 and 3] of nectar pseudomonads, which were closely related to five intrageneric groups: Pseudomonas oryzihabitans (NG 1); P. fluorescens, P. lutea and P. syringae (NG 2); and P. rhizosphaerae (NG 3). Linkage disequilibrium analysis pointed to a mostly clonal population structure, even when the analysis was restricted to isolates from the same floristic region or belonging to the same NG. Nevertheless, signatures of recombination were observed for NG 3, which exclusively included isolates retrieved from the floral nectar of insect-pollinated Mediterranean plants. In contrast, the other two NGs comprised both South African and Mediterranean isolates. Analyses relating diversification to floristic region and pollinator type revealed that there has been more unique evolution of the nectar pseudomonads within the Mediterranean region than would be expected by chance. This is the first work analysing the sequence of multiple loci to reveal geno- and ecotypes of nectar bacteria.

  9. The role of consolidation in learning context-dependent phonotactic patterns in speech and digital sequence production.

    Science.gov (United States)

    Anderson, Nathaniel D; Dell, Gary S

    2018-04-03

    Speakers implicitly learn novel phonotactic patterns by producing strings of syllables. The learning is revealed in their speech errors. First-order patterns, such as "/f/ must be a syllable onset," can be distinguished from contingent, or second-order, patterns, such as "/f/ must be an onset if the vowel is /a/, but a coda if the vowel is /o/." A metaanalysis of 19 experiments clearly demonstrated that first-order patterns affect speech errors to a very great extent in a single experimental session, but second-order vowel-contingent patterns only affect errors on the second day of testing, suggesting the need for a consolidation period. Two experiments tested an analogue to these studies involving sequences of button pushes, with fingers as "consonants" and thumbs as "vowels." The button-push errors revealed two of the key speech-error findings: first-order patterns are learned quickly, but second-order thumb-contingent patterns are only strongly revealed in the errors on the second day of testing. The influence of computational complexity on the implicit learning of phonotactic patterns in speech production may be a general feature of sequence production.

  10. Measuring patterns in team interaction sequences using a discrete recurrence approach.

    Science.gov (United States)

    Gorman, Jamie C; Cooke, Nancy J; Amazeen, Polemnia G; Fouse, Shannon

    2012-08-01

    Recurrence-based measures of communication determinism and pattern information are described and validated using previously collected team interaction data. Team coordination dynamics has revealed that"mixing" team membership can lead to flexible interaction processes, but keeping a team "intact" can lead to rigid interaction processes. We hypothesized that communication of intact teams would have greater determinism and higher pattern information compared to that of mixed teams. Determinism and pattern information were measured from three-person Uninhabited Air Vehicle team communication sequences over a series of 40-minute missions. Because team members communicated using push-to-talk buttons, communication sequences were automatically generated during each mission. The Composition x Mission determinism effect was significant. Intact teams' determinism increased over missions, whereas mixed teams' determinism did not change. Intact teams had significantly higher maximum pattern information than mixed teams. Results from these new communication analysis methods converge with content-based methods and support our hypotheses. Because they are not content based, and because they are automatic and fast, these new methods may be amenable to real-time communication pattern analysis.

  11. Complex evolutionary patterns revealed by mitochondrial genomes of the domestic horse.

    Science.gov (United States)

    Ning, T; Li, J; Lin, K; Xiao, H; Wylie, S; Hua, S; Li, H; Zhang, Y-P

    2014-01-01

    The domestic horse is the most widely used and important stock and recreational animal, valued for its strength and endurance. The energy required by the domestic horse is mainly supplied by mitochondria via oxidative phosphorylation. Thus, selection may have played an essential role in the evolution of the horse mitochondria. Besides, demographic events also affect the DNA polymorphic pattern on mitochondria. To understand the evolutionary patterns of the mitochondria of the domestic horse, we used a deep sequencing approach to obtain the complete sequences of 15 mitochondrial genomes, and four mitochondrial gene sequences, ND6, ATP8, ATP6 and CYTB, collected from 509, 363, 363 and 409 domestic horses, respectively. Evidence of strong substitution rate heterogeneity was found at nonsynonymous sites across the genomes. Signatures of recent positive selection on mtDNA of domestic horse were detected. Specifically, five amino acids in the four mitochondrial genes were identified as the targets of positive selection. Coalescentbased simulations imply that recent population expansion is the most probable explanation for the matrilineal population history for domestic horse. Our findings reveal a complex pattern of non-neutral evolution of the mitochondrial genome in the domestic horses.

  12. Chromosome-wide mapping of DNA methylation patterns in normal and malignant prostate cells reveals pervasive methylation of gene-associated and conserved intergenic sequences

    Directory of Open Access Journals (Sweden)

    De Marzo Angelo M

    2011-06-01

    Full Text Available Abstract Background DNA methylation has been linked to genome regulation and dysregulation in health and disease respectively, and methods for characterizing genomic DNA methylation patterns are rapidly emerging. We have developed/refined methods for enrichment of methylated genomic fragments using the methyl-binding domain of the human MBD2 protein (MBD2-MBD followed by analysis with high-density tiling microarrays. This MBD-chip approach was used to characterize DNA methylation patterns across all non-repetitive sequences of human chromosomes 21 and 22 at high-resolution in normal and malignant prostate cells. Results Examining this data using computational methods that were designed specifically for DNA methylation tiling array data revealed widespread methylation of both gene promoter and non-promoter regions in cancer and normal cells. In addition to identifying several novel cancer hypermethylated 5' gene upstream regions that mediated epigenetic gene silencing, we also found several hypermethylated 3' gene downstream, intragenic and intergenic regions. The hypermethylated intragenic regions were highly enriched for overlap with intron-exon boundaries, suggesting a possible role in regulation of alternative transcriptional start sites, exon usage and/or splicing. The hypermethylated intergenic regions showed significant enrichment for conservation across vertebrate species. A sampling of these newly identified promoter (ADAMTS1 and SCARF2 genes and non-promoter (downstream or within DSCR9, C21orf57 and HLCS genes hypermethylated regions were effective in distinguishing malignant from normal prostate tissues and/or cell lines. Conclusions Comparison of chromosome-wide DNA methylation patterns in normal and malignant prostate cells revealed significant methylation of gene-proximal and conserved intergenic sequences. Such analyses can be easily extended for genome-wide methylation analysis in health and disease.

  13. Computational Analysis of G-Quadruplex Forming Sequences across Chromosomes Reveals High Density Patterns Near the Terminal Ends.

    Directory of Open Access Journals (Sweden)

    Julia H Chariker

    Full Text Available G-quadruplex structures (G4 are found throughout the human genome and are known to play a regulatory role in a variety of molecular processes. Structurally, they have many configurations and can form from one or more DNA strands. At the gene level, they regulate gene expression and protein synthesis. In this paper, chromosomal-level patterns of distribution are analyzed on the human genome to identify high-level distribution patterns potentially related to global functional processes. Here we show unique high density banding patterns on individual chromosomes that are highly correlated, appearing in a mirror pattern, across forward and reverse DNA strands. The highest density of G4 sequences occurs within four megabases of one end of most chromosomes and contains G4 motifs that bind with zinc finger proteins. These findings suggest that G4 may play a role in global chromosomal processes such as those found in meiosis.

  14. Complete mitochondrial genome sequences of three bats species and whole genome mitochondrial analyses reveal patterns of codon bias and lend support to a basal split in Chiroptera.

    Science.gov (United States)

    Meganathan, P R; Pagan, Heidi J T; McCulloch, Eve S; Stevens, Richard D; Ray, David A

    2012-01-15

    Order Chiroptera is a unique group of mammals whose members have attained self-powered flight as their main mode of locomotion. Much speculation persists regarding bat evolution; however, lack of sufficient molecular data hampers evolutionary and conservation studies. Of ~1200 species, complete mitochondrial genome sequences are available for only eleven. Additional sequences should be generated if we are to resolve many questions concerning these fascinating mammals. Herein, we describe the complete mitochondrial genomes of three bats: Corynorhinus rafinesquii, Lasiurus borealis and Artibeus lituratus. We also compare the currently available mitochondrial genomes and analyze codon usage in Chiroptera. C. rafinesquii, L. borealis and A. lituratus mitochondrial genomes are 16438 bp, 17048 bp and 16709 bp, respectively. Genome organization and gene arrangements are similar to other bats. Phylogenetic analyses using complete mitochondrial genome sequences support previously established phylogenetic relationships and suggest utility in future studies focusing on the evolutionary aspects of these species. Comprehensive analyses of available bat mitochondrial genomes reveal distinct nucleotide patterns and synonymous codon preferences corresponding to different chiropteran families. These patterns suggest that mutational and selection forces are acting to different extents within Chiroptera and shape their mitochondrial genomes. Copyright © 2011 Elsevier B.V. All rights reserved.

  15. Waiting Time Distributions for Pattern Occurrence in a Constrained Sequence

    Directory of Open Access Journals (Sweden)

    Valeri Stefanov

    2007-01-01

    Full Text Available A binary sequence of zeros and ones is called a (d,k-sequence if it does not contain runs of zeros of length either less than d or greater than k, where d and k are arbitrary, but fixed, non-negative integers and d < k. Such sequences find an abundance of applications in communications, in particular for magnetic and optical recording. Occasionally, one requires that (d,k-sequences do not contain a specific pattern w. Therefore, distribution results concerning pattern occurrence in (d,k-sequences are of interest. In this paper we study the distribution of the waiting time until the r th occurrence of a pattern w in a random (d,k-sequence generated by a Markov source. Numerical examples are also provided.

  16. Whole genome sequencing reveals complex evolution patterns of multidrug-resistant Mycobacterium tuberculosis Beijing strains in patients.

    Directory of Open Access Journals (Sweden)

    Matthias Merker

    Full Text Available Multidrug-resistant (MDR Mycobacterium tuberculosis complex (MTBC strains represent a major threat for tuberculosis (TB control. Treatment of MDR-TB patients is long and less effective, resulting in a significant number of treatment failures. The development of further resistances leads to extensively drug-resistant (XDR variants. However, data on the individual reasons for treatment failure, e.g. an induced mutational burst, and on the evolution of bacteria in the patient are only sparsely available. To address this question, we investigated the intra-patient evolution of serial MTBC isolates obtained from three MDR-TB patients undergoing longitudinal treatment, finally leading to XDR-TB. Sequential isolates displayed identical IS6110 fingerprint patterns, suggesting the absence of exogenous re-infection. We utilized whole genome sequencing (WGS to screen for variations in three isolates from Patient A and four isolates from Patient B and C, respectively. Acquired polymorphisms were subsequently validated in up to 15 serial isolates by Sanger sequencing. We determined eight (Patient A and nine (Patient B polymorphisms, which occurred in a stepwise manner during the course of the therapy and were linked to resistance or a potential compensatory mechanism. For both patients, our analysis revealed the long-term co-existence of clonal subpopulations that displayed different drug resistance allele combinations. Out of these, the most resistant clone was fixed in the population. In contrast, baseline and follow-up isolates of Patient C were distinguished each by eleven unique polymorphisms, indicating an exogenous re-infection with an XDR strain not detected by IS6110 RFLP typing. Our study demonstrates that intra-patient microevolution of MDR-MTBC strains under longitudinal treatment is more complex than previously anticipated. However, a mutator phenotype was not detected. The presence of different subpopulations might confound phenotypic and

  17. Peptide Pattern Recognition for high-throughput protein sequence analysis and clustering

    DEFF Research Database (Denmark)

    Busk, Peter Kamp

    2017-01-01

    Large collections of protein sequences with divergent sequences are tedious to analyze for understanding their phylogenetic or structure-function relation. Peptide Pattern Recognition is an algorithm that was developed to facilitate this task but the previous version does only allow a limited...... number of sequences as input. I implemented Peptide Pattern Recognition as a multithread software designed to handle large numbers of sequences and perform analysis in a reasonable time frame. Benchmarking showed that the new implementation of Peptide Pattern Recognition is twenty times faster than...... the previous implementation on a small protein collection with 673 MAP kinase sequences. In addition, the new implementation could analyze a large protein collection with 48,570 Glycosyl Transferase family 20 sequences without reaching its upper limit on a desktop computer. Peptide Pattern Recognition...

  18. CodonLogo: a sequence logo-based viewer for codon patterns.

    Science.gov (United States)

    Sharma, Virag; Murphy, David P; Provan, Gregory; Baranov, Pavel V

    2012-07-15

    Conserved patterns across a multiple sequence alignment can be visualized by generating sequence logos. Sequence logos show each column in the alignment as stacks of symbol(s) where the height of a stack is proportional to its informational content, whereas the height of each symbol within the stack is proportional to its frequency in the column. Sequence logos use symbols of either nucleotide or amino acid alphabets. However, certain regulatory signals in messenger RNA (mRNA) act as combinations of codons. Yet no tool is available for visualization of conserved codon patterns. We present the first application which allows visualization of conserved regions in a multiple sequence alignment in the context of codons. CodonLogo is based on WebLogo3 and uses the same heuristics but treats codons as inseparable units of a 64-letter alphabet. CodonLogo can discriminate patterns of codon conservation from patterns of nucleotide conservation that appear indistinguishable in standard sequence logos. The CodonLogo source code and its implementation (in a local version of the Galaxy Browser) are available at http://recode.ucc.ie/CodonLogo and through the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/.

  19. MicroRNA sequence motifs reveal asymmetry between the stem arms

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Havgaard, Jakob Hull; Ensterö, M.

    2006-01-01

    The processing of micro RNAs (miRNAs) from their stemloop precursor have revealed asymmetry in the processing of the mature and its star sequence. Furthermore, the miRNA processing system between organism differ. To assess this at the sequence level we have investigated mature miRNAs in their gen......The processing of micro RNAs (miRNAs) from their stemloop precursor have revealed asymmetry in the processing of the mature and its star sequence. Furthermore, the miRNA processing system between organism differ. To assess this at the sequence level we have investigated mature mi...

  20. Shotgun Bisulfite Sequencing of the Betula platyphylla Genome Reveals the Tree’s DNA Methylation Patterning

    Directory of Open Access Journals (Sweden)

    Chang Su

    2014-12-01

    Full Text Available DNA methylation plays a critical role in the regulation of gene expression. Most studies of DNA methylation have been performed in herbaceous plants, and little is known about the methylation patterns in tree genomes. In the present study, we generated a map of methylated cytosines at single base pair resolution for Betula platyphylla (white birch by bisulfite sequencing combined with transcriptomics to analyze DNA methylation and its effects on gene expression. We obtained a detailed view of the function of DNA methylation sequence composition and distribution in the genome of B. platyphylla. There are 34,460 genes in the whole genome of birch, and 31,297 genes are methylated. Conservatively, we estimated that 14.29% of genomic cytosines are methylcytosines in birch. Among the methylation sites, the CHH context accounts for 48.86%, and is the largest proportion. Combined transcriptome and methylation analysis showed that the genes with moderate methylation levels had higher expression levels than genes with high and low methylation. In addition, methylated genes are highly enriched for the GO subcategories of binding activities, catalytic activities, cellular processes, response to stimulus and cell death, suggesting that methylation mediates these pathways in birch trees.

  1. An efficient, versatile and scalable pattern growth approach to mine frequent patterns in unaligned protein sequences.

    Science.gov (United States)

    Ye, Kai; Kosters, Walter A; Ijzerman, Adriaan P

    2007-03-15

    Pattern discovery in protein sequences is often based on multiple sequence alignments (MSA). The procedure can be computationally intensive and often requires manual adjustment, which may be particularly difficult for a set of deviating sequences. In contrast, two algorithms, PRATT2 (http//www.ebi.ac.uk/pratt/) and TEIRESIAS (http://cbcsrv.watson.ibm.com/) are used to directly identify frequent patterns from unaligned biological sequences without an attempt to align them. Here we propose a new algorithm with more efficiency and more functionality than both PRATT2 and TEIRESIAS, and discuss some of its applications to G protein-coupled receptors, a protein family of important drug targets. In this study, we designed and implemented six algorithms to mine three different pattern types from either one or two datasets using a pattern growth approach. We compared our approach to PRATT2 and TEIRESIAS in efficiency, completeness and the diversity of pattern types. Compared to PRATT2, our approach is faster, capable of processing large datasets and able to identify the so-called type III patterns. Our approach is comparable to TEIRESIAS in the discovery of the so-called type I patterns but has additional functionality such as mining the so-called type II and type III patterns and finding discriminating patterns between two datasets. The source code for pattern growth algorithms and their pseudo-code are available at http://www.liacs.nl/home/kosters/pg/.

  2. Neural Sequence Generation Using Spatiotemporal Patterns of Inhibition.

    Directory of Open Access Journals (Sweden)

    Jonathan Cannon

    2015-11-01

    Full Text Available Stereotyped sequences of neural activity are thought to underlie reproducible behaviors and cognitive processes ranging from memory recall to arm movement. One of the most prominent theoretical models of neural sequence generation is the synfire chain, in which pulses of synchronized spiking activity propagate robustly along a chain of cells connected by highly redundant feedforward excitation. But recent experimental observations in the avian song production pathway during song generation have shown excitatory activity interacting strongly with the firing patterns of inhibitory neurons, suggesting a process of sequence generation more complex than feedforward excitation. Here we propose a model of sequence generation inspired by these observations in which a pulse travels along a spatially recurrent excitatory chain, passing repeatedly through zones of local feedback inhibition. In this model, synchrony and robust timing are maintained not through redundant excitatory connections, but rather through the interaction between the pulse and the spatiotemporal pattern of inhibition that it creates as it circulates the network. These results suggest that spatially and temporally structured inhibition may play a key role in sequence generation.

  3. Neural Sequence Generation Using Spatiotemporal Patterns of Inhibition.

    Science.gov (United States)

    Cannon, Jonathan; Kopell, Nancy; Gardner, Timothy; Markowitz, Jeffrey

    2015-11-01

    Stereotyped sequences of neural activity are thought to underlie reproducible behaviors and cognitive processes ranging from memory recall to arm movement. One of the most prominent theoretical models of neural sequence generation is the synfire chain, in which pulses of synchronized spiking activity propagate robustly along a chain of cells connected by highly redundant feedforward excitation. But recent experimental observations in the avian song production pathway during song generation have shown excitatory activity interacting strongly with the firing patterns of inhibitory neurons, suggesting a process of sequence generation more complex than feedforward excitation. Here we propose a model of sequence generation inspired by these observations in which a pulse travels along a spatially recurrent excitatory chain, passing repeatedly through zones of local feedback inhibition. In this model, synchrony and robust timing are maintained not through redundant excitatory connections, but rather through the interaction between the pulse and the spatiotemporal pattern of inhibition that it creates as it circulates the network. These results suggest that spatially and temporally structured inhibition may play a key role in sequence generation.

  4. 454 sequencing reveals extreme complexity of the class II Major Histocompatibility Complex in the collared flycatcher

    Directory of Open Access Journals (Sweden)

    Gustafsson Lars

    2010-12-01

    Full Text Available Abstract Background Because of their functional significance, the Major Histocompatibility Complex (MHC class I and II genes have been the subject of continuous interest in the fields of ecology, evolution and conservation. In some vertebrate groups MHC consists of multiple loci with similar alleles; therefore, the multiple loci must be genotyped simultaneously. In such complex systems, understanding of the evolutionary patterns and their causes has been limited due to challenges posed by genotyping. Results Here we used 454 amplicon sequencing to characterize MHC class IIB exon 2 variation in the collared flycatcher, an important organism in evolutionary and immuno-ecological studies. On the basis of over 152,000 sequencing reads we identified 194 putative alleles in 237 individuals. We found an extreme complexity of the MHC class IIB in the collared flycatchers, with our estimates pointing to the presence of at least nine expressed loci and a large, though difficult to estimate precisely, number of pseudogene loci. Many similar alleles occurred in the pseudogenes indicating either a series of recent duplications or extensive concerted evolution. The expressed alleles showed unambiguous signals of historical selection and the occurrence of apparent interlocus exchange of alleles. Placing the collared flycatcher's MHC sequences in the context of passerine diversity revealed transspecific MHC class II evolution within the Muscicapidae family. Conclusions 454 amplicon sequencing is an effective tool for advancing our understanding of the MHC class II structure and evolutionary patterns in Passeriformes. We found a highly dynamic pattern of evolution of MHC class IIB genes with strong signals of selection and pronounced sequence divergence in expressed genes, in contrast to the apparent sequence homogenization in pseudogenes. We show that next generation sequencing offers a universal, affordable method for the characterization and, in perspective

  5. Whole-exome sequencing reveals the spectrum of gene mutations and the clonal evolution patterns in paediatric acute myeloid leukaemia.

    Science.gov (United States)

    Shiba, Norio; Yoshida, Kenichi; Shiraishi, Yuichi; Okuno, Yusuke; Yamato, Genki; Hara, Yusuke; Nagata, Yasunobu; Chiba, Kenichi; Tanaka, Hiroko; Terui, Kiminori; Kato, Motohiro; Park, Myoung-Ja; Ohki, Kentaro; Shimada, Akira; Takita, Junko; Tomizawa, Daisuke; Kudo, Kazuko; Arakawa, Hirokazu; Adachi, Souichi; Taga, Takashi; Tawa, Akio; Ito, Etsuro; Horibe, Keizo; Sanada, Masashi; Miyano, Satoru; Ogawa, Seishi; Hayashi, Yasuhide

    2016-11-01

    Acute myeloid leukaemia (AML) is a molecularly and clinically heterogeneous disease. Targeted sequencing efforts have identified several mutations with diagnostic and prognostic values in KIT, NPM1, CEBPA and FLT3 in both adult and paediatric AML. In addition, massively parallel sequencing enabled the discovery of recurrent mutations (i.e. IDH1/2 and DNMT3A) in adult AML. In this study, whole-exome sequencing (WES) of 22 paediatric AML patients revealed mutations in components of the cohesin complex (RAD21 and SMC3), BCORL1 and ASXL2 in addition to previously known gene mutations. We also revealed intratumoural heterogeneities in many patients, implicating multiple clonal evolution events in the development of AML. Furthermore, targeted deep sequencing in 182 paediatric AML patients identified three major categories of recurrently mutated genes: cohesion complex genes [STAG2, RAD21 and SMC3 in 17 patients (8·3%)], epigenetic regulators [ASXL1/ASXL2 in 17 patients (8·3%), BCOR/BCORL1 in 7 patients (3·4%)] and signalling molecules. We also performed WES in four patients with relapsed AML. Relapsed AML evolved from one of the subclones at the initial phase and was accompanied by many additional mutations, including common driver mutations that were absent or existed only with lower allele frequency in the diagnostic samples, indicating a multistep process causing leukaemia recurrence. © 2016 John Wiley & Sons Ltd.

  6. Heteroassociative storage of hippocampal pattern sequences in the CA3 subregion

    Directory of Open Access Journals (Sweden)

    Raphael Y. de Camargo

    2018-01-01

    Full Text Available Background Recent research suggests that the CA3 subregion of the hippocampus has properties of both autoassociative network, due to its ability to complete partial cues, tolerate noise, and store associations between memories, and heteroassociative one, due to its ability to store and retrieve sequences of patterns. Although there are several computational models of the CA3 as an autoassociative network, more detailed evaluations of its heteroassociative properties are missing. Methods We developed a model of the CA3 subregion containing 10,000 integrate-and-fire neurons with both recurrent excitatory and inhibitory connections, and which exhibits coupled oscillations in the gamma and theta ranges. We stored thousands of pattern sequences using a heteroassociative learning rule with competitive synaptic scaling. Results We showed that a purely heteroassociative network model can (i retrieve pattern sequences from partial cues with external noise and incomplete connectivity, (ii achieve homeostasis regarding the number of connections per neuron when many patterns are stored when using synaptic scaling, (iii continuously update the set of retrievable patterns, guaranteeing that the last stored patterns can be retrieved and older ones can be forgotten. Discussion Heteroassociative networks with synaptic scaling rules seem sufficient to achieve many desirable features regarding connectivity homeostasis, pattern sequence retrieval, noise tolerance and updating of the set of retrievable patterns.

  7. An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases

    Directory of Open Access Journals (Sweden)

    Md. Rezaul Karim

    2012-03-01

    Full Text Available Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in finding frequent orderly arrangements of motifs that are responsible for similar expression of a group of genes. In order to reduce mining time and complexity, however, most existing sequence mining algorithms either focus on finding short DNA sequences or require explicit specification of sequence lengths in advance. The challenge is to find longer sequences without specifying sequence lengths in advance. In this paper, we propose an efficient approach to mining maximal contiguous frequent patterns from large DNA sequence datasets. The experimental results show that our proposed approach is memory-efficient and mines maximal contiguous frequent patterns within a reasonable time.

  8. Conservation patterns in different functional sequence categoriesof divergent Drosophila species

    Energy Technology Data Exchange (ETDEWEB)

    Papatsenko, Dmitri; Kislyuk, Andrey; Levine, Michael; Dubchak, Inna

    2005-10-01

    We have explored the distributions of fully conservedungapped blocks in genome-wide pairwise alignments of recently completedspecies of Drosophila: D.yakuba, D.ananassae, D.pseudoobscura, D.virilisand D.mojavensis. Based on these distributions we have found that nearlyevery functional sequence category possesses its own distinctiveconservation pattern, sometimes independent of the overall sequenceconservation level. In the coding and regulatory regions, the ungappedblocks were longer than in introns, UTRs and non-functional sequences. Atthe same time, the blocks in the coding regions carried 3N+2 signaturecharacteristic to synonymic substitutions in the 3rd codon positions.Larger block sizes in transcription regulatory regions can be explainedby the presence of conserved arrays of binding sites for transcriptionfactors. We also have shown that the longest ungapped blocks, or'ultraconserved' sequences, are associated with specific gene groups,including those encoding ion channels and components of the cytoskeleton.We discussed how restrained conservation patterns may help in mappingfunctional sequence categories and improving genomeannotation.

  9. Next generation sequencing reveals the hidden diversity of zooplankton assemblages.

    Directory of Open Access Journals (Sweden)

    Penelope K Lindeque

    Full Text Available BACKGROUND: Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. METHODOLOGY/PRINCIPLE FINDINGS: Plankton net hauls (200 µm were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. CONCLUSIONS: Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may

  10. Sequence alignment reveals possible MAPK docking motifs on HIV proteins.

    Directory of Open Access Journals (Sweden)

    Perry Evans

    Full Text Available Over the course of HIV infection, virus replication is facilitated by the phosphorylation of HIV proteins by human ERK1 and ERK2 mitogen-activated protein kinases (MAPKs. MAPKs are known to phosphorylate their substrates by first binding with them at a docking site. Docking site interactions could be viable drug targets because the sequences guiding them are more specific than phosphorylation consensus sites. In this study we use multiple bioinformatics tools to discover candidate MAPK docking site motifs on HIV proteins known to be phosphorylated by MAPKs, and we discuss the possibility of targeting docking sites with drugs. Using sequence alignments of HIV proteins of different subtypes, we show that MAPK docking patterns previously described for human proteins appear on the HIV matrix, Tat, and Vif proteins in a strain dependent manner, but are absent from HIV Rev and appear on all HIV Nef strains. We revise the regular expressions of previously annotated MAPK docking patterns in order to provide a subtype independent motif that annotates all HIV proteins. One revision is based on a documented human variant of one of the substrate docking motifs, and the other reduces the number of required basic amino acids in the standard docking motifs from two to one. The proposed patterns are shown to be consistent with in silico docking between ERK1 and the HIV matrix protein. The motif usage on HIV proteins is sufficiently different from human proteins in amino acid sequence similarity to allow for HIV specific targeting using small-molecule drugs.

  11. Chromosomal structures and repetitive sequences divergence in Cucumis species revealed by comparative cytogenetic mapping.

    Science.gov (United States)

    Zhang, Yunxia; Cheng, Chunyan; Li, Ji; Yang, Shuqiong; Wang, Yunzhu; Li, Ziang; Chen, Jinfeng; Lou, Qunfeng

    2015-09-25

    Differentiation and copy number of repetitive sequences affect directly chromosome structure which contributes to reproductive isolation and speciation. Comparative cytogenetic mapping has been verified an efficient tool to elucidate the differentiation and distribution of repetitive sequences in genome. In present study, the distinct chromosomal structures of five Cucumis species were revealed through genomic in situ hybridization (GISH) technique and comparative cytogenetic mapping of major satellite repeats. Chromosome structures of five Cucumis species were investigated using GISH and comparative mapping of specific satellites. Southern hybridization was employed to study the proliferation of satellites, whose structural characteristics were helpful for analyzing chromosome evolution. Preferential distribution of repetitive DNAs at the subtelomeric regions was found in C. sativus, C hystrix and C. metuliferus, while majority was positioned at the pericentromeric heterochromatin regions in C. melo and C. anguria. Further, comparative GISH (cGISH) through using genomic DNA of other species as probes revealed high homology of repeats between C. sativus and C. hystrix. Specific satellites including 45S rDNA, Type I/II, Type III, Type IV, CentM and telomeric repeat were then comparatively mapped in these species. Type I/II and Type IV produced bright signals at the subtelomeric regions of C. sativus and C. hystrix simultaneously, which might explain the significance of their amplification in the divergence of Cucumis subgenus from the ancient ancestor. Unique positioning of Type III and CentM only at the centromeric domains of C. sativus and C. melo, respectively, combining with unique southern bands, revealed rapid evolutionary patterns of centromeric DNA in Cucumis. Obvious interstitial telomeric repeats were observed in chromosomes 1 and 2 of C. sativus, which might provide evidence of the fusion hypothesis of chromosome evolution from x = 12 to x = 7 in

  12. Comparative analyses of six solanaceous transcriptomes reveal a high degree of sequence conservation and species-specific transcripts

    Directory of Open Access Journals (Sweden)

    Ouyang Shu

    2005-09-01

    Full Text Available Abstract Background The Solanaceae is a family of closely related species with diverse phenotypes that have been exploited for agronomic purposes. Previous studies involving a small number of genes suggested sequence conservation across the Solanaceae. The availability of large collections of Expressed Sequence Tags (ESTs for the Solanaceae now provides the opportunity to assess sequence conservation and divergence on a genomic scale. Results All available ESTs and Expressed Transcripts (ETs, 449,224 sequences for six Solanaceae species (potato, tomato, pepper, petunia, tobacco and Nicotiana benthamiana, were clustered and assembled into gene indices. Examination of gene ontologies revealed that the transcripts within the gene indices encode a similar suite of biological processes. Although the ESTs and ETs were derived from a variety of tissues, 55–81% of the sequences had significant similarity at the nucleotide level with sequences among the six species. Putative orthologs could be identified for 28–58% of the sequences. This high degree of sequence conservation was supported by expression profiling using heterologous hybridizations to potato cDNA arrays that showed similar expression patterns in mature leaves for all six solanaceous species. 16–19% of the transcripts within the six Solanaceae gene indices did not have matches among Solanaceae, Arabidopsis, rice or 21 other plant gene indices. Conclusion Results from this genome scale analysis confirmed a high level of sequence conservation at the nucleotide level of the coding sequence among Solanaceae. Additionally, the results indicated that part of the Solanaceae transcriptome is likely to be unique for each species.

  13. Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

    Science.gov (United States)

    Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

    2012-01-01

    Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.

  14. King Lear Reveals the Tragic Pattern of Shakespeare

    Directory of Open Access Journals (Sweden)

    Salim Eflih Al-Ibia

    2017-04-01

    Full Text Available Rather than focusing on the obvious traditions of evaluating Shakespearean tragic heroes, this paper presents a groundbreaking approach to unfold the pattern William Shakespeare follows as he designed his unique characters. This pattern applies to most, if not all, Shakespearean tragic heroes. I argue that Shakespeare himself reveals a great portion of this pattern on the tongue of Lear as the latter disowns Goneril and Regan promising to have “such revenges on [them] both” in King Lear. Lear’s threats bestow four unique aspects that apply not only to his character but they also apply to Shakespearean tragic heroes. Lear’s speech tells us that he is determined to have an awful type of revenge on his daughters. However, the very same speech tells us that he seems uncertain about the method through which he should carry out this revenge. Lear does not express any type of remorse as he pursues his vengeful plans nor should he aim at amnesty. He also admits his own madness as he closes his revealing speech. This research develops these facts about Lear to unfold the unique pattern Shakespeare follows as he portrayed his major tragic figures. This pattern is examined, described and analyzed in King Lear, Othello, and Hamlet. We will find out that the pattern suggested in this study helps us better understand Shakespeare’s tragedies and enables us to provide better explanations for some controversial scenes in the tragedies discussed.

  15. Pattern recognition in complex activity travel patterns : comparison of Euclidean distance, signal-processing theoretical, and multidimensional sequence alignment methods

    NARCIS (Netherlands)

    Joh, C.H.; Arentze, T.A.; Timmermans, H.J.P.

    2001-01-01

    The application of a multidimensional sequence alignment method for classifying activity travel patterns is reported. The method was developed as an alternative to the existing classification methods suggested in the transportation literature. The relevance of the multidimensional sequence alignment

  16. Phylogeny and patterns of diversity of goat mtDNA haplogroup A revealed by resequencing complete mitogenomes.

    Directory of Open Access Journals (Sweden)

    Maria Grazia Doro

    Full Text Available We sequenced to near completion the entire mtDNA of 28 Sardinian goats, selected to represent the widest possible diversity of the most widespread mitochondrial evolutionary lineage, haplogroup (Hg A. These specimens were reporters of the diversity in the island but also elsewhere, as inferred from their affiliation to each of 11 clades defined by D-loop variation. Two reference sequences completed the dataset. Overall, 206 variations were found in the full set of 30 sequences, of which 23 were protein-coding non-synonymous single nucleotide substitutions. Many polymorphic sites within Hg A were informative for the reconstruction of its internal phylogeny. Bayesian and network clustering revealed a general similarity over the entire molecule of sequences previously assigned to the same D-loop clade, indicating evolutionarily meaningful lineages. Two major sister groupings emerged within Hg A, which parallel distinct geographical distributions of D-loop clades in extant stocks. The pattern of variation in protein-coding genes revealed an overwhelming role of purifying selection, with the quota of surviving variants approaching neutrality. However, a simple model of relaxation of selection for the bulk of variants here reported should be rejected. Non-synonymous diversity of Hg's A, B and C denoted that a proportion of variants not greater than that allowed in the wild was given the opportunity to spread into domesticated stocks. Our results also confirmed that a remarkable proportion of pre-existing Hg A diversity became incorporated into domestic stocks. Our results confirm clade A11 as a well differentiated and ancient lineage peculiar of Sardinia.

  17. VPA: an R tool for analyzing sequencing variants with user-specified frequency pattern

    Directory of Open Access Journals (Sweden)

    Hu Qiang

    2012-01-01

    Full Text Available Abstract Background The massive amounts of genetic variant generated by the next generation sequencing systems demand the development of effective computational tools for variant prioritization. Findings VPA (Variant Pattern Analyzer is an R tool for prioritizing variants with specified frequency pattern from multiple study subjects in next-generation sequencing study. The tool starts from individual files of variant and sequence calls and extract variants with user-specified frequency pattern across the study subjects of interest. Several position level quality criteria can be incorporated into the variant extraction. It can be used in studies with matched pair design as well as studies with multiple groups of subjects. Conclusions VPA can be used as an automatic pipeline to prioritize variants for further functional exploration and hypothesis generation. The package is implemented in the R language and is freely available from http://vpa.r-forge.r-project.org.

  18. Accurate and High-Coverage Immune Repertoire Sequencing Reveals Characteristics of Antibody Repertoire Diversification in Young Children with Malaria

    Science.gov (United States)

    Jiang, Ning

    Accurately measuring the immune repertoire sequence composition, diversity, and abundance is important in studying repertoire response in infections, vaccinations, and cancer immunology. Using molecular identifiers (MIDs) to tag mRNA molecules is an effective method in improving the accuracy of immune repertoire sequencing (IR-seq). However, it is still difficult to use IR-seq on small amount of clinical samples to achieve a high coverage of the repertoire diversities. This is especially challenging in studying infections and vaccinations where B cell subpopulations with fewer cells, such as memory B cells or plasmablasts, are often of great interest to study somatic mutation patterns and diversity changes. Here, we describe an approach of IR-seq based on the use of MIDs in combination with a clustering method that can reveal more than 80% of the antibody diversity in a sample and can be applied to as few as 1,000 B cells. We applied this to study the antibody repertoires of young children before and during an acute malaria infection. We discovered unexpectedly high levels of somatic hypermutation (SHM) in infants and revealed characteristics of antibody repertoire development in young children that would have a profound impact on immunization in children.

  19. Pattern analysis approach reveals restriction enzyme cutting abnormalities and other cDNA library construction artifacts using raw EST data

    Directory of Open Access Journals (Sweden)

    Zhou Sun

    2012-05-01

    or filtered by AFST. Conclusions cDNA terminal pattern analysis, as implemented in the AFST software tool, can be utilized to reveal wet-lab errors such as restriction enzyme cutting abnormities and chimeric EST sequences, detect various data abnormalities embedded in existing Sanger EST datasets, improve the accuracy of identifying and extracting bona fide cDNA inserts from raw ESTs, and therefore greatly benefit downstream EST-based applications.

  20. Salmon louse (Lepeophtheirus salmonis transcriptomes during post molting maturation and egg production, revealed using EST-sequencing and microarray analysis

    Directory of Open Access Journals (Sweden)

    Jonassen Inge

    2008-03-01

    Full Text Available Abstract Background Lepeophtheirus salmonis is an ectoparasitic copepod feeding on skin, mucus and blood from salmonid hosts. Initial analysis of EST sequences from pre adult and adult stages of L. salmonis revealed a large proportion of novel transcripts. In order to link unknown transcripts to biological functions we have combined EST sequencing and microarray analysis to characterize female salmon louse transcriptomes during post molting maturation and egg production. Results EST sequence analysis shows that 43% of the ESTs have no significant hits in GenBank. Sequenced ESTs assembled into 556 contigs and 1614 singletons and whenever homologous genes were identified no clear correlation with homologous genes from any specific animal group was evident. Sequence comparison of 27 L. salmonis proteins with homologous proteins in humans, zebrafish, insects and crustaceans revealed an almost identical sequence identity with all species. Microarray analysis of maturing female adult salmon lice revealed two major transcription patterns; up-regulation during the final molting followed by down regulation and female specific up regulation during post molting growth and egg production. For a third minor group of ESTs transcription decreased during molting from pre-adult II to immature adults. Genes regulated during molting typically gave hits with cuticula proteins whilst transcripts up regulated during post molting growth were female specific, including two vitellogenins. Conclusion The copepod L.salmonis contains high a level of novel genes. Among analyzed L.salmonis proteins, sequence identities with homologous proteins in crustaceans are no higher than to homologous proteins in humans. Three distinct processes, molting, post molting growth and egg production correlate with transcriptional regulation of three groups of transcripts; two including genes related to growth, one including genes related to egg production. The function of the regulated

  1. Discovering approximate-associated sequence patterns for protein-DNA interactions

    KAUST Repository

    Chan, Tak Ming

    2010-12-30

    Motivation: The bindings between transcription factors (TFs) and transcription factor binding sites (TFBSs) are fundamental protein-DNA interactions in transcriptional regulation. Extensive efforts have been made to better understand the protein-DNA interactions. Recent mining on exact TF-TFBS-associated sequence patterns (rules) has shown great potentials and achieved very promising results. However, exact rules cannot handle variations in real data, resulting in limited informative rules. In this article, we generalize the exact rules to approximate ones for both TFs and TFBSs, which are essential for biological variations. Results: A progressive approach is proposed to address the approximation to alleviate the computational requirements. Firstly, similar TFBSs are grouped from the available TF-TFBS data (TRANSFAC database). Secondly, approximate and highly conserved binding cores are discovered from TF sequences corresponding to each TFBS group. A customized algorithm is developed for the specific objective. We discover the approximate TF-TFBS rules by associating the grouped TFBS consensuses and TF cores. The rules discovered are evaluated by matching (verifying with) the actual protein-DNA binding pairs from Protein Data Bank (PDB) 3D structures. The approximate results exhibit many more verified rules and up to 300% better verification ratios than the exact ones. The customized algorithm achieves over 73% better verification ratios than traditional methods. Approximate rules (64-79%) are shown statistically significant. Detailed variation analysis and conservation verification on NCBI records demonstrate that the approximate rules reveal both the flexible and specific protein-DNA interactions accurately. The approximate TF-TFBS rules discovered show great generalized capability of exploring more informative binding rules. © The Author 2010. Published by Oxford University Press. All rights reserved.

  2. Sequence tagging reveals unexpected modifications in toxicoproteomics

    Science.gov (United States)

    Dasari, Surendra; Chambers, Matthew C.; Codreanu, Simona G.; Liebler, Daniel C.; Collins, Ben C.; Pennington, Stephen R.; Gallagher, William M.; Tabb, David L.

    2010-01-01

    Toxicoproteomic samples are rich in posttranslational modifications (PTMs) of proteins. Identifying these modifications via standard database searching can incur significant performance penalties. Here we describe the latest developments in TagRecon, an algorithm that leverages inferred sequence tags to identify modified peptides in toxicoproteomic data sets. TagRecon identifies known modifications more effectively than the MyriMatch database search engine. TagRecon outperformed state of the art software in recognizing unanticipated modifications from LTQ, Orbitrap, and QTOF data sets. We developed user-friendly software for detecting persistent mass shifts from samples. We follow a three-step strategy for detecting unanticipated PTMs in samples. First, we identify the proteins present in the sample with a standard database search. Next, identified proteins are interrogated for unexpected PTMs with a sequence tag-based search. Finally, additional evidence is gathered for the detected mass shifts with a refinement search. Application of this technology on toxicoproteomic data sets revealed unintended cross-reactions between proteins and sample processing reagents. Twenty five proteins in rat liver showed signs of oxidative stress when exposed to potentially toxic drugs. These results demonstrate the value of mining toxicoproteomic data sets for modifications. PMID:21214251

  3. Ontogeny of hepatic energy metabolism genes in mice as revealed by RNA-sequencing.

    Directory of Open Access Journals (Sweden)

    Helen J Renaud

    Full Text Available The liver plays a central role in metabolic homeostasis by coordinating synthesis, storage, breakdown, and redistribution of nutrients. Hepatic energy metabolism is dynamically regulated throughout different life stages due to different demands for energy during growth and development. However, changes in gene expression patterns throughout ontogeny for factors important in hepatic energy metabolism are not well understood. We performed detailed transcript analysis of energy metabolism genes during various stages of liver development in mice. Livers from male C57BL/6J mice were collected at twelve ages, including perinatal and postnatal time points (n = 3/age. The mRNA was quantified by RNA-Sequencing, with transcript abundance estimated by Cufflinks. One thousand sixty energy metabolism genes were examined; 794 were above detection, of which 627 were significantly changed during at least one developmental age compared to adult liver. Two-way hierarchical clustering revealed three major clusters dependent on age: GD17.5-Day 5 (perinatal-enriched, Day 10-Day 20 (pre-weaning-enriched, and Day 25-Day 60 (adolescence/adulthood-enriched. Clustering analysis of cumulative mRNA expression values for individual pathways of energy metabolism revealed three patterns of enrichment: glycolysis, ketogenesis, and glycogenesis were all perinatally-enriched; glycogenolysis was the only pathway enriched during pre-weaning ages; whereas lipid droplet metabolism, cholesterol and bile acid metabolism, gluconeogenesis, and lipid metabolism were all enriched in adolescence/adulthood. This study reveals novel findings such as the divergent expression of the fatty acid β-oxidation enzymes Acyl-CoA oxidase 1 and Carnitine palmitoyltransferase 1a, indicating a switch from mitochondrial to peroxisomal β-oxidation after weaning; as well as the dynamic ontogeny of genes implicated in obesity such as Stearoyl-CoA desaturase 1 and Elongation of very long chain fatty

  4. Foundations for a syntatic pattern recognition system for genomic DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Searles, D.B.

    1993-03-01

    The goal of the proposed work is the creation of a software system that will perform sophisticated pattern recognition and related functions at a level of abstraction and with expressive power beyond current general-purpose pattern-matching systems for biological sequences; and with a more uniform language, environment, and graphical user interface, and with greater flexibility, extensibility, embeddability, and ability to incorporate other algorithms, than current special-purpose analytic software.

  5. Whale phylogeny and rapid radiation events revealed using novel retroposed elements and their flanking sequences.

    Science.gov (United States)

    Chen, Zhuo; Xu, Shixia; Zhou, Kaiya; Yang, Guang

    2011-10-27

    A diversity of hypotheses have been proposed based on both morphological and molecular data to reveal phylogenetic relationships within the order Cetacea (dolphins, porpoises, and whales), and great progress has been made in the past two decades. However, there is still some controversy concerning relationships among certain cetacean taxa such as river dolphins and delphinoid species, which needs to be further addressed with more markers in an effort to address unresolved portions of the phylogeny. An analysis of additional SINE insertions and SINE-flanking sequences supported the monophyly of the order Cetacea as well as Odontocete, Delphinoidea (Delphinidae + Phocoenidae + Mondontidae), and Delphinidae. A sister relationship between Delphinidae and Phocoenidae + Mondontidae was supported, and members of classical river dolphins and the genera Tursiops and Stenella were found to be paraphyletic. Estimates of divergence times revealed rapid divergences of basal Odontocete lineages in the Oligocene and Early Miocene, and a recent rapid diversification of Delphinidae in the Middle-Late Miocene and Pliocene within a narrow time frame. Several novel SINEs were found to differentiate Delphinidae from the other two families (Monodontidae and Phocoenidae), whereas the sister grouping of the latter two families with exclusion of Delphinidae was further revealed using the SINE-flanking sequences. Interestingly, some anomalous PCR amplification patterns of SINE insertions were detected, which can be explained as the result of potential ancestral SINE polymorphisms and incomplete lineage sorting. Although a few loci were potentially anomalous, this study demonstrated that the SINE-based approach is a powerful tool in phylogenetic studies. Identifying additional SINE elements that resolve the relationships in the superfamily Delphinoidea and family Delphinidae will be important steps forward in completely resolving cetacean phylogenetic relationships in the future.

  6. Whale phylogeny and rapid radiation events revealed using novel retroposed elements and their flanking sequences

    Directory of Open Access Journals (Sweden)

    Zhou Kaiya

    2011-10-01

    Full Text Available Abstract Background A diversity of hypotheses have been proposed based on both morphological and molecular data to reveal phylogenetic relationships within the order Cetacea (dolphins, porpoises, and whales, and great progress has been made in the past two decades. However, there is still some controversy concerning relationships among certain cetacean taxa such as river dolphins and delphinoid species, which needs to be further addressed with more markers in an effort to address unresolved portions of the phylogeny. Results An analysis of additional SINE insertions and SINE-flanking sequences supported the monophyly of the order Cetacea as well as Odontocete, Delphinoidea (Delphinidae + Phocoenidae + Mondontidae, and Delphinidae. A sister relationship between Delphinidae and Phocoenidae + Mondontidae was supported, and members of classical river dolphins and the genera Tursiops and Stenella were found to be paraphyletic. Estimates of divergence times revealed rapid divergences of basal Odontocete lineages in the Oligocene and Early Miocene, and a recent rapid diversification of Delphinidae in the Middle-Late Miocene and Pliocene within a narrow time frame. Conclusions Several novel SINEs were found to differentiate Delphinidae from the other two families (Monodontidae and Phocoenidae, whereas the sister grouping of the latter two families with exclusion of Delphinidae was further revealed using the SINE-flanking sequences. Interestingly, some anomalous PCR amplification patterns of SINE insertions were detected, which can be explained as the result of potential ancestral SINE polymorphisms and incomplete lineage sorting. Although a few loci were potentially anomalous, this study demonstrated that the SINE-based approach is a powerful tool in phylogenetic studies. Identifying additional SINE elements that resolve the relationships in the superfamily Delphinoidea and family Delphinidae will be important steps forward in completely resolving

  7. WildSpan: mining structured motifs from protein sequences

    Directory of Open Access Journals (Sweden)

    Chen Chien-Yu

    2011-03-01

    of WildSpan is developed for discovering functional regions of a single protein by referring to a set of related sequences (e.g. its homologues. The discovered W-patterns are used to characterize the protein sequence and the results are compared with the conserved positions identified by multiple sequence alignment (MSA. The family-based mining mode of WildSpan is developed for extracting sequence signatures for a group of related proteins (e.g. a protein family for protein function classification. In this situation, the discovered W-patterns are compared with PROSITE patterns as well as the patterns generated by three existing methods performing the similar task. Finally, analysis on execution time of running WildSpan reveals that the proposed pruning strategy is effective in improving the scalability of the proposed algorithm. Conclusions The mining results conducted in this study reveal that WildSpan is efficient and effective in discovering functional signatures of proteins directly from sequences. The proposed pruning strategy is effective in improving the scalability of WildSpan. It is demonstrated in this study that the W-patterns discovered by WildSpan provides useful information in characterizing protein sequences. The WildSpan executable and open source codes are available on the web (http://biominer.csie.cyu.edu.tw/wildspan.

  8. Deep sequencing analysis of the developing mouse brain reveals a novel microRNA

    Directory of Open Access Journals (Sweden)

    Piltz Sandra

    2011-04-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are small non-coding RNAs that can exert multilevel inhibition/repression at a post-transcriptional or protein synthesis level during disease or development. Characterisation of miRNAs in adult mammalian brains by deep sequencing has been reported previously. However, to date, no small RNA profiling of the developing brain has been undertaken using this method. We have performed deep sequencing and small RNA analysis of a developing (E15.5 mouse brain. Results We identified the expression of 294 known miRNAs in the E15.5 developing mouse brain, which were mostly represented by let-7 family and other brain-specific miRNAs such as miR-9 and miR-124. We also discovered 4 putative 22-23 nt miRNAs: mm_br_e15_1181, mm_br_e15_279920, mm_br_e15_96719 and mm_br_e15_294354 each with a 70-76 nt predicted pre-miRNA. We validated the 4 putative miRNAs and further characterised one of them, mm_br_e15_1181, throughout embryogenesis. Mm_br_e15_1181 biogenesis was Dicer1-dependent and was expressed in E3.5 blastocysts and E7 whole embryos. Embryo-wide expression patterns were observed at E9.5 and E11.5 followed by a near complete loss of expression by E13.5, with expression restricted to a specialised layer of cells within the developing and early postnatal brain. Mm_br_e15_1181 was upregulated during neurodifferentiation of P19 teratocarcinoma cells. This novel miRNA has been identified as miR-3099. Conclusions We have generated and analysed the first deep sequencing dataset of small RNA sequences of the developing mouse brain. The analysis revealed a novel miRNA, miR-3099, with potential regulatory effects on early embryogenesis, and involvement in neuronal cell differentiation/function in the brain during late embryonic and early neonatal development.

  9. The Slow:Fast substitution ratio reveals changing patterns of natural selection in gamma-proteobacterial genomes

    Energy Technology Data Exchange (ETDEWEB)

    Alm, Eric; Shapiro, B. Jesse

    2009-04-15

    Different microbial species are thought to occupy distinct ecological niches, subjecting each species to unique selective constraints, which may leave a recognizable signal in their genomes. Thus, it may be possible to extract insight into the genetic basis of ecological differences among lineages by identifying unusual patterns of substitutions in orthologous gene or protein sequences. We use the ratio of substitutions in slow versus fast-evolving sites (nucleotides in DNA, or amino acids in protein sequence) to quantify deviations from the typical pattern of selective constraint observed across bacterial lineages. We propose that elevated S:F in one branch (an excess of slow-site substitutions) can indicate a functionally-relevant change, due to either positive selection or relaxed evolutionary constraint. In a genome-wide comparative study of gamma-proteobacterial proteins, we find that cell-surface proteins involved with motility and secretion functions often have high S:F ratios, while information-processing genes do not. Change in evolutionary constraints in some species is evidenced by increased S:F ratios within functionally-related sets of genes (e.g., energy production in Pseudomonas fluorescens), while other species apparently evolve mostly by drift (e.g., uniformly elevated S:F across most genes in Buchnera spp.). Overall, S:F reveals several species-specific, protein-level changes with potential functional/ecological importance. As microbial genome projects yield more species-rich gene-trees, the S:F ratio will become an increasingly powerful tool for uncovering functional genetic differences among species.

  10. Multi-species sequence comparison reveals dynamic evolution of the elastin gene that has involved purifying selection and lineage-specific insertions/deletions

    Directory of Open Access Journals (Sweden)

    Green Eric D

    2004-05-01

    Full Text Available Abstract Background The elastin gene (ELN is implicated as a factor in both supravalvular aortic stenosis (SVAS and Williams Beuren Syndrome (WBS, two diseases involving pronounced complications in mental or physical development. Although the complete spectrum of functional roles of the processed gene product remains to be established, these roles are inferred to be analogous in human and mouse. This view is supported by genomic sequence comparison, in which there are no large-scale differences in the ~1.8 Mb sequence block encompassing the common region deleted in WBS, with the exception of an overall reversed physical orientation between human and mouse. Results Conserved synteny around ELN does not translate to a high level of conservation in the gene itself. In fact, ELN orthologs in mammals show more sequence divergence than expected for a gene with a critical role in development. The pattern of divergence is non-conventional due to an unusually high ratio of gaps to substitutions. Specifically, multi-sequence alignments of eight mammalian sequences reveal numerous non-aligning regions caused by species-specific insertions and deletions, in spite of the fact that the vast majority of aligning sites appear to be conserved and undergoing purifying selection. Conclusions The pattern of lineage-specific, in-frame insertions/deletions in the coding exons of ELN orthologous genes is unusual and has led to unique features of the gene in each lineage. These differences may indicate that the gene has a slightly different functional mechanism in mammalian lineages, or that the corresponding regions are functionally inert. Identified regions that undergo purifying selection reflect a functional importance associated with evolutionary pressure to retain those features.

  11. Re-Analysis of Metagenomic Sequences from Acute Flaccidmyelitis Patients Reveals Alternatives to Enterovirus D68 Infection

    Science.gov (United States)

    2015-07-13

    caused in some cases by infection with enterovirus D68. We found that among the patients whose symptoms were previously attributed to enterovirus D68...distribution is unlimited. Re-analysis of metagenomic sequences from acute flaccidmyelitis patients reveals alternatives to enterovirus D68...Street Baltimore, MD 21218 -2685 ABSTRACT Re-analysis of metagenomic sequences from acute flaccidmyelitis patients reveals alternatives to enterovirus

  12. Pervasive within-Mitochondrion Single-Nucleotide Variant Heteroplasmy as Revealed by Single-Mitochondrion Sequencing

    Directory of Open Access Journals (Sweden)

    Jacqueline Morris

    2017-12-01

    Full Text Available Summary: A number of mitochondrial diseases arise from single-nucleotide variant (SNV accumulation in multiple mitochondria. Here, we present a method for identification of variants present at the single-mitochondrion level in individual mouse and human neuronal cells, allowing for extremely high-resolution study of mitochondrial mutation dynamics. We identified extensive heteroplasmy between individual mitochondrion, along with three high-confidence variants in mouse and one in human that were present in multiple mitochondria across cells. The pattern of variation revealed by single-mitochondrion data shows surprisingly pervasive levels of heteroplasmy in inbred mice. Distribution of SNV loci suggests inheritance of variants across generations, resulting in Poisson jackpot lines with large SNV load. Comparison of human and mouse variants suggests that the two species might employ distinct modes of somatic segregation. Single-mitochondrion resolution revealed mitochondria mutational dynamics that we hypothesize to affect risk probabilities for mutations reaching disease thresholds. : Morris et al. use independent sequencing of multiple individual mitochondria from mouse and human brain cells to show high pervasiveness of mutations. The mutations are heteroplasmic within single mitochondria and within and between cells. These findings suggest mechanisms by which mutations accumulate over time, resulting in mitochondrial dysfunction and disease. Keywords: single mitochondrion, single cell, human neuron, mouse neuron, single-nucleotide variation

  13. Movement Pattern Analysis Based on Sequence Signatures

    Directory of Open Access Journals (Sweden)

    Seyed Hossein Chavoshi

    2015-09-01

    Full Text Available Increased affordability and deployment of advanced tracking technologies have led researchers from various domains to analyze the resulting spatio-temporal movement data sets for the purpose of knowledge discovery. Two different approaches can be considered in the analysis of moving objects: quantitative analysis and qualitative analysis. This research focuses on the latter and uses the qualitative trajectory calculus (QTC, a type of calculus that represents qualitative data on moving point objects (MPOs, and establishes a framework to analyze the relative movement of multiple MPOs. A visualization technique called sequence signature (SESI is used, which enables to map QTC patterns in a 2D indexed rasterized space in order to evaluate the similarity of relative movement patterns of multiple MPOs. The applicability of the proposed methodology is illustrated by means of two practical examples of interacting MPOs: cars on a highway and body parts of a samba dancer. The results show that the proposed method can be effectively used to analyze interactions of multiple MPOs in different domains.

  14. Recognition of depositional sequences and stacking patterns, Late Devonian (Frasnian) carbonate platforms, Alberta basin

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, J.H.; Reeckmann, S.A.; Sarg, J.F.; Greenlee, S.M.

    1987-05-01

    Six depositional sequences bounded by regional unconformities or their correlative equivalents (sequence boundaries) have been recognized in Late Devonian (Frasnian) carbonate platforms in the Alberta basin. These sequences consist of a predictable vertical succession of smaller scale shoaling-upward cycles (parasequences). Parasequences are arranged in retrogradational, aggradational, and progradational stacking patterns that can be modeled as a sediment response to relative changes in sea level. Sequence boundaries are recognized by onlap onto underlying shelf or shelf margin strata. This onlap includes shelf margin wedges and deep marine onlap. In outcrop sections shelf margin wedges exhibit an abrupt juxtaposition of shallow water facies over deeper water deposits with no gradational facies changes at the boundaries. High on the platform, subaerial exposure fabrics may be present. The shelf margin wedges are interpreted to have formed during lowstands in sea level and typically exhibit an aggradational stacking pattern. On the platform, two types of sequences are recognized. A type 1 cycle occurs where the sequence boundary is overlain by a flooding surface and subsequent parasequences exhibit retrogradational stacking. In a type 2 cycle the sequence boundary is overlain by an aggradational package of shallow water parasequences, followed by a retrogradational package. These two types of sequences can be modeled using a sinusoidal eustatic sea level curve superimposed on thermo-tectonic subsidence.

  15. Targeted Genome Sequencing Reveals Varicella-Zoster Virus Open Reading Frame 12 Deletion.

    Science.gov (United States)

    Cohrs, Randall J; Lee, Katherine S; Beach, Addilynn; Sanford, Bridget; Baird, Nicholas L; Como, Christina; Graybill, Chiharu; Jones, Dallas; Tekeste, Eden; Ballard, Mitchell; Chen, Xiaomi; Yalacki, David; Frietze, Seth; Jones, Kenneth; Lenac Rovis, Tihana; Jonjić, Stipan; Haas, Jürgen; Gilden, Don

    2017-10-15

    The neurotropic herpesvirus varicella-zoster virus (VZV) establishes a lifelong latent infection in humans following primary infection. The low abundance of VZV nucleic acids in human neurons has hindered an understanding of the mechanisms that regulate viral gene transcription during latency. To overcome this critical barrier, we optimized a targeted capture protocol to enrich VZV DNA and cDNA prior to whole-genome/transcriptome sequence analysis. Since the VZV genome is remarkably stable, it was surprising to detect that VZV32, a VZV laboratory strain with no discernible growth defect in tissue culture, contained a 2,158-bp deletion in open reading frame (ORF) 12. Consequently, ORF 12 and 13 protein expression was abolished and Akt phosphorylation was inhibited. The discovery of the ORF 12 deletion, revealed through targeted genome sequencing analysis, points to the need to authenticate the VZV genome when the virus is propagated in tissue culture. IMPORTANCE Viruses isolated from clinical samples often undergo genetic modifications when cultured in the laboratory. Historically, VZV is among the most genetically stable herpesviruses, a notion supported by more than 60 complete genome sequences from multiple isolates and following multiple in vitro passages. However, application of enrichment protocols to targeted genome sequencing revealed the unexpected deletion of a significant portion of VZV ORF 12 following propagation in cultured human fibroblast cells. While the enrichment protocol did not introduce bias in either the virus genome or transcriptome, the findings indicate the need for authentication of VZV by sequencing when the virus is propagated in tissue culture. Copyright © 2017 American Society for Microbiology.

  16. Which MRI sequence of the spine best reveals bone-marrow metastases of neuroblastoma?

    International Nuclear Information System (INIS)

    Meyer, James S.; Jaramillo, Diego; Siegel, Marilyn J.; Farooqui, Saleem O.; Fletcher, Barry D.; Hoffer, Fredric A.

    2005-01-01

    MRI is an effective tool in evaluating bone marrow metastases. However, no study has defined which MRI sequences or image characteristics best correlate with bone-marrow metastases in neuroblastoma. To identify and refine MRI criteria and sequence selection for the diagnosis of bone-marrow metastases in children with neuroblastoma. Ninety-one children (mean age: 3.2 years; standard deviation: 2.8 years) enrolled in the RDOG IV study participated in our study. Forty-five children had bone metastases determined by bone-marrow aspiration or biopsy (n=4), radionuclide imaging (n=2), or both (n=39). Spine lesions were characterized using coronal T1-weighted (T1W) sagittal short tau inversion recovery (STIR) and coronal gadolinium-enhanced T1-weighted (GAD) MR sequences. Contingency table analysis was performed to determine which MRI sequences and characteristics were associated with metastases. The MRI criteria for metastatic disease were then developed for each imaging sequence. The sensitivity, specificity, predictive values, and accuracy of these criteria were determined for the whole group, children younger than 12 months old, and children 12 months and older. The MR characteristics that had significant (P ≤ 0.05) associations with metastases were homogeneous low T1-signal intensity, homogeneous high STIR-signal intensity, and heterogeneous pattern on T1, STIR, or GAD. Homogeneous low T1-signal had the highest sensitivity (88%), but a specificity of 62% for detecting metastases. A heterogeneous pattern on GAD was highly specific (97%), but relatively insensitive (65%) for detecting metastases. These MR characteristics were most accurate in children 12 months and older. The combination of non-contrast-enhanced T1W and GAD sequences can be used to determine the presence of spinal metastases in children with neuroblastoma, particularly those children who are 1 year and older. (orig.)

  17. Judgments relative to patterns: how temporal sequence patterns affect judgments and memory.

    Science.gov (United States)

    Kusev, Petko; Ayton, Peter; van Schaik, Paul; Tsaneva-Atanasova, Krasimira; Stewart, Neil; Chater, Nick

    2011-12-01

    Six experiments studied relative frequency judgment and recall of sequentially presented items drawn from 2 distinct categories (i.e., city and animal). The experiments show that judged frequencies of categories of sequentially encountered stimuli are affected by certain properties of the sequence configuration. We found (a) a first-run effect whereby people overestimated the frequency of a given category when that category was the first repeated category to occur in the sequence and (b) a dissociation between judgments and recall; respondents may judge 1 event more likely than the other and yet recall more instances of the latter. Specifically, the distribution of recalled items does not correspond to the frequency estimates for the event categories, indicating that participants do not make frequency judgments by sampling their memory for individual items as implied by other accounts such as the availability heuristic (Tversky & Kahneman, 1973) and the availability process model (Hastie & Park, 1986). We interpret these findings as reflecting the operation of a judgment heuristic sensitive to sequential patterns and offer an account for the relationship between memory and judged frequencies of sequentially encountered stimuli.

  18. Prediction of Human Activity by Discovering Temporal Sequence Patterns.

    Science.gov (United States)

    Li, Kang; Fu, Yun

    2014-08-01

    Early prediction of ongoing human activity has become more valuable in a large variety of time-critical applications. To build an effective representation for prediction, human activities can be characterized by a complex temporal composition of constituent simple actions and interacting objects. Different from early detection on short-duration simple actions, we propose a novel framework for long -duration complex activity prediction by discovering three key aspects of activity: Causality, Context-cue, and Predictability. The major contributions of our work include: (1) a general framework is proposed to systematically address the problem of complex activity prediction by mining temporal sequence patterns; (2) probabilistic suffix tree (PST) is introduced to model causal relationships between constituent actions, where both large and small order Markov dependencies between action units are captured; (3) the context-cue, especially interactive objects information, is modeled through sequential pattern mining (SPM), where a series of action and object co-occurrence are encoded as a complex symbolic sequence; (4) we also present a predictive accumulative function (PAF) to depict the predictability of each kind of activity. The effectiveness of our approach is evaluated on two experimental scenarios with two data sets for each: action-only prediction and context-aware prediction. Our method achieves superior performance for predicting global activity classes and local action units.

  19. Master stability functions reveal diffusion-driven pattern formation in networks

    Science.gov (United States)

    Brechtel, Andreas; Gramlich, Philipp; Ritterskamp, Daniel; Drossel, Barbara; Gross, Thilo

    2018-03-01

    We study diffusion-driven pattern formation in networks of networks, a class of multilayer systems, where different layers have the same topology, but different internal dynamics. Agents are assumed to disperse within a layer by undergoing random walks, while they can be created or destroyed by reactions between or within a layer. We show that the stability of homogeneous steady states can be analyzed with a master stability function approach that reveals a deep analogy between pattern formation in networks and pattern formation in continuous space. For illustration, we consider a generalized model of ecological meta-food webs. This fairly complex model describes the dispersal of many different species across a region consisting of a network of individual habitats while subject to realistic, nonlinear predator-prey interactions. In this example, the method reveals the intricate dependence of the dynamics on the spatial structure. The ability of the proposed approach to deal with this fairly complex system highlights it as a promising tool for ecology and other applications.

  20. RNAPattMatch: a web server for RNA sequence/structure motif detection based on pattern matching with flexible gaps

    Science.gov (United States)

    Drory Retwitzer, Matan; Polishchuk, Maya; Churkin, Elena; Kifer, Ilona; Yakhini, Zohar; Barash, Danny

    2015-01-01

    Searching for RNA sequence-structure patterns is becoming an essential tool for RNA practitioners. Novel discoveries of regulatory non-coding RNAs in targeted organisms and the motivation to find them across a wide range of organisms have prompted the use of computational RNA pattern matching as an enhancement to sequence similarity. State-of-the-art programs differ by the flexibility of patterns allowed as queries and by their simplicity of use. In particular—no existing method is available as a user-friendly web server. A general program that searches for RNA sequence-structure patterns is RNA Structator. However, it is not available as a web server and does not provide the option to allow flexible gap pattern representation with an upper bound of the gap length being specified at any position in the sequence. Here, we introduce RNAPattMatch, a web-based application that is user friendly and makes sequence/structure RNA queries accessible to practitioners of various background and proficiency. It also extends RNA Structator and allows a more flexible variable gaps representation, in addition to analysis of results using energy minimization methods. RNAPattMatch service is available at http://www.cs.bgu.ac.il/rnapattmatch. A standalone version of the search tool is also available to download at the site. PMID:25940619

  1. Whole Exome Sequencing Reveals Genetic Predisposition in a Large Family with Retinitis Pigmentosa

    Directory of Open Access Journals (Sweden)

    Juan Wu

    2014-01-01

    Full Text Available Next-generation sequencing has become more widely used to reveal genetic defect in monogenic disorders. Retinitis pigmentosa (RP, the leading cause of hereditary blindness worldwide, has been attributed to more than 67 disease-causing genes. Due to the extreme genetic heterogeneity, using general molecular screening alone is inadequate for identifying genetic predispositions in susceptible individuals. In order to identify underlying mutation rapidly, we utilized next-generation sequencing in a four-generation Chinese family with RP. Two affected patients and an unaffected sibling were subjected to whole exome sequencing. Through bioinformatics analysis and direct sequencing confirmation, we identified p.R135W transition in the rhodopsin gene. The mutation was subsequently confirmed to cosegregate with the disease in the family. In this study, our results suggest that whole exome sequencing is a robust method in diagnosing familial hereditary disease.

  2. Massively parallel amplicon sequencing reveals isotype-specific variability of antimicrobial peptide transcripts in Mytilus galloprovincialis.

    Directory of Open Access Journals (Sweden)

    Umberto Rosani

    Full Text Available BACKGROUND: Effective innate responses against potential pathogens are essential in the living world and possibly contributed to the evolutionary success of invertebrates. Taken together, antimicrobial peptide (AMP precursors of defensin, mytilin, myticin and mytimycin can represent about 40% of the hemocyte transcriptome in mussels injected with viral-like and bacterial preparations, and unique profiles of myticin C variants are expressed in single mussels. Based on amplicon pyrosequencing, we have ascertained and compared the natural and Vibrio-induced diversity of AMP transcripts in mussel hemocytes from three European regions. METHODOLOGY/PRINCIPAL FINDINGS: Hemolymph was collected from mussels farmed in the coastal regions of Palavas (France, Vigo (Spain and Venice (Italy. To represent the AMP families known in M. galloprovincialis, nine transcript sequences have been selected, amplified from hemocyte RNA and subjected to pyrosequencing. Hemolymph from farmed (offshore and wild (lagoon Venice mussels, both injected with 10(7 Vibrio cells, were similarly processed. Amplicon pyrosequencing emphasized the AMP transcript diversity, with Single Nucleotide Changes (SNC minimal for mytilin B/C and maximal for arthropod-like defensin and myticin C. Ratio of non-synonymous vs. synonymous changes also greatly differed between AMP isotypes. Overall, each amplicon revealed similar levels of nucleotidic variation across geographical regions, with two main sequence patterns confirmed for mytimycin and no substantial changes after immunostimulation. CONCLUSIONS/SIGNIFICANCE: Barcoding and bidirectional pyrosequencing allowed us to map and compare the transcript diversity of known mussel AMPs. Though most of the genuine cds variation was common to the analyzed samples we could estimate from 9 to 106 peptide variants in hemolymph pools representing 100 mussels, depending on the AMP isoform and sampling site. In this study, no prevailing SNC patterns related

  3. HIERARCHICAL ADAPTIVE ROOD PATTERN SEARCH FOR MOTION ESTIMATION AT VIDEO SEQUENCE ANALYSIS

    Directory of Open Access Journals (Sweden)

    V. T. Nguyen

    2016-05-01

    Full Text Available Subject of Research.The paper deals with the motion estimation algorithms for the analysis of video sequences in compression standards MPEG-4 Visual and H.264. Anew algorithm has been offered based on the analysis of the advantages and disadvantages of existing algorithms. Method. Thealgorithm is called hierarchical adaptive rood pattern search (Hierarchical ARPS, HARPS. This new algorithm includes the classic adaptive rood pattern search ARPS and hierarchical search MP (Hierarchical search or Mean pyramid. All motion estimation algorithms have been implemented using MATLAB package and tested with several video sequences. Main Results. The criteria for evaluating the algorithms were: speed, peak signal to noise ratio, mean square error and mean absolute deviation. The proposed method showed a much better performance at a comparable error and deviation. The peak signal to noise ratio in different video sequences shows better and worse results than characteristics of known algorithms so it requires further investigation. Practical Relevance. Application of this algorithm in MPEG-4 and H.264 codecs instead of the standard can significantly reduce compression time. This feature enables to recommend it in telecommunication systems for multimedia data storing, transmission and processing.

  4. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Directory of Open Access Journals (Sweden)

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  5. ICRPfinder: a fast pattern design algorithm for coding sequences and its application in finding potential restriction enzyme recognition sites

    Directory of Open Access Journals (Sweden)

    Stafford Phillip

    2009-09-01

    Full Text Available Abstract Background Restriction enzymes can produce easily definable segments from DNA sequences by using a variety of cut patterns. There are, however, no software tools that can aid in gene building -- that is, modifying wild-type DNA sequences to express the same wild-type amino acid sequences but with enhanced codons, specific cut sites, unique post-translational modifications, and other engineered-in components for recombinant applications. A fast DNA pattern design algorithm, ICRPfinder, is provided in this paper and applied to find or create potential recognition sites in target coding sequences. Results ICRPfinder is applied to find or create restriction enzyme recognition sites by introducing silent mutations. The algorithm is shown capable of mapping existing cut-sites but importantly it also can generate specified new unique cut-sites within a specified region that are guaranteed not to be present elsewhere in the DNA sequence. Conclusion ICRPfinder is a powerful tool for finding or creating specific DNA patterns in a given target coding sequence. ICRPfinder finds or creates patterns, which can include restriction enzyme recognition sites, without changing the translated protein sequence. ICRPfinder is a browser-based JavaScript application and it can run on any platform, in on-line or off-line mode.

  6. Whole-exome sequencing reveals GPIHBP1 mutations in infantile colitis with severe hypertriglyceridemia.

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Mir, Sabina; Penney, Samantha; Jhangiani, Shalini; Midgen, Craig; Finegold, Milton; Muzny, Donna M; Wang, Min; Bacino, Carlos A; Gibbs, Richard A; Lupski, James R; Kellermayer, Richard; Hanchard, Neil A

    2014-07-01

    Severe congenital hypertriglyceridemia (HTG) is a rare disorder caused by mutations in genes affecting lipoprotein lipase (LPL) activity. Here we report a 5-week-old Hispanic girl with severe HTG (12,031 mg/dL, normal limit 150 mg/dL) who presented with the unusual combination of lower gastrointestinal bleeding and milky plasma. Initial colonoscopy was consistent with colitis, which resolved with reduction of triglycerides. After negative sequencing of the LPL gene, whole-exome sequencing revealed novel compound heterozygous mutations in GPIHBP1. Our study broadens the phenotype of GPIHBP1-associated HTG, reinforces the effectiveness of whole-exome sequencing in Mendelian diagnoses, and implicates triglycerides in gastrointestinal mucosal injury.

  7. Next-generation sequencing can reveal in vitro-generated PCR crossover products: some artifactual sequences correspond to HLA alleles in the IMGT/HLA database.

    Science.gov (United States)

    Holcomb, C L; Rastrou, M; Williams, T C; Goodridge, D; Lazaro, A M; Tilanus, M; Erlich, H A

    2014-01-01

    The high-resolution human leukocyte antigen (HLA) genotyping assay that we developed using 454 sequencing and Conexio software uses generic polymerase chain reaction (PCR) primers for DRB exon 2. Occasionally, we observed low abundance DRB amplicon sequences that resulted from in vitro PCR 'crossing over' between DRB1 and DRB3/4/5. These hybrid sequences, revealed by the clonal sequencing property of the 454 system, were generally observed at a read depth of 5%-10% of the true alleles. They usually contained at least one mismatch with the IMGT/HLA database, and consequently, were easily recognizable and did not cause a problem for HLA genotyping. Sometimes, however, these artifactual sequences matched a rare allele and the automatic genotype assignment was incorrect. These observations raised two issues: (1) could PCR conditions be modified to reduce such artifacts? and (2) could some of the rare alleles listed in the IMGT/HLA database be artifacts rather than true alleles? Because PCR crossing over occurs during late cycles of PCR, we compared DRB genotypes resulting from 28 and (our standard) 35 cycles of PCR. For all 21 cell line DNAs amplified for 35 cycles, crossover products were detected. In 33% of the cases, these hybrid sequences corresponded to named alleles. With amplification for only 28 cycles, these artifactual sequences were not detectable. To investigate whether some rare alleles in the IMGT/HLA database might be due to PCR artifacts, we analyzed four samples obtained from the investigators who submitted the sequences. In three cases, the sequences were generated from true alleles. In one case, our 454 sequencing revealed an error in the previously submitted sequence. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Science.gov (United States)

    Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli

    2012-01-01

    RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  9. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Directory of Open Access Journals (Sweden)

    Sara Kangaspeska

    Full Text Available RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60% of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  10. A SOM clustering pattern sequence-based next symbol prediction method for day-ahead direct electricity load and price forecasting

    International Nuclear Information System (INIS)

    Jin, Cheng Hao; Pok, Gouchol; Lee, Yongmi; Park, Hyun-Woo; Kim, Kwang Deuk; Yun, Unil; Ryu, Keun Ho

    2015-01-01

    Highlights: • A novel pattern sequence-based direct time series forecasting method was proposed. • Due to the use of SOM’s topology preserving property, only SOM can be applied. • SCPSNSP only deals with the cluster patterns not each specific time series value. • SCPSNSP performs better than recently developed forecasting algorithms. - Abstract: In this paper, we propose a new day-ahead direct time series forecasting method for competitive electricity markets based on clustering and next symbol prediction. In the clustering step, pattern sequence and their topology relations are obtained from self organizing map time series clustering. In the next symbol prediction step, with each cluster label in the pattern sequence represented as a pair of its topologically identical coordinates, artificial neural network is used to predict the topological coordinates of next day by training the relationship between previous daily pattern sequence and its next day pattern. According to the obtained topology relations, the nearest nonzero hits pattern is assigned to next day so that the whole time series values can be directly forecasted from the assigned cluster pattern. The proposed method was evaluated on Spanish, Australian and New York electricity markets and compared with PSF and some of the most recently published forecasting methods. Experimental results show that the proposed method outperforms the best forecasting methods at least 3.64%

  11. Markovian Model in High Order Sequence Prediction From Log-Motif Patterns in Agbada Paralic Section, Niger Delta, Nigeria

    International Nuclear Information System (INIS)

    Olabode, S. O.; Adekoya, J. A.

    2002-01-01

    Markovian model in the elucidation of high order sequence was applied to repetitive events of regressive and transgressive phases in the Agbada paralic section Niger Delta. The repetitive events are made up of delta front, delta topset and fluvio-deltaic sediments. The sediments consist of sands, sandstones, siltstones and shales in various proportions. Five wells: MN1, AA1, NP2, NP6 and NP8 were studied.Summary of biostratigraphic report and well log-motif patterns was used to delineate the third order depositional sequences in the wells.Various Markovian properties - observed transition frequency matrix, observed transition probability matrix, fixed probability vector, expected random matrix (randomised transition matrix) and difference matrix were determined for stacked high order sequence (high frequency cyclic events) nested within the third-order sequences using the log-motif patterns for the various sand bodies and shales. Flow diagrams were constructed for each of the depositional sequences to know the likely occurrence of number of cycles.Upward transition matrix between the log-motif patterns and flow diagram to elucidate cyclicity show that the overall regressive sequence of the Niger Delta has been modified by deltaic depositional elements and fluctuations in sea level. The predictions of higher order sequence within third order sequences from Markovian Properties provide good basis for correlation within the depositional sequences. The model has also been used to decipher the dominant depositional processes during the formation of the sequences. Discrete reservoir intervals and seal potentials within the sequences were also predicted from the flow diagrams constructed

  12. Ananke: temporal clustering reveals ecological dynamics of microbial communities

    Directory of Open Access Journals (Sweden)

    Michael W. Hall

    2017-09-01

    Full Text Available Taxonomic markers such as the 16S ribosomal RNA gene are widely used in microbial community analysis. A common first step in marker-gene analysis is grouping genes into clusters to reduce data sets to a more manageable size and potentially mitigate the effects of sequencing error. Instead of clustering based on sequence identity, marker-gene data sets collected over time can be clustered based on temporal correlation to reveal ecologically meaningful associations. We present Ananke, a free and open-source algorithm and software package that complements existing sequence-identity-based clustering approaches by clustering marker-gene data based on time-series profiles and provides interactive visualization of clusters, including highlighting of internal OTU inconsistencies. Ananke is able to cluster distinct temporal patterns from simulations of multiple ecological patterns, such as periodic seasonal dynamics and organism appearances/disappearances. We apply our algorithm to two longitudinal marker gene data sets: faecal communities from the human gut of an individual sampled over one year, and communities from a freshwater lake sampled over eleven years. Within the gut, the segregation of the bacterial community around a food-poisoning event was immediately clear. In the freshwater lake, we found that high sequence identity between marker genes does not guarantee similar temporal dynamics, and Ananke time-series clusters revealed patterns obscured by clustering based on sequence identity or taxonomy. Ananke is free and open-source software available at https://github.com/beiko-lab/ananke.

  13. Whole-genome sequencing reveals a potential causal mutation for dwarfism in the Miniature Shetland pony.

    Science.gov (United States)

    Metzger, Julia; Gast, Alana Christina; Schrimpf, Rahel; Rau, Janina; Eikelberg, Deborah; Beineke, Andreas; Hellige, Maren; Distl, Ottmar

    2017-04-01

    The Miniature Shetland pony represents a horse breed with an extremely small body size. Clinical examination of a dwarf Miniature Shetland pony revealed a lowered size at the withers, malformed skull and brachygnathia superior. Computed tomography (CT) showed a shortened maxilla and a cleft of the hard and soft palate which protruded into the nasal passage leading to breathing difficulties. Pathological examination confirmed these findings but did not reveal histopathological signs of premature ossification in limbs or cranial sutures. Whole-genome sequencing of this dwarf Miniature Shetland pony and comparative sequence analysis using 26 reference equids from NCBI Sequence Read Archive revealed three probably damaging missense variants which could be exclusively found in the affected foal. Validation of these three missense mutations in 159 control horses from different horse breeds and five donkeys revealed only the aggrecan (ACAN)-associated g.94370258G>C variant as homozygous wild-type in all control samples. The dwarf Miniature Shetland pony had the homozygous mutant genotype C/C of the ACAN:g.94370258G>C variant and the normal parents were heterozygous G/C. An unaffected full sib and 3/5 unaffected half-sibs were heterozygous G/C for the ACAN:g.94370258G>C variant. In summary, we could demonstrate a dwarf phenotype in a miniature pony breed perfectly associated with a missense mutation within the ACAN gene.

  14. Peripheral blood transcriptome sequencing reveals rejection-relevant genes in long-term heart transplantation.

    Science.gov (United States)

    Chen, Yan; Zhang, Haibo; Xiao, Xue; Jia, Yixin; Wu, Weili; Liu, Licheng; Jiang, Jun; Zhu, Baoli; Meng, Xu; Chen, Weijun

    2013-10-03

    Peripheral blood-based gene expression patterns have been investigated as biomarkers to monitor the immune system and rule out rejection after heart transplantation. Recent advances in the high-throughput deep sequencing (HTS) technologies provide new leads in transcriptome analysis. By performing Solexa/Illumina's digital gene expression (DGE) profiling, we analyzed gene expression profiles of PBMCs from 6 quiescent (grade 0) and 6 rejection (grade 2R&3R) heart transplant recipients at more than 6 months after transplantation. Subsequently, quantitative real-time polymerase chain reaction (qRT-PCR) was carried out in an independent validation cohort of 47 individuals from three rejection groups (ISHLT, grade 0,1R, 2R&3R). Through DGE sequencing and qPCR validation, 10 genes were identified as informative genes for detection of cardiac transplant rejection. A further clustering analysis showed that the 10 genes were not only effective for distinguishing patients with acute cardiac allograft rejection, but also informative for discriminating patients with renal allograft rejection based on both blood and biopsy samples. Moreover, PPI network analysis revealed that the 10 genes were connected to each other within a short interaction distance. We proposed a 10-gene signature for heart transplant patients at high-risk of developing severe rejection, which was found to be effective as well in other organ transplant. Moreover, we supposed that these genes function systematically as biomarkers in long-time allograft rejection. Further validation in broad transplant population would be required before the non-invasive biomarkers can be generally utilized to predict the risk of transplant rejection. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  15. Descriptive parameters for revealing substitution patterns of sugar beet pectins using pectolytic enzymes.

    Science.gov (United States)

    Remoroza, C; Buchholt, H C; Gruppen, H; Schols, H A

    2014-01-30

    Enzymatic fingerprinting was applied to sugar beet pectins (SBPs) modified by either plant or fungal pectin methyl esterases and alkali catalyzed de-esterification to reveal the ester distributions over the pectin backbone. A simultaneous pectin lyase (PL) treatment to the commonly used endo-polygalacturonase (endo-PG) degradation showed to be effective in degrading both high and low methylesterified and/or acetylated homogalaturonan regions of SBP simultaneously. Using LC-HILIC-MS/ELSD, we studied in detail all the diagnostic oligomers present, enabling us to discriminate between differently prepared sugar beet pectins having various levels of methylesterification and acetylation. Furthermore, distinction between commercially extracted and de-esterified sugar beet pectin having different patterns of substitution was achieved by using novel descriptive pectin parameters. In addition to DBabs approach for nonmethylesterified sequences degradable by endo-PG, the "degree of hydrolysis" (DHPG) representing all partially saturated methylesterified and/or acetylated galacturonic acid (GalA) moieties was introduced as a new parameter. Consequently, the description DHPL has been introduced to quantify all esterified unsaturated GalA oligomers. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation

    Directory of Open Access Journals (Sweden)

    Papaloukas Costas

    2009-04-01

    Full Text Available Abstract Background Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the trans conformation. In spite of their infrequent occurrence, cis peptide bonds play a key role in the protein structure and function, as well as in many significant biological processes. Results We perform a systematic analysis of regions in protein sequences that contain a proline cis peptide bond in order to discover non-random associations between the primary sequence and the nature of proline cis/trans isomerization. For this purpose an efficient pattern discovery algorithm is employed which discovers regular expression-type patterns that are overrepresented (i.e. appear frequently repeated in a set of sequences. Four types of pattern discovery are performed: i exact pattern discovery, ii pattern discovery using a chemical equivalency set, iii pattern discovery using a structural equivalency set and iv pattern discovery using certain amino acids' physicochemical properties. The extracted patterns are carefully validated using a specially implemented scoring function and a significance measure (i.e. log-probability estimate indicative of their specificity. The score threshold for the first three types of pattern discovery is 0.90 while for the last type of pattern discovery 0.80. Regarding the significance measure, all patterns yielded values in the range [-9, -31] which ensure that the derived patterns are highly unlikely to have emerged by chance. Among the highest scoring patterns, most of them are consistent with previous investigations concerning the neighborhood of cis proline peptide bonds, and many new ones are identified. Finally, the extracted patterns are systematically compared against the PROSITE database, in order to gain insight into the functional implications of cis prolyl bonds. Conclusion Cis patterns with matches in the PROSITE database fell mostly into two

  17. Transcriptome Sequencing Revealed Significant Alteration of Cortical Promoter Usage and Splicing in Schizophrenia

    Science.gov (United States)

    Wu, Jing Qin; Wang, Xi; Beveridge, Natalie J.; Tooney, Paul A.; Scott, Rodney J.; Carr, Vaughan J.; Cairns, Murray J.

    2012-01-01

    Background While hybridization based analysis of the cortical transcriptome has provided important insight into the neuropathology of schizophrenia, it represents a restricted view of disease-associated gene activity based on predetermined probes. By contrast, sequencing technology can provide un-biased analysis of transcription at nucleotide resolution. Here we use this approach to investigate schizophrenia-associated cortical gene expression. Methodology/Principal Findings The data was generated from 76 bp reads of RNA-Seq, aligned to the reference genome and assembled into transcripts for quantification of exons, splice variants and alternative promoters in postmortem superior temporal gyrus (STG/BA22) from 9 male subjects with schizophrenia and 9 matched non-psychiatric controls. Differentially expressed genes were then subjected to further sequence and functional group analysis. The output, amounting to more than 38 Gb of sequence, revealed significant alteration of gene expression including many previously shown to be associated with schizophrenia. Gene ontology enrichment analysis followed by functional map construction identified three functional clusters highly relevant to schizophrenia including neurotransmission related functions, synaptic vesicle trafficking, and neural development. Significantly, more than 2000 genes displayed schizophrenia-associated alternative promoter usage and more than 1000 genes showed differential splicing (FDRschizophrenia-associated transcriptional diversity within the STG, and revealed variants with important implications for the complex pathophysiology of schizophrenia. PMID:22558445

  18. Deep sequencing reveals double mutations in cis of MPL exon 10 in myeloproliferative neoplasms.

    Science.gov (United States)

    Pietra, Daniela; Brisci, Angela; Rumi, Elisa; Boggi, Sabrina; Elena, Chiara; Pietrelli, Alessandro; Bordoni, Roberta; Ferrari, Maurizio; Passamonti, Francesco; De Bellis, Gianluca; Cremonesi, Laura; Cazzola, Mario

    2011-04-01

    Somatic mutations of MPL exon 10, mainly involving a W515 substitution, have been described in JAK2 (V617F)-negative patients with essential thrombocythemia and primary myelofibrosis. We used direct sequencing and high-resolution melt analysis to identify mutations of MPL exon 10 in 570 patients with myeloproliferative neoplasms, and allele specific PCR and deep sequencing to further characterize a subset of mutated patients. Somatic mutations were detected in 33 of 221 patients (15%) with JAK2 (V617F)-negative essential thrombocythemia or primary myelofibrosis. Only one patient with essential thrombocythemia carried both JAK2 (V617F) and MPL (W515L). High-resolution melt analysis identified abnormal patterns in all the MPL mutated cases, while direct sequencing did not detect the mutant MPL in one fifth of them. In 3 cases carrying double MPL mutations, deep sequencing analysis showed identical load and location in cis of the paired lesions, indicating their simultaneous occurrence on the same chromosome.

  19. Some maternal lineages of domestic horses may have origins in East Asia revealed with further evidence of mitochondrial genomes and HVR-1 sequences

    Directory of Open Access Journals (Sweden)

    Hongying Ma

    2018-06-01

    Full Text Available Objectives There are large populations of indigenous horse (Equus caballus in China and some other parts of East Asia. However, their matrilineal genetic diversity and origin remained poorly understood. Using a combination of mitochondrial DNA (mtDNA and hypervariable region (HVR-1 sequences, we aim to investigate the origin of matrilineal inheritance in these domestic horses. Methods To investigate patterns of matrilineal inheritance in domestic horses, we conducted a phylogenetic study using 31 de novo mtDNA genomes together with 317 others from the GenBank. In terms of the updated phylogeny, a total of 5,180 horse mitochondrial HVR-1 sequences were analyzed. Results Eightteen haplogroups (Aw-Rw were uncovered from the analysis of the whole mitochondrial genomes. Most of which have a divergence time before the earliest domestication of wild horses (about 5,800 years ago and during the Upper Paleolithic (35–10 KYA. The distribution of some haplogroups shows geographic patterns. The Lw haplogroup contained a significantly higher proportion of European horses than the horses from other regions, while haplogroups Jw, Rw, and some maternal lineages of Cw, have a higher frequency in the horses from East Asia. The 5,180 sequences of horse mitochondrial HVR-1 form nine major haplogroups (A-I. We revealed a corresponding relationship between the haplotypes of HVR-1 and those of whole mitochondrial DNA sequences. The data of the HVR-1 sequences also suggests that Jw, Rw, and some haplotypes of Cw may have originated in East Asia while Lw probably formed in Europe. Conclusions Our study supports the hypothesis of the multiple origins of the maternal lineage of domestic horses and some maternal lineages of domestic horses may have originated from East Asia.

  20. Complete genome sequence analysis of novel human bocavirus reveals genetic recombination between human bocavirus 2 and human bocavirus 4.

    Science.gov (United States)

    Khamrin, Pattara; Okitsu, Shoko; Ushijima, Hiroshi; Maneekarn, Niwat

    2013-07-01

    Epidemiological surveillance of human bocavirus (HBoV) was conducted on fecal specimens collected from hospitalized children with diarrhea in Chiang Mai, Thailand in 2011. By partial sequence analysis of VP1 gene, an unusual strain of HBoV (CMH-S011-11), was initially identified as HBoV4. The complete genome sequence of CMH-S011-11 was performed and analyzed further to clarify whether it was a recombinant strain or a new HBoV variant. Analysis of complete genome sequence revealed that the coding sequence starting from NS1, NP1 to VP1/VP2 was 4795 nucleotides long. Interestingly, the nucleotide sequence of NS1 gene of CMH-S011-11 was most closely related to the HBoV2 reference strains detected in Pakistan, which contradicted to the initial genotyping result of the partial VP1 region in the previous study. In addition, comparison of NP1 nucleotide sequence of CMH-S011-11 with those of other HBoV1-4 reference strains also revealed a high level of sequence identity with HBoV2. On the other hand, nucleotide sequence of VP1/VP2 gene of CMH-S011-11 was most closely related to those of HBoV4 reference strains detected in Nigeria. The overall full-length sequence analysis revealed that this CMH-S011-11 was grouped within HBoV4 species, but located in a separate branch from other HBoV4 prototype strains. Recombination analysis revealed that CMH-S011-11 was the result of recombination between HBoV2 and HBoV4 strains with the break point located near the start codon of VP2. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. Taxonomy of anaerobic digestion microbiome reveals biases associated with the applied high throughput sequencing strategies

    DEFF Research Database (Denmark)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis

    2018-01-01

    In the past few years, many studies investigated the anaerobic digestion microbiome by means of 16S rRNA amplicon sequencing. Results obtained from these studies were compared to each other without taking into consideration the followed procedure for amplicons preparation and data analysis...... specifically, the microbial compositions of three laboratory scale biogas reactors were analyzed before and after addition of sodium oleate by sequencing the microbiome with three different approaches: 16S rRNA amplicon sequencing, shotgun DNA and shotgun RNA. This comparative analysis revealed that......, in amplicon sequencing, abundance of some taxa (Euryarchaeota and Spirochaetes) was biased by the inefficiency of universal primers to hybridize all the templates. Reliability of the results obtained was also influenced by the number of hypervariable regions under investigation. Finally, amplicon sequencing...

  2. Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data.

    Science.gov (United States)

    Nuel, Gregory; Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude

    2010-01-26

    In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.). Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Our algorithms prove to be effective and able to handle real data sets with multiple sequences, as well as biological patterns of

  3. Molecular characterization of HCV in a Swedish county over 8 years (2002–2009 reveals distinct transmission patterns

    Directory of Open Access Journals (Sweden)

    Josefine Ederth

    2016-02-01

    Full Text Available Background: Hepatitis C virus (HCV is a major public health concern and data on its molecular epidemiology in Sweden is scarce. We carried out an 8-year population-based study of newly diagnosed HCV cases in one of Sweden's centrally situated counties, Södermanland (D-county. The aim was to characterize the HCV strains circulating, analyze their genetic relatedness to detect networks, and in combination with demographic data learn more about transmission. Methods: Molecular analyses of serum samples from 91% (N=557 of all newly notified cases in D-county, 2002–2009, were performed. Phylogenetic analysis (NS5B gene, 300 bp was linked to demographic data from the national surveillance database, SmiNet, to characterize D-county transmission clusters. The linear-by-linear association test (LBL was used to analyze trends over time. Results: The most prevalent subtypes were 1a (38% and 3a (34%. Subtype 1a was most prevalent among cases transmitted via sexual contact, via contaminated blood, or blood products, while subtype 3a was most prevalent among people who inject drugs (PWIDs. Phylogenetic analysis revealed that the subtype 3a sequences formed more and larger transmission clusters (50% of the sequences clustered, while the 1a sequences formed smaller clusters (19% of the sequences clustered, possibly suggesting different epidemics. Conclusion: We found different transmission patterns in D-county which may, from a public health perspective, have implications for how to control virus infections by targeted interventions.

  4. The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae

    Directory of Open Access Journals (Sweden)

    David B. Neale

    2017-09-01

    Full Text Available A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb. Franco (Coastal Douglas-fir is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp. Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms.

  5. Comparative analysis of codon usage bias and codon context patterns between dipteran and hymenopteran sequenced genomes.

    Directory of Open Access Journals (Sweden)

    Susanta K Behura

    Full Text Available BACKGROUND: Codon bias is a phenomenon of non-uniform usage of codons whereas codon context generally refers to sequential pair of codons in a gene. Although genome sequencing of multiple species of dipteran and hymenopteran insects have been completed only a few of these species have been analyzed for codon usage bias. METHODS AND PRINCIPAL FINDINGS: Here, we use bioinformatics approaches to analyze codon usage bias and codon context patterns in a genome-wide manner among 15 dipteran and 7 hymenopteran insect species. Results show that GAA is the most frequent codon in the dipteran species whereas GAG is the most frequent codon in the hymenopteran species. Data reveals that codons ending with C or G are frequently used in the dipteran genomes whereas codons ending with A or T are frequently used in the hymenopteran genomes. Synonymous codon usage orders (SCUO vary within genomes in a pattern that seems to be distinct for each species. Based on comparison of 30 one-to-one orthologous genes among 17 species, the fruit fly Drosophila willistoni shows the least codon usage bias whereas the honey bee (Apis mellifera shows the highest bias. Analysis of codon context patterns of these insects shows that specific codons are frequently used as the 3'- and 5'-context of start and stop codons, respectively. CONCLUSIONS: Codon bias pattern is distinct between dipteran and hymenopteran insects. While codon bias is favored by high GC content of dipteran genomes, high AT content of genes favors biased usage of synonymous codons in the hymenopteran insects. Also, codon context patterns vary among these species largely according to their phylogeny.

  6. Proteomic Analysis of Lysine Acetylation Sites in Rat Tissues Reveals Organ Specificity and Subcellular Patterns

    Directory of Open Access Journals (Sweden)

    Alicia Lundby

    2012-08-01

    Full Text Available Lysine acetylation is a major posttranslational modification involved in a broad array of physiological functions. Here, we provide an organ-wide map of lysine acetylation sites from 16 rat tissues analyzed by high-resolution tandem mass spectrometry. We quantify 15,474 modification sites on 4,541 proteins and provide the data set as a web-based database. We demonstrate that lysine acetylation displays site-specific sequence motifs that diverge between cellular compartments, with a significant fraction of nuclear sites conforming to the consensus motifs G-AcK and AcK-P. Our data set reveals that the subcellular acetylation distribution is tissue-type dependent and that acetylation targets tissue-specific pathways involved in fundamental physiological processes. We compare lysine acetylation patterns for rat as well as human skeletal muscle biopsies and demonstrate its general involvement in muscle contraction. Furthermore, we illustrate that acetylation of fructose-bisphosphate aldolase and glycerol-3-phosphate dehydrogenase serves as a cellular mechanism to switch off enzymatic activity.

  7. [Exome sequencing revealed Allan-Herndon-Dudley syndrome underlying multiple disabilities].

    Science.gov (United States)

    Arvio, Maria; Philips, Anju K; Ahvenainen, Minna; Somer, Mirja; Kalscheuer, Vera; Järvelä, Irma

    2014-01-01

    Normal function of the thyroid gland is the cornerstone of a child's mental development and physical growth. We describe a Finnish family, in which the diagnosis of three brothers became clear after investigations that lasted for more than 30 years. Two of the sons have already died. DNA analysis of the third one, a 16-year-old boy, revealed in exome sequencing of the complete X chromosome a mutation in the SLC16A2 gene, i.e. MCT8, coding for a thyroid hormone transport protein. Allan-Herndon-Dudley syndrome was thus shown to be the cause of multiple disabilities.

  8. Electromyographic Patterns during Golf Swing: Activation Sequence Profiling and Prediction of Shot Effectiveness.

    Science.gov (United States)

    Verikas, Antanas; Vaiciukynas, Evaldas; Gelzinis, Adas; Parker, James; Olsson, M Charlotte

    2016-04-23

    This study analyzes muscle activity, recorded in an eight-channel electromyographic (EMG) signal stream, during the golf swing using a 7-iron club and exploits information extracted from EMG dynamics to predict the success of the resulting shot. Muscles of the arm and shoulder on both the left and right sides, namely flexor carpi radialis, extensor digitorum communis, rhomboideus and trapezius, are considered for 15 golf players (∼5 shots each). The method using Gaussian filtering is outlined for EMG onset time estimation in each channel and activation sequence profiling. Shots of each player revealed a persistent pattern of muscle activation. Profiles were plotted and insights with respect to player effectiveness were provided. Inspection of EMG dynamics revealed a pair of highest peaks in each channel as the hallmark of golf swing, and a custom application of peak detection for automatic extraction of swing segment was introduced. Various EMG features, encompassing 22 feature sets, were constructed. Feature sets were used individually and also in decision-level fusion for the prediction of shot effectiveness. The prediction of the target attribute, such as club head speed or ball carry distance, was investigated using random forest as the learner in detection and regression tasks. Detection evaluates the personal effectiveness of a shot with respect to the player-specific average, whereas regression estimates the value of target attribute, using EMG features as predictors. Fusion after decision optimization provided the best results: the equal error rate in detection was 24.3% for the speed and 31.7% for the distance; the mean absolute percentage error in regression was 3.2% for the speed and 6.4% for the distance. Proposed EMG feature sets were found to be useful, especially when used in combination. Rankings of feature sets indicated statistics for muscle activity in both the left and right body sides, correlation-based analysis of EMG dynamics and features

  9. Electromyographic Patterns during Golf Swing: Activation Sequence Profiling and Prediction of Shot Effectiveness

    Directory of Open Access Journals (Sweden)

    Antanas Verikas

    2016-04-01

    Full Text Available This study analyzes muscle activity, recorded in an eight-channel electromyographic (EMG signal stream, during the golf swing using a 7-iron club and exploits information extracted from EMG dynamics to predict the success of the resulting shot. Muscles of the arm and shoulder on both the left and right sides, namely flexor carpi radialis, extensor digitorum communis, rhomboideus and trapezius, are considered for 15 golf players (∼5 shots each. The method using Gaussian filtering is outlined for EMG onset time estimation in each channel and activation sequence profiling. Shots of each player revealed a persistent pattern of muscle activation. Profiles were plotted and insights with respect to player effectiveness were provided. Inspection of EMG dynamics revealed a pair of highest peaks in each channel as the hallmark of golf swing, and a custom application of peak detection for automatic extraction of swing segment was introduced. Various EMG features, encompassing 22 feature sets, were constructed. Feature sets were used individually and also in decision-level fusion for the prediction of shot effectiveness. The prediction of the target attribute, such as club head speed or ball carry distance, was investigated using random forest as the learner in detection and regression tasks. Detection evaluates the personal effectiveness of a shot with respect to the player-specific average, whereas regression estimates the value of target attribute, using EMG features as predictors. Fusion after decision optimization provided the best results: the equal error rate in detection was 24.3% for the speed and 31.7% for the distance; the mean absolute percentage error in regression was 3.2% for the speed and 6.4% for the distance. Proposed EMG feature sets were found to be useful, especially when used in combination. Rankings of feature sets indicated statistics for muscle activity in both the left and right body sides, correlation-based analysis of EMG

  10. Whole-exome sequencing revealed two novel mutations in Usher syndrome.

    Science.gov (United States)

    Koparir, Asuman; Karatas, Omer Faruk; Atayoglu, Ali Timucin; Yuksel, Bayram; Sagiroglu, Mahmut Samil; Seven, Mehmet; Ulucan, Hakan; Yuksel, Adnan; Ozen, Mustafa

    2015-06-01

    Usher syndrome is a clinically and genetically heterogeneous autosomal recessive inherited disorder accompanied by hearing loss and retinitis pigmentosa (RP). Since the associated genes are various and quite large, we utilized whole-exome sequencing (WES) as a diagnostic tool to identify the molecular basis of Usher syndrome. DNA from a 12-year-old male diagnosed with Usher syndrome was analyzed by WES. Mutations detected were confirmed by Sanger sequencing. The pathogenicity of these mutations was determined by in silico analysis. A maternally inherited deleterious frameshift mutation, c.14439_14454del in exon 66 and a paternally inherited non-sense c.10830G>A stop-gain SNV in exon 55 of USH2A were found as two novel compound heterozygous mutations. Both of these mutations disrupt the C terminal of USH2A protein. As a result, WES revealed two novel compound heterozygous mutations in a Turkish USH2A patient. This approach gave us an opportunity to have an appropriate diagnosis and provide genetic counseling to the family within a reasonable time. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Widespread alternative and aberrant splicing revealed by lariat sequencing

    Science.gov (United States)

    Stepankiw, Nicholas; Raghavan, Madhura; Fogarty, Elizabeth A.; Grimson, Andrew; Pleiss, Jeffrey A.

    2015-01-01

    Alternative splicing is an important and ancient feature of eukaryotic gene structure, the existence of which has likely facilitated eukaryotic proteome expansions. Here, we have used intron lariat sequencing to generate a comprehensive profile of splicing events in Schizosaccharomyces pombe, amongst the simplest organisms that possess mammalian-like splice site degeneracy. We reveal an unprecedented level of alternative splicing, including alternative splice site selection for over half of all annotated introns, hundreds of novel exon-skipping events, and thousands of novel introns. Moreover, the frequency of these events is far higher than previous estimates, with alternative splice sites on average activated at ∼3% the rate of canonical sites. Although a subset of alternative sites are conserved in related species, implying functional potential, the majority are not detectably conserved. Interestingly, the rate of aberrant splicing is inversely related to expression level, with lowly expressed genes more prone to erroneous splicing. Although we validate many events with RNAseq, the proportion of alternative splicing discovered with lariat sequencing is far greater, a difference we attribute to preferential decay of aberrantly spliced transcripts. Together, these data suggest the spliceosome possesses far lower fidelity than previously appreciated, highlighting the potential contributions of alternative splicing in generating novel gene structures. PMID:26261211

  12. Patterns of hybrid loss of imprinting reveal tissue- and cluster-specific regulation.

    Directory of Open Access Journals (Sweden)

    Christopher D Wiley

    Full Text Available Crosses between natural populations of two species of deer mice, Peromyscus maniculatus (BW, and P. polionotus (PO, produce parent-of-origin effects on growth and development. BW females mated to PO males (bwxpo produce growth-retarded but otherwise healthy offspring. In contrast, PO females mated to BW males (POxBW produce overgrown and severely defective offspring. The hybrid phenotypes are pronounced in the placenta and include POxBW conceptuses which lack embryonic structures. Evidence to date links variation in control of genomic imprinting with the hybrid defects, particularly in the POxBW offspring. Establishment of genomic imprinting is typically mediated by gametic DNA methylation at sites known as gDMRs. However, imprinted gene clusters vary in their regulation by gDMR sequences.Here we further assess imprinted gene expression and DNA methylation at different cluster types in order to discern patterns. These data reveal POxBW misexpression at the Kcnq1ot1 and Peg3 clusters, both of which lose ICR methylation in placental tissues. In contrast, some embryonic transcripts (Peg10, Kcnq1ot1 reactivated the silenced allele with little or no loss of DNA methylation. Hybrid brains also display different patterns of imprinting perturbations. Several cluster pairs thought to use analogous regulatory mechanisms are differentially affected in the hybrids.These data reinforce the hypothesis that placental and somatic gene regulation differs significantly, as does that between imprinted gene clusters and between species. That such epigenetic regulatory variation exists in recently diverged species suggests a role in reproductive isolation, and that this variation is likely to be adaptive.

  13. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.

    Science.gov (United States)

    Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew

    2017-11-06

    Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.

  14. Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data

    Directory of Open Access Journals (Sweden)

    Regad Leslie

    2010-01-01

    Full Text Available Abstract Background In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.. Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. Results The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Conclusions Our algorithms prove to be effective and able to handle real data sets with

  15. Design Pattern Mining Using Distributed Learning Automata and DNA Sequence Alignment

    Science.gov (United States)

    Esmaeilpour, Mansour; Naderifar, Vahideh; Shukur, Zarina

    2014-01-01

    Context Over the last decade, design patterns have been used extensively to generate reusable solutions to frequently encountered problems in software engineering and object oriented programming. A design pattern is a repeatable software design solution that provides a template for solving various instances of a general problem. Objective This paper describes a new method for pattern mining, isolating design patterns and relationship between them; and a related tool, DLA-DNA for all implemented pattern and all projects used for evaluation. DLA-DNA achieves acceptable precision and recall instead of other evaluated tools based on distributed learning automata (DLA) and deoxyribonucleic acid (DNA) sequences alignment. Method The proposed method mines structural design patterns in the object oriented source code and extracts the strong and weak relationships between them, enabling analyzers and programmers to determine the dependency rate of each object, component, and other section of the code for parameter passing and modular programming. The proposed model can detect design patterns better that available other tools those are Pinot, PTIDEJ and DPJF; and the strengths of their relationships. Results The result demonstrate that whenever the source code is build standard and non-standard, based on the design patterns, then the result of the proposed method is near to DPJF and better that Pinot and PTIDEJ. The proposed model is tested on the several source codes and is compared with other related models and available tools those the results show the precision and recall of the proposed method, averagely 20% and 9.6% are more than Pinot, 27% and 31% are more than PTIDEJ and 3.3% and 2% are more than DPJF respectively. Conclusion The primary idea of the proposed method is organized in two following steps: the first step, elemental design patterns are identified, while at the second step, is composed to recognize actual design patterns. PMID:25243670

  16. Design pattern mining using distributed learning automata and DNA sequence alignment.

    Directory of Open Access Journals (Sweden)

    Mansour Esmaeilpour

    Full Text Available CONTEXT: Over the last decade, design patterns have been used extensively to generate reusable solutions to frequently encountered problems in software engineering and object oriented programming. A design pattern is a repeatable software design solution that provides a template for solving various instances of a general problem. OBJECTIVE: This paper describes a new method for pattern mining, isolating design patterns and relationship between them; and a related tool, DLA-DNA for all implemented pattern and all projects used for evaluation. DLA-DNA achieves acceptable precision and recall instead of other evaluated tools based on distributed learning automata (DLA and deoxyribonucleic acid (DNA sequences alignment. METHOD: The proposed method mines structural design patterns in the object oriented source code and extracts the strong and weak relationships between them, enabling analyzers and programmers to determine the dependency rate of each object, component, and other section of the code for parameter passing and modular programming. The proposed model can detect design patterns better that available other tools those are Pinot, PTIDEJ and DPJF; and the strengths of their relationships. RESULTS: The result demonstrate that whenever the source code is build standard and non-standard, based on the design patterns, then the result of the proposed method is near to DPJF and better that Pinot and PTIDEJ. The proposed model is tested on the several source codes and is compared with other related models and available tools those the results show the precision and recall of the proposed method, averagely 20% and 9.6% are more than Pinot, 27% and 31% are more than PTIDEJ and 3.3% and 2% are more than DPJF respectively. CONCLUSION: The primary idea of the proposed method is organized in two following steps: the first step, elemental design patterns are identified, while at the second step, is composed to recognize actual design patterns.

  17. Design pattern mining using distributed learning automata and DNA sequence alignment.

    Science.gov (United States)

    Esmaeilpour, Mansour; Naderifar, Vahideh; Shukur, Zarina

    2014-01-01

    Over the last decade, design patterns have been used extensively to generate reusable solutions to frequently encountered problems in software engineering and object oriented programming. A design pattern is a repeatable software design solution that provides a template for solving various instances of a general problem. This paper describes a new method for pattern mining, isolating design patterns and relationship between them; and a related tool, DLA-DNA for all implemented pattern and all projects used for evaluation. DLA-DNA achieves acceptable precision and recall instead of other evaluated tools based on distributed learning automata (DLA) and deoxyribonucleic acid (DNA) sequences alignment. The proposed method mines structural design patterns in the object oriented source code and extracts the strong and weak relationships between them, enabling analyzers and programmers to determine the dependency rate of each object, component, and other section of the code for parameter passing and modular programming. The proposed model can detect design patterns better that available other tools those are Pinot, PTIDEJ and DPJF; and the strengths of their relationships. The result demonstrate that whenever the source code is build standard and non-standard, based on the design patterns, then the result of the proposed method is near to DPJF and better that Pinot and PTIDEJ. The proposed model is tested on the several source codes and is compared with other related models and available tools those the results show the precision and recall of the proposed method, averagely 20% and 9.6% are more than Pinot, 27% and 31% are more than PTIDEJ and 3.3% and 2% are more than DPJF respectively. The primary idea of the proposed method is organized in two following steps: the first step, elemental design patterns are identified, while at the second step, is composed to recognize actual design patterns.

  18. Phylogenetic relationships among Synallaxini spinetails (Aves: Furnariidae) reveal a new biogeographic pattern across the Amazon and Paraná river basins.

    Science.gov (United States)

    Claramunt, Santiago

    2014-09-01

    Relationships among genera in the tribe Synallaxini have proved difficult to resolve. In this study, I investigate relationships among Synallaxis, Certhiaxis and Schoeniophylax using DNA sequences from the mitochondrion and three nuclear regions. I implemented novel primers and protocols for amplifying and sequencing autosomal and sex-linked introns in Furnariidae that resolved basal relationships in the Synallaxini with strong support. Synallaxis propinqua is sister to Schoeniophylax phryganophilus, and together they form a clade with Certhiaxis. The results are robust to analytical approaches when all genomic regions are analyzed jointly (parsimony, maximum likelihood, and species-tree analysis) and the same basal relationships are recovered by most genomic regions when analyzed separately. A sister relationship between S. propinqua, an Amazonian river island specialist, and S. phryganophilus, from the Paraná River basin region, reveals a new biogeographic pattern shared by at least other four pairs of taxa with similar distributions and ecologies. Estimates of divergence times for these five pairs span from the late Miocene to the Pleistocene. Identification of the historical events that produced this pattern is difficult and further advances will require additional studies of the taxa involved and a better understanding of the recent environmental history of South America. A new classification is proposed for the Synallaxini, including the description of a new genus for S. propinqua. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. RNA-Sequencing Reveals Unique Transcriptional Signatures of Running and Running-Independent Environmental Enrichment in the Adult Mouse Dentate Gyrus.

    Science.gov (United States)

    Grégoire, Catherine-Alexandra; Tobin, Stephanie; Goldenstein, Brianna L; Samarut, Éric; Leclerc, Andréanne; Aumont, Anne; Drapeau, Pierre; Fulton, Stephanie; Fernandes, Karl J L

    2018-01-01

    Environmental enrichment (EE) is a powerful stimulus of brain plasticity and is among the most accessible treatment options for brain disease. In rodents, EE is modeled using multi-factorial environments that include running, social interactions, and/or complex surroundings. Here, we show that running and running-independent EE differentially affect the hippocampal dentate gyrus (DG), a brain region critical for learning and memory. Outbred male CD1 mice housed individually with a voluntary running disk showed improved spatial memory in the radial arm maze compared to individually- or socially-housed mice with a locked disk. We therefore used RNA sequencing to perform an unbiased interrogation of DG gene expression in mice exposed to either a voluntary running disk (RUN), a locked disk (LD), or a locked disk plus social enrichment and tunnels [i.e., a running-independent complex environment (CE)]. RNA sequencing revealed that RUN and CE mice showed distinct, non-overlapping patterns of transcriptomic changes versus the LD control. Bio-informatics uncovered that the RUN and CE environments modulate separate transcriptional networks, biological processes, cellular compartments and molecular pathways, with RUN preferentially regulating synaptic and growth-related pathways and CE altering extracellular matrix-related functions. Within the RUN group, high-distance runners also showed selective stress pathway alterations that correlated with a drastic decline in overall transcriptional changes, suggesting that excess running causes a stress-induced suppression of running's genetic effects. Our findings reveal stimulus-dependent transcriptional signatures of EE on the DG, and provide a resource for generating unbiased, data-driven hypotheses for novel mediators of EE-induced cognitive changes.

  20. Cross-shelf investigation of coral reef cryptic benthic organisms reveals diversity patterns of the hidden majority

    KAUST Repository

    Pearman, John K.

    2018-05-18

    Coral reefs harbor diverse assemblages of organisms yet the majority of this diversity is hidden within the three dimensional structure of the reef and neglected using standard visual surveys. This study uses Autonomous Reef Monitoring Structures (ARMS) and amplicon sequencing methodologies, targeting mitochondrial cytochrome oxidase I and 18S rRNA genes, to investigate changes in the cryptic reef biodiversity. ARMS, deployed at 11 sites across a near- to off-shore gradient in the Red Sea were dominated by Porifera (sessile fraction), Arthropoda and Annelida (mobile fractions). The two primer sets detected different taxa lists, but patterns in community composition and structure were similar. While the microhabitat of the ARMS deployment affected the community structure, a clear cross-shelf gradient was observed for all fractions investigated. The partitioning of beta-diversity revealed that replacement (i.e. the substitution of species) made the highest contribution with richness playing a smaller role. Hence, different reef habitats across the shelf are relevant to regional diversity, as they harbor different communities, a result with clear implications for the design of Marine Protected Areas. ARMS can be vital tools to assess biodiversity patterns in the generally neglected but species-rich cryptic benthos, providing invaluable information for the management and conservation of hard-bottomed habitats over local and global scales.

  1. Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

    2015-01-01

    Background: The DHFR negative CHO DXB11 cell line (also known as DUX-B11 and DUKX) was historically the first CHO cell line to be used for large scale production of heterologous proteins and is still used for production of a number of complex proteins.  Results: Here we present the genomic sequence...... of the CHO DXB11 genome sequenced to a depth of 33x. Overall a significant genomic drift was seen favoring GC -> AT point mutations in line with the chemical mutagenesis strategy used for generation of the cell line. The sequencing depth for each gene in the genome revealed distinct peaks at sequencing...... in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2...

  2. Transcriptome sequencing revealed significant alteration of cortical promoter usage and splicing in schizophrenia.

    Directory of Open Access Journals (Sweden)

    Jing Qin Wu

    Full Text Available While hybridization based analysis of the cortical transcriptome has provided important insight into the neuropathology of schizophrenia, it represents a restricted view of disease-associated gene activity based on predetermined probes. By contrast, sequencing technology can provide un-biased analysis of transcription at nucleotide resolution. Here we use this approach to investigate schizophrenia-associated cortical gene expression.The data was generated from 76 bp reads of RNA-Seq, aligned to the reference genome and assembled into transcripts for quantification of exons, splice variants and alternative promoters in postmortem superior temporal gyrus (STG/BA22 from 9 male subjects with schizophrenia and 9 matched non-psychiatric controls. Differentially expressed genes were then subjected to further sequence and functional group analysis. The output, amounting to more than 38 Gb of sequence, revealed significant alteration of gene expression including many previously shown to be associated with schizophrenia. Gene ontology enrichment analysis followed by functional map construction identified three functional clusters highly relevant to schizophrenia including neurotransmission related functions, synaptic vesicle trafficking, and neural development. Significantly, more than 2000 genes displayed schizophrenia-associated alternative promoter usage and more than 1000 genes showed differential splicing (FDR<0.05. Both types of transcriptional isoforms were exemplified by reads aligned to the neurodevelopmentally significant doublecortin-like kinase 1 (DCLK1 gene.This study provided the first deep and un-biased analysis of schizophrenia-associated transcriptional diversity within the STG, and revealed variants with important implications for the complex pathophysiology of schizophrenia.

  3. Comparative analysis of taxonomic, functional, and metabolic patterns of microbiomes from 14 full-scale biogas reactors by metagenomic sequencing and radioisotopic analysis.

    Science.gov (United States)

    Luo, Gang; Fotidis, Ioannis A; Angelidaki, Irini

    2016-01-01

    Biogas production is a very complex process due to the high complexity in diversity and interactions of the microorganisms mediating it, and only limited and diffuse knowledge exists about the variation of taxonomic and functional patterns of microbiomes across different biogas reactors, and their relationships with the metabolic patterns. The present study used metagenomic sequencing and radioisotopic analysis to assess the taxonomic, functional, and metabolic patterns of microbiomes from 14 full-scale biogas reactors operated under various conditions treating either sludge or manure. The results from metagenomic analysis showed that the dominant methanogenic pathway revealed by radioisotopic analysis was not always correlated with the taxonomic and functional compositions. It was found by radioisotopic experiments that the aceticlastic methanogenic pathway was dominant, while metagenomics analysis showed higher relative abundance of hydrogenotrophic methanogens. Principal coordinates analysis showed the sludge-based samples were clearly distinct from the manure-based samples for both taxonomic and functional patterns, and canonical correspondence analysis showed that the both temperature and free ammonia were crucial environmental variables shaping the taxonomic and functional patterns. The study further the overall patterns of functional genes were strongly correlated with overall patterns of taxonomic composition across different biogas reactors. The discrepancy between the metabolic patterns determined by metagenomic analysis and metabolic pathways determined by radioisotopic analysis was found. Besides, a clear correlation between taxonomic and functional patterns was demonstrated for biogas reactors, and also the environmental factors that shaping both taxonomic and functional genes patterns were identified.

  4. Multivariate pattern classification reveals autonomic and experiential representations of discrete emotions.

    Science.gov (United States)

    Kragel, Philip A; Labar, Kevin S

    2013-08-01

    Defining the structural organization of emotions is a central unresolved question in affective science. In particular, the extent to which autonomic nervous system activity signifies distinct affective states remains controversial. Most prior research on this topic has used univariate statistical approaches in attempts to classify emotions from psychophysiological data. In the present study, electrodermal, cardiac, respiratory, and gastric activity, as well as self-report measures were taken from healthy subjects during the experience of fear, anger, sadness, surprise, contentment, and amusement in response to film and music clips. Information pertaining to affective states present in these response patterns was analyzed using multivariate pattern classification techniques. Overall accuracy for classifying distinct affective states was 58.0% for autonomic measures and 88.2% for self-report measures, both of which were significantly above chance. Further, examining the error distribution of classifiers revealed that the dimensions of valence and arousal selectively contributed to decoding emotional states from self-report, whereas a categorical configuration of affective space was evident in both self-report and autonomic measures. Taken together, these findings extend recent multivariate approaches to study emotion and indicate that pattern classification tools may improve upon univariate approaches to reveal the underlying structure of emotional experience and physiological expression. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  5. Multi-region and single-cell sequencing reveal variable genomic heterogeneity in rectal cancer.

    Science.gov (United States)

    Liu, Mingshan; Liu, Yang; Di, Jiabo; Su, Zhe; Yang, Hong; Jiang, Beihai; Wang, Zaozao; Zhuang, Meng; Bai, Fan; Su, Xiangqian

    2017-11-23

    Colorectal cancer is a heterogeneous group of malignancies with complex molecular subtypes. While colon cancer has been widely investigated, studies on rectal cancer are very limited. Here, we performed multi-region whole-exome sequencing and single-cell whole-genome sequencing to examine the genomic intratumor heterogeneity (ITH) of rectal tumors. We sequenced nine tumor regions and 88 single cells from two rectal cancer patients with tumors of the same molecular classification and characterized their mutation profiles and somatic copy number alterations (SCNAs) at the multi-region and the single-cell levels. A variable extent of genomic heterogeneity was observed between the two patients, and the degree of ITH increased when analyzed on the single-cell level. We found that major SCNAs were early events in cancer development and inherited steadily. Single-cell sequencing revealed mutations and SCNAs which were hidden in bulk sequencing. In summary, we studied the ITH of rectal cancer at regional and single-cell resolution and demonstrated that variable heterogeneity existed in two patients. The mutational scenarios and SCNA profiles of two patients with treatment naïve from the same molecular subtype are quite different. Our results suggest each tumor possesses its own architecture, which may result in different diagnosis, prognosis, and drug responses. Remarkable ITH exists in the two patients we have studied, providing a preliminary impression of ITH in rectal cancer.

  6. A base composition analysis of natural patterns for the preprocessing of metagenome sequences.

    Science.gov (United States)

    Bonham-Carter, Oliver; Ali, Hesham; Bastola, Dhundy

    2013-01-01

    On the pretext that sequence reads and contigs often exhibit the same kinds of base usage that is also observed in the sequences from which they are derived, we offer a base composition analysis tool. Our tool uses these natural patterns to determine relatedness across sequence data. We introduce spectrum sets (sets of motifs) which are permutations of bacterial restriction sites and the base composition analysis framework to measure their proportional content in sequence data. We suggest that this framework will increase the efficiency during the pre-processing stages of metagenome sequencing and assembly projects. Our method is able to differentiate organisms and their reads or contigs. The framework shows how to successfully determine the relatedness between these reads or contigs by comparison of base composition. In particular, we show that two types of organismal-sequence data are fundamentally different by analyzing their spectrum set motif proportions (coverage). By the application of one of the four possible spectrum sets, encompassing all known restriction sites, we provide the evidence to claim that each set has a different ability to differentiate sequence data. Furthermore, we show that the spectrum set selection having relevance to one organism, but not to the others of the data set, will greatly improve performance of sequence differentiation even if the fragment size of the read, contig or sequence is not lengthy. We show the proof of concept of our method by its application to ten trials of two or three freshly selected sequence fragments (reads and contigs) for each experiment across the six organisms of our set. Here we describe a novel and computationally effective pre-processing step for metagenome sequencing and assembly tasks. Furthermore, our base composition method has applications in phylogeny where it can be used to infer evolutionary distances between organisms based on the notion that related organisms often have much conserved code.

  7. Levels of integration in cognitive control and sequence processing in the prefrontal cortex.

    Science.gov (United States)

    Bahlmann, Jörg; Korb, Franziska M; Gratton, Caterina; Friederici, Angela D

    2012-01-01

    Cognitive control is necessary to flexibly act in changing environments. Sequence processing is needed in language comprehension to build the syntactic structure in sentences. Functional imaging studies suggest that sequence processing engages the left ventrolateral prefrontal cortex (PFC). In contrast, cognitive control processes additionally recruit bilateral rostral lateral PFC regions. The present study aimed to investigate these two types of processes in one experimental paradigm. Sequence processing was manipulated using two different sequencing rules varying in complexity. Cognitive control was varied with different cue-sets that determined the choice of a sequencing rule. Univariate analyses revealed distinct PFC regions for the two types of processing (i.e. sequence processing: left ventrolateral PFC and cognitive control processing: bilateral dorsolateral and rostral PFC). Moreover, in a common brain network (including left lateral PFC and intraparietal sulcus) no interaction between sequence and cognitive control processing was observed. In contrast, a multivariate pattern analysis revealed an interaction of sequence and cognitive control processing, such that voxels in left lateral PFC and parietal cortex showed different tuning functions for tasks involving different sequencing and cognitive control demands. These results suggest that the difference between the process of rule selection (i.e. cognitive control) and the process of rule-based sequencing (i.e. sequence processing) find their neuronal underpinnings in distinct activation patterns in lateral PFC. Moreover, the combination of rule selection and rule sequencing can shape the response of neurons in lateral PFC and parietal cortex.

  8. MicroRNAs in Amoebozoa: deep sequencing of the small RNA population in the social amoeba Dictyostelium discoideum reveals developmentally regulated microRNAs.

    Science.gov (United States)

    Avesson, Lotta; Reimegård, Johan; Wagner, E Gerhart H; Söderbom, Fredrik

    2012-10-01

    The RNA interference machinery has served as a guardian of eukaryotic genomes since the divergence from prokaryotes. Although the basic components have a shared origin, silencing pathways directed by small RNAs have evolved in diverse directions in different eukaryotic lineages. Micro (mi)RNAs regulate protein-coding genes and play vital roles in plants and animals, but less is known about their functions in other organisms. Here, we report, for the first time, deep sequencing of small RNAs from the social amoeba Dictyostelium discoideum. RNA from growing single-cell amoebae as well as from two multicellular developmental stages was sequenced. Computational analyses combined with experimental data reveal the expression of miRNAs, several of them exhibiting distinct expression patterns during development. To our knowledge, this is the first report of miRNAs in the Amoebozoa supergroup. We also show that overexpressed miRNA precursors generate miRNAs and, in most cases, miRNA* sequences, whose biogenesis is dependent on the Dicer-like protein DrnB, further supporting the presence of miRNAs in D. discoideum. In addition, we find miRNAs processed from hairpin structures originating from an intron as well as from a class of repetitive elements. We believe that these repetitive elements are sources for newly evolved miRNAs.

  9. Transcriptional analysis of the HeT-A retrotransposon in mutant and wild type stocks reveals high sequence variability at Drosophila telomeres and other unusual features

    Directory of Open Access Journals (Sweden)

    Piñeyro David

    2011-11-01

    Full Text Available Abstract Background Telomere replication in Drosophila depends on the transposition of a domesticated retroelement, the HeT-A retrotransposon. The sequence of the HeT-A retrotransposon changes rapidly resulting in differentiated subfamilies. This pattern of sequence change contrasts with the essential function with which the HeT-A is entrusted and brings about questions concerning the extent of sequence variability, the telomere contribution of different subfamilies, and whether wild type and mutant Drosophila stocks show different HeT-A scenarios. Results A detailed study on the variability of HeT-A reveals that both the level of variability and the number of subfamilies are higher than previously reported. Comparisons between GIII, a strain with longer telomeres, and its parental strain Oregon-R indicate that both strains have the same set of HeT-A subfamilies. Finally, the presence of a highly conserved splicing pattern only in its antisense transcripts indicates a putative regulatory, functional or structural role for the HeT-A RNA. Interestingly, our results also suggest that most HeT-A copies are actively expressed regardless of which telomere and where in the telomere they are located. Conclusions Our study demonstrates how the HeT-A sequence changes much faster than previously reported resulting in at least nine different subfamilies most of which could actively contribute to telomere extension in Drosophila. Interestingly, the only significant difference observed between Oregon-R and GIII resides in the nature and proportion of the antisense transcripts, suggesting a possible mechanism that would in part explain the longer telomeres of the GIII stock.

  10. Social patterns revealed through random matrix theory

    Science.gov (United States)

    Sarkar, Camellia; Jalan, Sarika

    2014-11-01

    Despite the tremendous advancements in the field of network theory, very few studies have taken weights in the interactions into consideration that emerge naturally in all real-world systems. Using random matrix analysis of a weighted social network, we demonstrate the profound impact of weights in interactions on emerging structural properties. The analysis reveals that randomness existing in particular time frame affects the decisions of individuals rendering them more freedom of choice in situations of financial security. While the structural organization of networks remains the same throughout all datasets, random matrix theory provides insight into the interaction pattern of individuals of the society in situations of crisis. It has also been contemplated that individual accountability in terms of weighted interactions remains as a key to success unless segregation of tasks comes into play.

  11. Sequence exploration reveals information bias among molecular markers used in phylogenetic reconstruction for Colletotrichum species.

    Science.gov (United States)

    Rampersad, Sephra N; Hosein, Fazeeda N; Carrington, Christine Vf

    2014-01-01

    The Colletotrichum gloeosporioides species complex is among the most destructive fungal plant pathogens in the world, however, identification of isolates of quarantine importance to the intra-specific level is confounded by a number of factors that affect phylogenetic reconstruction. Information bias and quality parameters were investigated to determine whether nucleotide sequence alignments and phylogenetic trees accurately reflect the genetic diversity and phylogenetic relatedness of individuals. Sequence exploration of GAPDH, ACT, TUB2 and ITS markers indicated that the query sequences had different patterns of nucleotide substitution but were without evidence of base substitution saturation. Regions of high entropy were much more dispersed in the ACT and GAPDH marker alignments than for the ITS and TUB2 markers. A discernible bimodal gap in the genetic distance frequency histograms was produced for the ACT and GAPDH markers which indicated successful separation of intra- and inter-specific sequences in the data set. Overall, analyses indicated clear differences in the ability of these markers to phylogenetically separate individuals to the intra-specific level which coincided with information bias.

  12. Deep sequencing of the oral microbiome reveals signatures of periodontal disease.

    Directory of Open Access Journals (Sweden)

    Bo Liu

    Full Text Available The oral microbiome, the complex ecosystem of microbes inhabiting the human mouth, harbors several thousands of bacterial types. The proliferation of pathogenic bacteria within the mouth gives rise to periodontitis, an inflammatory disease known to also constitute a risk factor for cardiovascular disease. While much is known about individual species associated with pathogenesis, the system-level mechanisms underlying the transition from health to disease are still poorly understood. Through the sequencing of the 16S rRNA gene and of whole community DNA we provide a glimpse at the global genetic, metabolic, and ecological changes associated with periodontitis in 15 subgingival plaque samples, four from each of two periodontitis patients, and the remaining samples from three healthy individuals. We also demonstrate the power of whole-metagenome sequencing approaches in characterizing the genomes of key players in the oral microbiome, including an unculturable TM7 organism. We reveal the disease microbiome to be enriched in virulence factors, and adapted to a parasitic lifestyle that takes advantage of the disrupted host homeostasis. Furthermore, diseased samples share a common structure that was not found in completely healthy samples, suggesting that the disease state may occupy a narrow region within the space of possible configurations of the oral microbiome. Our pilot study demonstrates the power of high-throughput sequencing as a tool for understanding the role of the oral microbiome in periodontal disease. Despite a modest level of sequencing (~2 lanes Illumina 76 bp PE and high human DNA contamination (up to ~90% we were able to partially reconstruct several oral microbes and to preliminarily characterize some systems-level differences between the healthy and diseased oral microbiomes.

  13. RNA-Sequencing Reveals Unique Transcriptional Signatures of Running and Running-Independent Environmental Enrichment in the Adult Mouse Dentate Gyrus

    Directory of Open Access Journals (Sweden)

    Catherine-Alexandra Grégoire

    2018-04-01

    Full Text Available Environmental enrichment (EE is a powerful stimulus of brain plasticity and is among the most accessible treatment options for brain disease. In rodents, EE is modeled using multi-factorial environments that include running, social interactions, and/or complex surroundings. Here, we show that running and running-independent EE differentially affect the hippocampal dentate gyrus (DG, a brain region critical for learning and memory. Outbred male CD1 mice housed individually with a voluntary running disk showed improved spatial memory in the radial arm maze compared to individually- or socially-housed mice with a locked disk. We therefore used RNA sequencing to perform an unbiased interrogation of DG gene expression in mice exposed to either a voluntary running disk (RUN, a locked disk (LD, or a locked disk plus social enrichment and tunnels [i.e., a running-independent complex environment (CE]. RNA sequencing revealed that RUN and CE mice showed distinct, non-overlapping patterns of transcriptomic changes versus the LD control. Bio-informatics uncovered that the RUN and CE environments modulate separate transcriptional networks, biological processes, cellular compartments and molecular pathways, with RUN preferentially regulating synaptic and growth-related pathways and CE altering extracellular matrix-related functions. Within the RUN group, high-distance runners also showed selective stress pathway alterations that correlated with a drastic decline in overall transcriptional changes, suggesting that excess running causes a stress-induced suppression of running’s genetic effects. Our findings reveal stimulus-dependent transcriptional signatures of EE on the DG, and provide a resource for generating unbiased, data-driven hypotheses for novel mediators of EE-induced cognitive changes.

  14. Molecular evolution and diversification of snake toxin genes, revealed by analysis of intron sequences.

    Science.gov (United States)

    Fujimi, T J; Nakajyo, T; Nishimura, E; Ogura, E; Tsuchiya, T; Tamiya, T

    2003-08-14

    The genes encoding erabutoxin (short chain neurotoxin) isoforms (Ea, Eb, and Ec), LsIII (long chain neurotoxin) and a novel long chain neurotoxin pseudogene were cloned from a Laticauda semifasciata genomic library. Short and long chain neurotoxin genes were also cloned from the genome of Laticauda laticaudata, a closely related species of L. semifasciata, by PCR. A putative matrix attached region (MAR) sequence was found in the intron I of the LsIII gene. Comparative analysis of 11 structurally relevant snake toxin genes (three-finger-structure toxins) revealed the molecular evolution of these toxins. Three-finger-structure toxin genes diverged from a common ancestor through two types of evolutionary pathways (long and short types), early in the course of evolution. At a later stage of evolution in each gene, the accumulation of mutations in the exons, especially exon II, by accelerated evolution may have caused the increased diversification in their functions. It was also revealed that the putative MAR sequence found in the LsIII gene was integrated into the gene after the species-level divergence.

  15. Distribution patterns of firearm discharge residues as revealed by neutron activation analysis

    International Nuclear Information System (INIS)

    Pillay, K.K.S.; Driscoll, D.C.; Jester, W.A.

    1975-01-01

    A systematic investigation using a variety of handguns has revealed the existence of distinguisable distribution patterns of firearm discharge residues on surfaces below the flight path of a bullet. The residues are identificable even at distances of 12 meters from the gun using nondestructive neutron activation analysis. The results of these investigations show that the distribution pattern for a gun is reproducible using similar ammunition and that there exist two distinct regions to the patterns developed between the firearm and the target-one with respect to the position of the gun and the other in the vicinity of the target. The judicious applications of these findings could be of significant value in criminal investigations. (T.G.)

  16. Deep Sequencing Reveals the Complete Genome and Evidence for Transcriptional Activity of the First Virus-Like Sequences Identified in Aristotelia chilensis (Maqui Berry

    Directory of Open Access Journals (Sweden)

    Javier Villacreses

    2015-04-01

    Full Text Available Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1. High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs: ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV, Petuvirus genus. ORF1 encodes a movement protein (MP; ORF2 a Reverse Transcriptase (RT and a Ribonuclease H (RNase H domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs, AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq. Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  17. A Filtering Method to Reveal Crystalline Patterns from Atom Probe Microscopy Desorption Maps

    Science.gov (United States)

    2016-03-26

    reveal crystalline patterns from atom probe microscopy desorption maps Lan Yao Department of Materials Science and Engineering, University of Michigan, Ann...reveal the crystallographic information present in Atom Probe Microscopy (APM) data is presented. Themethod filters atoms based on the time difference...between their evaporation and the evaporation of the previous atom . Since this time difference correlates with the location and the local structure of

  18. Sequence-based Screening for Rare Enzymes: New Insights into the World of AMDases Reveal a Conserved Motif and 58 Novel Enzymes Clustering in Eight Distinct Families.

    Directory of Open Access Journals (Sweden)

    Janine Maimanakos

    2016-08-01

    Full Text Available Arylmalonate-Decarboxylases (AMDases, EC 4.1.1.76 are very rare and mostly underexplored enzymes. Currently only four known and biochemically characterized representatives exist. However, their ability to decarboxylate α-disubstituted malonic acid derivatives to optically pure products without cofactors makes them attractive and promising candidates for the use as biocatalysts in industrial processes. Until now, AMDases could not be separated from other members of the aspartate/glutamate racemase superfamily based on their gene sequences. Within this work, a search algorithm was developed that enables a reliable prediction of AMDase activity for potential candidates. Based on specific sequence patterns and screening methods 58 novel AMDase candidate genes could be identified in this work. Thereby, AMDases with the conserved sequence pattern of Bordetella bronchiseptica’s prototype appeared to be limited to the classes of Alpha-, Beta- and Gammaproteobacteria. Amino acid homologies and comparison of gene surrounding sequences enabled the classification of eight enzyme clusters. Particularly striking is the accumulation of genes coding for different transporters of the TTT family, TRAP transporters and ABC transporters as well as genes coding for mandelate racemases/muconate lactonizing enzymes that might be involved in substrate uptake or degradation of AMDase products. Further, three novel AMDases were characterized which showed a high enantiomeric excess (>99% of the (R-enantiomer of flurbiprofen. These are the recombinant AmdA and AmdV from Variovorax sp. strains HH01 and HH02, originated from soil, and AmdP from Polymorphum gilvum found by a data base search. Altogether our findings give new insights into the class of AMDases and reveal many previously unknown enzyme candidates with high potential for bioindustrial processes.

  19. Genome sequence of the pattern forming Paenibacillus vortex bacterium reveals potential for thriving in complex environments

    NARCIS (Netherlands)

    Sirota-Madi, A.; Olender, T.; Helman, Y.; Ingham, C.; Brainis, I.; Roth, D.; Hagi, E.; Brodsky, L.; Leshkowitz, D.; Galatenko, V.; Nikolaev, V.; Mugasimangalam, R.C.; Bransburg-Zabary, S.; Gutnick, D.L.; Lancet, D.; Ben-Jacob, E.

    2010-01-01

    Background: The pattern-forming bacterium Paenibacillus vortex is notable for its advanced social behavior, which is reflected in development of colonies with highly intricate architectures. Prior to this study, only two other Paenibacillus species (Paenibacillus sp. JDR-2 and Paenibacillus larvae)

  20. First full-length genome sequence of the polerovirus luffa aphid-borne yellows virus (LABYV) reveals the presence of at least two consensus sequences in an isolate from Thailand.

    Science.gov (United States)

    Knierim, Dennis; Maiss, Edgar; Kenyon, Lawrence; Winter, Stephan; Menzel, Wulf

    2015-10-01

    Luffa aphid-borne yellows virus (LABYV) was proposed as the name for a previously undescribed polerovirus based on partial genome sequences obtained from samples of cucurbit plants collected in Thailand between 2008 and 2013. In this study, we determined the first full-length genome sequence of LABYV. Based on phylogenetic analysis and genome properties, it is clear that this virus represents a distinct species in the genus Polerovirus. Analysis of sequences from sample TH24, which was collected in 2010 from a luffa plant in Thailand, reveals the presence of two different full-length genome consensus sequences.

  1. How Next-Generation Sequencing Has Aided Our Understanding of the Sequence Composition and Origin of B Chromosomes

    Directory of Open Access Journals (Sweden)

    Alevtina Ruban

    2017-10-01

    Full Text Available Accessory, supernumerary, or—most simply—B chromosomes, are found in many eukaryotic karyotypes. These small chromosomes do not follow the usual pattern of segregation, but rather are transmitted in a higher than expected frequency. As increasingly being demonstrated by next-generation sequencing (NGS, their structure comprises fragments of standard (A chromosomes, although in some plant species, their sequence also includes contributions from organellar genomes. Transcriptomic analyses of various animal and plant species have revealed that, contrary to what used to be the common belief, some of the B chromosome DNA is protein-encoding. This review summarizes the progress in understanding B chromosome biology enabled by the application of next-generation sequencing technology and state-of-the-art bioinformatics. In particular, a contrast is drawn between a direct sequencing approach and a strategy based on a comparative genomics as alternative routes that can be taken towards the identification of B chromosome sequences.

  2. Molecular cloning, sequence characterization and expression pattern of Rab18 gene from watermelon (Citrullus lanatus).

    Science.gov (United States)

    Xinli, Xiao; Lei, Peng

    2015-03-04

    The complete mRNA sequence of watermelon Rab18 gene was amplified through the rapid amplification of cDNA ends (RACE) method. The full-length mRNA was 1010 bp containing a 645 bp open reading frame, which encodes a protein of 214 amino acids. Sequence analysis revealed that watermelon Rab18 protein shares high homology with the Rab18 of cucumber (99%), muskmelon (98%), Morus notabilis (90%), tomato (89%), wine grape (89%) and potato (88%). Phylogenetic analysis revealed that watermelon Rab18 gene has a closer genetic relationship with Rab18 gene of cucumber and muskmelon. Tissue expression profile analysis indicated that watermelon Rab18 gene was highly expressed in root, stem and leaf, moderately expressed in flower and weakly expressed in fruit.

  3. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    Energy Technology Data Exchange (ETDEWEB)

    Shi, CY; Yang, H; Wei, CL; Yu, O; Zhang, ZZ; Sun, J; Wan, XC

    2011-01-01

    Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real

  4. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    Directory of Open Access Journals (Sweden)

    Chen Qi

    2011-02-01

    Full Text Available Abstract Background Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results Using high-throughput Illumina RNA-seq, the transcriptome from poly (A+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs. Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010. Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were

  5. Draft genome sequence of Streptomyces coelicoflavus ZG0656 reveals the putative biosynthetic gene cluster of acarviostatin family α-amylase inhibitors.

    Science.gov (United States)

    Guo, X; Geng, P; Bai, F; Bai, G; Sun, T; Li, X; Shi, L; Zhong, Q

    2012-08-01

    The aims of this study are to obtain the draft genome sequence of Streptomyces coelicoflavus ZG0656, which produces novel acarviostatin family α-amylase inhibitors, and then to reveal the putative acarviostatin-related gene cluster and the biosynthetic pathway. The draft genome sequence of S. coelicoflavus ZG0656 was generated using a shotgun approach employing a combination of 454 and Solexa sequencing technologies. Genome analysis revealed a putative gene cluster for acarviostatin biosynthesis, termed sct-cluster. The cluster contains 13 acarviostatin synthetic genes, six transporter genes, four starch degrading or transglycosylation enzyme genes and two regulator genes. On the basis of bioinformatic analysis, we proposed a putative biosynthetic pathway of acarviostatins. The intracellular steps produce a structural core, acarviostatin I00-7-P, and the extracellular assemblies lead to diverse acarviostatin end products. The draft genome sequence of S. coelicoflavus ZG0656 revealed the putative biosynthetic gene cluster of acarviostatins and a putative pathway of acarviostatin production. To our knowledge, S. coelicoflavus ZG0656 is the first strain in this species for which a genome sequence has been reported. The analysis of sct-cluster provided important insights into the biosynthesis of acarviostatins. This work will be a platform for producing novel variants and yield improvement. © 2012 The Authors. Letters in Applied Microbiology © 2012 The Society for Applied Microbiology.

  6. Capacity for patterns and sequences in Kanerva's SDM as compared to other associative memory models

    Science.gov (United States)

    Keeler, James D.

    1987-01-01

    The information capacity of Kanerva's Sparse Distributed Memory (SDM) and Hopfield-type neural networks is investigated. Under the approximations used, it is shown that the total information stored in these systems is proportional to the number connections in the network. The proportionality constant is the same for the SDM and Hopfield-type models independent of the particular model, or the order of the model. The approximations are checked numerically. This same analysis can be used to show that the SDM can store sequences of spatiotemporal patterns, and the addition of time-delayed connections allows the retrieval of context dependent temporal patterns. A minor modification of the SDM can be used to store correlated patterns.

  7. TreeNetViz: revealing patterns of networks over tree structures.

    Science.gov (United States)

    Gou, Liang; Zhang, Xiaolong Luke

    2011-12-01

    Network data often contain important attributes from various dimensions such as social affiliations and areas of expertise in a social network. If such attributes exhibit a tree structure, visualizing a compound graph consisting of tree and network structures becomes complicated. How to visually reveal patterns of a network over a tree has not been fully studied. In this paper, we propose a compound graph model, TreeNet, to support visualization and analysis of a network at multiple levels of aggregation over a tree. We also present a visualization design, TreeNetViz, to offer the multiscale and cross-scale exploration and interaction of a TreeNet graph. TreeNetViz uses a Radial, Space-Filling (RSF) visualization to represent the tree structure, a circle layout with novel optimization to show aggregated networks derived from TreeNet, and an edge bundling technique to reduce visual complexity. Our circular layout algorithm reduces both total edge-crossings and edge length and also considers hierarchical structure constraints and edge weight in a TreeNet graph. These experiments illustrate that the algorithm can reduce visual cluttering in TreeNet graphs. Our case study also shows that TreeNetViz has the potential to support the analysis of a compound graph by revealing multiscale and cross-scale network patterns. © 2011 IEEE

  8. Characteristics of MHC class I genes in house sparrows Passer domesticus as revealed by long cDNA transcripts and amplicon sequencing.

    Science.gov (United States)

    Karlsson, Maria; Westerdahl, Helena

    2013-08-01

    In birds the major histocompatibility complex (MHC) organization differs both among and within orders; chickens Gallus gallus of the order Galliformes have a simple arrangement, while many songbirds of the order Passeriformes have a more complex arrangement with larger numbers of MHC class I and II genes. Chicken MHC genes are found at two independent loci, classical MHC-B and non-classical MHC-Y, whereas non-classical MHC genes are yet to be verified in passerines. Here we characterize MHC class I transcripts (α1 to α3 domain) and perform amplicon sequencing using a next-generation sequencing technique on exon 3 from house sparrow Passer domesticus (a passerine) families. Then we use phylogenetic, selection, and segregation analyses to gain a better understanding of the MHC class I organization. Trees based on the α1 and α2 domain revealed a distinct cluster with short terminal branches for transcripts with a 6-bp deletion. Interestingly, this cluster was not seen in the tree based on the α3 domain. 21 exon 3 sequences were verified in a single individual and the average numbers within an individual were nine and five for sequences with and without a 6-bp deletion, respectively. All individuals had exon 3 sequences with and without a 6-bp deletion. The sequences with a 6-bp deletion have many characteristics in common with non-classical MHC, e.g., highly conserved amino acid positions were substituted compared with the other alleles, low nucleotide diversity and just a single site was subject to positive selection. However, these alleles also have characteristics that suggest they could be classical, e.g., complete linkage and absence of a distinct cluster in a tree based on the α3 domain. Thus, we cannot determine for certain whether or not the alleles with a 6-bp deletion are non-classical based on our present data. Further analyses on segregation patterns of these alleles in combination with dating the 6-bp deletion through MHC characterization across the

  9. A transcriptome atlas of rabbit revealed by PacBio single-molecule long-read sequencing.

    Science.gov (United States)

    Chen, Shi-Yi; Deng, Feilong; Jia, Xianbo; Li, Cao; Lai, Song-Jia

    2017-08-09

    It is widely acknowledged that transcriptional diversity largely contributes to biological regulation in eukaryotes. Since the advent of second-generation sequencing technologies, a large number of RNA sequencing studies have considerably improved our understanding of transcriptome complexity. However, it still remains a huge challenge for obtaining full-length transcripts because of difficulties in the short read-based assembly. In the present study we employ PacBio single-molecule long-read sequencing technology for whole-transcriptome profiling in rabbit (Oryctolagus cuniculus). We totally obtain 36,186 high-confidence transcripts from 14,474 genic loci, among which more than 23% of genic loci and 66% of isoforms have not been annotated yet within the current reference genome. Furthermore, about 17% of transcripts are computationally revealed to be non-coding RNAs. Up to 24,797 alternative splicing (AS) and 11,184 alternative polyadenylation (APA) events are detected within this de novo constructed transcriptome, respectively. The results provide a comprehensive set of reference transcripts and hence contribute to the improved annotation of rabbit genome.

  10. Sequence Dependencies of DNA Deformability and Hydration in the Minor Groove

    Science.gov (United States)

    Yonetani, Yoshiteru; Kono, Hidetoshi

    2009-01-01

    Abstract DNA deformability and hydration are both sequence-dependent and are essential in specific DNA sequence recognition by proteins. However, the relationship between the two is not well understood. Here, systematic molecular dynamics simulations of 136 DNA sequences that differ from each other in their central tetramer revealed that sequence dependence of hydration is clearly correlated with that of deformability. We show that this correlation can be illustrated by four typical cases. Most rigid basepair steps are highly likely to form an ordered hydration pattern composed of one water molecule forming a bridge between the bases of distinct strands, but a few exceptions favor another ordered hydration composed of two water molecules forming such a bridge. Steps with medium deformability can display both of these hydration patterns with frequent transition. Highly flexible steps do not have any stable hydration pattern. A detailed picture of this correlation demonstrates that motions of hydration water molecules and DNA bases are tightly coupled with each other at the atomic level. These results contribute to our understanding of the entropic contribution from water molecules in protein or drug binding and could be applied for the purpose of predicting binding sites. PMID:19686662

  11. High-throughput sequencing of the B-cell receptor in African Burkitt lymphoma reveals clues to pathogenesis.

    Science.gov (United States)

    Lombardo, Katharine A; Coffey, David G; Morales, Alicia J; Carlson, Christopher S; Towlerton, Andrea M H; Gerdts, Sarah E; Nkrumah, Francis K; Neequaye, Janet; Biggar, Robert J; Orem, Jackson; Casper, Corey; Mbulaiteye, Sam M; Bhatia, Kishor G; Warren, Edus H

    2017-03-28

    Burkitt lymphoma (BL), the most common pediatric cancer in sub-Saharan Africa, is a malignancy of antigen-experienced B lymphocytes. High-throughput sequencing (HTS) of the immunoglobulin heavy ( IGH ) and light chain ( IGK / IGL ) loci was performed on genomic DNA from 51 primary BL tumors: 19 from Uganda and 32 from Ghana. Reverse transcription polymerase chain reaction analysis and tumor RNA sequencing (RNAseq) was performed on the Ugandan tumors to confirm and extend the findings from the HTS of tumor DNA. Clonal IGH and IGK / IGL rearrangements were identified in 41 and 46 tumors, respectively. Evidence for rearrangement of the second IGH allele was observed in only 6 of 41 tumor samples with a clonal IGH rearrangement, suggesting that the normal process of biallelic IGHD to IGHJ diversity-joining (DJ) rearrangement is often disrupted in BL progenitor cells. Most tumors, including those with a sole dominant, nonexpressed DJ rearrangement, contained many IGH and IGK / IGL sequences that differed from the dominant rearrangement by < 10 nucleotides, suggesting that the target of ongoing mutagenesis of these loci in BL tumor cells is not limited to expressed alleles. IGHV usage in both BL tumor cohorts revealed enrichment for IGHV genes that are infrequently used in memory B cells from healthy subjects. Analysis of publicly available DNA sequencing and RNAseq data revealed that these same IGHV genes were overrepresented in dominant tumor-associated IGH rearrangements in several independent BL tumor cohorts. These data suggest that BL derives from an abnormal B-cell progenitor and that aberrant mutational processes are active on the immunoglobulin loci in BL cells.

  12. Homozygosity mapping and targeted sanger sequencing reveal genetic defects underlying inherited retinal disease in families from pakistan.

    Directory of Open Access Journals (Sweden)

    Maleeha Maria

    Full Text Available Homozygosity mapping has facilitated the identification of the genetic causes underlying inherited diseases, particularly in consanguineous families with multiple affected individuals. This knowledge has also resulted in a mutation dataset that can be used in a cost and time effective manner to screen frequent population-specific genetic variations associated with diseases such as inherited retinal disease (IRD.We genetically screened 13 families from a cohort of 81 Pakistani IRD families diagnosed with Leber congenital amaurosis (LCA, retinitis pigmentosa (RP, congenital stationary night blindness (CSNB, or cone dystrophy (CD. We employed genome-wide single nucleotide polymorphism (SNP array analysis to identify homozygous regions shared by affected individuals and performed Sanger sequencing of IRD-associated genes located in the sizeable homozygous regions. In addition, based on population specific mutation data we performed targeted Sanger sequencing (TSS of frequent variants in AIPL1, CEP290, CRB1, GUCY2D, LCA5, RPGRIP1 and TULP1, in probands from 28 LCA families.Homozygosity mapping and Sanger sequencing of IRD-associated genes revealed the underlying mutations in 10 families. TSS revealed causative variants in three families. In these 13 families four novel mutations were identified in CNGA1, CNGB1, GUCY2D, and RPGRIP1.Homozygosity mapping and TSS revealed the underlying genetic cause in 13 IRD families, which is useful for genetic counseling as well as therapeutic interventions that are likely to become available in the near future.

  13. Investigation of Double-Band Electrophoretic Pattern of ITS-rDNA Region in Iranian Isolates of Leishmania Tropica

    Directory of Open Access Journals (Sweden)

    MA Ghatee

    2013-06-01

    Full Text Available Background: Leishmania tropica is a genetically divergent species. Amplification of entire internal tran­scribed spacer (ITS region of L. tropica isolates obtained from Bam district, one of the well known focus of anthroponotic cutaneous leishmaniasis ACL( in Iran, revealed a double-band pat­tern in agarose gel electrophoresis. This study explains how this pattern occurs.Methods: Twenty seven L. tropica smear preparations were collected from Bam district, south east Iran, and eight L. major and one L. infantum smear preparations were gathered from Shiraz, south west Iran. Furthermore one L. major and one L. infantum cultured standard strains were tested using entire ITS-PCR to survey their electrophoretic pattern. The ITS sequences of L. tropica, L. major, and L. infantum already deposited in GenBank were analyzed. Analysis of GenBank sequences of L. tropica revealed two groups of sequences based on length size, one group having a 100 bp gap. Therefore, a new re­verse primer namely LITS-MG was designed to exclude this gap in PCR products.Results: Whole ITS fragment amplification resulted in a double-band pattern in all L. tropica cases, while a sharp single band was observed for L. infantum and L. major isolates. This result was correspond­ing to the result obtained from in silico analysis of GenBank sequences. Use of LITS-MG primer was expectedly resulted in a single band including ITS1, 5.8s and partial ITS2 product for L. tropica which is appropriate for following molecular studies such as sequencing or restriction analysis.Conclusion: Sequences analysis of GenBank L. tropica sequences and following practical laboratory tests revealed at least two alleles in L. tropica which were confirmed in Bam isolates. This especial double-band pattern is because of a 100 bp fragment difference within ITS-rDNA alleles

  14. Single nucleus genome sequencing reveals high similarity among nuclei of an endomycorrhizal fungus.

    Directory of Open Access Journals (Sweden)

    Kui Lin

    2014-01-01

    Full Text Available Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence of a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection and transmission of the genetic make-up to next generations. Although this concept has become textbook knowledge, it is only based on studying a few loci, including 45S rDNA. To provide a more comprehensive insight into the genetic makeup of arbuscular endomycorrhizal fungi, we applied de novo genome sequencing of individual nuclei of Rhizophagus irregularis. This revealed a surprisingly low level of polymorphism between nuclei. In contrast, within a nucleus, the 45S rDNA repeat unit turned out to be highly diverged. This finding demystifies a long-lasting hypothesis on the complex genetic makeup of arbuscular endomycorrhizal fungi. Subsequent genome assembly resulted in the first draft reference genome sequence of an arbuscular endomycorrhizal fungus. Its length is 141 Mbps, representing over 27,000 protein-coding gene models. We used the genomic sequence to reinvestigate the phylogenetic relationships of Rhizophagus irregularis with other fungal phyla. This unambiguously demonstrated that Glomeromycota are more closely related to Mucoromycotina than to its postulated sister Dikarya.

  15. Significant strain accumulation between the deformation front and landward out-of-sequence thrusts in accretionary wedge of SW Taiwan revealed by cGPS and SAR interferometry

    Science.gov (United States)

    Tsai, M. C.

    2017-12-01

    High strain accumulation across the fold-and-thrust belt in Southwestern Taiwan are revealed by the Continuous GPS (cGPS) and SAR interferometry. This high strain is generally accommodated by the major active structures in fold-and-thrust belt of western Foothills in SW Taiwan connected to the accretionary wedge in the incipient are-continent collision zone. The active structures across the high strain accumulation include the deformation front around the Tainan Tableland, the Hochiali, Hsiaokangshan, Fangshan and Chishan faults. Among these active structures, the deformation pattern revealed from cGPS and SAR interferometry suggest that the Fangshan transfer fault may be a left-lateral fault zone with thrust component accommodating the westward differential motion of thrust sheets on both side of the fault. In addition, the Chishan fault connected to the splay fault bordering the lower-slope and upper-slope of the accretionary wedge which could be the major seismogenic fault and an out-of-sequence thrust fault in SW Taiwan. The big earthquakes resulted from the reactivation of out-of-sequence thrusts have been observed along the Nankai accretionary wedge, thus the assessment of the major seismogenic structures by strain accumulation between the frontal décollement and out-of-sequence thrusts is a crucial topic. According to the background seismicity, the low seismicity and mid-crust to mantle events are observed inland and the lower- and upper- slope domain offshore SW Taiwan, which rheologically implies the upper crust of the accretionary wedge is more or less aseimic. This result may suggest that the excess fluid pressure from the accretionary wedge not only has significantly weakened the prism materials as well as major fault zone, but also makes the accretionary wedge landward extension, which is why the low seismicity is observed in SW Taiwan area. Key words: Continuous GPS, SAR interferometry, strain rate, out-of-sequence thrust.

  16. An investigation of application of the golden ratio and Fibonacci sequence in fashion design and pattern making

    Science.gov (United States)

    Kazlacheva, Z. I.

    2017-10-01

    The Golden ratio and Fibonacci sequence are used as proportions in design as symbols of beauty and harmony. That symbolism is a result of the strong connections in their mathematical nature. The Golden section is a number, introduced with Greek letter φ, which is found by dividing a line into two parts as the longer part divided by the smaller part is equal as the whole length of longer and smaller parts divided by the longer part. Fibonacci sequence is a series of numbers where every number is equal to the two numbers before it. An investigation of application of proportions based on the Golden ratio and Fibonacci sequence in the fashion design and pattern making of ladies’ clothing is the main aim of the paper. Based on the study it may be concluded that in fashion design and pattern making the Golden ratio and Fibonacci sequence can be used in creation of beautiful and harmonic forms directly or with the help of geometrical figures as: In directly use the Golden and Fibonacci numbers proportions can be in one and the same or different directions. In the application with the help of geometrical shapes the Golden and Fibonacci figures combine proportioning and form creation. The Golden and Fibonacci shapes can be used directly as forms or as frames of forms creation of elements and pieces. Its application can be in different directions and location according the bodice. The Golden section and Fibonacci sequence can combine proportions with other principles of design as symmetry, rhythm, etc.

  17. Genome-Wide Methylome Analyses Reveal Novel Epigenetic Regulation Patterns in Schizophrenia and Bipolar Disorder

    Science.gov (United States)

    Li, Yongsheng; Camarillo, Cynthia; Xu, Juan; Arana, Tania Bedard; Xiao, Yun; Zhao, Zheng; Chen, Hong; Ramirez, Mercedes; Zavala, Juan; Escamilla, Michael A.; Armas, Regina; Mendoza, Ricardo; Ontiveros, Alfonso; Nicolini, Humberto; Jerez Magaña, Alvaro Antonio; Rubin, Lewis P.; Li, Xia; Xu, Chun

    2015-01-01

    Schizophrenia (SZ) and bipolar disorder (BP) are complex genetic disorders. Their appearance is also likely informed by as yet only partially described epigenetic contributions. Using a sequencing-based method for genome-wide analysis, we quantitatively compared the blood DNA methylation landscapes in SZ and BP subjects to control, both in an understudied population, Hispanics along the US-Mexico border. Remarkably, we identified thousands of differentially methylated regions for SZ and BP preferentially located in promoters 3′-UTRs and 5′-UTRs of genes. Distinct patterns of aberrant methylation of promoter sequences were located surrounding transcription start sites. In these instances, aberrant methylation occurred in CpG islands (CGIs) as well as in flanking regions as well as in CGI sparse promoters. Pathway analysis of genes displaying these distinct aberrant promoter methylation patterns showed enhancement of epigenetic changes in numerous genes previously related to psychiatric disorders and neurodevelopment. Integration of gene expression data further suggests that in SZ aberrant promoter methylation is significantly associated with altered gene transcription. In particular, we found significant associations between (1) promoter CGIs hypermethylation with gene repression and (2) CGI 3′-shore hypomethylation with increased gene expression. Finally, we constructed a specific methylation analysis platform that facilitates viewing and comparing aberrant genome methylation in human neuropsychiatric disorders. PMID:25734057

  18. Time-Resolved Transposon Insertion Sequencing Reveals Genome-Wide Fitness Dynamics during Infection.

    Science.gov (United States)

    Yang, Guanhua; Billings, Gabriel; Hubbard, Troy P; Park, Joseph S; Yin Leung, Ka; Liu, Qin; Davis, Brigid M; Zhang, Yuanxing; Wang, Qiyao; Waldor, Matthew K

    2017-10-03

    Transposon insertion sequencing (TIS) is a powerful high-throughput genetic technique that is transforming functional genomics in prokaryotes, because it enables genome-wide mapping of the determinants of fitness. However, current approaches for analyzing TIS data assume that selective pressures are constant over time and thus do not yield information regarding changes in the genetic requirements for growth in dynamic environments (e.g., during infection). Here, we describe structured analysis of TIS data collected as a time series, termed pattern analysis of conditional essentiality (PACE). From a temporal series of TIS data, PACE derives a quantitative assessment of each mutant's fitness over the course of an experiment and identifies mutants with related fitness profiles. In so doing, PACE circumvents major limitations of existing methodologies, specifically the need for artificial effect size thresholds and enumeration of bacterial population expansion. We used PACE to analyze TIS samples of Edwardsiella piscicida (a fish pathogen) collected over a 2-week infection period from a natural host (the flatfish turbot). PACE uncovered more genes that affect E. piscicida 's fitness in vivo than were detected using a cutoff at a terminal sampling point, and it identified subpopulations of mutants with distinct fitness profiles, one of which informed the design of new live vaccine candidates. Overall, PACE enables efficient mining of time series TIS data and enhances the power and sensitivity of TIS-based analyses. IMPORTANCE Transposon insertion sequencing (TIS) enables genome-wide mapping of the genetic determinants of fitness, typically based on observations at a single sampling point. Here, we move beyond analysis of endpoint TIS data to create a framework for analysis of time series TIS data, termed pattern analysis of conditional essentiality (PACE). We applied PACE to identify genes that contribute to colonization of a natural host by the fish pathogen

  19. Sequence analysis of chromosome 1 revealed different selection patterns between Chinese wild mice and laboratory strains.

    Science.gov (United States)

    Xu, Fuyi; Hu, Shixian; Chao, Tianzhu; Wang, Maochun; Li, Kai; Zhou, Yuxun; Xu, Hongyan; Xiao, Junhua

    2017-10-01

    Both natural and artificial selection play a critical role in animals' adaptation to the environment. Detection of the signature of selection in genomic regions can provide insights for understanding the function of specific phenotypes. It is generally assumed that laboratory mice may experience intense artificial selection while wild mice more natural selection. However, the differences of selection signature in the mouse genome and underlying genes between wild and laboratory mice remain unclear. In this study, we used two mouse populations: chromosome 1 (Chr 1) substitution lines (C1SLs) derived from Chinese wild mice and mouse genome project (MGP) sequenced inbred strains and two selection detection statistics: Fst and Tajima's D to identify the signature of selection footprint on Chr 1. For the differentiation between the C1SLs and MGP, 110 candidate selection regions containing 47 protein coding genes were detected. A total of 149 selection regions which encompass 7.215 Mb were identified in the C1SLs by Tajima's D approach. While for the MGP, we identified nearly twice selection regions (243) compared with the C1SLs which accounted for 13.27 Mb Chr 1 sequence. Through functional annotation, we identified several biological processes with significant enrichment including seven genes in the olfactory transduction pathway. In addition, we searched the phenotypes associated with the 47 candidate selection genes identified by Fst. These genes were involved in behavior, growth or body weight, mortality or aging, and immune systems which align well with the phenotypic differences between wild and laboratory mice. Therefore, the findings would be helpful for our understanding of the phenotypic differences between wild and laboratory mice and applications for using this new mouse resource (C1SLs) for further genetics studies.

  20. Whole-Exome Sequencing Reveals Clinically Relevant Variants in Family Affected with Autism Spectrum Disorder

    Directory of Open Access Journals (Sweden)

    Jiaxiu Zhou

    2016-10-01

    Full Text Available Chromosomal microarray (CMA has been suggested as a first tier clinical diagnostic test for ASD. High-throughput sequencing (HTS has associated hundreds of genes associated with ASD. Whole Exome Sequencing (WES was used in combination with CMA to identify clinically-relevant ASD variants. In prior work, a trio-based (father, mother, and proband WGS (Whole Genome Sequencing was used to reveal clinically-relevant de novo, or inherited, rare variants in half (16 / 32 of the ASD families in which all probands had normal, or VOUS (Variant of Uncertain Clinical Significance, CMA results. In this study, after CMA screening chromosome structural abnormalities of a proband affected with ASD, a WES was performed on the patient and parents. Some rare de novo, and inherited, variants were detected using trio-based bioinformatics analysis. ASD variants were ranked by SFARI Gene score, HPO (human phenotype ontology, protein function damage, and manual searching PubMed. Sanger sequencing was used to validated some candidate variants in family members. A de novo homozygous mutation in SPG11 (p.C209F, two inherited, compound-heterozygote mutations in SCN9A (p.Q10R and p.R1893H and BEST1 (p.A135V and p.A297V were confirmed. Heterozygous mutations in TSC1 (p.S487C and SHANK2 (p.Arg569His inherited from mother were also confirmed.

  1. Atypical case of Wolfram syndrome revealed through targeted exome sequencing in a patient with suspected mitochondrial disease.

    Science.gov (United States)

    Lieber, Daniel S; Vafai, Scott B; Horton, Laura C; Slate, Nancy G; Liu, Shangtao; Borowsky, Mark L; Calvo, Sarah E; Schmahmann, Jeremy D; Mootha, Vamsi K

    2012-01-06

    Mitochondrial diseases comprise a diverse set of clinical disorders that affect multiple organ systems with varying severity and age of onset. Due to their clinical and genetic heterogeneity, these diseases are difficult to diagnose. We have developed a targeted exome sequencing approach to improve our ability to properly diagnose mitochondrial diseases and apply it here to an individual patient. Our method targets mitochondrial DNA (mtDNA) and the exons of 1,600 nuclear genes involved in mitochondrial biology or Mendelian disorders with multi-system phenotypes, thereby allowing for simultaneous evaluation of multiple disease loci. Targeted exome sequencing was performed on a patient initially suspected to have a mitochondrial disorder. The patient presented with diabetes mellitus, diffuse brain atrophy, autonomic neuropathy, optic nerve atrophy, and a severe amnestic syndrome. Further work-up revealed multiple heteroplasmic mtDNA deletions as well as profound thiamine deficiency without a clear nutritional cause. Targeted exome sequencing revealed a homozygous c.1672C > T (p.R558C) missense mutation in exon 8 of WFS1 that has previously been reported in a patient with Wolfram syndrome. This case demonstrates how clinical application of next-generation sequencing technology can enhance the diagnosis of patients suspected to have rare genetic disorders. Furthermore, the finding of unexplained thiamine deficiency in a patient with Wolfram syndrome suggests a potential link between WFS1 biology and thiamine metabolism that has implications for the clinical management of Wolfram syndrome patients.

  2. Atypical case of Wolfram syndrome revealed through targeted exome sequencing in a patient with suspected mitochondrial disease

    Directory of Open Access Journals (Sweden)

    Lieber Daniel S

    2012-01-01

    Full Text Available Abstract Background Mitochondrial diseases comprise a diverse set of clinical disorders that affect multiple organ systems with varying severity and age of onset. Due to their clinical and genetic heterogeneity, these diseases are difficult to diagnose. We have developed a targeted exome sequencing approach to improve our ability to properly diagnose mitochondrial diseases and apply it here to an individual patient. Our method targets mitochondrial DNA (mtDNA and the exons of 1,600 nuclear genes involved in mitochondrial biology or Mendelian disorders with multi-system phenotypes, thereby allowing for simultaneous evaluation of multiple disease loci. Case Presentation Targeted exome sequencing was performed on a patient initially suspected to have a mitochondrial disorder. The patient presented with diabetes mellitus, diffuse brain atrophy, autonomic neuropathy, optic nerve atrophy, and a severe amnestic syndrome. Further work-up revealed multiple heteroplasmic mtDNA deletions as well as profound thiamine deficiency without a clear nutritional cause. Targeted exome sequencing revealed a homozygous c.1672C > T (p.R558C missense mutation in exon 8 of WFS1 that has previously been reported in a patient with Wolfram syndrome. Conclusion This case demonstrates how clinical application of next-generation sequencing technology can enhance the diagnosis of patients suspected to have rare genetic disorders. Furthermore, the finding of unexplained thiamine deficiency in a patient with Wolfram syndrome suggests a potential link between WFS1 biology and thiamine metabolism that has implications for the clinical management of Wolfram syndrome patients.

  3. Inheritance Patterns in Citation Networks Reveal Scientific Memes

    Directory of Open Access Journals (Sweden)

    Tobias Kuhn

    2014-11-01

    Full Text Available Memes are the cultural equivalent of genes that spread across human culture by means of imitation. What makes a meme and what distinguishes it from other forms of information, however, is still poorly understood. Our analysis of memes in the scientific literature reveals that they are governed by a surprisingly simple relationship between frequency of occurrence and the degree to which they propagate along the citation graph. We propose a simple formalization of this pattern and validate it with data from close to 50 million publication records from the Web of Science, PubMed Central, and the American Physical Society. Evaluations relying on human annotators, citation network randomizations, and comparisons with several alternative approaches confirm that our formula is accurate and effective, without a dependence on linguistic or ontological knowledge and without the application of arbitrary thresholds or filters.

  4. Inheritance Patterns in Citation Networks Reveal Scientific Memes

    Science.gov (United States)

    Kuhn, Tobias; Perc, Matjaž; Helbing, Dirk

    2014-10-01

    Memes are the cultural equivalent of genes that spread across human culture by means of imitation. What makes a meme and what distinguishes it from other forms of information, however, is still poorly understood. Our analysis of memes in the scientific literature reveals that they are governed by a surprisingly simple relationship between frequency of occurrence and the degree to which they propagate along the citation graph. We propose a simple formalization of this pattern and validate it with data from close to 50 million publication records from the Web of Science, PubMed Central, and the American Physical Society. Evaluations relying on human annotators, citation network randomizations, and comparisons with several alternative approaches confirm that our formula is accurate and effective, without a dependence on linguistic or ontological knowledge and without the application of arbitrary thresholds or filters.

  5. Extracting gene expression patterns and identifying co-expressed genes from microarray data reveals biologically responsive processes

    Directory of Open Access Journals (Sweden)

    Paules Richard S

    2007-11-01

    Full Text Available Abstract Background A common observation in the analysis of gene expression data is that many genes display similarity in their expression patterns and therefore appear to be co-regulated. However, the variation associated with microarray data and the complexity of the experimental designs make the acquisition of co-expressed genes a challenge. We developed a novel method for Extracting microarray gene expression Patterns and Identifying co-expressed Genes, designated as EPIG. The approach utilizes the underlying structure of gene expression data to extract patterns and identify co-expressed genes that are responsive to experimental conditions. Results Through evaluation of the correlations among profiles, the magnitude of variation in gene expression profiles, and profile signal-to-noise ratio's, EPIG extracts a set of patterns representing co-expressed genes. The method is shown to work well with a simulated data set and microarray data obtained from time-series studies of dauer recovery and L1 starvation in C. elegans and after ultraviolet (UV or ionizing radiation (IR-induced DNA damage in diploid human fibroblasts. With the simulated data set, EPIG extracted the appropriate number of patterns which were more stable and homogeneous than the set of patterns that were determined using the CLICK or CAST clustering algorithms. However, CLICK performed better than EPIG and CAST with respect to the average correlation between clusters/patterns of the simulated data. With real biological data, EPIG extracted more dauer-specific patterns than CLICK. Furthermore, analysis of the IR/UV data revealed 18 unique patterns and 2661 genes out of approximately 17,000 that were identified as significantly expressed and categorized to the patterns by EPIG. The time-dependent patterns displayed similar and dissimilar responses between IR and UV treatments. Gene Ontology analysis applied to each pattern-related subset of co-expressed genes revealed underlying

  6. Instantaneous, Simple, and Reversible Revealing of Invisible Patterns Encrypted in Robust Hollow Sphere Colloidal Photonic Crystals.

    Science.gov (United States)

    Zhong, Kuo; Li, Jiaqi; Liu, Liwang; Van Cleuvenbergen, Stijn; Song, Kai; Clays, Koen

    2018-05-04

    The colors of photonic crystals are based on their periodic crystalline structure. They show clear advantages over conventional chromophores for many applications, mainly due to their anti-photobleaching and responsiveness to stimuli. More specifically, combining colloidal photonic crystals and invisible patterns is important in steganography and watermarking for anticounterfeiting applications. Here a convenient way to imprint robust invisible patterns in colloidal crystals of hollow silica spheres is presented. While these patterns remain invisible under static environmental humidity, even up to near 100% relative humidity, they are unveiled immediately (≈100 ms) and fully reversibly by dynamic humid flow, e.g., human breath. They reveal themselves due to the extreme wettability of the patterned (etched) regions, as confirmed by contact angle measurements. The liquid surface tension threshold to induce wetting (revealing the imprinted invisible images) is evaluated by thermodynamic predictions and subsequently verified by exposure to various vapors with different surface tension. The color of the patterned regions is furthermore independently tuned by vapors with different refractive indices. Such a system can play a key role in applications such as anticounterfeiting, identification, and vapor sensing. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

    Science.gov (United States)

    Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

    2014-07-04

    Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was

  8. ACME: A scalable parallel system for extracting frequent patterns from a very long sequence

    KAUST Repository

    Sahli, Majed

    2014-10-02

    Modern applications, including bioinformatics, time series, and web log analysis, require the extraction of frequent patterns, called motifs, from one very long (i.e., several gigabytes) sequence. Existing approaches are either heuristics that are error-prone, or exact (also called combinatorial) methods that are extremely slow, therefore, applicable only to very small sequences (i.e., in the order of megabytes). This paper presents ACME, a combinatorial approach that scales to gigabyte-long sequences and is the first to support supermaximal motifs. ACME is a versatile parallel system that can be deployed on desktop multi-core systems, or on thousands of CPUs in the cloud. However, merely using more compute nodes does not guarantee efficiency, because of the related overheads. To this end, ACME introduces an automatic tuning mechanism that suggests the appropriate number of CPUs to utilize, in order to meet the user constraints in terms of run time, while minimizing the financial cost of cloud resources. Our experiments show that, compared to the state of the art, ACME supports three orders of magnitude longer sequences (e.g., DNA for the entire human genome); handles large alphabets (e.g., English alphabet for Wikipedia); scales out to 16,384 CPUs on a supercomputer; and supports elastic deployment in the cloud.

  9. Working in Separate Silos? What Citation Patterns Reveal about Higher Education Research Internationally

    Science.gov (United States)

    Tight, Malcolm

    2014-01-01

    Higher education research is a growing, inter-disciplinary and increasingly international field of study. This article examines the citation patterns of articles published in six leading higher education journals--three published in the United States and three published elsewhere in the world--for what they reveal about the development of this…

  10. Mitochondrial DNA analyses reveal low genetic diversity in Culex quinquefasciatus from residential areas in Malaysia.

    Science.gov (United States)

    Low, V L; Lim, P E; Chen, C D; Lim, Y A L; Tan, T K; Norma-Rashid, Y; Lee, H L; Sofian-Azirun, M

    2014-06-01

    The present study explored the intraspecific genetic diversity, dispersal patterns and phylogeographic relationships of Culex quinquefasciatus Say (Diptera: Culicidae) in Malaysia using reference data available in GenBank in order to reveal this species' phylogenetic relationships. A statistical parsimony network of 70 taxa aligned as 624 characters of the cytochrome c oxidase subunit I (COI) gene and 685 characters of the cytochrome c oxidase subunit II (COII) gene revealed three haplotypes (A1-A3) and four haplotypes (B1-B4), respectively. The concatenated sequences of both COI and COII genes with a total of 1309 characters revealed seven haplotypes (AB1-AB7). Analysis using tcs indicated that haplotype AB1 was the common ancestor and the most widespread haplotype in Malaysia. The genetic distance based on concatenated sequences of both COI and COII genes ranged from 0.00076 to 0.00229. Sequence alignment of Cx. quinquefasciatus from Malaysia and other countries revealed four haplotypes (AA1-AA4) by the COI gene and nine haplotypes (BB1-BB9) by the COII gene. Phylogenetic analyses demonstrated that Malaysian Cx. quinquefasciatus share the same genetic lineage as East African and Asian Cx. quinquefasciatus. This study has inferred the genetic lineages, dispersal patterns and hypothetical ancestral genotypes of Cx. quinquefasciatus. © 2013 The Royal Entomological Society.

  11. DNA sequencing reveals limited heterogeneity in the 16S rRNA gene from the rrnB operon among five Mycoplasma hominis isolates

    DEFF Research Database (Denmark)

    Mygind, T; Birkelund, Svend; Christiansen, Gunna

    1998-01-01

    To investigate the intraspecies heterogeneity within the 16S rRNA gene of Mycoplasma hominis, five isolates with diverse antigenic profiles, variable/identical P120 hypervariable domains, and different 16S rRNA gene RFLP patterns were analysed. The 16S rRNA gene from the rrnB operon was amplified...... by PCR and the PCR products were sequenced. Three isolates had identical 16S rRNA sequences and two isolates had sequences that differed from the others by only one nucleotide....

  12. A tobacco cDNA reveals two different transcription patterns in vegetative and reproductive organs

    Directory of Open Access Journals (Sweden)

    I. da Silva

    2002-08-01

    Full Text Available In order to identify genes expressed in the pistil that may have a role in the reproduction process, we have established an expressed sequence tags project to randomly sequence clones from a Nicotiana tabacum stigma/style cDNA library. A cDNA clone (MTL-8 showing high sequence similarity to genes encoding glycine-rich RNA-binding proteins was chosen for further characterization. Based on the extensive identity of MTL-8 to the RGP-1a sequence of N. sylvestris, a primer was defined to extend the 5' sequence of MTL-8 by RT-PCR from stigma/style RNAs. The amplification product was sequenced and it was confirmed that MTL-8 corresponds to an mRNA encoding a glycine-rich RNA-binding protein. Two transcripts of different sizes and expression patterns were identified when the MTL-8 cDNA insert was used as a probe in RNA blots. The largest is 1,100 nucleotides (nt long and markedly predominant in ovaries. The smaller transcript, with 600 nt, is ubiquitous to the vegetative and reproductive organs analyzed (roots, stems, leaves, sepals, petals, stamens, stigmas/styles and ovaries. Plants submitted to stress (wounding, virus infection and ethylene treatment presented an increased level of the 600-nt transcript in leaves, especially after tobacco necrosis virus infection. In contrast, the level of the 1,100-nt transcript seems to be unaffected by the stress conditions tested. Results of Southern blot experiments have suggested that MTL-8 is present in one or two copies in the tobacco genome. Our results suggest that the shorter transcript is related to stress while the larger one is a flower predominant and nonstress-inducible messenger.

  13. High quality maize centromere 10 sequence reveals evidence of frequent recombination events

    Directory of Open Access Journals (Sweden)

    Thomas Kai Wolfgruber

    2016-03-01

    Full Text Available The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR have presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 x 10-6 and 5 x 10-5 for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb of the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length centromeric retrotransposons from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. This repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to facilitate the repair of frequent DSBs in centromeres.

  14. Proteomic analysis of lysine acetylation sites in rat tissues reveals organ specificity and subcellular patterns

    DEFF Research Database (Denmark)

    Lundby, Alicia; Hansen, Kasper Lage; Weinert, Brian Tate

    2012-01-01

    ,541 proteins and provide the data set as a web-based database. We demonstrate that lysine acetylation displays site-specific sequence motifs that diverge between cellular compartments, with a significant fraction of nuclear sites conforming to the consensus motifs G-AcK and AcK-P. Our data set reveals...

  15. Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization

    CSIR Research Space (South Africa)

    Gcebe, N

    2017-04-01

    Full Text Available Journal of Systematic and Evolutionary Microbiology: DOI 10.1099/ijsem.0.001678 Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization Gcebe N Rutten V Gey...

  16. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    Science.gov (United States)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  17. Mouse Nkrp1-Clr gene cluster sequence and expression analyses reveal conservation of tissue-specific MHC-independent immunosurveillance.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available The Nkrp1 (Klrb1-Clr (Clec2 genes encode a receptor-ligand system utilized by NK cells as an MHC-independent immunosurveillance strategy for innate immune responses. The related Ly49 family of MHC-I receptors displays extreme allelic polymorphism and haplotype plasticity. In contrast, previous BAC-mapping and aCGH studies in the mouse suggest the neighboring and related Nkrp1-Clr cluster is evolutionarily stable. To definitively compare the relative evolutionary rate of Nkrp1-Clr vs. Ly49 gene clusters, the Nkrp1-Clr gene clusters from two Ly49 haplotype-disparate inbred mouse strains, BALB/c and 129S6, were sequenced. Both Nkrp1-Clr gene cluster sequences are highly similar to the C57BL/6 reference sequence, displaying the same gene numbers and order, complete pseudogenes, and gene fragments. The Nkrp1-Clr clusters contain a strikingly dissimilar proportion of repetitive elements compared to the Ly49 clusters, suggesting that certain elements may be partly responsible for the highly disparate Ly49 vs. Nkrp1 evolutionary rate. Focused allelic polymorphisms were found within the Nkrp1b/d (Klrb1b, Nkrp1c (Klrb1c, and Clr-c (Clec2f genes, suggestive of possible immune selection. Cell-type specific transcription of Nkrp1-Clr genes in a large panel of tissues/organs was determined. Clr-b (Clec2d and Clr-g (Clec2i showed wide expression, while other Clr genes showed more tissue-specific expression patterns. In situ hybridization revealed specific expression of various members of the Clr family in leukocytes/hematopoietic cells of immune organs, various tissue-restricted epithelial cells (including intestinal, kidney tubular, lung, and corneal progenitor epithelial cells, as well as myocytes. In summary, the Nkrp1-Clr gene cluster appears to evolve more slowly relative to the related Ly49 cluster, and likely regulates innate immunosurveillance in a tissue-specific manner.

  18. Multivoxel Patterns Reveal Functionally Differentiated Networks Underlying Auditory Feedback Processing of Speech

    DEFF Research Database (Denmark)

    Zheng, Zane Z.; Vicente-Grabovetsky, Alejandro; MacDonald, Ewen N.

    2013-01-01

    The everyday act of speaking involves the complex processes of speech motor control. An important component of control is monitoring, detection, and processing of errors when auditory feedback does not correspond to the intended motor gesture. Here we show, using fMRI and converging operations...... within a multivoxel pattern analysis framework, that this sensorimotor process is supported by functionally differentiated brain networks. During scanning, a real-time speech-tracking system was used to deliver two acoustically different types of distorted auditory feedback or unaltered feedback while...... human participants were vocalizing monosyllabic words, and to present the same auditory stimuli while participants were passively listening. Whole-brain analysis of neural-pattern similarity revealed three functional networks that were differentially sensitive to distorted auditory feedback during...

  19. Re-inspection of small RNA sequence datasets reveals several novel human miRNA genes.

    Directory of Open Access Journals (Sweden)

    Thomas Birkballe Hansen

    Full Text Available BACKGROUND: miRNAs are key players in gene expression regulation. To fully understand the complex nature of cellular differentiation or initiation and progression of disease, it is important to assess the expression patterns of as many miRNAs as possible. Thereby, identifying novel miRNAs is an essential prerequisite to make possible a comprehensive and coherent understanding of cellular biology. METHODOLOGY/PRINCIPAL FINDINGS: Based on two extensive, but previously published, small RNA sequence datasets from human embryonic stem cells and human embroid bodies, respectively [1], we identified 112 novel miRNA-like structures and were able to validate miRNA processing in 12 out of 17 investigated cases. Several miRNA candidates were furthermore substantiated by including additional available small RNA datasets, thereby demonstrating the power of combining datasets to identify miRNAs that otherwise may be assigned as experimental noise. CONCLUSIONS/SIGNIFICANCE: Our analysis highlights that existing datasets are not yet exhaustedly studied and continuous re-analysis of the available data is important to uncover all features of small RNA sequencing.

  20. Phylogenetic inferences of Nepenthes species in Peninsular Malaysia revealed by chloroplast (trnL intron) and nuclear (ITS) DNA sequences

    OpenAIRE

    Bunawan, Hamidun; Yen, Choong Chee; Yaakop, Salmah; Noor, Normah Mohd

    2017-01-01

    Background The chloroplastic trnL intron and the nuclear internal transcribed spacer (ITS) region were sequenced for 11 Nepenthes species recorded in Peninsular Malaysia to examine their phylogenetic relationship and to evaluate the usage of trnL intron and ITS sequences for phylogenetic reconstruction of this genus. Results Phylogeny reconstruction was carried out using neighbor-joining, maximum parsimony and Bayesian analyses. All the trees revealed two major clusters, a lowland group consi...

  1. Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing.

    Science.gov (United States)

    Conway, Tyrrell; Creecy, James P; Maddox, Scott M; Grissom, Joe E; Conkle, Trevor L; Shadid, Tyler M; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada; Wanner, Barry L

    2014-07-08

    We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3' transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5' ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. Importance: We precisely mapped the 5' and 3' ends of RNA transcripts across the E. coli K-12 genome by using a single-nucleotide analytical approach. Our resulting high-resolution transcriptome maps show that ca. one-third of E. coli operons are

  2. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  3. Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes

    Directory of Open Access Journals (Sweden)

    Rebecca M. Davidson

    2011-11-01

    Full Text Available Transcriptome sequencing is a powerful method for studying global expression patterns in large, complex genomes. Evaluation of sequence-based expression profiles during reproductive development would provide functional annotation to genes underlying agronomic traits. We generated transcriptome profiles for 12 diverse maize ( L. reproductive tissues representing male, female, developing seed, and leaf tissues using high throughput transcriptome sequencing. Overall, ∼80% of annotated genes were expressed. Comparative analysis between sequence and hybridization-based methods demonstrated the utility of ribonucleic acid sequencing (RNA-seq for expression determination and differentiation of paralagous genes (∼85% of maize genes. Analysis of 4975 gene families across reproductive tissues revealed expression divergence is proportional to family size. In all pairwise comparisons between tissues, 7 (pre- vs. postemergence cobs to 48% (pollen vs. ovule of genes were differentially expressed. Genes with expression restricted to a single tissue within this study were identified with the highest numbers observed in leaves, endosperm, and pollen. Coexpression network analysis identified 17 gene modules with complex and shared expression patterns containing many previously described maize genes. The data and analyses in this study provide valuable tools through improved gene annotation, gene family characterization, and a core set of candidate genes to further characterize maize reproductive development and improve grain yield potential.

  4. Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Kristoffer T Bæk

    Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.

  5. Relative role of transfer zones in controlling sequence stacking patterns and facies distribution: insights from the Fushan Depression, South China Sea

    Science.gov (United States)

    Liu, Entao; Wang, Hua; Li, Yuan; Huang, Chuanyan

    2015-04-01

    In sedimentary basins, a transfer zone can be defined as a coordinated system of deformational features which has good prospects for hydrocarbon exploration. Although the term 'transfer zone' has been widely applied to the study of extensional basins, little attention has been paid to its controlling effect on sequence tracking pattern and depositional facies distribution. Fushan Depression is a half-graben rift sub-basin, located in the southeast of the Beibuwan Basin, South China Sea. In this study, comparative analysis of seismic reflection, palaeogeomorphology, fault activity and depositional facies distribution in the southern slope indicates that three different types of sequence stacking patterns (i.e. multi-level step-fault belt in the western area, flexure slope belt in the central area, gentle slope belt in the eastern area) were developed along the southern slope, together with a large-scale transfer zone in the central area, at the intersection of the western and eastern fault systems. Further analysis shows that the transfer zone played an important role in the diversity of sequence stacking patterns in the southern slope by dividing the Fushan Depression into two non-interfering tectonic systems forming different sequence patterns, and leading to the formation of the flexure slope belt in the central area. The transfer zone had an important controlling effect on not only the diversity of sequence tracking patterns, but also the facies distribution on the relay ramp. During the high-stand stage, under the controlling effect of the transfer zone, the sediments contain a significant proportion of coarser material accumulated and distributed along the ramp axis. By contrast, during the low-stand stage, the transfer zone did not seem to contribute significantly to the low-stand fan distribution which was mainly controlled by the slope gradient (palaeogeomorphology). Therefore, analysis of the transfer zone can provide a new perspective for basin analysis

  6. Detection of M-Sequences from Spike Sequence in Neuronal Networks

    Directory of Open Access Journals (Sweden)

    Yoshi Nishitani

    2012-01-01

    Full Text Available In circuit theory, it is well known that a linear feedback shift register (LFSR circuit generates pseudorandom bit sequences (PRBS, including an M-sequence with the maximum period of length. In this study, we tried to detect M-sequences known as a pseudorandom sequence generated by the LFSR circuit from time series patterns of stimulated action potentials. Stimulated action potentials were recorded from dissociated cultures of hippocampal neurons grown on a multielectrode array. We could find several M-sequences from a 3-stage LFSR circuit (M3. These results show the possibility of assembling LFSR circuits or its equivalent ones in a neuronal network. However, since the M3 pattern was composed of only four spike intervals, the possibility of an accidental detection was not zero. Then, we detected M-sequences from random spike sequences which were not generated from an LFSR circuit and compare the result with the number of M-sequences from the originally observed raster data. As a result, a significant difference was confirmed: a greater number of “0–1” reversed the 3-stage M-sequences occurred than would have accidentally be detected. This result suggests that some LFSR equivalent circuits are assembled in neuronal networks.

  7. Patterns of mutation and selection at synonymous sites in Drosophila

    DEFF Research Database (Denmark)

    Singh, Nadia D; Bauer DuMont, Vanessa L; Hubisz, Melissa J

    2007-01-01

    , when applied to 18 coding sequences in 3 species of Drosophila, confirmed an earlier report that the Notch gene in Drosophila melanogaster was evolving under selection in favor of those codons defined as unpreferred in this species. This finding opened the possibility that synonymous sites may...... be subject to a variety of selective pressures beyond weak selection for increased frequencies of the codons currently defined as "preferred" in D. melanogaster. To further explore patterns of synonymous site evolution in Drosophila in a lineage-specific manner, we expanded the application of the maximum...... likelihood framework to 8,452 protein coding sequences with well-defined orthology in D. melanogaster, Drosophila sechellia, and Drosophila yakuba. Our analyses reveal intragenomic and interspecific variation in mutational patterns as well as in patterns and intensity of selection on synonymous sites. In D...

  8. Foundations for a syntatic pattern recognition system for genomic DNA sequences. [Annual] report, 1 December 1991--31 March 1993

    Energy Technology Data Exchange (ETDEWEB)

    Searles, D.B.

    1993-03-01

    The goal of the proposed work is the creation of a software system that will perform sophisticated pattern recognition and related functions at a level of abstraction and with expressive power beyond current general-purpose pattern-matching systems for biological sequences; and with a more uniform language, environment, and graphical user interface, and with greater flexibility, extensibility, embeddability, and ability to incorporate other algorithms, than current special-purpose analytic software.

  9. Mitochondrial genome sequences reveal deep divergences among Anopheles punctulatus sibling species in Papua New Guinea

    Directory of Open Access Journals (Sweden)

    Logue Kyle

    2013-02-01

    Full Text Available Abstract Background Members of the Anopheles punctulatus group (AP group are the primary vectors of human malaria in Papua New Guinea. The AP group includes 13 sibling species, most of them morphologically indistinguishable. Understanding why only certain species are able to transmit malaria requires a better comprehension of their evolutionary history. In particular, understanding relationships and divergence times among Anopheles species may enable assessing how malaria-related traits (e.g. blood feeding behaviours, vector competence have evolved. Methods DNA sequences of 14 mitochondrial (mt genomes from five AP sibling species and two species of the Anopheles dirus complex of Southeast Asia were sequenced. DNA sequences from all concatenated protein coding genes (10,770 bp were then analysed using a Bayesian approach to reconstruct phylogenetic relationships and date the divergence of the AP sibling species. Results Phylogenetic reconstruction using the concatenated DNA sequence of all mitochondrial protein coding genes indicates that the ancestors of the AP group arrived in Papua New Guinea 25 to 54 million years ago and rapidly diverged to form the current sibling species. Conclusion Through evaluation of newly described mt genome sequences, this study has revealed a divergence among members of the AP group in Papua New Guinea that would significantly predate the arrival of humans in this region, 50 thousand years ago. The divergence observed among the mtDNA sequences studied here may have resulted from reproductive isolation during historical changes in sea-level through glacial minima and maxima. This leads to a hypothesis that the AP sibling species have evolved independently for potentially thousands of generations. This suggests that the evolution of many phenotypes, such as insecticide resistance will arise independently in each of the AP sibling species studied here.

  10. Thermodynamics of complexity and pattern manipulation

    Science.gov (United States)

    Garner, Andrew J. P.; Thompson, Jayne; Vedral, Vlatko; Gu, Mile

    2017-04-01

    Many organisms capitalize on their ability to predict the environment to maximize available free energy and reinvest this energy to create new complex structures. This functionality relies on the manipulation of patterns—temporally ordered sequences of data. Here, we propose a framework to describe pattern manipulators—devices that convert thermodynamic work to patterns or vice versa—and use them to build a "pattern engine" that facilitates a thermodynamic cycle of pattern creation and consumption. We show that the least heat dissipation is achieved by the provably simplest devices, the ones that exhibit desired operational behavior while maintaining the least internal memory. We derive the ultimate limits of this heat dissipation and show that it is generally nonzero and connected with the pattern's intrinsic crypticity—a complexity theoretic quantity that captures the puzzling difference between the amount of information the pattern's past behavior reveals about its future and the amount one needs to communicate about this past to optimally predict the future.

  11. Transcriptome sequencing of two phenotypic mosaic Eucalyptus trees reveals large scale transcriptome re-modelling.

    Directory of Open Access Journals (Sweden)

    Amanda Padovan

    Full Text Available Phenotypic mosaic trees offer an ideal system for studying differential gene expression. We have investigated two mosaic eucalypt trees from two closely related species (Eucalyptus melliodora and E. sideroxylon, which each support two types of leaves: one part of the canopy is resistant to insect herbivory and the remaining leaves are susceptible. Driving this ecological distinction are differences in plant secondary metabolites. We used these phenotypic mosaics to investigate genome wide patterns of foliar gene expression with the aim of identifying patterns of differential gene expression and the somatic mutation(s that lead to this phenotypic mosaicism. We sequenced the mRNA pool from leaves of the resistant and susceptible ecotypes from both mosaic eucalypts using the Illumina HiSeq 2000 platform. We found large differences in pathway regulation and gene expression between the ecotypes of each mosaic. The expression of the genes in the MVA and MEP pathways is reflected by variation in leaf chemistry, however this is not the case for the terpene synthases. Apart from the terpene biosynthetic pathway, there are several other metabolic pathways that are differentially regulated between the two ecotypes, suggesting there is much more phenotypic diversity than has been described. Despite the close relationship between the two species, they show large differences in the global patterns of gene and pathway regulation.

  12. Effects of Temporal Sequencing and Auditory Discrimination on Children's Memory Patterns for Tones, Numbers, and Nonsense Words

    Science.gov (United States)

    Gromko, Joyce Eastlund; Hansen, Dee; Tortora, Anne Halloran; Higgins, Daniel; Boccia, Eric

    2009-01-01

    The purpose of this study was to determine whether children's recall of tones, numbers, and words was supported by a common temporal sequencing mechanism; whether children's patterns of memory for tones, numbers, and nonsense words were the same despite differences in symbol systems; and whether children's recall of tones, numbers, and nonsense…

  13. Identification of Y-Chromosome Sequences in Turner Syndrome.

    Science.gov (United States)

    Silva-Grecco, Roseane Lopes da; Trovó-Marqui, Alessandra Bernadete; Sousa, Tiago Alves de; Croce, Lilian Da; Balarin, Marly Aparecida Spadotto

    2016-05-01

    To investigate the presence of Y-chromosome sequences and determine their frequency in patients with Turner syndrome. The study included 23 patients with Turner syndrome from Brazil, who gave written informed consent for participating in the study. Cytogenetic analyses were performed in peripheral blood lymphocytes, with 100 metaphases per patient. Genomic DNA was also extracted from peripheral blood lymphocytes, and gene sequences DYZ1, DYZ3, ZFY and SRY were amplified by Polymerase Chain Reaction. The cytogenetic analysis showed a 45,X karyotype in 9 patients (39.2 %) and a mosaic pattern in 14 (60.8 %). In 8.7 % (2 out of 23) of the patients, Y-chromosome sequences were found. This prevalence is very similar to those reported previously. The initial karyotype analysis of these patients did not reveal Y-chromosome material, but they were found positive for Y-specific sequences in the lymphocyte DNA analysis. The PCR technique showed that 2 (8.7 %) of the patients with Turner syndrome had Y-chromosome sequences, both presenting marker chromosomes on cytogenetic analysis.

  14. Deep sequencing reveals exceptional diversity and modes of transmission for bacterial sponge symbionts.

    Science.gov (United States)

    Webster, Nicole S; Taylor, Michael W; Behnam, Faris; Lücker, Sebastian; Rattei, Thomas; Whalan, Stephen; Horn, Matthias; Wagner, Michael

    2010-08-01

    Marine sponges contain complex bacterial communities of considerable ecological and biotechnological importance, with many of these organisms postulated to be specific to sponge hosts. Testing this hypothesis in light of the recent discovery of the rare microbial biosphere, we investigated three Australian sponges by massively parallel 16S rRNA gene tag pyrosequencing. Here we show bacterial diversity that is unparalleled in an invertebrate host, with more than 250,000 sponge-derived sequence tags being assigned to 23 bacterial phyla and revealing up to 2996 operational taxonomic units (95% sequence similarity) per sponge species. Of the 33 previously described 'sponge-specific' clusters that were detected in this study, 48% were found exclusively in adults and larvae - implying vertical transmission of these groups. The remaining taxa, including 'Poribacteria', were also found at very low abundance among the 135,000 tags retrieved from surrounding seawater. Thus, members of the rare seawater biosphere may serve as seed organisms for widely occurring symbiont populations in sponges and their host association might have evolved much more recently than previously thought. © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd.

  15. Unexpected allelic heterogeneity and spectrum of mutations in Fowler syndrome revealed by next-generation exome sequencing.

    Science.gov (United States)

    Lalonde, Emilie; Albrecht, Steffen; Ha, Kevin C H; Jacob, Karine; Bolduc, Nathalie; Polychronakos, Constantin; Dechelotte, Pierre; Majewski, Jacek; Jabado, Nada

    2010-08-01

    Protein coding genes constitute approximately 1% of the human genome but harbor 85% of the mutations with large effects on disease-related traits. Therefore, efficient strategies for selectively sequencing complete coding regions (i.e., "whole exome") have the potential to contribute our understanding of human diseases. We used a method for whole-exome sequencing coupling Agilent whole-exome capture to the Illumina DNA-sequencing platform, and investigated two unrelated fetuses from nonconsanguineous families with Fowler Syndrome (FS), a stereotyped phenotype lethal disease. We report novel germline mutations in feline leukemia virus subgroup C cellular-receptor-family member 2, FLVCR2, which has recently been shown to cause FS. Using this technology, we identified three types of genetic abnormalities: point-mutations, insertions-deletions, and intronic splice-site changes (first pathogenic report using this technology), in the fetuses who both were compound heterozygotes for the disease. Although revealing a high level of allelic heterogeneity and mutational spectrum in FS, this study further illustrates the successful application of whole-exome sequencing to uncover genetic defects in rare Mendelian disorders. Of importance, we show that we can identify genes underlying rare, monogenic and recessive diseases using a limited number of patients (n=2), in the absence of shared genetic heritage and in the presence of allelic heterogeneity.

  16. Deep sequencing of foot-and-mouth disease virus reveals RNA sequences involved in genome packaging.

    Science.gov (United States)

    Logan, Grace; Newman, Joseph; Wright, Caroline F; Lasecka-Dykes, Lidia; Haydon, Daniel T; Cottam, Eleanor M; Tuthill, Tobias J

    2017-10-18

    Non-enveloped viruses protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. Packaging and capsid assembly in RNA viruses can involve interactions between capsid proteins and secondary structures in the viral genome as exemplified by the RNA bacteriophage MS2 and as proposed for other RNA viruses of plants, animals and human. In the picornavirus family of non-enveloped RNA viruses, the requirements for genome packaging remain poorly understood. Here we show a novel and simple approach to identify predicted RNA secondary structures involved in genome packaging in the picornavirus foot-and-mouth disease virus (FMDV). By interrogating deep sequencing data generated from both packaged and unpackaged populations of RNA we have determined multiple regions of the genome with constrained variation in the packaged population. Predicted secondary structures of these regions revealed stem loops with conservation of structure and a common motif at the loop. Disruption of these features resulted in attenuation of virus growth in cell culture due to a reduction in assembly of mature virions. This study provides evidence for the involvement of predicted RNA structures in picornavirus packaging and offers a readily transferable methodology for identifying packaging requirements in many other viruses. Importance In order to transmit their genetic material to a new host, non-enveloped viruses must protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. For many non-enveloped RNA viruses the requirements for this critical part of the viral life cycle remain poorly understood. We have identified RNA sequences involved in genome packaging of the picornavirus foot-and-mouth disease virus. This virus causes an economically devastating disease of livestock affecting both the developed and developing world. The experimental methods developed to carry out this work are novel, simple and transferable to the

  17. Modified Pattern Sequence-based Forecasting for Electric Vehicle Charging Stations

    Energy Technology Data Exchange (ETDEWEB)

    Majidpour, Mostafa; Qiu, Charlie; Chu, Peter; Gadh, Rajit; Pota, Hemanshu R.

    2014-11-03

    Three algorithms for the forecasting of energy consumption at individual EV charging outlets have been applied to real world data from the UCLA campus. Out of these three algorithms, namely k-Nearest Neighbor (kNN), ARIMA, and Pattern Sequence Forecasting (PSF), kNN with k=1, was the best and PSF was the worst performing algorithm with respect to the SMAPE measure. The advantage of PSF is its increased robustness to noise by substituting the real valued time series with an integer valued one, and the advantage of NN is having the least SMAPE for our data. We propose a Modified PSF algorithm (MPSF) which is a combination of PSF and NN; it could be interpreted as NN on integer valued data or as PSF with considering only the most recent neighbor to produce the output. Some other shortcomings of PSF are also addressed in the MPSF. Results show that MPSF has improved the forecast performance.

  18. Targeted exome sequencing reveals novel USH2A mutations in Chinese patients with simplex Usher syndrome.

    Science.gov (United States)

    Shu, Hai-Rong; Bi, Huai; Pan, Yang-Chun; Xu, Hang-Yu; Song, Jian-Xin; Hu, Jie

    2015-09-16

    Usher syndrome (USH) is an autosomal recessive disorder characterized by hearing impairment and vision dysfunction due to retinitis pigmentosa. Phenotypic and genetic heterogeneities of this disease make it impractical to obtain a genetic diagnosis by conventional Sanger sequencing. In this study, we applied a next-generation sequencing approach to detect genetic abnormalities in patients with USH. Two unrelated Chinese families were recruited, consisting of two USH afflicted patients and four unaffected relatives. We selected 199 genes related to inherited retinal diseases as targets for deep exome sequencing. Through systematic data analysis using an established bioinformatics pipeline, all variants that passed filter criteria were validated by Sanger sequencing and co-segregation analysis. A homozygous frameshift mutation (c.4382delA, p.T1462Lfs*2) was revealed in exon20 of gene USH2A in the F1 family. Two compound heterozygous mutations, IVS47 + 1G > A and c.13156A > T (p.I4386F), located in intron 48 and exon 63 respectively, of USH2A, were identified as causative mutations for the F2 family. Of note, the missense mutation c.13156A > T has not been reported so far. In conclusion, targeted exome sequencing precisely and rapidly identified the genetic defects in two Chinese USH families and this technique can be applied as a routine examination for these disorders with significant clinical and genetic heterogeneity.

  19. Comparative sequence analyses of the major quantitative trait locus phosphorus uptake 1 (Pup1) reveal a complex genetic structure.

    Science.gov (United States)

    Heuer, Sigrid; Lu, Xiaochun; Chin, Joong Hyoun; Tanaka, Juan Pariasca; Kanamori, Hiroyuki; Matsumoto, Takashi; De Leon, Teresa; Ulat, Victor Jun; Ismail, Abdelbagi M; Yano, Masahiro; Wissuwa, Matthias

    2009-06-01

    The phosphorus uptake 1 (Pup1) locus was identified as a major quantitative trait locus (QTL) for tolerance of phosphorus deficiency in rice. Near-isogenic lines with the Pup1 region from tolerant donor parent Kasalath typically show threefold higher phosphorus uptake and grain yield in phosphorus-deficient field trials than the intolerant parent Nipponbare. In this study, we report the fine mapping of the Pup1 locus to the long arm of chromosome 12 (15.31-15.47 Mb). Genes in the region were initially identified on the basis of the Nipponbare reference genome, but did not reveal any obvious candidate genes related to phosphorus uptake. Kasalath BAC clones were therefore sequenced and revealed a 278-kbp sequence significantly different from the syntenic regions in Nipponbare (145 kb) and in the indica reference genome of 93-11 (742 kbp). Size differences are caused by large insertions or deletions (INDELs), and an exceptionally large number of retrotransposon and transposon-related elements (TEs) present in all three sequences (45%-54%). About 46 kb of the Kasalath sequence did not align with the entire Nipponbare genome, and only three Nipponbare genes (fatty acid alpha-dioxygenase, dirigent protein and aspartic proteinase) are highly conserved in Kasalath. Two Nipponbare genes (expressed proteins) might have evolved by at least three TE integrations in an ancestor gene that is still present in Kasalath. Several predicted Kasalath genes are novel or unknown genes that are mainly located within INDEL regions. Our results highlight the importance of sequencing QTL regions in the respective donor parent, as important genes might not be present in the current reference genomes.

  20. Capacity for patterns and sequences in Kanerva's SDM as compared to other associative memory models. [Sparse, Distributed Memory

    Science.gov (United States)

    Keeler, James D.

    1988-01-01

    The information capacity of Kanerva's Sparse Distributed Memory (SDM) and Hopfield-type neural networks is investigated. Under the approximations used here, it is shown that the total information stored in these systems is proportional to the number connections in the network. The proportionality constant is the same for the SDM and Hopfield-type models independent of the particular model, or the order of the model. The approximations are checked numerically. This same analysis can be used to show that the SDM can store sequences of spatiotemporal patterns, and the addition of time-delayed connections allows the retrieval of context dependent temporal patterns. A minor modification of the SDM can be used to store correlated patterns.

  1. Personal sleep pattern visualization using sequence-based kernel self-organizing map on sound data.

    Science.gov (United States)

    Wu, Hongle; Kato, Takafumi; Yamada, Tomomi; Numao, Masayuki; Fukui, Ken-Ichi

    2017-07-01

    We propose a method to discover sleep patterns via clustering of sound events recorded during sleep. The proposed method extends the conventional self-organizing map algorithm by kernelization and sequence-based technologies to obtain a fine-grained map that visualizes the distribution and changes of sleep-related events. We introduced features widely applied in sound processing and popular kernel functions to the proposed method to evaluate and compare performance. The proposed method provides a new aspect of sleep monitoring because the results demonstrate that sound events can be directly correlated to an individual's sleep patterns. In addition, by visualizing the transition of cluster dynamics, sleep-related sound events were found to relate to the various stages of sleep. Therefore, these results empirically warrant future study into the assessment of personal sleep quality using sound data. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Torque measurements reveal sequence-specific cooperative transitions in supercoiled DNA

    Science.gov (United States)

    Oberstrass, Florian C.; Fernandes, Louis E.; Bryant, Zev

    2012-01-01

    B-DNA becomes unstable under superhelical stress and is able to adopt a wide range of alternative conformations including strand-separated DNA and Z-DNA. Localized sequence-dependent structural transitions are important for the regulation of biological processes such as DNA replication and transcription. To directly probe the effect of sequence on structural transitions driven by torque, we have measured the torsional response of a panel of DNA sequences using single molecule assays that employ nanosphere rotational probes to achieve high torque resolution. The responses of Z-forming d(pGpC)n sequences match our predictions based on a theoretical treatment of cooperative transitions in helical polymers. “Bubble” templates containing 50–100 bp mismatch regions show cooperative structural transitions similar to B-DNA, although less torque is required to disrupt strand–strand interactions. Our mechanical measurements, including direct characterization of the torsional rigidity of strand-separated DNA, establish a framework for quantitative predictions of the complex torsional response of arbitrary sequences in their biological context. PMID:22474350

  3. SHARAKU: an algorithm for aligning and clustering read mapping profiles of deep sequencing in non-coding RNA processing.

    Science.gov (United States)

    Tsuchiya, Mariko; Amano, Kojiro; Abe, Masaya; Seki, Misato; Hase, Sumitaka; Sato, Kengo; Sakakibara, Yasubumi

    2016-06-15

    Deep sequencing of the transcripts of regulatory non-coding RNA generates footprints of post-transcriptional processes. After obtaining sequence reads, the short reads are mapped to a reference genome, and specific mapping patterns can be detected called read mapping profiles, which are distinct from random non-functional degradation patterns. These patterns reflect the maturation processes that lead to the production of shorter RNA sequences. Recent next-generation sequencing studies have revealed not only the typical maturation process of miRNAs but also the various processing mechanisms of small RNAs derived from tRNAs and snoRNAs. We developed an algorithm termed SHARAKU to align two read mapping profiles of next-generation sequencing outputs for non-coding RNAs. In contrast with previous work, SHARAKU incorporates the primary and secondary sequence structures into an alignment of read mapping profiles to allow for the detection of common processing patterns. Using a benchmark simulated dataset, SHARAKU exhibited superior performance to previous methods for correctly clustering the read mapping profiles with respect to 5'-end processing and 3'-end processing from degradation patterns and in detecting similar processing patterns in deriving the shorter RNAs. Further, using experimental data of small RNA sequencing for the common marmoset brain, SHARAKU succeeded in identifying the significant clusters of read mapping profiles for similar processing patterns of small derived RNA families expressed in the brain. The source code of our program SHARAKU is available at http://www.dna.bio.keio.ac.jp/sharaku/, and the simulated dataset used in this work is available at the same link. Accession code: The sequence data from the whole RNA transcripts in the hippocampus of the left brain used in this work is available from the DNA DataBank of Japan (DDBJ) Sequence Read Archive (DRA) under the accession number DRA004502. yasu@bio.keio.ac.jp Supplementary data are available

  4. RNA Sequencing and Coexpression Analysis Reveal Key Genes Involved in α-Linolenic Acid Biosynthesis in Perilla frutescens Seed

    Directory of Open Access Journals (Sweden)

    Tianyuan Zhang

    2017-11-01

    Full Text Available Perilla frutescen is used as traditional food and medicine in East Asia. Its seeds contain high levels of α-linolenic acid (ALA, which is important for health, but is scarce in our daily meals. Previous reports on RNA-seq of perilla seed had identified fatty acid (FA and triacylglycerol (TAG synthesis genes, but the underlying mechanism of ALA biosynthesis and its regulation still need to be further explored. So we conducted Illumina RNA-sequencing in seven temporal developmental stages of perilla seeds. Sequencing generated a total of 127 million clean reads, containing 15.88 Gb of valid data. The de novo assembly of sequence reads yielded 64,156 unigenes with an average length of 777 bp. A total of 39,760 unigenes were annotated and 11,693 unigenes were found to be differentially expressed in all samples. According to Kyoto Encyclopedia of Genes and Genomes (KEGG pathway analysis, 486 unigenes were annotated in the “lipid metabolism” pathway. Of these, 150 unigenes were found to be involved in fatty acid (FA biosynthesis and triacylglycerol (TAG assembly in perilla seeds. A coexpression analysis showed that a total of 104 genes were highly coexpressed (r > 0.95. The coexpression network could be divided into two main subnetworks showing over expression in the medium or earlier and late phases, respectively. In order to identify the putative regulatory genes, a transcription factor (TF analysis was performed. This led to the identification of 45 gene families, mainly including the AP2-EREBP, bHLH, MYB, and NAC families, etc. After coexpression analysis of TFs with highly expression of FAD2 and FAD3 genes, 162 TFs were found to be significantly associated with two FAD genes (r > 0.95. Those TFs were predicted to be the key regulatory factors in ALA biosynthesis in perilla seed. The qRT-PCR analysis also verified the relevance of expression pattern between two FAD genes and partial candidate TFs. Although it has been reported that some TFs

  5. An interspecific fungal hybrid reveals cross-kingdom rules for allopolyploid gene expression patterns.

    Directory of Open Access Journals (Sweden)

    Murray P Cox

    2014-03-01

    Full Text Available Polyploidy, a state in which the chromosome complement has undergone an increase, is a major force in evolution. Understanding the consequences of polyploidy has received much attention, and allopolyploids, which result from the union of two different parental genomes, are of particular interest because they must overcome a suite of biological responses to this merger, known as "genome shock." A key question is what happens to gene expression of the two gene copies following allopolyploidization, but until recently the tools to answer this question on a genome-wide basis were lacking. Here we utilize high throughput transcriptome sequencing to produce the first genome-wide picture of gene expression response to allopolyploidy in fungi. A novel pipeline for assigning sequence reads to the gene copies was used to quantify their expression in a fungal allopolyploid. We find that the transcriptional response to allopolyploidy is predominantly conservative: both copies of most genes are retained; over half the genes inherit parental gene expression patterns; and parental differential expression is often lost in the allopolyploid. Strikingly, the patterns of gene expression change are highly concordant with the genome-wide expression results of a cotton allopolyploid. The very different nature of these two allopolyploids implies a conserved, eukaryote-wide transcriptional response to genome merger. We provide evidence that the transcriptional responses we observe are mostly driven by intrinsic differences between the regulatory systems in the parent species, and from this propose a mechanistic model in which the cross-kingdom conservation in transcriptional response reflects conservation of the mutational processes underlying eukaryotic gene regulatory evolution. This work provides a platform to develop a universal understanding of gene expression response to allopolyploidy and suggests that allopolyploids are an exceptional system to investigate gene

  6. Sequencing of 50 human exomes reveals adaptation to high altitude

    DEFF Research Database (Denmark)

    Yi, Xin; Liang, Yu; Huerta-Sanchez, Emilia

    2010-01-01

    Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18x per individual. Genes showing population-specific allele frequency changes, which repres...... in genetic adaptation to high altitude.......Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18x per individual. Genes showing population-specific allele frequency changes, which...... represent strong candidates for altitude adaptation, were identified. The strongest signal of natural selection came from endothelial Per-Arnt-Sim (PAS) domain protein 1 (EPAS1), a transcription factor involved in response to hypoxia. One single-nucleotide polymorphism (SNP) at EPAS1 shows a 78% frequency...

  7. Visually driven chaining of elementary swim patterns into a goal-directed motor sequence: a virtual reality study of zebrafish prey capture

    Directory of Open Access Journals (Sweden)

    Chintan A Trivedi

    2013-05-01

    Full Text Available Prey capture behavior critically depends on rapid processing of sensory input in order to track, approach and catch the target. When using vision, the nervous system faces the problem of extracting relevant information from a continuous stream of input in order to detect and categorize visible objects as potential prey and to select appropriate motor patterns for approach. For prey capture, many vertebrates exhibit intermittent locomotion, in which discrete motor patterns are chained into a sequence, interrupted by short periods of rest. Here, using high-speed recordings of full-length prey capture sequences performed by freely swimming zebrafish larvae in the presence of a single paramecium, we provide a detailed kinematic analysis of first and subsequent swim bouts during prey capture. Using Fourier analysis, we show that individual swim bouts represent an elementary motor pattern. Changes in orientation are directed towards the target on a graded scale and are implemented by an asymmetric tail bend component superimposed on this basic motor pattern. To further investigate the role of visual feedback on the efficiency and speed of this complex behavior, we developed a closed-loop virtual reality setup in which minimally restrained larvae recapitulated interconnected swim patterns closely resembling those observed during prey capture in freely moving fish. Systematic variation of stimulus properties showed that prey capture is initiated within a narrow range of stimulus size and velocity. Furthermore, variations in the delay and location of swim-triggered visual feedback showed that the reaction time of secondary and later swims is shorter for stimuli that appear within a narrow spatio-temporal window following a swim. This suggests that the larva may generate an expectation of stimulus position, which enables accelerated motor sequencing if the expectation is met by appropriate visual feedback.

  8. Visually driven chaining of elementary swim patterns into a goal-directed motor sequence: a virtual reality study of zebrafish prey capture

    Science.gov (United States)

    Trivedi, Chintan A.; Bollmann, Johann H.

    2013-01-01

    Prey capture behavior critically depends on rapid processing of sensory input in order to track, approach, and catch the target. When using vision, the nervous system faces the problem of extracting relevant information from a continuous stream of input in order to detect and categorize visible objects as potential prey and to select appropriate motor patterns for approach. For prey capture, many vertebrates exhibit intermittent locomotion, in which discrete motor patterns are chained into a sequence, interrupted by short periods of rest. Here, using high-speed recordings of full-length prey capture sequences performed by freely swimming zebrafish larvae in the presence of a single paramecium, we provide a detailed kinematic analysis of first and subsequent swim bouts during prey capture. Using Fourier analysis, we show that individual swim bouts represent an elementary motor pattern. Changes in orientation are directed toward the target on a graded scale and are implemented by an asymmetric tail bend component superimposed on this basic motor pattern. To further investigate the role of visual feedback on the efficiency and speed of this complex behavior, we developed a closed-loop virtual reality setup in which minimally restrained larvae recapitulated interconnected swim patterns closely resembling those observed during prey capture in freely moving fish. Systematic variation of stimulus properties showed that prey capture is initiated within a narrow range of stimulus size and velocity. Furthermore, variations in the delay and location of swim triggered visual feedback showed that the reaction time of secondary and later swims is shorter for stimuli that appear within a narrow spatio-temporal window following a swim. This suggests that the larva may generate an expectation of stimulus position, which enables accelerated motor sequencing if the expectation is met by appropriate visual feedback. PMID:23675322

  9. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    Energy Technology Data Exchange (ETDEWEB)

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by

  10. Genome-wide analysis of ABA-responsive elements ABRE and CE3 reveals divergent patterns in Arabidopsis and rice

    Directory of Open Access Journals (Sweden)

    Riaño-Pachón Diego

    2007-08-01

    Full Text Available Abstract Background In plants, complex regulatory mechanisms are at the core of physiological and developmental processes. The phytohormone abscisic acid (ABA is involved in the regulation of various such processes, including stomatal closure, seed and bud dormancy, and physiological responses to cold, drought and salinity stress. The underlying tissue or plant-wide control circuits often include combinatorial gene regulatory mechanisms and networks that we are only beginning to unravel with the help of new molecular tools. The increasing availability of genomic sequences and gene expression data enables us to dissect ABA regulatory mechanisms at the individual gene expression level. In this paper we used an in-silico-based approach directed towards genome-wide prediction and identification of specific features of ABA-responsive elements. In particular we analysed the genome-wide occurrence and positional arrangements of two well-described ABA-responsive cis-regulatory elements (CREs, ABRE and CE3, in thale cress (Arabidopsis thaliana and rice (Oryza sativa. Results Our results show that Arabidopsis and rice use the ABA-responsive elements ABRE and CE3 distinctively. Earlier reports for various monocots have identified CE3 as a coupling element (CE associated with ABRE. Surprisingly, we found that while ABRE is equally abundant in both species, CE3 is practically absent in Arabidopsis. ABRE-ABRE pairs are common in both genomes, suggesting that these can form functional ABA-responsive complexes (ABRCs in Arabidopsis and rice. Furthermore, we detected distinct combinations, orientation patterns and DNA strand preferences of ABRE and CE3 motifs in rice gene promoters. Conclusion Our computational analyses revealed distinct recruitment patterns of ABA-responsive CREs in upstream sequences of Arabidopsis and rice. The apparent absence of CE3s in Arabidopsis suggests that another CE pairs with ABRE to establish a functional ABRC capable of

  11. Genome-wide analysis of ABA-responsive elements ABRE and CE3 reveals divergent patterns in Arabidopsis and rice.

    Science.gov (United States)

    Gómez-Porras, Judith L; Riaño-Pachón, Diego Mauricio; Dreyer, Ingo; Mayer, Jorge E; Mueller-Roeber, Bernd

    2007-08-01

    In plants, complex regulatory mechanisms are at the core of physiological and developmental processes. The phytohormone abscisic acid (ABA) is involved in the regulation of various such processes, including stomatal closure, seed and bud dormancy, and physiological responses to cold, drought and salinity stress. The underlying tissue or plant-wide control circuits often include combinatorial gene regulatory mechanisms and networks that we are only beginning to unravel with the help of new molecular tools. The increasing availability of genomic sequences and gene expression data enables us to dissect ABA regulatory mechanisms at the individual gene expression level. In this paper we used an in-silico-based approach directed towards genome-wide prediction and identification of specific features of ABA-responsive elements. In particular we analysed the genome-wide occurrence and positional arrangements of two well-described ABA-responsive cis-regulatory elements (CREs), ABRE and CE3, in thale cress (Arabidopsis thaliana) and rice (Oryza sativa). Our results show that Arabidopsis and rice use the ABA-responsive elements ABRE and CE3 distinctively. Earlier reports for various monocots have identified CE3 as a coupling element (CE) associated with ABRE. Surprisingly, we found that while ABRE is equally abundant in both species, CE3 is practically absent in Arabidopsis. ABRE-ABRE pairs are common in both genomes, suggesting that these can form functional ABA-responsive complexes (ABRCs) in Arabidopsis and rice. Furthermore, we detected distinct combinations, orientation patterns and DNA strand preferences of ABRE and CE3 motifs in rice gene promoters. Our computational analyses revealed distinct recruitment patterns of ABA-responsive CREs in upstream sequences of Arabidopsis and rice. The apparent absence of CE3s in Arabidopsis suggests that another CE pairs with ABRE to establish a functional ABRC capable of interacting with transcription factors. Further studies will be

  12. Genetic and genomic diversity studies of Acacia symbionts in Senegal reveal new species of Mesorhizobium with a putative geographical pattern.

    Science.gov (United States)

    Diouf, Fatou; Diouf, Diegane; Klonowska, Agnieszka; Le Queré, Antoine; Bakhoum, Niokhor; Fall, Dioumacor; Neyra, Marc; Parrinello, Hugues; Diouf, Mayecor; Ndoye, Ibrahima; Moulin, Lionel

    2015-01-01

    Acacia senegal (L) Willd. and Acacia seyal Del. are highly nitrogen-fixing and moderately salt tolerant species. In this study we focused on the genetic and genomic diversity of Acacia mesorhizobia symbionts from diverse origins in Senegal and investigated possible correlations between the genetic diversity of the strains, their soil of origin, and their tolerance to salinity. We first performed a multi-locus sequence analysis on five markers gene fragments on a collection of 47 mesorhizobia strains of A. senegal and A. seyal from 8 localities. Most of the strains (60%) clustered with the M. plurifarium type strain ORS 1032T, while the others form four new clades (MSP1 to MSP4). We sequenced and assembled seven draft genomes: four in the M. plurifarium clade (ORS3356, ORS3365, STM8773 and ORS1032T), one in MSP1 (STM8789), MSP2 (ORS3359) and MSP3 (ORS3324). The average nucleotide identities between these genomes together with the MLSA analysis reveal three new species of Mesorhizobium. A great variability of salt tolerance was found among the strains with a lack of correlation between the genetic diversity of mesorhizobia, their salt tolerance and the soils samples characteristics. A putative geographical pattern of A. senegal symbionts between the dryland north part and the center of Senegal was found, reflecting adaptations to specific local conditions such as the water regime. However, the presence of salt does not seem to be an important structuring factor of Mesorhizobium species.

  13. Genome re-sequencing of semi-wild soybean reveals a complex Soja population structure and deep introgression.

    Directory of Open Access Journals (Sweden)

    Jie Qiu

    Full Text Available Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou and a wild line (Lanxi 1 collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1 no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2 besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3 high heterozygous rates (0.19-0.49 were observed in several semi-wild lines; and (4 over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure.

  14. High throughput deep degradome sequencing reveals microRNAs and their targets in response to drought stress in mulberry (Morus alba).

    Science.gov (United States)

    Li, Ruixue; Chen, Dandan; Wang, Taichu; Wan, Yizhen; Li, Rongfang; Fang, Rongjun; Wang, Yuting; Hu, Fei; Zhou, Hong; Li, Long; Zhao, Weiguo

    2017-01-01

    MicroRNAs (miRNAs) play important regulatory roles by targeting mRNAs for cleavage or translational repression. Identification of miRNA targets is essential to better understanding the roles of miRNAs. miRNA targets have not been well characterized in mulberry (Morus alba). To anatomize miRNA guided gene regulation under drought stress, transcriptome-wide high throughput degradome sequencing was used in this study to directly detect drought stress responsive miRNA targets in mulberry. A drought library (DL) and a contrast library (CL) were constructed to capture the cleaved mRNAs for sequencing. In CL, 409 target genes of 30 conserved miRNA families and 990 target genes of 199 novel miRNAs were identified. In DL, 373 target genes of 30 conserved miRNA families and 950 target genes of 195 novel miRNAs were identified. Of the conserved miRNA families in DL, mno-miR156, mno-miR172, and mno-miR396 had the highest number of targets with 54, 52 and 41 transcripts, respectively, indicating that these three miRNA families and their target genes might play important functions in response to drought stress in mulberry. Additionally, we found that many of the target genes were transcription factors. By analyzing the miRNA-target molecular network, we found that the DL independent networks consisted of 838 miRNA-mRNA pairs (63.34%). The expression patterns of 11 target genes and 12 correspondent miRNAs were detected using qRT-PCR. Six miRNA targets were further verified by RNA ligase-mediated 5' rapid amplification of cDNA ends (RLM-5' RACE). Gene Ontology (GO) annotations and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that these target transcripts were implicated in a broad range of biological processes and various metabolic pathways. This is the first study to comprehensively characterize target genes and their associated miRNAs in response to drought stress by degradome sequencing in mulberry. This study provides a framework for understanding

  15. Next-generation sequencing of multiple individuals per barcoded library by deconvolution of sequenced amplicons using endonuclease fragment analysis

    DEFF Research Database (Denmark)

    Andersen, Jeppe D; Pereira, Vania; Pietroni, Carlotta

    2014-01-01

    The simultaneous sequencing of samples from multiple individuals increases the efficiency of next-generation sequencing (NGS) while also reducing costs. Here we describe a novel and simple approach for sequencing DNA from multiple individuals per barcode. Our strategy relies on the endonuclease...... digestion of PCR amplicons prior to library preparation, creating a specific fragment pattern for each individual that can be resolved after sequencing. By using both barcodes and restriction fragment patterns, we demonstrate the ability to sequence the human melanocortin 1 receptor (MC1R) genes from 72...... individuals using only 24 barcoded libraries....

  16. Full-length mRNA sequencing uncovers a widespread coupling between transcription initiation and mRNA processing.

    Science.gov (United States)

    Anvar, Seyed Yahya; Allard, Guy; Tseng, Elizabeth; Sheynkman, Gloria M; de Klerk, Eleonora; Vermaat, Martijn; Yin, Raymund H; Johansson, Hans E; Ariyurek, Yavuz; den Dunnen, Johan T; Turner, Stephen W; 't Hoen, Peter A C

    2018-03-29

    The multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level. Here, we studied the interdependence of transcription initiation, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing. In MCF-7 breast cancer cells, we find 2700 genes with interdependent alternative transcription initiation, splicing and polyadenylation events, both in proximal and distant parts of mRNA molecules, including examples of coupling between transcription start sites and polyadenylation sites. The analysis of three human primary tissues (brain, heart and liver) reveals similar patterns of interdependency between transcription initiation and mRNA processing events. We predict thousands of novel open reading frames from full-length mRNA sequences and obtained evidence for their translation by shotgun proteomics. The mapping database rescues 358 previously unassigned peptides and improves the assignment of others. By recognizing sample-specific amino-acid changes and novel splicing patterns, full-length mRNA sequencing improves proteogenomics analysis of MCF-7 cells. Our findings demonstrate that our understanding of transcriptome complexity is far from complete and provides a basis to reveal largely unresolved mechanisms that coordinate transcription initiation and mRNA processing.

  17. JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures.

    Science.gov (United States)

    Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2012-02-15

    We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.

  18. Structural details (kinks and non-α conformations) in transmembrane helices are intrahelically determined and can be predicted by sequence pattern descriptors

    Science.gov (United States)

    Rigoutsos, Isidore; Riek, Peter; Graham, Robert M.; Novotny, Jiri

    2003-01-01

    One of the promising methods of protein structure prediction involves the use of amino acid sequence-derived patterns. Here we report on the creation of non-degenerate motif descriptors derived through data mining of training sets of residues taken from the transmembrane-spanning segments of polytopic proteins. These residues correspond to short regions in which there is a deviation from the regular α-helical character (i.e. π-helices, 310-helices and kinks). A ‘search engine’ derived from these motif descriptors correctly identifies, and discriminates amongst instances of the above ‘non-canonical’ helical motifs contained in the SwissProt/TrEMBL database of protein primary structures. Our results suggest that deviations from α-helicity are encoded locally in sequence patterns only about 7–9 residues long and can be determined in silico directly from the amino acid sequence. Delineation of such variations in helical habit is critical to understanding the complex structure–function relationships of polytopic proteins and for drug discovery. The success of our current methodology foretells development of similar prediction tools capable of identifying other structural motifs from sequence alone. The method described here has been implemented and is available on the World Wide Web at http://cbcsrv.watson.ibm.com/Ttkw.html. PMID:12888523

  19. Structural details (kinks and non-alpha conformations) in transmembrane helices are intrahelically determined and can be predicted by sequence pattern descriptors.

    Science.gov (United States)

    Rigoutsos, Isidore; Riek, Peter; Graham, Robert M; Novotny, Jiri

    2003-08-01

    One of the promising methods of protein structure prediction involves the use of amino acid sequence-derived patterns. Here we report on the creation of non-degenerate motif descriptors derived through data mining of training sets of residues taken from the transmembrane-spanning segments of polytopic proteins. These residues correspond to short regions in which there is a deviation from the regular alpha-helical character (i.e. pi-helices, 3(10)-helices and kinks). A 'search engine' derived from these motif descriptors correctly identifies, and discriminates amongst instances of the above 'non-canonical' helical motifs contained in the SwissProt/TrEMBL database of protein primary structures. Our results suggest that deviations from alpha-helicity are encoded locally in sequence patterns only about 7-9 residues long and can be determined in silico directly from the amino acid sequence. Delineation of such variations in helical habit is critical to understanding the complex structure-function relationships of polytopic proteins and for drug discovery. The success of our current methodology foretells development of similar prediction tools capable of identifying other structural motifs from sequence alone. The method described here has been implemented and is available on the World Wide Web at http://cbcsrv.watson.ibm.com/Ttkw.html.

  20. Sequence memory based on coherent spin-interaction neural networks.

    Science.gov (United States)

    Xia, Min; Wong, W K; Wang, Zhijie

    2014-12-01

    Sequence information processing, for instance, the sequence memory, plays an important role on many functions of brain. In the workings of the human brain, the steady-state period is alterable. However, in the existing sequence memory models using heteroassociations, the steady-state period cannot be changed in the sequence recall. In this work, a novel neural network model for sequence memory with controllable steady-state period based on coherent spininteraction is proposed. In the proposed model, neurons fire collectively in a phase-coherent manner, which lets a neuron group respond differently to different patterns and also lets different neuron groups respond differently to one pattern. The simulation results demonstrating the performance of the sequence memory are presented. By introducing a new coherent spin-interaction sequence memory model, the steady-state period can be controlled by dimension parameters and the overlap between the input pattern and the stored patterns. The sequence storage capacity is enlarged by coherent spin interaction compared with the existing sequence memory models. Furthermore, the sequence storage capacity has an exponential relationship to the dimension of the neural network.

  1. Geometric Mechanics Reveals Optimal Complex Terrestrial Undulation Patterns

    Science.gov (United States)

    Gong, Chaohui; Astley, Henry; Schiebel, Perrin; Dai, Jin; Travers, Matthew; Goldman, Daniel; Choset, Howie; CMU Team; GT Team

    Geometric mechanics offers useful tools for intuitively analyzing biological and robotic locomotion. However, utility of these tools were previously restricted to systems that have only two internal degrees of freedom and in uniform media. We show kinematics of complex locomotors that make intermittent contacts with substrates can be approximated as a linear combination of two shape bases, and can be represented using two variables. Therefore, the tools of geometric mechanics can be used to analyze motions of locomotors with many degrees of freedom. To demonstrate the proposed technique, we present studies on two different types of snake gaits which utilize combinations of waves in the horizontal and vertical planes: sidewinding (in the sidewinder rattlesnake C. cerastes) and lateral undulation (in the desert specialist snake C. occipitalis). C. cerastes moves by generating posteriorly traveling body waves in the horizontal and vertical directions, with a relative phase offset equal to +/-π/2 while C. occipitalismaintains a π/2 offset of a frequency doubled vertical wave. Geometric analysis reveals these coordination patterns enable optimal movement in the two different styles of undulatory terrestrial locomotion. More broadly, these examples demonstrate the utility of geometric mechanics in analyzing realistic biological and robotic locomotion.

  2. Full Genome Sequencing Reveals New Southern African Territories Genotypes Bringing Us Closer to Understanding True Variability of Foot-and-Mouth Disease Virus in Africa

    Science.gov (United States)

    Lasecka-Dykes, Lidia; Wright, Caroline F.; Di Nardo, Antonello; Logan, Grace; Mioulet, Valerie; Jackson, Terry; Tuthill, Tobias J.; Knowles, Nick J.; King, Donald P.

    2018-01-01

    Foot-and-mouth disease virus (FMDV) causes a highly contagious disease of cloven-hooved animals that poses a constant burden on farmers in endemic regions and threatens the livestock industries in disease-free countries. Despite the increased number of publicly available whole genome sequences, FMDV data are biased by the opportunistic nature of sampling. Since whole genomic sequences of Southern African Territories (SAT) are particularly underrepresented, this study sequenced 34 isolates from eastern and southern Africa. Phylogenetic analyses revealed two novel genotypes (that comprised 8/34 of these SAT isolates) which contained unusual 5′ untranslated and non-structural encoding regions. While recombination has occurred between these sequences, phylogeny violation analyses indicated that the high degree of sequence diversity for the novel SAT genotypes has not solely arisen from recombination events. Based on estimates of the timing of ancestral divergence, these data are interpreted as being representative of un-sampled FMDV isolates that have been subjected to geographical isolation within Africa by the effects of the Great African Rinderpest Pandemic (1887–1897), which caused a mass die-out of FMDV-susceptible hosts. These findings demonstrate that further sequencing of African FMDV isolates is likely to reveal more unusual genotypes and will allow for better understanding of natural variability and evolution of FMDV. PMID:29652800

  3. Electrophoretic mobility shift assay reveals a novel recognition sequence for Setaria italica NAC protein.

    Science.gov (United States)

    Puranik, Swati; Kumar, Karunesh; Srivastava, Prem S; Prasad, Manoj

    2011-10-01

    The NAC (NAM/ATAF1,2/CUC2) proteins are among the largest family of plant transcription factors. Its members have been associated with diverse plant processes and intricately regulate the expression of several genes. Inspite of this immense progress, knowledge of their DNA-binding properties are still limited. In our recent publication,1 we reported isolation of a membrane-associated NAC domain protein from Setaria italica (SiNAC). Transactivation analysis revealed that it was a functionally active transcription factor as it could stimulate expression of reporter genes in vivo. Truncations of the transmembrane region of the protein lead to its nuclear localization. Here we describe expression and purification of SiNAC DNA-binding domain. We further report identification of a novel DNA-binding site, [C/G][A/T][T/A][G/C]TC[C/G][A/T][C/G][G/C] for SiNAC by electrophoretic mobility shift assay. The SiNAC-GST protein could bind to the NAC recognition sequence in vitro as well as to sequences where some bases had been reshuffled. The results presented here contribute to our understanding of the DNA-binding specificity of SiNAC protein.

  4. Effects of tonal language background on tests of temporal sequencing in children.

    Science.gov (United States)

    Mukari, Siti Zamratol-Mai S; Yu, Xuan; Ishak, Wan Syafira; Mazlan, Rafidah

    2015-01-01

    The aims of the present study were to determine the effects of language background on the performance of the pitch pattern sequence test (PPST) and duration pattern sequence test (DPST). As temporal order sequencing may be affected by age and working memory, these factors were also studied. Performance of tonal and non-tonal language speakers on PPST and DPST were compared. Twenty-eight native Mandarin (tonal language) speakers and twenty-nine native Malay (non-tonal language) speakers between seven to nine years old participated in this study. The results revealed that relative to native Malay speakers, native Mandarin speakers demonstrated better scores on the PPST in both humming and verbal labeling responses. However, a similar language effect was not apparent in the DPST. An age effect was only significant in the PPST (verbal labeling). Finally, no significant effect of working memory was found on the PPST and the DPST. These findings suggest that the PPST is affected by tonal language background, and highlight the importance of developing different normative values for tonal and non-tonal language speakers.

  5. Evolutionary patterns in the sequence and structure of transfer RNA: early origins of archaea and viruses.

    Directory of Open Access Journals (Sweden)

    Feng-Jie Sun

    2008-03-01

    Full Text Available Transfer RNAs (tRNAs are ancient molecules that are central to translation. Since they probably carry evolutionary signatures that were left behind when the living world diversified, we reconstructed phylogenies directly from the sequence and structure of tRNA using well-established phylogenetic methods. The trees placed tRNAs with long variable arms charging Sec, Tyr, Ser, and Leu consistently at the base of the rooted phylogenies, but failed to reveal groupings that would indicate clear evolutionary links to organismal origin or molecular functions. In order to uncover evolutionary patterns in the trees, we forced tRNAs into monophyletic groups using constraint analyses to generate timelines of organismal diversification and test competing evolutionary hypotheses. Remarkably, organismal timelines showed Archaea was the most ancestral superkingdom, followed by viruses, then superkingdoms Eukarya and Bacteria, in that order, supporting conclusions from recent phylogenomic studies of protein architecture. Strikingly, constraint analyses showed that the origin of viruses was not only ancient, but was linked to Archaea. Our findings have important implications. They support the notion that the archaeal lineage was very ancient, resulted in the first organismal divide, and predated diversification of tRNA function and specificity. Results are also consistent with the concept that viruses contributed to the development of the DNA replication machinery during the early diversification of the living world.

  6. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing.

    Science.gov (United States)

    Zhang, Yanying; Yang, Qingsong; Ling, Juan; Van Nostrand, Joy D; Shi, Zhou; Zhou, Jizhong; Dong, Junde

    2017-01-01

    Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata , Avicennia marina , and Ceriops tagal , was undertaken using high - throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem.

  7. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Yanying Zhang

    2017-10-01

    Full Text Available Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata, Avicennia marina, and Ceriops tagal, was undertaken using high-throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem.

  8. Analysis of 90 Mb of the potato genome reveals conservation of gene structures and order with tomato but divergence in repetitive sequence composition

    Directory of Open Access Journals (Sweden)

    O'Brien Kimberly

    2008-06-01

    Full Text Available Abstract Background The Solanaceae family contains a number of important crop species including potato (Solanum tuberosum which is grown for its underground storage organ known as a tuber. Albeit the 4th most important food crop in the world, other than a collection of ~220,000 Expressed Sequence Tags, limited genomic sequence information is currently available for potato and advances in potato yield and nutrition content would be greatly assisted through access to a complete genome sequence. While morphologically diverse, Solanaceae species such as potato, tomato, pepper, and eggplant share not only genes but also gene order thereby permitting highly informative comparative genomic analyses. Results In this study, we report on analysis 89.9 Mb of potato genomic sequence representing 10.2% of the genome generated through end sequencing of a potato bacterial artificial chromosome (BAC clone library (87 Mb and sequencing of 22 potato BAC clones (2.9 Mb. The GC content of potato is very similar to Solanum lycopersicon (tomato and other dicotyledonous species yet distinct from the monocotyledonous grass species, Oryza sativa. Parallel analyses of repetitive sequences in potato and tomato revealed substantial differences in their abundance, 34.2% in potato versus 46.3% in tomato, which is consistent with the increased genome size per haploid genome of these two Solanum species. Specific classes and types of repetitive sequences were also differentially represented between these two species including a telomeric-related repetitive sequence, ribosomal DNA, and a number of unclassified repetitive sequences. Comparative analyses between tomato and potato at the gene level revealed a high level of conservation of gene content, genic feature, and gene order although discordances in synteny were observed. Conclusion Genomic level analyses of potato and tomato confirm that gene sequence and gene order are conserved between these solanaceous species and that

  9. Uncommon nucleotide excision repair phenotypes revealed by targeted high-throughput sequencing.

    Science.gov (United States)

    Calmels, Nadège; Greff, Géraldine; Obringer, Cathy; Kempf, Nadine; Gasnier, Claire; Tarabeux, Julien; Miguet, Marguerite; Baujat, Geneviève; Bessis, Didier; Bretones, Patricia; Cavau, Anne; Digeon, Béatrice; Doco-Fenzy, Martine; Doray, Bérénice; Feillet, François; Gardeazabal, Jesus; Gener, Blanca; Julia, Sophie; Llano-Rivas, Isabel; Mazur, Artur; Michot, Caroline; Renaldo-Robin, Florence; Rossi, Massimiliano; Sabouraud, Pascal; Keren, Boris; Depienne, Christel; Muller, Jean; Mandel, Jean-Louis; Laugel, Vincent

    2016-03-22

    Deficient nucleotide excision repair (NER) activity causes a variety of autosomal recessive diseases including xeroderma pigmentosum (XP) a disorder which pre-disposes to skin cancer, and the severe multisystem condition known as Cockayne syndrome (CS). In view of the clinical overlap between NER-related disorders, as well as the existence of multiple phenotypes and the numerous genes involved, we developed a new diagnostic approach based on the enrichment of 16 NER-related genes by multiplex amplification coupled with next-generation sequencing (NGS). Our test cohort consisted of 11 DNA samples, all with known mutations and/or non pathogenic SNPs in two of the tested genes. We then used the same technique to analyse samples from a prospective cohort of 40 patients. Multiplex amplification and sequencing were performed using AmpliSeq protocol on the Ion Torrent PGM (Life Technologies). We identified causative mutations in 17 out of the 40 patients (43%). Four patients showed biallelic mutations in the ERCC6(CSB) gene, five in the ERCC8(CSA) gene: most of them had classical CS features but some had very mild and incomplete phenotypes. A small cohort of 4 unrelated classic XP patients from the Basque country (Northern Spain) revealed a common splicing mutation in POLH (XP-variant), demonstrating a new founder effect in this population. Interestingly, our results also found ERCC2(XPD), ERCC3(XPB) or ERCC5(XPG) mutations in two cases of UV-sensitive syndrome and in two cases with mixed XP/CS phenotypes. Our study confirms that NGS is an efficient technique for the analysis of NER-related disorders on a molecular level. It is particularly useful for phenotypes with combined features or unusually mild symptoms. Targeted NGS used in conjunction with DNA repair functional tests and precise clinical evaluation permits rapid and cost-effective diagnosis in patients with NER-defects.

  10. Long-term acoustical observations of the mesopelagic fish Maurolicus muelleri reveal novel and varied vertical migration patterns

    KAUST Repository

    Staby, A; Rø stad, Anders; Kaartvedt, Stein

    2011-01-01

    . The data revealed known patterns as normal diel vertical migration (DVM), midnight sinking between dusk and dawn, and periods without migrations, as well as novel behaviours consisting of early morning ascents, reverse diel vertical migrations

  11. Analysis of sequences from field samples reveals the presence of the recently described pepper vein yellows virus (genus Polerovirus) in six additional countries.

    Science.gov (United States)

    Knierim, Dennis; Tsai, Wen-Shi; Kenyon, Lawrence

    2013-06-01

    Polerovirus infection was detected by reverse transcription polymerase chain reaction (RT-PCR) in 29 pepper plants (Capsicum spp.) and one black nightshade plant (Solanum nigrum) sample collected from fields in India, Indonesia, Mali, Philippines, Thailand and Taiwan. At least two representative samples for each country were selected to generate a general polerovirus RT-PCR product of 1.4 kb length for sequencing. Sequence analysis of the partial genome sequences revealed the presence of pepper vein yellows virus (PeVYV) in all 13 samples. A 1990 Australian herbarium sample of pepper described by serological means as infected with capsicum yellows virus (CYV) was identified by sequence analysis of a partial CP sequence as probably infected with a potato leaf roll virus (PLRV) isolate.

  12. Transcriptome analysis reveals novel patterning and pigmentation genes underlying Heliconius butterfly wing pattern variation

    Directory of Open Access Journals (Sweden)

    Hines Heather M

    2012-06-01

    Full Text Available Abstract Background Heliconius butterfly wing pattern diversity offers a unique opportunity to investigate how natural genetic variation can drive the evolution of complex adaptive phenotypes. Positional cloning and candidate gene studies have identified a handful of regulatory and pigmentation genes implicated in Heliconius wing pattern variation, but little is known about the greater developmental networks within which these genes interact to pattern a wing. Here we took a large-scale transcriptomic approach to identify the network of genes involved in Heliconius wing pattern development and variation. This included applying over 140 transcriptome microarrays to assay gene expression in dissected wing pattern elements across a range of developmental stages and wing pattern morphs of Heliconius erato. Results We identified a number of putative early prepattern genes with color-pattern related expression domains. We also identified 51 genes differentially expressed in association with natural color pattern variation. Of these, the previously identified color pattern “switch gene” optix was recovered as the first transcript to show color-specific differential expression. Most differentially expressed genes were transcribed late in pupal development and have roles in cuticle formation or pigment synthesis. These include previously undescribed transporter genes associated with ommochrome pigmentation. Furthermore, we observed upregulation of melanin-repressing genes such as ebony and Dat1 in non-melanic patterns. Conclusions This study identifies many new genes implicated in butterfly wing pattern development and provides a glimpse into the number and types of genes affected by variation in genes that drive color pattern evolution.

  13. How They Move Reveals What Is Happening: Understanding the Dynamics of Big Events from Human Mobility Pattern

    Directory of Open Access Journals (Sweden)

    Jean Damascène Mazimpaka

    2017-01-01

    Full Text Available The context in which a moving object moves contributes to the movement pattern observed. Likewise, the movement pattern reflects the properties of the movement context. In particular, big events influence human mobility depending on the dynamics of the events. However, this influence has not been explored to understand big events. In this paper, we propose a methodology for learning about big events from human mobility pattern. The methodology involves extracting and analysing the stopping, approaching, and moving-away interactions between public transportation vehicles and the geographic context. The analysis is carried out at two different temporal granularity levels to discover global and local patterns. The results of evaluating this methodology on bus trajectories demonstrate that it can discover occurrences of big events from mobility patterns, roughly estimate the event start and end time, and reveal the temporal patterns of arrival and departure of event attendees. This knowledge can be usefully applied in transportation and event planning and management.

  14. Initial uncertainty impacts statistical learning in sound sequence processing.

    Science.gov (United States)

    Todd, Juanita; Provost, Alexander; Whitson, Lisa; Mullens, Daniel

    2016-11-01

    This paper features two studies confirming a lasting impact of first learning on how subsequent experience is weighted in early relevance-filtering processes. In both studies participants were exposed to sequences of sound that contained a regular pattern on two different timescales. Regular patterning in sound is readily detected by the auditory system and used to form "prediction models" that define the most likely properties of sound to be encountered in a given context. The presence and strength of these prediction models is inferred from changes in automatically elicited components of auditory evoked potentials. Both studies employed sound sequences that contained both a local and longer-term pattern. The local pattern was defined by a regular repeating pure tone occasionally interrupted by a rare deviating tone (p=0.125) that was physically different (a 30msvs. 60ms duration difference in one condition and a 1000Hz vs. 1500Hz frequency difference in the other). The longer-term pattern was defined by the rate at which the two tones alternated probabilities (i.e., the tone that was first rare became common and the tone that was first common became rare). There was no task related to the tones and participants were asked to ignore them while focussing attention on a movie with subtitles. Auditory-evoked potentials revealed long lasting modulatory influences based on whether the tone was initially encountered as rare and unpredictable or common and predictable. The results are interpreted as evidence that probability (or indeed predictability) assigns a differential information-value to the two tones that in turn affects the extent to which prediction models are updated and imposed. These effects are exposed for both common and rare occurrences of the tones. The studies contribute to a body of work that reveals that probabilistic information is not faithfully represented in these early evoked potentials and instead exposes that predictability (or conversely

  15. Profiling of wheat class III peroxidase genes derived from powdery mildew-attacked epidermis reveals distinct sequence-associated expression patterns.

    Science.gov (United States)

    Liu, Guosheng; Sheng, Xiaoyan; Greenshields, David L; Ogieglo, Adam; Kaminskyj, Susan; Selvaraj, Gopalan; Wei, Yangdou

    2005-07-01

    A cDNA library was constructed from leaf epidermis of diploid wheat (Triticum monococcum) infected with the powdery mildew fungus (Blumeria graminis f. sp. tritici) and was screened for genes encoding peroxidases. From 2,500 expressed sequence tags (ESTs), 36 cDNAs representing 10 peroxidase genes (designated TmPRX1 to TmPRX10) were isolated and further characterized. Alignment of the deduced amino acid sequences and phylogenetic clustering with peroxidases from other plant species demonstrated that these peroxidases fall into four distinct groups. Differential expression and tissue-specific localization among the members were observed during the B. graminis f. sp. tritici attack using Northern blots and reverse-transcriptase polymerase chain reaction analyses. Consistent with its abundance in the EST collection, TmPRX1 expression showed the highest induction during pathogen attack and fluctuated in response to the fungal parasitic stages. TmPRX1 to TmPRX6 were expressed predominantly in mesophyll cells, whereas TmPRX7 to TmPRX10, which feature a putative C-terminal propeptide, were detectable mainly in epidermal cells. Using TmPRX8 as a representative, we demonstrated that its C-terminal propeptide was sufficient to target a green fluorescent protein fusion protein to the vacuoles in onion cells. Finally, differential expression profiles of the TmPRXs after abiotic stresses and signal molecule treatments were used to dissect the potential role of these peroxidases in multiple stress and defense pathways.

  16. Movement patterns and dispersal potential of Pecos bluntnose shiner (Notropis simus pecosensis) revealed using otolith microchemistry

    Science.gov (United States)

    Chase, Nathan M.; Caldwell, Colleen A.; Carleton, Scott A.; Gould, William R.; Hobbs, James A.

    2015-01-01

    Natal origin and dispersal potential of the federally threatened Pecos bluntnose shiner (Notropis simus pecosensis) were successfully characterized using otolith microchemistry and swimming performance trials. Strontium isotope ratios (87Sr:86Sr) of otoliths within the resident plains killifish (Fundulus zebrinus) were successfully used as a surrogate for strontium isotope ratios in water and revealed three isotopically distinct reaches throughout 297 km of the Pecos River, New Mexico, USA. Two different life history movement patterns were revealed in Pecos bluntnose shiner. Eggs and fry were either retained in upper river reaches or passively dispersed downriver followed by upriver movement during the first year of life, with some fish achieving a minimum movement of 56 km. Swimming ability of Pecos bluntnose shiner confirmed upper critical swimming speeds (Ucrit) as high as 43.8 cm·s−1 and 20.6 body lengths·s−1 in 30 days posthatch fish. Strong swimming ability early in life supports our observations of upriver movement using otolith microchemistry and confirms movement patterns that were previously unknown for the species. Understanding patterns of dispersal of this and other small-bodied fishes using otolith microchemistry may help redirect conservation and management efforts for Great Plains fishes.

  17. Phylogenetic inferences of Nepenthes species in Peninsular Malaysia revealed by chloroplast (trnL intron) and nuclear (ITS) DNA sequences.

    Science.gov (United States)

    Bunawan, Hamidun; Yen, Choong Chee; Yaakop, Salmah; Noor, Normah Mohd

    2017-01-26

    The chloroplastic trnL intron and the nuclear internal transcribed spacer (ITS) region were sequenced for 11 Nepenthes species recorded in Peninsular Malaysia to examine their phylogenetic relationship and to evaluate the usage of trnL intron and ITS sequences for phylogenetic reconstruction of this genus. Phylogeny reconstruction was carried out using neighbor-joining, maximum parsimony and Bayesian analyses. All the trees revealed two major clusters, a lowland group consisting of N. ampullaria, N. mirabilis, N. gracilis and N. rafflesiana, and another containing both intermediately distributed species (N. albomarginata and N. benstonei) and four highland species (N. sanguinea, N. macfarlanei, N. ramispina and N. alba). The trnL intron and ITS sequences proved to provide phylogenetic informative characters for deriving a phylogeny of Nepenthes species in Peninsular Malaysia. To our knowledge, this is the first molecular phylogenetic study of Nepenthes species occurring along an altitudinal gradient in Peninsular Malaysia.

  18. Genetic and genomic diversity studies of Acacia symbionts in Senegal reveal new species of Mesorhizobium with a putative geographical pattern.

    Directory of Open Access Journals (Sweden)

    Fatou Diouf

    Full Text Available Acacia senegal (L Willd. and Acacia seyal Del. are highly nitrogen-fixing and moderately salt tolerant species. In this study we focused on the genetic and genomic diversity of Acacia mesorhizobia symbionts from diverse origins in Senegal and investigated possible correlations between the genetic diversity of the strains, their soil of origin, and their tolerance to salinity. We first performed a multi-locus sequence analysis on five markers gene fragments on a collection of 47 mesorhizobia strains of A. senegal and A. seyal from 8 localities. Most of the strains (60% clustered with the M. plurifarium type strain ORS 1032T, while the others form four new clades (MSP1 to MSP4. We sequenced and assembled seven draft genomes: four in the M. plurifarium clade (ORS3356, ORS3365, STM8773 and ORS1032T, one in MSP1 (STM8789, MSP2 (ORS3359 and MSP3 (ORS3324. The average nucleotide identities between these genomes together with the MLSA analysis reveal three new species of Mesorhizobium. A great variability of salt tolerance was found among the strains with a lack of correlation between the genetic diversity of mesorhizobia, their salt tolerance and the soils samples characteristics. A putative geographical pattern of A. senegal symbionts between the dryland north part and the center of Senegal was found, reflecting adaptations to specific local conditions such as the water regime. However, the presence of salt does not seem to be an important structuring factor of Mesorhizobium species.

  19. Coseismic deformation pattern of the Emilia 2012 seismic sequence imaged by Radarsat-1 interferometry

    Directory of Open Access Journals (Sweden)

    Christian Bignami

    2012-10-01

    Full Text Available On May 20 and 29, 2012, two earthquakes of magnitudes 5.9 and 5.8 (Mw, respectively, and their aftershock sequences hit the central Po Plain (Italy, about 40 km north of Bologna. More than 2,000 sizable aftershocks were recorded by the Isti-tuto Nazionale di Geofisica e Vulcanologia (INGV; National Institute of Geophysics and Volcanology National Seismic Network (http://iside.rm.ingv.it/. The sequence was generated by pure compressional faulting over blind thrusts of the western Ferrara Arc, and it involved a 50-km-long stretch of this buried outer front of the northern Apennines. The focal mechanisms of the larger shocks agree with available structural data and with present-day tectonic stress indicators, which show locally a maximum horizontal stress oriented ca. N-S; i.e. oriented perpendicular to the main structural trends. Most of the sequence occurred between 1 km and 12 km in depth, above the local basal detachment of the outer thrust fronts of the northern Apennines. We measured the surface displacement patterns associated with the mainshocks and some of the larger aftershocks (some of which had Mw >5.0 by applying the Interferometric Synthetic Aperture Radar (InSAR technique to a pair of C-Band Radarsat-1 images. We then used the coseismic motions detected over the epicentral region as input information, to obtain the best-fit model fault for the two largest shocks. […

  20. Patterns of Limnohabitans Microdiversity across a Large Set of Freshwater Habitats as Revealed by Reverse Line Blot Hybridization

    Science.gov (United States)

    Jezbera, Jan; Jezberová, Jitka; Kasalický, Vojtěch; Šimek, Karel; Hahn, Martin W.

    2013-01-01

    Among abundant freshwater Betaproteobacteria, only few groups are considered to be of central ecological importance. One of them is the well-studied genus Limnohabitans and mainly its R-BT subcluster, investigated previously mainly by fluorescence in situ hybridization methods. We designed, based on sequences from a large Limnohabitans culture collection, 18 RLBH (Reverse Line Blot Hybridization) probes specific for different groups within the genus Limnohabitans by targeting diagnostic sequences on their 16 S–23 S rRNA ITS regions. The developed probes covered in sum 92% of the available isolates. This set of probes was applied to environmental DNA originating from 161 different European standing freshwater habitats to reveal the microdiversity (intra-genus) patterns of the Limnohabitans genus along a pH gradient. Investigated habitats differed in various physicochemical parameters, and represented a very broad range of standing freshwater habitats. The Limnohabitans microdiversity, assessed as number of RLBH-defined groups detected, increased significantly along the gradient of rising pH of habitats. 14 out of 18 probes returned detection signals that allowed predictions on the distribution of distinct Limnohabitans groups. Most probe-defined Limnohabitans groups showed preferences for alkaline habitats, one for acidic, and some seemed to lack preferences. Complete niche-separation was indicated for some of the probe-targeted groups. Moreover, bimodal distributions observed for some groups of Limnohabitans, suggested further niche separation between genotypes within the same probe-defined group. Statistical analyses suggested that different environmental parameters such as pH, conductivity, oxygen and altitude influenced the distribution of distinct groups. The results of our study do not support the hypothesis that the wide ecological distribution of Limnohabitans bacteria in standing freshwater habitats results from generalist adaptations of these bacteria

  1. Patterns of Limnohabitans microdiversity across a large set of freshwater habitats as revealed by Reverse Line Blot Hybridization.

    Directory of Open Access Journals (Sweden)

    Jan Jezbera

    Full Text Available Among abundant freshwater Betaproteobacteria, only few groups are considered to be of central ecological importance. One of them is the well-studied genus Limnohabitans and mainly its R-BT subcluster, investigated previously mainly by fluorescence in situ hybridization methods. We designed, based on sequences from a large Limnohabitans culture collection, 18 RLBH (Reverse Line Blot Hybridization probes specific for different groups within the genus Limnohabitans by targeting diagnostic sequences on their 16 S-23 S rRNA ITS regions. The developed probes covered in sum 92% of the available isolates. This set of probes was applied to environmental DNA originating from 161 different European standing freshwater habitats to reveal the microdiversity (intra-genus patterns of the Limnohabitans genus along a pH gradient. Investigated habitats differed in various physicochemical parameters, and represented a very broad range of standing freshwater habitats. The Limnohabitans microdiversity, assessed as number of RLBH-defined groups detected, increased significantly along the gradient of rising pH of habitats. 14 out of 18 probes returned detection signals that allowed predictions on the distribution of distinct Limnohabitans groups. Most probe-defined Limnohabitans groups showed preferences for alkaline habitats, one for acidic, and some seemed to lack preferences. Complete niche-separation was indicated for some of the probe-targeted groups. Moreover, bimodal distributions observed for some groups of Limnohabitans, suggested further niche separation between genotypes within the same probe-defined group. Statistical analyses suggested that different environmental parameters such as pH, conductivity, oxygen and altitude influenced the distribution of distinct groups. The results of our study do not support the hypothesis that the wide ecological distribution of Limnohabitans bacteria in standing freshwater habitats results from generalist adaptations of

  2. Multiple oxygen tension environments reveal diverse patterns of transcriptional regulation in primary astrocytes.

    Directory of Open Access Journals (Sweden)

    Wayne Chadwick

    Full Text Available The central nervous system normally functions at O(2 levels which would be regarded as hypoxic by most other tissues. However, most in vitro studies of neurons and astrocytes are conducted under hyperoxic conditions without consideration of O(2-dependent cellular adaptation. We analyzed the reactivity of astrocytes to 1, 4 and 9% O(2 tensions compared to the cell culture standard of 20% O(2, to investigate their ability to sense and translate this O(2 information to transcriptional activity. Variance of ambient O(2 tension for rat astrocytes resulted in profound changes in ribosomal activity, cytoskeletal and energy-regulatory mechanisms and cytokine-related signaling. Clustering of transcriptional regulation patterns revealed four distinct response pattern groups that directionally pivoted around the 4% O(2 tension, or demonstrated coherent ascending/decreasing gene expression patterns in response to diverse oxygen tensions. Immune response and cell cycle/cancer-related signaling pathway transcriptomic subsets were significantly activated with increasing hypoxia, whilst hemostatic and cardiovascular signaling mechanisms were attenuated with increasing hypoxia. Our data indicate that variant O(2 tensions induce specific and physiologically-focused transcript regulation patterns that may underpin important physiological mechanisms that connect higher neurological activity to astrocytic function and ambient oxygen environments. These strongly defined patterns demonstrate a strong bias for physiological transcript programs to pivot around the 4% O(2 tension, while uni-modal programs that do not, appear more related to pathological actions. The functional interaction of these transcriptional 'programs' may serve to regulate the dynamic vascular responsivity of the central nervous system during periods of stress or heightened activity.

  3. Robust and efficient multi-frequency temporal phase unwrapping: optimal fringe frequency and pattern sequence selection.

    Science.gov (United States)

    Zhang, Minliang; Chen, Qian; Tao, Tianyang; Feng, Shijie; Hu, Yan; Li, Hui; Zuo, Chao

    2017-08-21

    Temporal phase unwrapping (TPU) is an essential algorithm in fringe projection profilometry (FPP), especially when measuring complex objects with discontinuities and isolated surfaces. Among others, the multi-frequency TPU has been proven to be the most reliable algorithm in the presence of noise. For a practical FPP system, in order to achieve an accurate, efficient, and reliable measurement, one needs to make wise choices about three key experimental parameters: the highest fringe frequency, the phase-shifting steps, and the fringe pattern sequence. However, there was very little research on how to optimize these parameters quantitatively, especially considering all three aspects from a theoretical and analytical perspective simultaneously. In this work, we propose a new scheme to determine simultaneously the optimal fringe frequency, phase-shifting steps and pattern sequence under multi-frequency TPU, robustly achieving high accuracy measurement by a minimum number of fringe frames. Firstly, noise models regarding phase-shifting algorithms as well as 3-D coordinates are established under a projector defocusing condition, which leads to the optimal highest fringe frequency for a FPP system. Then, a new concept termed frequency-to-frame ratio (FFR) that evaluates the magnitude of the contribution of each frame for TPU is defined, on which an optimal phase-shifting combination scheme is proposed. Finally, a judgment criterion is established, which can be used to judge whether the ratio between adjacent fringe frequencies is conducive to stably and efficiently unwrapping the phase. The proposed method provides a simple and effective theoretical framework to improve the accuracy, efficiency, and robustness of a practical FPP system in actual measurement conditions. The correctness of the derived models as well as the validity of the proposed schemes have been verified through extensive simulations and experiments. Based on a normal monocular 3-D FPP hardware system

  4. The complete genome sequence of Fibrobacter succinogenes S85 reveals a cellulolytic and metabolic specialist.

    Directory of Open Access Journals (Sweden)

    Garret Suen

    Full Text Available Fibrobacter succinogenes is an important member of the rumen microbial community that converts plant biomass into nutrients usable by its host. This bacterium, which is also one of only two cultivated species in its phylum, is an efficient and prolific degrader of cellulose. Specifically, it has a particularly high activity against crystalline cellulose that requires close physical contact with this substrate. However, unlike other known cellulolytic microbes, it does not degrade cellulose using a cellulosome or by producing high extracellular titers of cellulase enzymes. To better understand the biology of F. succinogenes, we sequenced the genome of the type strain S85 to completion. A total of 3,085 open reading frames were predicted from its 3.84 Mbp genome. Analysis of sequences predicted to encode for carbohydrate-degrading enzymes revealed an unusually high number of genes that were classified into 49 different families of glycoside hydrolases, carbohydrate binding modules (CBMs, carbohydrate esterases, and polysaccharide lyases. Of the 31 identified cellulases, none contain CBMs in families 1, 2, and 3, typically associated with crystalline cellulose degradation. Polysaccharide hydrolysis and utilization assays showed that F. succinogenes was able to hydrolyze a number of polysaccharides, but could only utilize the hydrolytic products of cellulose. This suggests that F. succinogenes uses its array of hemicellulose-degrading enzymes to remove hemicelluloses to gain access to cellulose. This is reflected in its genome, as F. succinogenes lacks many of the genes necessary to transport and metabolize the hydrolytic products of non-cellulose polysaccharides. The F. succinogenes genome reveals a bacterium that specializes in cellulose as its sole energy source, and provides insight into a novel strategy for cellulose degradation.

  5. Self-Organization of Spatio-Temporal Hierarchy via Learning of Dynamic Visual Image Patterns on Action Sequences.

    Science.gov (United States)

    Jung, Minju; Hwang, Jungsik; Tani, Jun

    2015-01-01

    It is well known that the visual cortex efficiently processes high-dimensional spatial information by using a hierarchical structure. Recently, computational models that were inspired by the spatial hierarchy of the visual cortex have shown remarkable performance in image recognition. Up to now, however, most biological and computational modeling studies have mainly focused on the spatial domain and do not discuss temporal domain processing of the visual cortex. Several studies on the visual cortex and other brain areas associated with motor control support that the brain also uses its hierarchical structure as a processing mechanism for temporal information. Based on the success of previous computational models using spatial hierarchy and temporal hierarchy observed in the brain, the current report introduces a novel neural network model for the recognition of dynamic visual image patterns based solely on the learning of exemplars. This model is characterized by the application of both spatial and temporal constraints on local neural activities, resulting in the self-organization of a spatio-temporal hierarchy necessary for the recognition of complex dynamic visual image patterns. The evaluation with the Weizmann dataset in recognition of a set of prototypical human movement patterns showed that the proposed model is significantly robust in recognizing dynamically occluded visual patterns compared to other baseline models. Furthermore, an evaluation test for the recognition of concatenated sequences of those prototypical movement patterns indicated that the model is endowed with a remarkable capability for the contextual recognition of long-range dynamic visual image patterns.

  6. Mapping the transcription start points of the Staphylococcus aureus eap, emp, and vwb promoters reveals a conserved octanucleotide sequence that is essential for expression of these genes.

    Science.gov (United States)

    Harraghy, Niamh; Homerova, Dagmar; Herrmann, Mathias; Kormanec, Jan

    2008-01-01

    Mapping the transcription start points of the eap, emp, and vwb promoters revealed a conserved octanucleotide sequence (COS). Deleting this sequence abolished the expression of eap, emp, and vwb. However, electrophoretic mobility shift assays gave no evidence that this sequence was a binding site for SarA or SaeR, known regulators of eap and emp.

  7. Process time optimization of robotic remote laser cutting by utilizing customized beam patterns and redundancy space task sequencing

    DEFF Research Database (Denmark)

    Villumsen, Sigurd

    This dissertation is written as a part of the ROBOCUT project which concerns the development of a new laser cutting technology that seeks to increase the performance of traditional and remote laser cutting by using beam shaping technologies. The resulting customized beam patterns are obtained by ...... axes of the laser cutting system and transforming the sequencing problem into a generalized traveling salesman problem (GTSP)....

  8. Prokaryotic caspase homologs: phylogenetic patterns and functional characteristics reveal considerable diversity.

    Directory of Open Access Journals (Sweden)

    Johannes Asplund-Samuelsson

    Full Text Available Caspases accomplish initiation and execution of apoptosis, a programmed cell death process specific to metazoans. The existence of prokaryotic caspase homologs, termed metacaspases, has been known for slightly more than a decade. Despite their potential connection to the evolution of programmed cell death in eukaryotes, the phylogenetic distribution and functions of these prokaryotic metacaspase sequences are largely uncharted, while a few experiments imply involvement in programmed cell death. Aiming at providing a more detailed picture of prokaryotic caspase homologs, we applied a computational approach based on Hidden Markov Model search profiles to identify and functionally characterize putative metacaspases in bacterial and archaeal genomes. Out of the total of 1463 analyzed genomes, merely 267 (18% were identified to contain putative metacaspases, but their taxonomic distribution included most prokaryotic phyla and a few archaea (Euryarchaeota. Metacaspases were particularly abundant in Alphaproteobacteria, Deltaproteobacteria and Cyanobacteria, which harbor many morphologically and developmentally complex organisms, and a distinct correlation was found between abundance and phenotypic complexity in Cyanobacteria. Notably, Bacillus subtilis and Escherichia coli, known to undergo genetically regulated autolysis, lacked metacaspases. Pfam domain architecture analysis combined with operon identification revealed rich and varied configurations among the metacaspase sequences. These imply roles in programmed cell death, but also e.g. in signaling, various enzymatic activities and protein modification. Together our data show a wide and scattered distribution of caspase homologs in prokaryotes with structurally and functionally diverse sub-groups, and with a potentially intriguing evolutionary role. These features will help delineate future characterizations of death pathways in prokaryotes.

  9. Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform.

    Science.gov (United States)

    Schirmer, Melanie; Ijaz, Umer Z; D'Amore, Rosalinda; Hall, Neil; Sloan, William T; Quince, Christopher

    2015-03-31

    With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. The online database MaarjAM reveals global and ecosystemic distribution patterns in arbuscular mycorrhizal fungi (Glomeromycota).

    Science.gov (United States)

    Opik, M; Vanatoa, A; Vanatoa, E; Moora, M; Davison, J; Kalwij, J M; Reier, U; Zobel, M

    2010-10-01

    • Here, we describe a new database, MaarjAM, that summarizes publicly available Glomeromycota DNA sequence data and associated metadata. The goal of the database is to facilitate the description of distribution and richness patterns in this group of fungi. • Small subunit (SSU) rRNA gene sequences and available metadata were collated from all suitable taxonomic and ecological publications. These data have been made accessible in an open-access database (http://maarjam.botany.ut.ee). • Two hundred and eighty-two SSU rRNA gene virtual taxa (VT) were described based on a comprehensive phylogenetic analysis of all collated Glomeromycota sequences. Two-thirds of VT showed limited distribution ranges, occurring in single current or historic continents or climatic zones. Those VT that associated with a taxonomically wide range of host plants also tended to have a wide geographical distribution, and vice versa. No relationships were detected between VT richness and latitude, elevation or vascular plant richness. • The collated Glomeromycota molecular diversity data suggest limited distribution ranges in most Glomeromycota taxa and a positive relationship between the width of a taxon's geographical range and its host taxonomic range. Inconsistencies between molecular and traditional taxonomy of Glomeromycota, and shortage of data from major continents and ecosystems, are highlighted.

  11. Deep sequencing of small RNA libraries from human prostate epithelial and stromal cells reveal distinct pattern of microRNAs primarily predicted to target growth factors.

    Science.gov (United States)

    Singh, Savita; Zheng, Yun; Jagadeeswaran, Guru; Ebron, Jey Sabith; Sikand, Kavleen; Gupta, Sanjay; Sunker, Ramanjulu; Shukla, Girish C

    2016-02-28

    Complex epithelial and stromal cell interactions are required during the development and progression of prostate cancer. Regulatory small non-coding microRNAs (miRNAs) participate in the spatiotemporal regulation of messenger RNA (mRNA) and regulation of translation affecting a large number of genes involved in prostate carcinogenesis. In this study, through deep-sequencing of size fractionated small RNA libraries we profiled the miRNAs of prostate epithelial (PrEC) and stromal (PrSC) cells. Over 50 million reads were obtained for PrEC in which 860,468 were unique sequences. Similarly, nearly 76 million reads for PrSC were obtained in which over 1 million were unique reads. Expression of many miRNAs of broadly conserved and poorly conserved miRNA families were identified. Sixteen highly expressed miRNAs with significant change in expression in PrSC than PrEC were further analyzed in silico. ConsensusPathDB showed the target genes of these miRNAs were significantly involved in adherence junction, cell adhesion, EGRF, TGF-β and androgen signaling. Let-7 family of tumor-suppressor miRNAs expression was highly pervasive in both, PrEC and PrSC cells. In addition, we have also identified several miRNAs that are unique to PrEC or PrSC cells and their predicted putative targets are a group of transcription factors. This study provides perspective on the miRNA expression in PrEC and PrSC, and reveals a global trend in miRNA interactome. We conclude that the most abundant miRNAs are potential regulators of development and differentiation of the prostate gland by targeting a set of growth factors. Additionally, high level expression of the most members of let-7 family miRNAs suggests their role in the fine tuning of the growth and proliferation of prostate epithelial and stromal cells. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  12. Expression profile of the Schistosoma japonicum degradome reveals differential protease expression patterns and potential anti-schistosomal intervention targets.

    Directory of Open Access Journals (Sweden)

    Shuai Liu

    2014-10-01

    Full Text Available Blood fluke proteases play pivotal roles in the processes of invasion, nutrition acquisition, immune evasion, and other host-parasite interactions. Hundreds of genes encoding putative proteases have been identified in the recently published schistosome genomes. However, the expression profiles of these proteases in Schistosoma species have not yet been systematically analyzed. We retrieved and culled the redundant protease sequences of Schistosoma japonicum, Schistosoma mansoni, Echinococcus multilocularis, and Clonorchis sinensis from public databases utilizing bioinformatic approaches. The degradomes of the four parasitic organisms and Homo sapiens were then comparatively analyzed. A total of 262 S. japonicum protease sequences were obtained and the expression profiles generated using whole-genome microarray. Four main clusters of protease genes with different expression patterns were identified: proteases up-regulated in hepatic schistosomula and adult worms, egg-specific or predominantly expressed proteases, cercaria-specific or predominantly expressed proteases, and constantly expressed proteases. A subset of protease genes with different expression patterns were further validated using real-time quantitative PCR. The present study represents the most comprehensive analysis of a degradome in Schistosoma species to date. These results provide a firm foundation for future research on the specific function(s of individual proteases and may help to refine anti-proteolytic strategies in blood flukes.

  13. A sequence identification measurement model to investigate the implicit learning of metrical temporal patterns.

    Directory of Open Access Journals (Sweden)

    Benjamin G Schultz

    Full Text Available Implicit learning (IL occurs unconsciously and without intention. Perceptual fluency is the ease of processing elicited by previous exposure to a stimulus. It has been assumed that perceptual fluency is associated with IL. However, the role of perceptual fluency following IL has not been investigated in temporal pattern learning. Two experiments by Schultz, Stevens, Keller, and Tillmann demonstrated the IL of auditory temporal patterns using a serial reaction-time task and a generation task based on the process dissociation procedure. The generation task demonstrated that learning was implicit in both experiments via motor fluency, that is, the inability to suppress learned information. With the aim to disentangle conscious and unconscious processes, we analyze unreported recognition data associated with the Schultz et al. experiments using the sequence identification measurement model. The model assumes that perceptual fluency reflects unconscious processes and IL. For Experiment 1, the model indicated that conscious and unconscious processes contributed to recognition of temporal patterns, but that unconscious processes had a greater influence on recognition than conscious processes. In the model implementation of Experiment 2, there was equal contribution of conscious and unconscious processes in the recognition of temporal patterns. As Schultz et al. demonstrated IL in both experiments using a generation task, and the conditions reported here in Experiments 1 and 2 were identical, two explanations are offered for the discrepancy in model and behavioral results based on the two tasks: 1 perceptual fluency may not be necessary to infer IL, or 2 conscious control over implicitly learned information may vary as a function of perceptual fluency and motor fluency.

  14. Sequencing of BAC pools by different next generation sequencing platforms and strategies

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2011-10-01

    Full Text Available Abstract Background Next generation sequencing of BACs is a viable option for deciphering the sequence of even large and highly repetitive genomes. In order to optimize this strategy, we examined the influence of read length on the quality of Roche/454 sequence assemblies, to what extent Illumina/Solexa mate pairs (MPs improve the assemblies by scaffolding and whether barcoding of BACs is dispensable. Results Sequencing four BACs with both FLX and Titanium technologies revealed similar sequencing accuracy, but showed that the longer Titanium reads produce considerably less misassemblies and gaps. The 454 assemblies of 96 barcoded BACs were improved by scaffolding 79% of the total contig length with MPs from a non-barcoded library. Assembly of the unmasked 454 sequences without separation by barcodes revealed chimeric contig formation to be a major problem, encompassing 47% of the total contig length. Masking the sequences reduced this fraction to 24%. Conclusion Optimal BAC pool sequencing should be based on the longest available reads, with barcoding essential for a comprehensive assessment of both repetitive and non-repetitive sequence information. When interest is restricted to non-repetitive regions and repeats are masked prior to assembly, barcoding is non-essential. In any case, the assemblies can be improved considerably by scaffolding with non-barcoded BAC pool MPs.

  15. Multilocus sequence typing reveals a novel subspeciation of Lactobacillus delbrueckii.

    Science.gov (United States)

    Tanigawa, Kana; Watanabe, Koichi

    2011-03-01

    Currently, the species Lactobacillus delbrueckii is divided into four subspecies, L. delbrueckii subsp. delbrueckii, L. delbrueckii subsp. bulgaricus, L. delbrueckii subsp. indicus and L. delbrueckii subsp. lactis. These classifications were based mainly on phenotypic identification methods and few studies have used genotypic identification methods. As a result, these subspecies have not yet been reliably delineated. In this study, the four subspecies of L. delbrueckii were discriminated by phenotype and by genotypic identification [amplified-fragment length polymorphism (AFLP) and multilocus sequence typing (MLST)] methods. The MLST method developed here was based on the analysis of seven housekeeping genes (fusA, gyrB, hsp60, ileS, pyrG, recA and recG). The MLST method had good discriminatory ability: the 41 strains of L. delbrueckii examined were divided into 34 sequence types, with 29 sequence types represented by only a single strain. The sequence types were divided into eight groups. These groups could be discriminated as representing different subspecies. The results of the AFLP and MLST analyses were consistent. The type strain of L. delbrueckii subsp. delbrueckii, YIT 0080(T), was clearly discriminated from the other strains currently classified as members of this subspecies, which were located close to strains of L. delbrueckii subsp. lactis. The MLST scheme developed in this study should be a useful tool for the identification of strains of L. delbrueckii to the subspecies level.

  16. Sequence and expression pattern of the germ line marker vasa in honey bees and stingless bees

    Science.gov (United States)

    2009-01-01

    Queens and workers of social insects differ in the rates of egg laying. Using genomic information we determined the sequence of vasa, a highly conserved gene specific to the germ line of metazoans, for the honey bee and four stingless bees. The vasa sequence of social bees differed from that of other insects in two motifs. By RT-PCR we confirmed the germ line specificity of Amvasa expression in honey bees. In situ hybridization on ovarioles showed that Amvasa is expressed throughout the germarium, except for the transition zone beneath the terminal filament. A diffuse vasa signal was also seen in terminal filaments suggesting the presence of germ line cells. Oocytes showed elevated levels of Amvasa transcripts in the lower germarium and after follicles became segregated. In previtellogenic follicles, Amvasa transcription was detected in the trophocytes, which appear to supply its mRNA to the growing oocyte. A similar picture was obtained for ovarioles of the stingless bee Melipona quadrifasciata, except that Amvasa expression was higher in the oocytes of previtellogenic follicles. The social bees differ in this respect from Drosophila, the model system for insect oogenesis, suggesting that changes in the sequence and expression pattern of vasa may have occurred during social evolution. PMID:21637523

  17. Sequence and expression pattern of the germ line marker vasa in honey bees and stingless bees

    Directory of Open Access Journals (Sweden)

    Érica Donato Tanaka

    2009-01-01

    Full Text Available Queens and workers of social insects differ in the rates of egg laying. Using genomic information we determined the sequence of vasa, a highly conserved gene specific to the germ line of metazoans, for the honey bee and four stingless bees. The vasa sequence of social bees differed from that of other insects in two motifs. By RT-PCR we confirmed the germ line specificity of Amvasa expression in honey bees. In situ hybridization on ovarioles showed that Amvasa is expressed throughout the germarium, except for the transition zone beneath the terminal filament. A diffuse vasa signal was also seen in terminal filaments suggesting the presence of germ line cells. Oocytes showed elevated levels of Amvasa transcripts in the lower germarium and after follicles became segregated. In previtellogenic follicles, Amvasa transcription was detected in the trophocytes, which appear to supply its mRNA to the growing oocyte. A similar picture was obtained for ovarioles of the stingless bee Melipona quadrifasciata, except that Amvasa expression was higher in the oocytes of previtellogenic follicles. The social bees differ in this respect from Drosophila, the model system for insect oogenesis, suggesting that changes in the sequence and expression pattern of vasa may have occurred during social evolution.

  18. Contrasting patterns of evolutionary constraint and novelty revealed by comparative sperm proteomic analysis in Lepidoptera.

    Science.gov (United States)

    Whittington, Emma; Forsythe, Desiree; Borziak, Kirill; Karr, Timothy L; Walters, James R; Dorus, Steve

    2017-12-02

    Rapid evolution is a hallmark of reproductive genetic systems and arises through the combined processes of sequence divergence, gene gain and loss, and changes in gene and protein expression. While studies aiming to disentangle the molecular ramifications of these processes are progressing, we still know little about the genetic basis of evolutionary transitions in reproductive systems. Here we conduct the first comparative analysis of sperm proteomes in Lepidoptera, a group that exhibits dichotomous spermatogenesis, in which males produce a functional fertilization-competent sperm (eupyrene) and an incompetent sperm morph lacking nuclear DNA (apyrene). Through the integrated application of evolutionary proteomics and genomics, we characterize the genomic patterns potentially associated with the origination and evolution of this unique spermatogenic process and assess the importance of genetic novelty in Lepidopteran sperm biology. Comparison of the newly characterized Monarch butterfly (Danaus plexippus) sperm proteome to those of the Carolina sphinx moth (Manduca sexta) and the fruit fly (Drosophila melanogaster) demonstrated conservation at the level of protein abundance and post-translational modification within Lepidoptera. In contrast, comparative genomic analyses across insects reveals significant divergence at two levels that differentiate the genetic architecture of sperm in Lepidoptera from other insects. First, a significant reduction in orthology among Monarch sperm genes relative to the remainder of the genome in non-Lepidopteran insect species was observed. Second, a substantial number of sperm proteins were found to be specific to Lepidoptera, in that they lack detectable homology to the genomes of more distantly related insects. Lastly, the functional importance of Lepidoptera specific sperm proteins is broadly supported by their increased abundance relative to proteins conserved across insects. Our results identify a burst of genetic novelty

  19. Genome sequencing and analysis reveals possible determinants of Staphylococcus aureus nasal carriage

    Directory of Open Access Journals (Sweden)

    Cole Alexander M

    2008-09-01

    Full Text Available Abstract Background Nasal carriage of Staphylococcus aureus is a major risk factor in clinical and community settings due to the range of etiologies caused by the organism. We have identified unique immunological and ultrastructural properties associated with nasal carriage isolates denoting a role for bacterial factors in nasal carriage. However, despite extensive molecular level characterizations by several groups suggesting factors necessary for colonization on nasal epithelium, genetic determinants of nasal carriage are unknown. Herein, we have set a genomic foundation for unraveling the bacterial determinants of nasal carriage in S. aureus. Results MLST analysis revealed no lineage specific differences between carrier and non-carrier strains suggesting a role for mobile genetic elements. We completely sequenced a model carrier isolate (D30 and a model non-carrier strain (930918-3 to identify differential gene content. Comparison revealed the presence of 84 genes unique to the carrier strain and strongly suggests a role for Type VII secretion systems in nasal carriage. These genes, along with a putative pathogenicity island (SaPIBov present uniquely in the carrier strains are likely important in affecting carriage. Further, PCR-based genotyping of other clinical isolates for a specific subset of these 84 genes raise the possibility of nasal carriage being caused by multiple gene sets. Conclusion Our data suggest that carriage is likely a heterogeneic phenotypic trait and implies a role for nucleotide level polymorphism in carriage. Complete genome level analyses of multiple carriage strains of S. aureus will be important in clarifying molecular determinants of S. aureus nasal carriage.

  20. Structure-Related Roles for the Conservation of the HIV-1 Fusion Peptide Sequence Revealed by Nuclear Magnetic Resonance.

    Science.gov (United States)

    Serrano, Soraya; Huarte, Nerea; Rujas, Edurne; Andreu, David; Nieva, José L; Jiménez, María Angeles

    2017-10-17

    Despite extensive characterization of the human immunodeficiency virus type 1 (HIV-1) hydrophobic fusion peptide (FP), the structure-function relationships underlying its extraordinary degree of conservation remain poorly understood. Specifically, the fact that the tandem repeat of the FLGFLG tripeptide is absolutely conserved suggests that high hydrophobicity may not suffice to unleash FP function. Here, we have compared the nuclear magnetic resonance (NMR) structures adopted in nonpolar media by two FP surrogates, wtFP-tag and scrFP-tag, which had equal hydrophobicity but contained wild-type and scrambled core sequences LFLGFLG and FGLLGFL, respectively. In addition, these peptides were tagged at their C-termini with an epitope sequence that folded independently, thereby allowing Western blot detection without interfering with FP structure. We observed similar α-helical FP conformations for both specimens dissolved in the low-polarity medium 25% (v/v) 1,1,1,3,3,3-hexafluoro-2-propanol (HFIP), but important differences in contact with micelles of the membrane mimetic dodecylphosphocholine (DPC). Thus, whereas wtFP-tag preserved a helix displaying a Gly-rich ridge, the scrambled sequence lost in great part the helical structure upon being solubilized in DPC. Western blot analyses further revealed the capacity of wtFP-tag to assemble trimers in membranes, whereas membrane oligomers were not observed in the case of the scrFP-tag sequence. We conclude that, beyond hydrophobicity, preserving sequence order is an important feature for defining the secondary structures and oligomeric states adopted by the HIV FP in membranes.

  1. Deep amplicon sequencing reveals mixed phytoplasma infection within single grapevine plants

    DEFF Research Database (Denmark)

    Nicolaisen, Mogens; Contaldo, Nicoletta; Makarova, Olga

    2011-01-01

    The diversity of phytoplasmas within single plants has not yet been fully investigated. In this project, deep amplicon sequencing was used to generate 50,926 phytoplasma sequences from 11 phytoplasma-infected grapevine samples from a PCR amplicon in the 5' end of the 16S region. After clustering ...

  2. Complete sequence analysis reveals two distinct poleroviruses infecting cucurbits in China.

    Science.gov (United States)

    Xiang, Hai-ying; Shang, Qiao-xia; Han, Cheng-gui; Li, Da-wei; Yu, Jia-lin

    2008-01-01

    The complete RNA genomes of a Chinese isolate of cucurbit aphid-borne yellows virus (CABYV-CHN) and a new polerovirus tentatively referred to as melon aphid-borne yellows virus (MABYV) were determined. The entire genome of CABYV-CHN shared 89.0% nucleotide sequence identity with the French CABYV isolate. In contrast, nucleotide sequence identities between MABYV and CABYV and other poleroviruses were in the range of 50.7-74.2%, with amino acid sequence identities ranging from 24.8 to 82.9% for individual gene products. We propose that CABYV-CHN is a strain of CABYV and that MABYV is a member of a tentative distinct species within the genus Polerovirus.

  3. MR of normal pancreas : comparison of five pulse sequences and enhancing patterns on dynamic imaging

    International Nuclear Information System (INIS)

    Jang, Hyun Jung; Kim, Tae Kyoung; Hong, Sung Hwan; Han, Joon Koo; Choi, Byung Ihn

    1997-01-01

    To compare T1-weighted FLASH and turbo spin echo (SE) T2-weighted sequences with conventional T1- and T2-weighted sequences in imaging normal pancreas and to describe the enhancing patterns on dynamic MR imging. Forty-four patients with presumed hepatic hemangiomas were studied at 1.0T or 1.5T by using conventional SE sequences (T1-weighted, T2-weighted, and heavily T2-weighted), turbo-SE T2-weighted sequences, and breath-hold T1-weighted FLASH sequences acquired before, immediately on, and at 1, 2, 3, and 5 or 10 minutes after injection of a bolus of gadopentetate dimeglumine. No patients had either a history or its clinical features of pancreatic disease. Images were quantitatively analyzed for signal-difference-to noise ratios (SD/Ns) between the pancreas and peripancreatic fat. Percentage enhancement of the pancreas was measured on each dynamic MR image. Conspicuity of the pancreatic border was qualitatively evaluated according to a consensus, reached by three radiologists. Turbo-SE T2-weighted images had a significantly higher SD/N ratio (p<0.001) and better conspicuity of the pancreatic border (p<0.001) than SE T2- and heavily T2-weighted images;T1-weighted SE images had a significantly higher SD/N ratio than T1-weighted FLASH images (p<0.001), but there was no significant difference between tham in qualitative analysis (p=0.346). Percentage enhancement immediately on and at 1, 2, 3, 5, and 10 minutes after administration of contrast material was 39.9%, 44.5%, 42.9%, 40.8%, 36.3%, 29.9%, respectively, with peak enhancement at 1 minute. In MR imaging of normal pancreas, turbo-SE T2-weighted imaging is superior to SE T2- and heavily T2- weighted imaging, and SE T1-weighted imaging is superior to T1-weighted FLASH imaging. On serial gadolinium-enhanced FLASH imaging, normal pancreas shows peak enhancement at 1 minute

  4. Genome-wide analysis of WRKY transcription factors in white pear (Pyrus bretschneideri) reveals evolution and patterns under drought stress.

    Science.gov (United States)

    Huang, Xiaosan; Li, Kongqing; Xu, Xiaoyong; Yao, Zhenghong; Jin, Cong; Zhang, Shaoling

    2015-12-24

    WRKY transcription factors (TFs) constitute one of the largest protein families in higher plants, and its members contain one or two conserved WRKY domains, about 60 amino acid residues with the WRKYGQK sequence followed by a C2H2 or C2HC zinc finger motif. WRKY proteins play significant roles in plant development, and in responses to biotic and abiotic stresses. Pear (Pyrus bretschneideri) is one of the most important fruit crops in the world and is frequently threatened by abiotic stress, such as drought, affecting growth, development and productivity. Although the pear genome sequence has been released, little is known about the WRKY TFs in pear, especially in respond to drought stress at the genome-wide level. We identified a total of 103 WRKY TFs in the pear genome. Based on the structural features of WRKY proteins and topology of the phylogenetic tree, the pear WRKY (PbWRKY) family was classified into seven groups (Groups 1, 2a-e, and 3). The microsyteny analysis indicated that 33 (32%) PbWRKY genes were tandemly duplicated and 57 genes (55.3%) were segmentally duplicated. RNA-seq experiment data and quantitative real-time reverse transcription PCR revealed that PbWRKY genes in different groups were induced by drought stress, and Group 2a and 3 were mainly involved in the biological pathways in response to drought stress. Furthermore, adaptive evolution analysis detected a significant positive selection for Pbr001425 in Group 3, and its expression pattern differed from that of other members in this group. The present study provides a solid foundation for further functional dissection and molecular evolution of WRKY TFs in pear, especially for improving the water-deficient resistance of pear through manipulation of the PbWRKYs.

  5. Compact flow diagrams for state sequences

    NARCIS (Netherlands)

    Buchin, K.A.; Buchin, M.E.; Gudmundsson, J.; Horton, M.J.; Sijben, S.

    2016-01-01

    We introduce the concept of compactly representing a large number of state sequences, e.g., sequences of activities, as a flow diagram. We argue that the flow diagram representation gives an intuitive summary that allows the user to detect patterns among large sets of state sequences. Simplified,

  6. Magnetic nanoparticle imaging by random and maximum length sequences of inhomogeneous activation fields.

    Science.gov (United States)

    Baumgarten, Daniel; Eichardt, Roland; Crevecoeur, Guillaume; Supriyanto, Eko; Haueisen, Jens

    2013-01-01

    Biomedical applications of magnetic nanoparticles require a precise knowledge of their biodistribution. From multi-channel magnetorelaxometry measurements, this distribution can be determined by means of inverse methods. It was recently shown that the combination of sequential inhomogeneous excitation fields in these measurements is favorable regarding the reconstruction accuracy when compared to homogeneous activation . In this paper, approaches for the determination of activation sequences for these measurements are investigated. Therefor, consecutive activation of single coils, random activation patterns and families of m-sequences are examined in computer simulations involving a sample measurement setup and compared with respect to the relative condition number of the system matrix. We obtain that the values of this condition number decrease with larger number of measurement samples for all approaches. Random sequences and m-sequences reveal similar results with a significant reduction of the required number of samples. We conclude that the application of pseudo-random sequences for sequential activation in the magnetorelaxometry imaging of magnetic nanoparticles considerably reduces the number of required sequences while preserving the relevant measurement information.

  7. Bacterial Pathogens and Community Composition in Advanced Sewage Treatment Systems Revealed by Metagenomics Analysis Based on High-Throughput Sequencing

    Science.gov (United States)

    Lu, Xin; Zhang, Xu-Xiang; Wang, Zhu; Huang, Kailong; Wang, Yuan; Liang, Weigang; Tan, Yunfei; Liu, Bo; Tang, Junying

    2015-01-01

    This study used 454 pyrosequencing, Illumina high-throughput sequencing and metagenomic analysis to investigate bacterial pathogens and their potential virulence in a sewage treatment plant (STP) applying both conventional and advanced treatment processes. Pyrosequencing and Illumina sequencing consistently demonstrated that Arcobacter genus occupied over 43.42% of total abundance of potential pathogens in the STP. At species level, potential pathogens Arcobacter butzleri, Aeromonas hydrophila and Klebsiella pneumonia dominated in raw sewage, which was also confirmed by quantitative real time PCR. Illumina sequencing also revealed prevalence of various types of pathogenicity islands and virulence proteins in the STP. Most of the potential pathogens and virulence factors were eliminated in the STP, and the removal efficiency mainly depended on oxidation ditch. Compared with sand filtration, magnetic resin seemed to have higher removals in most of the potential pathogens and virulence factors. However, presence of the residual A. butzleri in the final effluent still deserves more concerns. The findings indicate that sewage acts as an important source of environmental pathogens, but STPs can effectively control their spread in the environment. Joint use of the high-throughput sequencing technologies is considered a reliable method for deep and comprehensive overview of environmental bacterial virulence. PMID:25938416

  8. Single-Cell RNA-Sequencing Reveals a Continuous Spectrum of Differentiation in Hematopoietic Cells

    Directory of Open Access Journals (Sweden)

    Iain C. Macaulay

    2016-02-01

    Full Text Available The transcriptional programs that govern hematopoiesis have been investigated primarily by population-level analysis of hematopoietic stem and progenitor cells, which cannot reveal the continuous nature of the differentiation process. Here we applied single-cell RNA-sequencing to a population of hematopoietic cells in zebrafish as they undergo thrombocyte lineage commitment. By reconstructing their developmental chronology computationally, we were able to place each cell along a continuum from stem cell to mature cell, refining the traditional lineage tree. The progression of cells along this continuum is characterized by a highly coordinated transcriptional program, displaying simultaneous suppression of genes involved in cell proliferation and ribosomal biogenesis as the expression of lineage specific genes increases. Within this program, there is substantial heterogeneity in the expression of the key lineage regulators. Overall, the total number of genes expressed, as well as the total mRNA content of the cell, decreases as the cells undergo lineage commitment.

  9. Extracellular DNA amplicon sequencing reveals high levels of benthic eukaryotic diversity in the central Red Sea

    KAUST Repository

    Pearman, John K.

    2015-11-01

    The present study aims to characterize the benthic eukaryotic biodiversity patterns at a coarse taxonomic level in three areas of the central Red Sea (a lagoon, an offshore area in Thuwal and a shallow coastal area near Jeddah) based on extracellular DNA. High-throughput amplicon sequencing targeting the V9 region of the 18S rRNA gene was undertaken for 32 sediment samples. High levels of alpha-diversity were detected with 16,089 operational taxonomic units (OTUs) being identified. The majority of the OTUs were assigned to Metazoa (29.2%), Alveolata (22.4%) and Stramenopiles (17.8%). Stramenopiles (Diatomea) and Alveolata (Ciliophora) were frequent in a lagoon and in shallower coastal stations, whereas metazoans (Arthropoda: Maxillopoda) were dominant in deeper offshore stations. Only 24.6% of total OTUs were shared among all areas. Beta-diversity was generally lower between the lagoon and Jeddah (nearshore) than between either of those and the offshore area, suggesting a nearshore–offshore biodiversity gradient. The current approach allowed for a broad-range of benthic eukaryotic biodiversity to be analysed with significantly less labour than would be required by other traditional taxonomic approaches. Our findings suggest that next generation sequencing techniques have the potential to provide a fast and standardised screening of benthic biodiversity at large spatial and temporal scales.

  10. A diminutive perinate European Enantiornithes reveals an asynchronous ossification pattern in early birds.

    Science.gov (United States)

    Knoll, Fabien; Chiappe, Luis M; Sanchez, Sophie; Garwood, Russell J; Edwards, Nicholas P; Wogelius, Roy A; Sellers, William I; Manning, Phillip L; Ortega, Francisco; Serrano, Francisco J; Marugán-Lobón, Jesús; Cuesta, Elena; Escaso, Fernando; Sanz, Jose Luis

    2018-03-05

    Fossils of juvenile Mesozoic birds provide insight into the early evolution of avian development, however such fossils are rare. The analysis of the ossification sequence in these early-branching birds has the potential to address important questions about their comparative developmental biology and to help understand their morphological evolution and ecological differentiation. Here we report on an early juvenile enantiornithine specimen from the Early Cretaceous of Europe, which sheds new light on the osteogenesis in this most species-rich clade of Mesozoic birds. Consisting of a nearly complete skeleton, it is amongst the smallest known Mesozoic avian fossils representing post-hatching stages of development. Comparisons between this new specimen and other known early juvenile enantiornithines support a clade-wide asynchronous pattern of osteogenesis in the sternum and the vertebral column, and strongly indicate that the hatchlings of these phylogenetically basal birds varied greatly in size and tempo of skeletal maturation.

  11. Next generation sequencing reveals distinct fecal pollution signatures in aquatic sediments across gradients of anthropogenic influence

    Directory of Open Access Journals (Sweden)

    Gian Marco Luna

    2016-11-01

    Full Text Available Aquatic sediments are the repository of a variety of anthropogenic pollutants, including bacteria of fecal origin, that reach the aquatic environment from a variety of sources. Although fecal bacteria can survive for long periods of time in aquatic sediments, the microbiological quality of sediments is almost entirely neglected when performing quality assessments of aquatic ecosystems. Here we investigated the relative abundance, patterns and diversity of fecal bacterial populations in two coastal areas in the Northern Adriatic Sea (Italy: the Po river prodelta (PRP, an estuarine area receiving significant contaminant discharge from one of the largest European rivers and the Lagoon of Venice (LV, a transitional environment impacted by a multitude of anthropogenic stressors. From both areas, several indicators of fecal and sewage contamination were determined in the sediments using Next Generation Sequencing (NGS of 16S rDNA amplicons. At both areas, fecal contamination was high, with fecal bacteria accounting for up to 3.96% and 1.12% of the sediment bacterial assemblages in PRP and LV, respectively. The magnitude of the fecal signature was highest in the PRP site, highlighting the major role of the Po river in spreading microbial contaminants into the adjacent coastal area. In the LV site, fecal pollution was highest in the urban area, and almost disappeared when moving to the open sea. Our analysis revealed a large number of fecal Operational Taxonomic Units (OTU, 960 and 181 in PRP and LV, respectively and showed a different fecal signature in the two areas, suggesting a diverse contribution of human and non-human sources of contamination. These results highlight the potential of NGS techniques to gain insights into the origin and fate of different fecal bacteria populations in aquatic sediments.

  12. DNA pattern recognition using canonical correlation algorithm.

    Science.gov (United States)

    Sarkar, B K; Chakraborty, Chiranjib

    2015-10-01

    We performed canonical correlation analysis as an unsupervised statistical tool to describe related views of the same semantic object for identifying patterns. A pattern recognition technique based on canonical correlation analysis (CCA) was proposed for finding required genetic code in the DNA sequence. Two related but different objects were considered: one was a particular pattern, and other was test DNA sequence. CCA found correlations between two observations of the same semantic pattern and test sequence. It is concluded that the relationship possesses maximum value in the position where the pattern exists. As a case study, the potential of CCA was demonstrated on the sequence found from HIV-1 preferred integration sites. The subsequences on the left and right flanking from the integration site were considered as the two views, and statistically significant relationships were established between these two views to elucidate the viral preference as an important factor for the correlation.

  13. Multineuronal Spike Sequences Repeat with Millisecond Precision

    Directory of Open Access Journals (Sweden)

    Koki eMatsumoto

    2013-06-01

    Full Text Available Cortical microcircuits are nonrandomly wired by neurons. As a natural consequence, spikes emitted by microcircuits are also nonrandomly patterned in time and space. One of the prominent spike organizations is a repetition of fixed patterns of spike series across multiple neurons. However, several questions remain unsolved, including how precisely spike sequences repeat, how the sequences are spatially organized, how many neurons participate in sequences, and how different sequences are functionally linked. To address these questions, we monitored spontaneous spikes of hippocampal CA3 neurons ex vivo using a high-speed functional multineuron calcium imaging technique that allowed us to monitor spikes with millisecond resolution and to record the location of spiking and nonspiking neurons. Multineuronal spike sequences were overrepresented in spontaneous activity compared to the statistical chance level. Approximately 75% of neurons participated in at least one sequence during our observation period. The participants were sparsely dispersed and did not show specific spatial organization. The number of sequences relative to the chance level decreased when larger time frames were used to detect sequences. Thus, sequences were precise at the millisecond level. Sequences often shared common spikes with other sequences; parts of sequences were subsequently relayed by following sequences, generating complex chains of multiple sequences.

  14. Protein sequences bound to mineral surfaces persist into deep time

    DEFF Research Database (Denmark)

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa

    2016-01-01

    of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell......, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated...

  15. Sequence stratigraphy of the Nukumaruan stratotype (Pliocene-Pleistocene, c. 2.08-1.63 Ma), Wanganui Basin, New Zealand

    International Nuclear Information System (INIS)

    Abbott, S.T.; Naish, T.R.; Carter, R.M.; Pillans, B.J.

    2005-01-01

    Late Pliocene to Early Pleistocene (c. 2.08-1.63 Ma) strata exposed in coastal cliffs along Nukumaru and Ototoka beaches near Wanganui, between the top of the Nukumaru Limestone and the base of the Butlers Shell Conglomerate, comprise 11 depositional sequences of a total thickness of c. 86 m. The sequences consist predominantly of siliciclastic shoreline facies. Non-marine facies (including palaeosols), and a variety of shallow-marine shellbed facies, are also represented. Patterns in facies composition and sequence architecture reveal three sequence motifs (Maxwell, Nukumaru, and Birdgrove) that represent progressively increasing maximum palaeowater depths within a broadly basin-margin palaeogeographic setting. The sequence motif changes systematically up section and records a lower order tectonic influence on accommodation that has modulated the stacking patterns of individual sequences. Correlation of the sequences with oxygen isotope stages 77-57 is achieved using the basin-wide Ototoka tephra, and indicates that the sequences accumulated in response to obliquity driven (41 k.y. duration) glacio-eustatic sea-level oscillations. Correlation of the Nukumaru coast sequences with other sections along basin strike, and the global oxygen isotope record indicates that (i) 500 k.y. (δ 18 O stages MIS 56-34) is missing at the unconformity between the Nukumaruan and overlying Castlecliffian stratotypes on the Wanganui coast, and (ii) the Pliocene-Pleistocene boundary lies within sequence NC7 at the base of the Lower Maxwell Formation. (author). 52 refs., 15 figs., 2 tabs

  16. DNA sequence modeling based on context trees

    NARCIS (Netherlands)

    Kusters, C.J.; Ignatenko, T.; Roland, J.; Horlin, F.

    2015-01-01

    Genomic sequences contain instructions for protein and cell production. Therefore understanding and identification of biologically and functionally meaningful patterns in DNA sequences is of paramount importance. Modeling of DNA sequences in its turn can help to better understand and identify such

  17. Metagenome Sequence Analysis of Filamentous Microbial Communities Obtained from Geochemically Distinct Geothermal Channels Reveals Specialization of Three Aquificales Lineages

    Directory of Open Access Journals (Sweden)

    Cristina eTakacs-vesbach

    2013-05-01

    Full Text Available The Aquificales are thermophilic microorganisms that inhabit hydrothermal systems worldwide and are considered one of the earliest lineages of the domain Bacteria. We analyzed metagenome sequence obtained from six thermal ‘filamentous streamer’ communities (~40 Mbp per site, which targeted three different groups of Aquificales found in Yellowstone National Park (YNP. Unassembled metagenome sequence and PCR-amplified 16S rRNA gene libraries revealed that acidic, sulfidic sites were dominated by Hydrogenobaculum (Aquificaceae populations, whereas the circumneutral pH (6.5 - 7.8 sites containing dissolved sulfide were dominated by Sulfurihydrogenibium spp. (Hydrogenothermaceae. Thermocrinis (Aquificaceae populations were found primarily in the circumneutral sites with undetectable sulfide, and to a lesser extent in one sulfidic system at pH 8. Phylogenetic analysis of assembled sequence containing 16S rRNA genes as well as conserved protein-encoding genes revealed that the composition and function of these communities varied across geochemical conditions. Each Aquificales lineage contained genes for CO2 fixation by the reverse TCA cycle, but only the Sulfurihydrogenibium populations perform citrate cleavage using ATP citrate lyase (Acl. The Aquificaceae populations use an alternative pathway catalyzed by two separate enzymes, citryl CoA synthetase (Ccs and citryl CoA lyase (Ccl. All three Aquificales lineages contained evidence of aerobic respiration, albeit due to completely different types of heme Cu oxidases (subunit I involved in oxygen reduction. The distribution of Aquificales populations and differences among functional genes involved in energy generation and electron transport is consistent with the hypothesis that geochemical parameters (e.g., pH, sulfide, H2, O2 have resulted in niche specialization among members of the Aquificales.

  18. Patterned biofilm formation reveals a mechanism for structural heterogeneity in bacterial biofilms.

    Science.gov (United States)

    Gu, Huan; Hou, Shuyu; Yongyat, Chanokpon; De Tore, Suzanne; Ren, Dacheng

    2013-09-03

    Bacterial biofilms are ubiquitous and are the major cause of chronic infections in humans and persistent biofouling in industry. Despite the significance of bacterial biofilms, the mechanism of biofilm formation and associated drug tolerance is still not fully understood. A major challenge in biofilm research is the intrinsic heterogeneity in the biofilm structure, which leads to temporal and spatial variation in cell density and gene expression. To understand and control such structural heterogeneity, surfaces with patterned functional alkanthiols were used in this study to obtain Escherichia coli cell clusters with systematically varied cluster size and distance between clusters. The results from quantitative imaging analysis revealed an interesting phenomenon in which multicellular connections can be formed between cell clusters depending on the size of interacting clusters and the distance between them. In addition, significant differences in patterned biofilm formation were observed between wild-type E. coli RP437 and some of its isogenic mutants, indicating that certain cellular and genetic factors are involved in interactions among cell clusters. In particular, autoinducer-2-mediated quorum sensing was found to be important. Collectively, these results provide missing information that links cell-to-cell signaling and interaction among cell clusters to the structural organization of bacterial biofilms.

  19. Network based approaches reveal clustering in protein point patterns

    Science.gov (United States)

    Parker, Joshua; Barr, Valarie; Aldridge, Joshua; Samelson, Lawrence E.; Losert, Wolfgang

    2014-03-01

    Recent advances in super-resolution imaging have allowed for the sub-diffraction measurement of the spatial location of proteins on the surfaces of T-cells. The challenge is to connect these complex point patterns to the internal processes and interactions, both protein-protein and protein-membrane. We begin analyzing these patterns by forming a geometric network amongst the proteins and looking at network measures, such the degree distribution. This allows us to compare experimentally observed patterns to models. Specifically, we find that the experimental patterns differ from heterogeneous Poisson processes, highlighting an internal clustering structure. Further work will be to compare our results to simulated protein-protein interactions to determine clustering mechanisms.

  20. Classic selective sweeps revealed by massive sequencing in cattle.

    Directory of Open Access Journals (Sweden)

    Saber Qanbari

    2014-02-01

    Full Text Available Human driven selection during domestication and subsequent breed formation has likely left detectable signatures within the genome of modern cattle. The elucidation of these signatures of selection is of interest from the perspective of evolutionary biology, and for identifying domestication-related genes that ultimately may help to further genetically improve this economically important animal. To this end, we employed a panel of more than 15 million autosomal SNPs identified from re-sequencing of 43 Fleckvieh animals. We mainly applied two somewhat complementary statistics, the integrated Haplotype Homozygosity Score (iHS reflecting primarily ongoing selection, and the Composite of Likelihood Ratio (CLR having the most power to detect completed selection after fixation of the advantageous allele. We find 106 candidate selection regions, many of which are harboring genes related to phenotypes relevant in domestication, such as coat coloring pattern, neurobehavioral functioning and sensory perception including KIT, MITF, MC1R, NRG4, Erbb4, TMEM132D and TAS2R16, among others. To further investigate the relationship between genes with signatures of selection and genes identified in QTL mapping studies, we use a sample of 3062 animals to perform four genome-wide association analyses using appearance traits, body size and somatic cell count. We show that regions associated with coat coloring significantly (P<0.0001 overlap with the candidate selection regions, suggesting that the selection signals we identify are associated with traits known to be affected by selection during domestication. Results also provide further evidence regarding the complexity of the genetics underlying coat coloring in cattle. This study illustrates the potential of population genetic approaches for identifying genomic regions affecting domestication-related phenotypes and further helps to identify specific regions targeted by selection during speciation, domestication and

  1. Deep sequencing of natural and experimental populations of Drosophila melanogaster reveals biases in the spectrum of new mutations.

    Science.gov (United States)

    Assaf, Zoe June; Tilk, Susanne; Park, Jane; Siegal, Mark L; Petrov, Dmitri A

    2017-12-01

    Mutations provide the raw material of evolution, and thus our ability to study evolution depends fundamentally on having precise measurements of mutational rates and patterns. We generate a data set for this purpose using (1) de novo mutations from mutation accumulation experiments and (2) extremely rare polymorphisms from natural populations. The first, mutation accumulation (MA) lines are the product of maintaining flies in tiny populations for many generations, therefore rendering natural selection ineffective and allowing new mutations to accrue in the genome. The second, rare genetic variation from natural populations allows the study of mutation because extremely rare polymorphisms are relatively unaffected by the filter of natural selection. We use both methods in Drosophila melanogaster , first generating our own novel data set of sequenced MA lines and performing a meta-analysis of all published MA mutations (∼2000 events) and then identifying a high quality set of ∼70,000 extremely rare (≤0.1%) polymorphisms that are fully validated with resequencing. We use these data sets to precisely measure mutational rates and patterns. Highlights of our results include: a high rate of multinucleotide mutation events at both short (∼5 bp) and long (∼1 kb) genomic distances, showing that mutation drives GC content lower in already GC-poor regions, and using our precise context-dependent mutation rates to predict long-term evolutionary patterns at synonymous sites. We also show that de novo mutations from independent MA experiments display similar patterns of single nucleotide mutation and well match the patterns of mutation found in natural populations. © 2017 Assaf et al.; Published by Cold Spring Harbor Laboratory Press.

  2. Genotyping by PCR and High-Throughput Sequencing of Commercial Probiotic Products Reveals Composition Biases.

    Directory of Open Access Journals (Sweden)

    Wesley Morovic

    2016-11-01

    Full Text Available Recent advances in microbiome research have brought renewed focus on beneficial bacteria, many of which are available in food and dietary supplements. Although probiotics have historically been defined as microorganisms that convey health benefits when ingested in sufficient viable amounts, this description now includes the stipulation well defined strains, encompassing definitive taxonomy for consumer consideration and regulatory oversight. Here, we evaluated 52 commercial dietary supplements covering a range of labeled species, and determined their content using plate counting, targeted genotyping. Additionally, strain identities were assessed using methods recently published by the United States Pharmacopeial Convention. We also determined the relative abundance of individual bacteria by high-throughput sequencing (HTS of the 16S rRNA sequence using paired-end 2x250bp Illumina MiSeq technology. Using multiple methods, we tested the hypothesis that products do contain the quantitative amount of labeled bacteria, and qualitative list of labeled microbial species. We found that 17 samples (33% were below label claim for CFU prior to their expiration dates. A multiplexed-PCR scheme showed that only 30/52 (58% of the products contained a correctly labeled classification, with issues encompassing incorrect taxonomy, missing species and un-labeled species. The HTS revealed that many blended products consisted predominantly of Lactobacillus acidophilus and Bifidobacterium animalis subsp. lactis. These results highlight the need for reliable methods to qualitatively determine the correct taxonomy and quantitatively ascertain the relative amounts of mixed microbial populations in commercial probiotic products.

  3. Genomic view of bipolar disorder revealed by whole genome sequencing in a genetic isolate.

    Directory of Open Access Journals (Sweden)

    Benjamin Georgi

    2014-03-01

    Full Text Available Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders.

  4. Genomic View of Bipolar Disorder Revealed by Whole Genome Sequencing in a Genetic Isolate

    Science.gov (United States)

    Georgi, Benjamin; Craig, David; Kember, Rachel L.; Liu, Wencheng; Lindquist, Ingrid; Nasser, Sara; Brown, Christopher; Egeland, Janice A.; Paul, Steven M.; Bućan, Maja

    2014-01-01

    Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders. PMID:24625924

  5. Sequence and structural analysis of the chitinase insertion domain reveals two conserved motifs involved in chitin-binding.

    Directory of Open Access Journals (Sweden)

    Hai Li

    2010-01-01

    Full Text Available Chitinases are prevalent in life and are found in species including archaea, bacteria, fungi, plants, and animals. They break down chitin, which is the second most abundant carbohydrate in nature after cellulose. Hence, they are important for maintaining a balance between carbon and nitrogen trapped as insoluble chitin in biomass. Chitinases are classified into two families, 18 and 19 glycoside hydrolases. In addition to a catalytic domain, which is a triosephosphate isomerase barrel, many family 18 chitinases contain another module, i.e., chitinase insertion domain. While numerous studies focus on the biological role of the catalytic domain in chitinase activity, the function of the chitinase insertion domain is not completely understood. Bioinformatics offers an important avenue in which to facilitate understanding the role of residues within the chitinase insertion domain in chitinase function.Twenty-seven chitinase insertion domain sequences, which include four experimentally determined structures and span five kingdoms, were aligned and analyzed using a modified sequence entropy parameter. Thirty-two positions with conserved residues were identified. The role of these conserved residues was explored by conducting a structural analysis of a number of holo-enzymes. Hydrogen bonding and van der Waals calculations revealed a distinct subset of four conserved residues constituting two sequence motifs that interact with oligosaccharides. The other conserved residues may be key to the structure, folding, and stability of this domain.Sequence and structural studies of the chitinase insertion domains conducted within the framework of evolution identified four conserved residues which clearly interact with the substrates. Furthermore, evolutionary studies propose a link between the appearance of the chitinase insertion domain and the function of family 18 chitinases in the subfamily A.

  6. Judgments Relative to Patterns: How Temporal Sequence Patterns Affect Judgments and Memory

    Science.gov (United States)

    Kusev, Petko; Ayton, Peter; van Schaik, Paul; Tsaneva-Atanasova, Krasimira; Stewart, Neil; Chater, Nick

    2011-01-01

    RESix experiments studied relative frequency judgment and recall of sequentially presented items drawn from 2 distinct categories (i.e., city and animal). The experiments show that judged frequencies of categories of sequentially encountered stimuli are affected by certain properties of the sequence configuration. We found (a) a "first-run…

  7. Identifying recombinants in human and primate immunodeficiency virus sequence alignments using quartet scanning

    Directory of Open Access Journals (Sweden)

    Martin Darren P

    2009-04-01

    Full Text Available Abstract Background Recombination has a profound impact on the evolution of viruses, but characterizing recombination patterns in molecular sequences remains a challenging endeavor. Despite its importance in molecular evolutionary studies, identifying the sequences that exhibit such patterns has received comparatively less attention in the recombination detection framework. Here, we extend a quartet-mapping based recombination detection method to enable identification of recombinant sequences without prior specifications of either query and reference sequences. Through simulations we evaluate different recombinant identification statistics and significance tests. We compare the quartet approach with triplet-based methods that employ additional heuristic tests to identify parental and recombinant sequences. Results Analysis of phylogenetic simulations reveal that identifying the descendents of relatively old recombination events is a challenging task for all methods available, and that quartet scanning performs relatively well compared to the triplet based methods. The use of quartet scanning is further demonstrated by analyzing both well-established and putative HIV-1 recombinant strains. In agreement with recent findings, we provide evidence that the presumed circulating recombinant CRF02_AG is a 'pure' lineage, whereas the presumed parental lineage subtype G has a recombinant origin. We also demonstrate HIV-1 intrasubtype recombination, confirm the hybrid origin of SIV in chimpanzees and further disentangle the recombinant history of SIV lineages in a primate immunodeficiency virus data set. Conclusion Quartet scanning makes a valuable addition to triplet-based methods for identifying recombinant sequences without prior specifications of either query and reference sequences. The new method is available in the VisRD v.3.0 package http://www.cmp.uea.ac.uk/~vlm/visrd.

  8. Nuclear Species-Diagnostic SNP Markers Mined from 454 Amplicon Sequencing Reveal Admixture Genomic Structure of Modern Citrus Varieties

    Science.gov (United States)

    Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick

    2015-01-01

    Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP

  9. Identities among actin-encoding cDNAs of the Nile tilapia (Oreochromis niloticus and other eukaryote species revealed by nucleotide and amino acid sequence analyses

    Directory of Open Access Journals (Sweden)

    Andréia B. Poletto

    2008-01-01

    Full Text Available Actin-encoding cDNAs of Nile tilapia (Oreochromis niloticus were isolated by RT-PCR using total RNA samples of different tissues and further characterized by nucleotide sequencing and in silico amino acid (aa sequence analysis. Comparisons among the actin gene sequences of O. niloticus and those of other species evidenced that the isolated genes present a high similarity to other fish and other vertebrate actin genes. The highest nucleotide resemblance was observed between O. niloticus and O. mossambicus a-actin and b-actin genes. Analysis of the predicted aa sequences revealed two distinct types of cytoplasmic actins, one cardiac muscle actin type and one skeletal muscle actin type that were expressed in different tissues of Nile tilapia. The evolutionary relationships between the Nile tilapia actin genes and diverse other organisms is discussed.

  10. Identification of a consistent pattern of mutations in neurovirulent variants derived from the sabin vaccine strain of poliovirus type 2.

    OpenAIRE

    Equestre, M; Genovese, D; Cavalieri, F; Fiore, L; Santoro, R; Perez Bercoff, R

    1991-01-01

    Complete nucleotide sequencing of the RNAs of two unrelated neurovirulent isolates of Sabin-related poliovirus type 2 revealed that two nucleotides and one amino acid (amino acid 143 in the major capsid protein VP1) consistently departed from the sequences of the nonneurovirulent poliovirus type 2 712 and Sabin vaccine strains. This pattern of mutation appeared to be a feature common to all neurovirulent variants of poliovirus type 2.

  11. Sequencing chromosomal abnormalities reveals neurodevelopmental loci that confer risk across diagnostic boundaries

    Science.gov (United States)

    Talkowski, Michael E.; Rosenfeld, Jill A.; Blumenthal, Ian; Pillalamarri, Vamsee; Chiang, Colby; Heilbut, Adrian; Ernst, Carl; Hanscom, Carrie; Rossin, Elizabeth; Lindgren, Amelia; Pereira, Shahrin; Ruderfer, Douglas; Kirby, Andrew; Ripke, Stephan; Harris, David; Lee, Ji-Hyun; Ha, Kyungsoo; Kim, Hyung-Goo; Solomon, Benjamin D.; Gropman, Andrea L.; Lucente, Diane; Sims, Katherine; Ohsumi, Toshiro K.; Borowsky, Mark L.; Loranger, Stephanie; Quade, Bradley; Lage, Kasper; Miles, Judith; Wu, Bai-Lin; Shen, Yiping; Neale, Benjamin; Shaffer, Lisa G.; Daly, Mark J.; Morton, Cynthia C.; Gusella, James F.

    2012-01-01

    SUMMARY Balanced chromosomal abnormalities (BCAs) represent a reservoir of single gene disruptions in neurodevelopmental disorders (NDD). We sequenced BCAs in autism and related NDDs, revealing disruption of 33 loci in four general categories: 1) genes associated with abnormal neurodevelopment (e.g., AUTS2, FOXP1, CDKL5), 2) single gene contributors to microdeletion syndromes (MBD5, SATB2, EHMT1, SNURF-SNRPN), 3) novel risk loci (e.g., CHD8, KIRREL3, ZNF507), and 4) genes associated with later onset psychiatric disorders (e.g., TCF4, ZNF804A, PDE10A, GRIN2B, ANK3). We also discovered profoundly increased burden of copy number variants among 19,556 neurodevelopmental cases compared to 13,991 controls (p = 2.07×10−47) and enrichment of polygenic risk alleles from autism and schizophrenia genome-wide association studies (p = 0.0018 and 0.0009, respectively). Our findings suggest a polygenic risk model of autism incorporating loci of strong effect and indicate that some neurodevelopmental genes are sensitive to perturbation by multiple mutational mechanisms, leading to variable phenotypic outcomes that manifest at different life stages. PMID:22521361

  12. A Network Based Methodology to Reveal Patterns in Knowledge Transfer

    Directory of Open Access Journals (Sweden)

    Orlando López-Cruz

    2015-12-01

    Full Text Available This paper motivates, presents and demonstrates in use a methodology based in complex network analysis to support research aimed at identification of sources in the process of knowledge transfer at the interorganizational level. The importance of this methodology is that it states a unified model to reveal knowledge sharing patterns and to compare results from multiple researches on data from different periods of time and different sectors of the economy. This methodology does not address the underlying statistical processes. To do this, national statistics departments (NSD provide documents and tools at their websites. But this proposal provides a guide to model information inferences gathered from data processing revealing links between sources and recipients of knowledge being transferred and that the recipient detects as main source to new knowledge creation. Some national statistics departments set as objective for these surveys the characterization of innovation dynamics in firms and to analyze the use of public support instruments. From this characterization scholars conduct different researches. Measures of dimensions of the network composed by manufacturing firms and other organizations conform the base to inquiry the structure that emerges from taking ideas from other organizations to incept innovations. These two sets of data are actors of a two- mode-network. The link between two actors (network nodes, one acting as the source of the idea. The second one acting as the destination comes from organizations or events organized by organizations that “provide” ideas to other group of firms. The resulting demonstrated design satisfies the objective of being a methodological model to identify sources in knowledge transfer of knowledge effectively used in innovation.

  13. Phylogeny of bent-toed geckos (Cyrtodactylus) reveals a west to east pattern of diversification.

    Science.gov (United States)

    Wood, Perry L; Heinicke, Matthew P; Jackman, Todd R; Bauer, Aaron M

    2012-12-01

    The Asian/Pacific genus Cyrtodactylus is the most diverse and among the most widely distributed genera of geckos, and more species are continually being discovered. Major patterns in the evolutionary history of Cyrtodactylus have remained largely unknown because no published study has broadly sampled across the geographic range and morphological diversity of the genus. We assembled a data set including sequences from one mitochondrial and three nuclear loci for 68 Cyrtodactylus and 20 other gekkotan species to infer phylogenetic relationships within the genus and identify major biogeographic patterns. Our results indicate that Cyrtodactylus is monophyletic, but only if the Indian/Sri Lankan species sometimes recognized as Geckoella are included. Basal divergences divide Cyrtodactylus into three well-supported groups: the single species C. tibetanus, a clade of Myanmar/southern Himalayan species, and a large clade including all other Cyrtodactylus plus Geckoella. Within the largest major clade are several well-supported subclades, with separate subclades being most diverse in Thailand, Eastern Indochina, the Sunda region, the Papuan region, and the Philippines, respectively. The phylogenetic results, along with molecular clock and ancestral area analyses, show Cyrtodactylus to have originated in the circum-Himalayan region just after the Cretaceous/Paleogene boundary, with a generally west to east pattern of colonization and diversification progressing through the Cenozoic. Wallacean species are derived from within a Sundaland radiation, the Philippines were colonized from Borneo, and Australia was colonized twice, once via New Guinea and once via the Lesser Sundas. Overall, these results are consistent with past suggestions of a Palearctic origin for Cyrtodactylus, and highlight the key role of geography in diversification of the genus. Copyright © 2012 Elsevier Inc. All rights reserved.

  14. Transcriptome sequencing of the blind subterranean mole rat, Spalax galili: Utility and potential for the discovery of novel evolutionary patterns

    KAUST Repository

    Malik, Assaf

    2011-08-12

    The blind subterranean mole rat (Spalax ehrenbergi superspecies) is a model animal for survival under extreme environments due to its ability to live in underground habitats under severe hypoxic stress and darkness. Here we report the transcriptome sequencing of Spalax galili, a chromosomal type of S. ehrenbergi. cDNA pools from muscle and brain tissues isolated from animals exposed to hypoxic and normoxic conditions were sequenced using Sanger, GS FLX, and GS FLX Titanium technologies. Assembly of the sequences yielded over 51,000 isotigs with homology to ~12,000 mouse, rat or human genes. Based on these results, it was possible to detect large numbers of splice variants, SNPs, and novel transcribed regions. In addition, multiple differential expression patterns were detected between tissues and treatments. The results presented here will serve as a valuable resource for future studies aimed at identifying genes and gene regions evolved during the adaptive radiation associated with underground life of the blind mole rat. 2011 Malik et al.

  15. Dynamic Evolution of Pathogenicity Revealed by Sequencing and Comparative Genomics of 19 Pseudomonas syringae Isolates

    Science.gov (United States)

    Romanchuk, Artur; Chang, Jeff H.; Mukhtar, M. Shahid; Cherkis, Karen; Roach, Jeff; Grant, Sarah R.; Jones, Corbin D.; Dangl, Jeffery L.

    2011-01-01

    Closely related pathogens may differ dramatically in host range, but the molecular, genetic, and evolutionary basis for these differences remains unclear. In many Gram- negative bacteria, including the phytopathogen Pseudomonas syringae, type III effectors (TTEs) are essential for pathogenicity, instrumental in structuring host range, and exhibit wide diversity between strains. To capture the dynamic nature of virulence gene repertoires across P. syringae, we screened 11 diverse strains for novel TTE families and coupled this nearly saturating screen with the sequencing and assembly of 14 phylogenetically diverse isolates from a broad collection of diseased host plants. TTE repertoires vary dramatically in size and content across all P. syringae clades; surprisingly few TTEs are conserved and present in all strains. Those that are likely provide basal requirements for pathogenicity. We demonstrate that functional divergence within one conserved locus, hopM1, leads to dramatic differences in pathogenicity, and we demonstrate that phylogenetics-informed mutagenesis can be used to identify functionally critical residues of TTEs. The dynamism of the TTE repertoire is mirrored by diversity in pathways affecting the synthesis of secreted phytotoxins, highlighting the likely role of both types of virulence factors in determination of host range. We used these 14 draft genome sequences, plus five additional genome sequences previously reported, to identify the core genome for P. syringae and we compared this core to that of two closely related non-pathogenic pseudomonad species. These data revealed the recent acquisition of a 1 Mb megaplasmid by a sub-clade of cucumber pathogens. This megaplasmid encodes a type IV secretion system and a diverse set of unknown proteins, which dramatically increases both the genomic content of these strains and the pan-genome of the species. PMID:21799664

  16. Genotyping-by-sequencing for Populus population genomics: an assessment of genome sampling patterns and filtering approaches.

    Directory of Open Access Journals (Sweden)

    Martin P Schilling

    Full Text Available Continuing advances in nucleotide sequencing technology are inspiring a suite of genomic approaches in studies of natural populations. Researchers are faced with data management and analytical scales that are increasing by orders of magnitude. With such dramatic advances comes a need to understand biases and error rates, which can be propagated and magnified in large-scale data acquisition and processing. Here we assess genomic sampling biases and the effects of various population-level data filtering strategies in a genotyping-by-sequencing (GBS protocol. We focus on data from two species of Populus, because this genus has a relatively small genome and is emerging as a target for population genomic studies. We estimate the proportions and patterns of genomic sampling by examining the Populus trichocarpa genome (Nisqually-1, and demonstrate a pronounced bias towards coding regions when using the methylation-sensitive ApeKI restriction enzyme in this species. Using population-level data from a closely related species (P. tremuloides, we also investigate various approaches for filtering GBS data to retain high-depth, informative SNPs that can be used for population genetic analyses. We find a data filter that includes the designation of ambiguous alleles resulted in metrics of population structure and Hardy-Weinberg equilibrium that were most consistent with previous studies of the same populations based on other genetic markers. Analyses of the filtered data (27,910 SNPs also resulted in patterns of heterozygosity and population structure similar to a previous study using microsatellites. Our application demonstrates that technically and analytically simple approaches can readily be developed for population genomics of natural populations.

  17. Sequence selection by dynamical symmetry breaking in an autocatalytic binary polymer model

    DEFF Research Database (Denmark)

    Fellermann, Harold; Tanaka, Shinpei; Rasmussen, Steen

    2017-01-01

    Template-directed replication of nucleic acids is at the essence of all living beings and a major milestone for any origin of life scenario. We present an idealized model of prebiotic sequence replication, where binary polymers act as templates for their autocatalytic replication, thereby serving...... as each others reactants and products in an intertwined molecular ecology. Our model demonstrates how autocatalysis alters the qualitative and quantitative system dynamics in counterintuitive ways. Most notably, numerical simulations reveal a very strong intrinsic selection mechanism that favors...... the appearance of a few population structures with highly ordered and repetitive sequence patterns when starting from a pool of monomers. We demonstrate both analytically and through simulation how this "selection of the dullest" is caused by continued symmetry breaking through random fluctuations...

  18. Aligning protein sequence and analysing substitution pattern using ...

    Indian Academy of Sciences (India)

    Prakash

    Aligning protein sequences using a score matrix has became a routine but valuable method in modern biological ..... the amino acids according to their substitution behaviour ...... which may cause great change (e.g. prolonging the helix) in.

  19. Characterization of Fasciola samples by ITS of rDNA sequences revealed the existence of Fasciola hepatica and Fasciola gigantica in Yunnan Province, China.

    Science.gov (United States)

    Shu, Fan-Fan; Lv, Rui-Qing; Zhang, Yi-Fang; Duan, Gang; Wu, Ding-Yu; Li, Bi-Feng; Yang, Jian-Fa; Zou, Feng-Cai

    2012-08-01

    On mainland China, liver flukes of Fasciola spp. (Digenea: Fasciolidae) can cause serious acute and chronic morbidity in numerous species of mammals such as sheep, goats, cattle, and humans. The objective of the present study was to examine the taxonomic identity of Fasciola species in Yunnan province by sequences of the first and second internal transcribed spacers (ITS-1 and ITS-2) of nuclear ribosomal DNA (rDNA). The ITS rDNA was amplified from 10 samples representing Fasciola species in cattle from 2 geographical locations in Yunnan Province, by polymerase chain reaction (PCR), and the products were sequenced directly. The lengths of the ITS-1 and ITS-2 sequences were 422 and 361-362 base pairs, respectively, for all samples sequenced. Using ITS sequences, 2 Fasciola species were revealed, namely Fasciola hepatica and Fasciola gigantica. This is the first demonstration of F. gigantica in cattle in Yunnan Province, China using a molecular approach; our findings have implications for studying the population genetic characterization of the Chinese Fasciola species and for the prevention and control of Fasciola spp. in this province.

  20. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds.

    Directory of Open Access Journals (Sweden)

    Yao Xu

    Full Text Available Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus and Qinchuan (Bos taurus are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 to 12 fold on average of 97.86% and 98.98% coverage of genomes, respectively. Comparison with the Bos_taurus_UMD_3.1 reference assembly yielded 9,010,096 SNPs for Nanyang, and 6,965,062 for Qinchuan cattle, 51% and 29% of which were novel SNPs, respectively. A total of 154,934 and 115,032 small indels (1 to 3 bp were found in the Nanyang and Qinchuan genomes, respectively. The SNP and indel distribution revealed that Nanyang showed a genetically high diversity as compared to Qinchuan cattle. Furthermore, a total of 2,907 putative cases of copy number variation (CNV were identified by aligning Nanyang to Qinchuan genome, 783 of which (27% encompassed the coding regions of 495 functional genes. The gene ontology (GO analysis revealed that many CNV genes were enriched in the immune system and environment adaptability. Among several CNV genes related to lipid transport and fat metabolism, Lepin receptor gene (LEPR overlapping with CNV_1815 showed remarkably higher copy number in Qinchuan than Nanyang (log2 (ratio = -2.34988; P value = 1.53E-102. Further qPCR and association analysis investigated that the copy number of the LEPR gene presented positive correlations with transcriptional expression and phenotypic traits, suggesting the LEPR CNV may contribute to the higher fat deposition in muscles of Qinchuan cattle. Our findings provide evidence that the distinct phenotypes of Nanyang and Qinchuan breeds may be due to the different genetic variations including SNPs

  1. The Douglas-fir genome sequence reveals specialization of the photosynthetic apparatus in Pinaceae

    Science.gov (United States)

    David B. Neale; Patrick E. McGuire; Nicholas C. Wheeler; Kristian A. Stevens; Marc W. Crepeau; Charis Cardeno; Aleksey V. Zimin; Daniela Puiu; Geo M. Pertea; U. Uzay Sezen; Claudio Casola; Tomasz E. Koralewski; Robin Paul; Daniel Gonzalez-Ibeas; Sumaira Zaman; Richard Cronn; Mark Yandell; Carson Holt; Charles H. Langley; James A. Yorke; Steven L. Salzberg; Jill L. Wegrzyn

    2017-01-01

    A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50...

  2. Synaptotagmin gene content of the sequenced genomes

    Directory of Open Access Journals (Sweden)

    Craxton Molly

    2004-07-01

    Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their

  3. Draft whole genome sequence of groundnut stem rot fungus Athelia rolfsii revealing genetic architect of its pathogenicity and virulence.

    Science.gov (United States)

    Iquebal, M A; Tomar, Rukam S; Parakhia, M V; Singla, Deepak; Jaiswal, Sarika; Rathod, V M; Padhiyar, S M; Kumar, Neeraj; Rai, Anil; Kumar, Dinesh

    2017-07-13

    Groundnut (Arachis hypogaea L.) is an important oil seed crop having major biotic constraint in production due to stem rot disease caused by fungus, Athelia rolfsii causing 25-80% loss in productivity. As chemical and biological combating strategies of this fungus are not very effective, thus genome sequencing can reveal virulence and pathogenicity related genes for better understanding of the host-parasite interaction. We report draft assembly of Athelia rolfsii genome of ~73 Mb having 8919 contigs. Annotation analysis revealed 16830 genes which are involved in fungicide resistance, virulence and pathogenicity along with putative effector and lethal genes. Secretome analysis revealed CAZY genes representing 1085 enzymatic genes, glycoside hydrolases, carbohydrate esterases, carbohydrate-binding modules, auxillary activities, glycosyl transferases and polysaccharide lyases. Repeat analysis revealed 11171 SSRs, LTR, GYPSY and COPIA elements. Comparative analysis with other existing ascomycotina genome predicted conserved domain family of WD40, CYP450, Pkinase and ABC transporter revealing insight of evolution of pathogenicity and virulence. This study would help in understanding pathogenicity and virulence at molecular level and development of new combating strategies. Such approach is imperative in endeavour of genome based solution in stem rot disease management leading to better productivity of groundnut crop in tropical region of world.

  4. Computational sequence analysis of predicted long dsRNA transcriptomes of major crops reveals sequence complementarity with human genes.

    Science.gov (United States)

    Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I

    2013-01-01

    Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.

  5. Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles.

    Science.gov (United States)

    Gadala-Maria, Daniel; Yaari, Gur; Uduman, Mohamed; Kleinstein, Steven H

    2015-02-24

    Individual variation in germline and expressed B-cell immunoglobulin (Ig) repertoires has been associated with aging, disease susceptibility, and differential response to infection and vaccination. Repertoire properties can now be studied at large-scale through next-generation sequencing of rearranged Ig genes. Accurate analysis of these repertoire-sequencing (Rep-Seq) data requires identifying the germline variable (V), diversity (D), and joining (J) gene segments used by each Ig sequence. Current V(D)J assignment methods work by aligning sequences to a database of known germline V(D)J segment alleles. However, existing databases are likely to be incomplete and novel polymorphisms are hard to differentiate from the frequent occurrence of somatic hypermutations in Ig sequences. Here we develop a Tool for Ig Genotype Elucidation via Rep-Seq (TIgGER). TIgGER analyzes mutation patterns in Rep-Seq data to identify novel V segment alleles, and also constructs a personalized germline database containing the specific set of alleles carried by a subject. This information is then used to improve the initial V segment assignments from existing tools, like IMGT/HighV-QUEST. The application of TIgGER to Rep-Seq data from seven subjects identified 11 novel V segment alleles, including at least one in every subject examined. These novel alleles constituted 13% of the total number of unique alleles in these subjects, and impacted 3% of V(D)J segment assignments. These results reinforce the highly polymorphic nature of human Ig V genes, and suggest that many novel alleles remain to be discovered. The integration of TIgGER into Rep-Seq processing pipelines will increase the accuracy of V segment assignments, thus improving B-cell repertoire analyses.

  6. Large-area imaging reveals biologically driven non-random spatial patterns of corals at a remote reef

    Science.gov (United States)

    Edwards, Clinton B.; Eynaud, Yoan; Williams, Gareth J.; Pedersen, Nicole E.; Zgliczynski, Brian J.; Gleason, Arthur C. R.; Smith, Jennifer E.; Sandin, Stuart A.

    2017-12-01

    For sessile organisms such as reef-building corals, differences in the degree of dispersion of individuals across a landscape may result from important differences in life-history strategies or may reflect patterns of habitat availability. Descriptions of spatial patterns can thus be useful not only for the identification of key biological and physical mechanisms structuring an ecosystem, but also by providing the data necessary to generate and test ecological theory. Here, we used an in situ imaging technique to create large-area photomosaics of 16 plots at Palmyra Atoll, central Pacific, each covering 100 m2 of benthic habitat. We mapped the location of 44,008 coral colonies and identified each to the lowest taxonomic level possible. Using metrics of spatial dispersion, we tested for departures from spatial randomness. We also used targeted model fitting to explore candidate processes leading to differences in spatial patterns among taxa. Most taxa were clustered and the degree of clustering varied by taxon. A small number of taxa did not significantly depart from randomness and none revealed evidence of spatial uniformity. Importantly, taxa that readily fragment or tolerate stress through partial mortality were more clustered. With little exception, clustering patterns were consistent with models of fragmentation and dispersal limitation. In some taxa, dispersion was linearly related to abundance, suggesting density dependence of spatial patterning. The spatial patterns of stony corals are non-random and reflect fundamental life-history characteristics of the taxa, suggesting that the reef landscape may, in many cases, have important elements of spatial predictability.

  7. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sec...

  8. Comparative sequence analysis of Solanum and Arabidopsis in a hot spot for pathogen resistance on potato chromosome V reveals a patchwork of conserved and rapidly evolving genome segments

    Directory of Open Access Journals (Sweden)

    Bruggmann Rémy

    2007-05-01

    Full Text Available Abstract Background Quantitative phenotypic variation of agronomic characters in crop plants is controlled by environmental and genetic factors (quantitative trait loci = QTL. To understand the molecular basis of such QTL, the identification of the underlying genes is of primary interest and DNA sequence analysis of the genomic regions harboring QTL is a prerequisite for that. QTL mapping in potato (Solanum tuberosum has identified a region on chromosome V tagged by DNA markers GP21 and GP179, which contains a number of important QTL, among others QTL for resistance to late blight caused by the oomycete Phytophthora infestans and to root cyst nematodes. Results To obtain genomic sequence for the targeted region on chromosome V, two local BAC (bacterial artificial chromosome contigs were constructed and sequenced, which corresponded to parts of the homologous chromosomes of the diploid, heterozygous genotype P6/210. Two contiguous sequences of 417,445 and 202,781 base pairs were assembled and annotated. Gene-by-gene co-linearity was disrupted by non-allelic insertions of retrotransposon elements, stretches of diverged intergenic sequences, differences in gene content and gene order. The latter was caused by inversion of a 70 kbp genomic fragment. These features were also found in comparison to orthologous sequence contigs from three homeologous chromosomes of Solanum demissum, a wild tuber bearing species. Functional annotation of the sequence identified 48 putative open reading frames (ORF in one contig and 22 in the other, with an average of one ORF every 9 kbp. Ten ORFs were classified as resistance-gene-like, 11 as F-box-containing genes, 13 as transposable elements and three as transcription factors. Comparing potato to Arabidopsis thaliana annotated proteins revealed five micro-syntenic blocks of three to seven ORFs with A. thaliana chromosomes 1, 3 and 5. Conclusion Comparative sequence analysis revealed highly conserved collinear regions

  9. Genomic sequencing and in vivo footprinting of an expression-specific DNase I-hypersensitive site of avian vitellogenin II promoter reveal a demethylation of a mCpG and a change in specific interactions of proteins with DNA.

    Science.gov (United States)

    Saluz, H P; Feavers, I M; Jiricny, J; Jost, J P

    1988-01-01

    Genomic sequencing was used to study the in vivo methylation pattern of two CpG sites in the promoter region of the avian vitellogenin gene. The CpG at position +10 was fully methylated in DNA isolated from tissues that do not express the gene but was unmethylated in the liver of mature hens and estradiol-treated roosters. In the latter tissue, this site became demethylated and DNase I hypersensitive after estradiol treatment. A second CpG (position -52) was unmethylated in all tissues examined. In vivo genomic footprinting with dimethyl sulfate revealed different patterns of DNA protection in silent and expressed genes. In rooster liver cells, at least 10 base pairs of DNA, including the methylated CpG, were protected by protein(s). Gel-shift assays indicated that a protein factor, present in rooster liver nuclear extract, bound at this site only when it was methylated. In hen liver cells, the same unmethylated CpG lies within a protected region of approximately equal to 20 base pairs. In vitro DNase I protection and gel-shift assays indicate that this sequence is bound by a protein, which binds both double- and single-stranded DNA. For the latter substrate, this factor was shown to bind solely the noncoding (i.e., mRNA-like) strand. Images PMID:3413118

  10. Comparative Analysis of WUSCHEL-Related Homeobox Genes Revealed Their Parent-of-Origin and Cell Type-Specific Expression Pattern During Early Embryogenesis in Tobacco

    Directory of Open Access Journals (Sweden)

    Xuemei Zhou

    2018-03-01

    Full Text Available WUSCHEL-related homeobox (WOX gene is a plant-specific clade of homeobox transcription factors. Increasing evidences reveal that WOXs play critical roles in early embryogenesis, which involves zygote development, initiation of zygote division, and apical or basal cell lineage establishment. However, how WOXs regulate these developmental events remains largely unknown, and even detailed expression pattern in gametes and early proembryos is not yet available. Here, 13 WOX family genes were identified in Nicotiana tabacum genome. Comparative analysis of 13 WOX family genes with their homologs in Arabidopsis thaliana reveals relatively conserved expression pattern of WUS and WOX5 in shoot/root apical meristem. Whereas variations were also found, e.g., lacking homolog of WOX8 (a marker for suspensor cell in tobacco genome and the expression of WOX2/WOX9 in both apical cell and basal cell. Transient transcriptional activity analysis revealed that WOXs in WUS clade have repressive activities for their target's transcription, whereas WOXs in ancient and intermediate clade have activation activities, giving a molecular basis for the phylogenetic classification of tobacco WOXs into three major clades. Expression pattern analysis revealed that some WOXs (e.g., WOX 13a expressed in both male and female gametes and some WOXs (e.g., WOX 11 and WOX 13b displayed the characteristics of parent-of-origin genes. Interestingly, some WOXs (e.g., WOX2 and WOX9, which are essential for early embryo patterning, were de novo transcribed in zygote, indicating relevant mechanism for embryo pattern formation is only established in zygote right after fertilization and not carried in by gametes. We also found that most WOXs displayed a stage-specific and cell type-specific expression pattern. Taken together, this work provides a detailed landscape of WOXs in tobacco during fertilization and early embryogenesis, which will facilitate the understanding of their specific roles

  11. Detecting change in stochastic sound sequences.

    Directory of Open Access Journals (Sweden)

    Benjamin Skerritt-Davis

    2018-05-01

    Full Text Available Our ability to parse our acoustic environment relies on the brain's capacity to extract statistical regularities from surrounding sounds. Previous work in regularity extraction has predominantly focused on the brain's sensitivity to predictable patterns in sound sequences. However, natural sound environments are rarely completely predictable, often containing some level of randomness, yet the brain is able to effectively interpret its surroundings by extracting useful information from stochastic sounds. It has been previously shown that the brain is sensitive to the marginal lower-order statistics of sound sequences (i.e., mean and variance. In this work, we investigate the brain's sensitivity to higher-order statistics describing temporal dependencies between sound events through a series of change detection experiments, where listeners are asked to detect changes in randomness in the pitch of tone sequences. Behavioral data indicate listeners collect statistical estimates to process incoming sounds, and a perceptual model based on Bayesian inference shows a capacity in the brain to track higher-order statistics. Further analysis of individual subjects' behavior indicates an important role of perceptual constraints in listeners' ability to track these sensory statistics with high fidelity. In addition, the inference model facilitates analysis of neural electroencephalography (EEG responses, anchoring the analysis relative to the statistics of each stochastic stimulus. This reveals both a deviance response and a change-related disruption in phase of the stimulus-locked response that follow the higher-order statistics. These results shed light on the brain's ability to process stochastic sound sequences.

  12. ATRX mutation in two adult brothers with non-specific moderate intellectual disability identified by exome sequencing

    OpenAIRE

    Moncini, S.; Bedeschi, M.F.; Castronovo, P.; Crippa, M.; Calvello, M.; Garghentino, R.R.; Scuvera, G.; Finelli, P.; Venturin, M.

    2013-01-01

    In this report, we describe two adult brothers affected by moderate non-specific intellectual disability (ID). They showed minor facial anomalies, not clearly ascribable to any specific syndromic patterns, microcephaly, brachydactyly and broad toes. Both brothers presented seizures. Karyotype, subtelomeric and FMR1 analysis were normal in both cases. We performed array-CGH analysis that revealed no copy-number variations potentially associated with ID. Subsequent exome sequence analysis allow...

  13. Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM Encoder-Decoder Architecture

    OpenAIRE

    Park, Seong Hyeon; Kim, ByeongDo; Kang, Chang Mook; Chung, Chung Choo; Choi, Jun Won

    2018-01-01

    In this paper, we propose a deep learning based vehicle trajectory prediction technique which can generate the future trajectory sequence of surrounding vehicles in real time. We employ the encoder-decoder architecture which analyzes the pattern underlying in the past trajectory using the long short-term memory (LSTM) based encoder and generates the future trajectory sequence using the LSTM based decoder. This structure produces the $K$ most likely trajectory candidates over occupancy grid ma...

  14. Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

    OpenAIRE

    Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

    1984-01-01

    The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic...

  15. Amino acid and nucleotide recurrence in aligned sequences: synonymous substitution patterns in association with global and local base compositions.

    Science.gov (United States)

    Nishizawa, M; Nishizawa, K

    2000-10-01

    The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the 'between gene' GC content heterogeneity, which is linked to 'isochores', is a principal factor associated with the bias in substitution patterns in human, 'within gene' heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed.

  16. Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

    Science.gov (United States)

    Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

    1984-03-26

    The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.

  17. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers

    Directory of Open Access Journals (Sweden)

    Weiguo Hou

    2018-05-01

    Full Text Available Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length encoding the cyanophage gp23 major capsid protein (MCP. Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92% belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  18. Penerapan Reverse Engineering Dalam Penentuan Pola Interaksi Sequence Diagram Pada Sampel Aplikasi Android

    Directory of Open Access Journals (Sweden)

    Vierdy Sulfianto Rahmadani

    2015-04-01

    Full Text Available The purpose of this research is to apply the application of reverse engineering to determine interaction patterns of the Sequence diagram that can be used by system analysts as a template for designing UML sequence diagrams. Sample applications from android are used as dataset for reverse engineering and pattern identification. The first step is collecting application datasets. The next stage is identifying the features and applications activity, reverse engineering to obtain a sequence diagram model, and then synthesize all of the models into an interaction pattern of sequence diagram. The final step is to test the patterns by implementing it in an application development case stud. The evaluation results concludes that interaction patterns of sequence diagram designs obtained in reverse engineering steps is able to be implemented in software development that contained similar features with the obtained features in this research.

  19. Visual guidance of forward flight in hummingbirds reveals control based on image features instead of pattern velocity.

    Science.gov (United States)

    Dakin, Roslyn; Fellows, Tyee K; Altshuler, Douglas L

    2016-08-02

    Information about self-motion and obstacles in the environment is encoded by optic flow, the movement of images on the eye. Decades of research have revealed that flying insects control speed, altitude, and trajectory by a simple strategy of maintaining or balancing the translational velocity of images on the eyes, known as pattern velocity. It has been proposed that birds may use a similar algorithm but this hypothesis has not been tested directly. We examined the influence of pattern velocity on avian flight by manipulating the motion of patterns on the walls of a tunnel traversed by Anna's hummingbirds. Contrary to prediction, we found that lateral course control is not based on regulating nasal-to-temporal pattern velocity. Instead, birds closely monitored feature height in the vertical axis, and steered away from taller features even in the absence of nasal-to-temporal pattern velocity cues. For vertical course control, we observed that birds adjusted their flight altitude in response to upward motion of the horizontal plane, which simulates vertical descent. Collectively, our results suggest that birds avoid collisions using visual cues in the vertical axis. Specifically, we propose that birds monitor the vertical extent of features in the lateral visual field to assess distances to the side, and vertical pattern velocity to avoid collisions with the ground. These distinct strategies may derive from greater need to avoid collisions in birds, compared with small insects.

  20. Fast convergence of spike sequences to periodic patterns in recurrent networks

    International Nuclear Information System (INIS)

    Jin, Dezhe Z.

    2002-01-01

    The dynamical attractors are thought to underlie many biological functions of recurrent neural networks. Here we show that stable periodic spike sequences with precise timings are the attractors of the spiking dynamics of recurrent neural networks with global inhibition. Almost all spike sequences converge within a finite number of transient spikes to these attractors. The convergence is fast, especially when the global inhibition is strong. These results support the possibility that precise spatiotemporal sequences of spikes are useful for information encoding and processing in biological neural networks

  1. Molecular recognition of AT-DNA sequences by the induced CD pattern of dibenzotetraaza[14]annulene (DBTAA)–adenine derivatives

    OpenAIRE

    Stojković, Marijana Radić; Škugor, Marko; Dudek, Łukasz; Grolik, Jarosław; Eilmes, Julita; Piantanida, Ivo

    2014-01-01

    Summary An investigation of the interactions of two novel and several known DBTAA–adenine conjugates with double-stranded DNA and RNA has revealed the DNA/RNA groove as the dominant binding site, which is in contrast to the majority of previously studied DBTAA analogues (DNA/RNA intercalators). Only DBTAA–propyladenine conjugates revealed the molecular recognition of AT-DNA by an ICD band pattern > 300 nm, whereas significant ICD bands did not appear for other ds-DNA/RNA. A structure–activity...

  2. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  3. Whole-exome sequencing of a patient with severe and complex hemostatic abnormalities reveals a possible contributing frameshift mutation in C3AR1

    DEFF Research Database (Denmark)

    Leinøe, Eva; Nielsen, Ove Juul; Jønson, Lars

    2016-01-01

    -threatening coagulation disorder causing recurrent venous thromboembolic events, severe thrombocytopenia, and subdural hematomas. Whole-exome sequencing revealed a frameshift mutation (C3AR1 c.355-356dup, p.Asp119Alafs*19) resulting in a premature stop codon in C3AR1 (Complement Component 3a Receptor 1). Based...

  4. Complex (Nonstandard) Six-Layer Polytypes of Lizardite Revealed from Oblique-Texture Electron Diffraction Patterns

    International Nuclear Information System (INIS)

    Zhukhlistov, A.P.; Zinchuk, N.N.; Kotel'nikov, D.D.

    2004-01-01

    Association of simple (1T and 3R) and two complex (nonstandard) orthogonal polytypes of the serpentine mineral lizardite from the Catoca kimberlite pipe (West Africa) association is revealed from oblique-texture electron diffraction patterns. A six-layer polytype with an ordered superposition of equally oriented layers (notation 3 2 3 2 3 4 3 4 3 6 3 6 or ++ - -00) belonging to the structural group A and a three-layer (336 or I,I,II) or a six-layer (336366 or I,I,II,I,II,II) polytype with alternating oppositely oriented layers and semi-disordered structure are identified using polytype analysis

  5. Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.

    Science.gov (United States)

    Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M

    2010-12-15

    Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.

  6. Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi

    Directory of Open Access Journals (Sweden)

    Huynen Leon

    2010-12-01

    Full Text Available Abstract Background Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Results Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. Conclusions The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.

  7. Patterns of Force, Sequences of Resistance

    DEFF Research Database (Denmark)

    Lindegaard, Marie Rosenkrantz; Daniël De Vries, Thomas; Bernasco, Wim

    2018-01-01

    Robberies are improvised encounters involving offender threat, sometimes force, and often victim resistance. While the association between threat, force, and resistance in robberies is well-established, sequential patterns are disputed due to biases of retrospective studies. To overcome these bia...... the likelihood of victim resistance despite having no effect on offender vio- lence. By providing more reliable and detailed accounts of real-life behavior during robberies, our analysis illustrates the potential of a newly emergent field of studies of crimes caught on camera....

  8. Genome Sequence of the Bacterium Streptomyces davawensis JCM 4913 and Heterologous Production of the Unique Antibiotic Roseoflavin

    Science.gov (United States)

    Jankowitsch, Frank; Schwarz, Julia; Rückert, Christian; Gust, Bertolt; Szczepanowski, Rafael; Blom, Jochen; Pelzer, Stefan; Kalinowski, Jörn

    2012-01-01

    Streptomyces davawensis JCM 4913 synthesizes the antibiotic roseoflavin, a structural riboflavin (vitamin B2) analog. Here, we report the 9,466,619-bp linear chromosome of S. davawensis JCM 4913 and a 89,331-bp linear plasmid. The sequence has an average G+C content of 70.58% and contains six rRNA operons (16S-23S-5S) and 69 tRNA genes. The 8,616 predicted protein-coding sequences include 32 clusters coding for secondary metabolites, several of which are unique to S. davawensis. The chromosome contains long terminal inverted repeats of 33,255 bp each and atypical telomeres. Sequence analysis with regard to riboflavin biosynthesis revealed three different patterns of gene organization in Streptomyces species. Heterologous expression of a set of genes present on a subgenomic fragment of S. davawensis resulted in the production of roseoflavin by the host Streptomyces coelicolor M1152. Phylogenetic analysis revealed that S. davawensis is a close relative of Streptomyces cinnabarinus, and much to our surprise, we found that the latter bacterium is a roseoflavin producer as well. PMID:23043000

  9. Structural and sequence features of two residue turns in beta-hairpins.

    Science.gov (United States)

    Madan, Bharat; Seo, Sung Yong; Lee, Sun-Gu

    2014-09-01

    Beta-turns in beta-hairpins have been implicated as important sites in protein folding. In particular, two residue β-turns, the most abundant connecting elements in beta-hairpins, have been a major target for engineering protein stability and folding. In this study, we attempted to investigate and update the structural and sequence properties of two residue turns in beta-hairpins with a large data set. For this, 3977 beta-turns were extracted from 2394 nonhomologous protein chains and analyzed. First, the distribution, dihedral angles and twists of two residue turn types were determined, and compared with previous data. The trend of turn type occurrence and most structural features of the turn types were similar to previous results, but for the first time Type II turns in beta-hairpins were identified. Second, sequence motifs for the turn types were devised based on amino acid positional potentials of two-residue turns, and their distributions were examined. From this study, we could identify code-like sequence motifs for the two residue beta-turn types. Finally, structural and sequence properties of beta-strands in the beta-hairpins were analyzed, which revealed that the beta-strands showed no specific sequence and structural patterns for turn types. The analytical results in this study are expected to be a reference in the engineering or design of beta-hairpin turn structures and sequences. © 2014 Wiley Periodicals, Inc.

  10. High-resolution deep sequencing reveals biodiversity, population structure, and persistence of HIV-1 quasispecies within host ecosystems

    Directory of Open Access Journals (Sweden)

    Yin Li

    2012-12-01

    Full Text Available Abstract Background Deep sequencing provides the basis for analysis of biodiversity of taxonomically similar organisms in an environment. While extensively applied to microbiome studies, population genetics studies of viruses are limited. To define the scope of HIV-1 population biodiversity within infected individuals, a suite of phylogenetic and population genetic algorithms was applied to HIV-1 envelope hypervariable domain 3 (Env V3 within peripheral blood mononuclear cells from a group of perinatally HIV-1 subtype B infected, therapy-naïve children. Results Biodiversity of HIV-1 Env V3 quasispecies ranged from about 70 to 270 unique sequence clusters across individuals. Viral population structure was organized into a limited number of clusters that included the dominant variants combined with multiple clusters of low frequency variants. Next generation viral quasispecies evolved from low frequency variants at earlier time points through multiple non-synonymous changes in lineages within the evolutionary landscape. Minor V3 variants detected as long as four years after infection co-localized in phylogenetic reconstructions with early transmitting viruses or with subsequent plasma virus circulating two years later. Conclusions Deep sequencing defines HIV-1 population complexity and structure, reveals the ebb and flow of dominant and rare viral variants in the host ecosystem, and identifies an evolutionary record of low-frequency cell-associated viral V3 variants that persist for years. Bioinformatics pipeline developed for HIV-1 can be applied for biodiversity studies of virome populations in human, animal, or plant ecosystems.

  11. Candidate genes revealed by a genome scan for mosquito resistance to a bacterial insecticide: sequence and gene expression variations

    Directory of Open Access Journals (Sweden)

    David Jean-Philippe

    2009-11-01

    Full Text Available Abstract Background Genome scans are becoming an increasingly popular approach to study the genetic basis of adaptation and speciation, but on their own, they are often helpless at identifying the specific gene(s or mutation(s targeted by selection. This shortcoming is hopefully bound to disappear in the near future, thanks to the wealth of new genomic resources that are currently being developed for many species. In this article, we provide a foretaste of this exciting new era by conducting a genome scan in the mosquito Aedes aegypti with the aim to look for candidate genes involved in resistance to Bacillus thuringiensis subsp. israelensis (Bti insecticidal toxins. Results The genome of a Bti-resistant and a Bti-susceptible strains was surveyed using about 500 MITE-based molecular markers, and the loci showing the highest inter-strain genetic differentiation were sequenced and mapped on the Aedes aegypti genome sequence. Several good candidate genes for Bti-resistance were identified in the vicinity of these highly differentiated markers. Two of them, coding for a cadherin and a leucine aminopeptidase, were further examined at the sequence and gene expression levels. In the resistant strain, the cadherin gene displayed patterns of nucleotide polymorphisms consistent with the action of positive selection (e.g. an excess of high compared to intermediate frequency mutations, as well as a significant under-expression compared to the susceptible strain. Conclusion Both sequence and gene expression analyses agree to suggest a role for positive selection in the evolution of this cadherin gene in the resistant strain. However, it is unlikely that resistance to Bti is conferred by this gene alone, and further investigation will be needed to characterize other genes significantly associated with Bti resistance in Ae. aegypti. Beyond these results, this article illustrates how genome scans can build on the body of new genomic information (here, full

  12. Binning of shallowly sampled metagenomic sequence fragments reveals that low abundance bacteria play important roles in sulfur cycling and degradation of complex organic polymers in an acid mine drainage community

    Science.gov (United States)

    Dick, G. J.; Andersson, A.; Banfield, J. F.

    2007-12-01

    Our understanding of environmental microbiology has been greatly enhanced by community genome sequencing of DNA recovered directly the environment. Community genomics provides insights into the diversity, community structure, metabolic function, and evolution of natural populations of uncultivated microbes, thereby revealing dynamics of how microorganisms interact with each other and their environment. Recent studies have demonstrated the potential for reconstructing near-complete genomes from natural environments while highlighting the challenges of analyzing community genomic sequence, especially from diverse environments. A major challenge of shotgun community genome sequencing is identification of DNA fragments from minor community members for which only low coverage of genomic sequence is present. We analyzed community genome sequence retrieved from biofilms in an acid mine drainage (AMD) system in the Richmond Mine at Iron Mountain, CA, with an emphasis on identification and assembly of DNA fragments from low-abundance community members. The Richmond mine hosts an extensive, relatively low diversity subterranean chemolithoautotrophic community that is sustained entirely by oxidative dissolution of pyrite. The activity of these microorganisms greatly accelerates the generation of AMD. Previous and ongoing work in our laboratory has focused on reconstrucing genomes of dominant community members, including several bacteria and archaea. We binned contigs from several samples (including one new sample and two that had been previously analyzed) by tetranucleotide frequency with clustering by Self-Organizing Maps (SOM). The binning, evaluated by comparison with information from the manually curated assembly of the dominant organisms, was found to be very effective: fragments were correctly assigned with 95% accuracy. Improperly assigned fragments often contained sequences that are either evolutionarily constrained (e.g. 16S rRNA genes) or mobile elements that are

  13. Clustering Patterns of Engagement in Massive Open Online Courses (MOOCs): The Use of Learning Analytics to Reveal Student Categories

    Science.gov (United States)

    Khalil, Mohammad; Ebner, Martin

    2017-01-01

    Massive Open Online Courses (MOOCs) are remote courses that excel in their students' heterogeneity and quantity. Due to the peculiarity of being massiveness, the large datasets generated by MOOC platforms require advanced tools and techniques to reveal hidden patterns for purposes of enhancing learning and educational behaviors. This publication…

  14. Inactivity periods and postural change speed can explain atypical postural change patterns of Caenorhabditis elegans mutants.

    Science.gov (United States)

    Fukunaga, Tsukasa; Iwasaki, Wataru

    2017-01-19

    With rapid advances in genome sequencing and editing technologies, systematic and quantitative analysis of animal behavior is expected to be another key to facilitating data-driven behavioral genetics. The nematode Caenorhabditis elegans is a model organism in this field. Several video-tracking systems are available for automatically recording behavioral data for the nematode, but computational methods for analyzing these data are still under development. In this study, we applied the Gaussian mixture model-based binning method to time-series postural data for 322 C. elegans strains. We revealed that the occurrence patterns of the postural states and the transition patterns among these states have a relationship as expected, and such a relationship must be taken into account to identify strains with atypical behaviors that are different from those of wild type. Based on this observation, we identified several strains that exhibit atypical transition patterns that cannot be fully explained by their occurrence patterns of postural states. Surprisingly, we found that two simple factors-overall acceleration of postural movement and elimination of inactivity periods-explained the behavioral characteristics of strains with very atypical transition patterns; therefore, computational analysis of animal behavior must be accompanied by evaluation of the effects of these simple factors. Finally, we found that the npr-1 and npr-3 mutants have similar behavioral patterns that were not predictable by sequence homology, proving that our data-driven approach can reveal the functions of genes that have not yet been characterized. We propose that elimination of inactivity periods and overall acceleration of postural change speed can explain behavioral phenotypes of strains with very atypical postural transition patterns. Our methods and results constitute guidelines for effectively finding strains that show "truly" interesting behaviors and systematically uncovering novel gene

  15. Time-Resolved Transposon Insertion Sequencing Reveals Genome-Wide Fitness Dynamics during Infection

    Directory of Open Access Journals (Sweden)

    Guanhua Yang

    2017-10-01

    Full Text Available Transposon insertion sequencing (TIS is a powerful high-throughput genetic technique that is transforming functional genomics in prokaryotes, because it enables genome-wide mapping of the determinants of fitness. However, current approaches for analyzing TIS data assume that selective pressures are constant over time and thus do not yield information regarding changes in the genetic requirements for growth in dynamic environments (e.g., during infection. Here, we describe structured analysis of TIS data collected as a time series, termed pattern analysis of conditional essentiality (PACE. From a temporal series of TIS data, PACE derives a quantitative assessment of each mutant’s fitness over the course of an experiment and identifies mutants with related fitness profiles. In so doing, PACE circumvents major limitations of existing methodologies, specifically the need for artificial effect size thresholds and enumeration of bacterial population expansion. We used PACE to analyze TIS samples of Edwardsiella piscicida (a fish pathogen collected over a 2-week infection period from a natural host (the flatfish turbot. PACE uncovered more genes that affect E. piscicida’s fitness in vivo than were detected using a cutoff at a terminal sampling point, and it identified subpopulations of mutants with distinct fitness profiles, one of which informed the design of new live vaccine candidates. Overall, PACE enables efficient mining of time series TIS data and enhances the power and sensitivity of TIS-based analyses.

  16. Fetal anatomy revealed with fast MR sequences.

    Science.gov (United States)

    Levine, D; Hatabu, H; Gaa, J; Atkinson, M W; Edelman, R R

    1996-10-01

    Although all the imaging studies in this pictorial essay were done for maternal rather than fetal indications, fetal anatomy was well visualized. However, when scans are undertaken for fetal indications, fetal motion in between scout views and imaging sequences may make specific image planes difficult to obtain. Of the different techniques described in this review, we preferred the HASTE technique and use it almost exclusively for scanning pregnant patients. The T2-weighting is ideal for delineating fetal organs. Also, the HASTE technique allows images to be obtained in 430 msec, limiting artifacts arising from maternal and fetal motion. MR imaging should play a more important role in evaluating equivocal sonographic cases as fast scanning techniques are more widely used. Obstetric MR imaging no longer will be limited by fetal motion artifacts. When complex anatomy requires definition in a complicated pregnant patient, MR imaging should be considered as a useful adjunct to sonography.

  17. Molecular recognition of AT-DNA sequences by the induced CD pattern of dibenzotetraaza[14]annulene (DBTAA)-adenine derivatives.

    Science.gov (United States)

    Stojković, Marijana Radić; Skugor, Marko; Dudek, Lukasz; Grolik, Jarosław; Eilmes, Julita; Piantanida, Ivo

    2014-01-01

    An investigation of the interactions of two novel and several known DBTAA-adenine conjugates with double-stranded DNA and RNA has revealed the DNA/RNA groove as the dominant binding site, which is in contrast to the majority of previously studied DBTAA analogues (DNA/RNA intercalators). Only DBTAA-propyladenine conjugates revealed the molecular recognition of AT-DNA by an ICD band pattern > 300 nm, whereas significant ICD bands did not appear for other ds-DNA/RNA. A structure-activity relation for the studied series of compounds showed that the essential structural features for the ICD recognition are a) the presence of DNA-binding appendages (adenine side chain and positively charged side chain) on both DBTAA side chains, and b) the presence of a short propyl linker, which does not support intramolecular aromatic stacking between DBTAA and adenine. The observed AT-DNA-ICD pattern differs from previously reported ss-DNA (poly dT) ICD recognition by a strong negative ICD band at 350 nm, which allows for the dynamic differentiation between ss-DNA (poly dT) and coupled ds-AT-DNA.

  18. Trace metal depositional patterns from an open pit mining activity as revealed by archived avian gizzard contents

    Energy Technology Data Exchange (ETDEWEB)

    Bendell, L.I., E-mail: bendell@sfu.ca

    2011-02-15

    Archived samples of blue grouse (Dendragapus obscurus) gizzard contents, inclusive of grit, collected yearly between 1959 and 1970 were analyzed for cadmium, lead, zinc, and copper content. Approximately halfway through the 12-year sampling period, an open-pit copper mine began activities, then ceased operations 2 years later. Thus the archived samples provided a unique opportunity to determine if avian gizzard contents, inclusive of grit, could reveal patterns in the anthropogenic deposition of trace metals associated with mining activities. Gizzard concentrations of cadmium and copper strongly coincided with the onset of opening and the closing of the pit mining activity. Gizzard zinc and lead demonstrated significant among year variation; however, maximum concentrations did not correlate to mining activity. The archived gizzard contents did provide a useful tool for documenting trends in metal depositional patterns related to an anthropogenic activity. Further, blue grouse ingesting grit particles during the time of active mining activity would have been exposed to toxicologically significant levels of cadmium. Gizzard lead concentrations were also of toxicological significance but not related to mining activity. This type of 'pulse' toxic metal exposure as a consequence of open-pit mining activity would not necessarily have been revealed through a 'snap-shot' of soil, plant or avian tissue trace metal analysis post-mining activity. - Research Highlights: {yields} Archived gizzard samples reveals mining history. {yields} Grit ingestion exposes grouse to cadmium and lead. {yields} Grit selection includes particles enriched in cadmium. {yields} Cadmium enriched particles are of toxicological significance.

  19. Parallel motif extraction from very long sequences

    KAUST Repository

    Sahli, Majed; Mansour, Essam; Kalnis, Panos

    2013-01-01

    Motifs are frequent patterns used to identify biological functionality in genomic sequences, periodicity in time series, or user trends in web logs. In contrast to a lot of existing work that focuses on collections of many short sequences, modern

  20. The paradox of HBV evolution as revealed from a 16th century mummy.

    Directory of Open Access Journals (Sweden)

    Zoe Patterson Ross

    2018-01-01

    Full Text Available Hepatitis B virus (HBV is a ubiquitous viral pathogen associated with large-scale morbidity and mortality in humans. However, there is considerable uncertainty over the time-scale of its origin and evolution. Initial shotgun data from a mid-16th century Italian child mummy, that was previously paleopathologically identified as having been infected with Variola virus (VARV, the agent of smallpox, showed no DNA reads for VARV yet did for hepatitis B virus (HBV. Previously, electron microscopy provided evidence for the presence of VARV in this sample, although similar analyses conducted here did not reveal any VARV particles. We attempted to enrich and sequence for both VARV and HBV DNA. Although we did not recover any reads identified as VARV, we were successful in reconstructing an HBV genome at 163.8X coverage. Strikingly, both the HBV sequence and that of the associated host mitochondrial DNA displayed a nearly identical cytosine deamination pattern near the termini of DNA fragments, characteristic of an ancient origin. In contrast, phylogenetic analyses revealed a close relationship between the putative ancient virus and contemporary HBV strains (of genotype D, at first suggesting contamination. In addressing this paradox we demonstrate that HBV evolution is characterized by a marked lack of temporal structure. This confounds attempts to use molecular clock-based methods to date the origin of this virus over the time-frame sampled so far, and means that phylogenetic measures alone cannot yet be used to determine HBV sequence authenticity. If genuine, this phylogenetic pattern indicates that the genotypes of HBV diversified long before the 16th century, and enables comparison of potential pathogenic similarities between modern and ancient HBV. These results have important implications for our understanding of the emergence and evolution of this common viral pathogen.

  1. Angiogenesis interactome and time course microarray data reveal the distinct activation patterns in endothelial cells.

    Directory of Open Access Journals (Sweden)

    Liang-Hui Chu

    Full Text Available Angiogenesis involves stimulation of endothelial cells (EC by various cytokines and growth factors, but the signaling mechanisms are not completely understood. Combining dynamic gene expression time-course data for stimulated EC with protein-protein interactions associated with angiogenesis (the "angiome" could reveal how different stimuli result in different patterns of network activation and could implicate signaling intermediates as points for control or intervention. We constructed the protein-protein interaction networks of positive and negative regulation of angiogenesis comprising 367 and 245 proteins, respectively. We used five published gene expression datasets derived from in vitro assays using different types of blood endothelial cells stimulated by VEGFA (vascular endothelial growth factor A. We used the Short Time-series Expression Miner (STEM to identify significant temporal gene expression profiles. The statistically significant patterns between 2D fibronectin and 3D type I collagen substrates for telomerase-immortalized EC (TIME show that different substrates could influence the temporal gene activation patterns in the same cell line. We investigated the different activation patterns among 18 transmembrane tyrosine kinase receptors, and experimentally measured the protein level of the tyrosine-kinase receptors VEGFR1, VEGFR2 and VEGFR3 in human umbilical vein EC (HUVEC and human microvascular EC (MEC. The results show that VEGFR1-VEGFR2 levels are more closely coupled than VEGFR1-VEGFR3 or VEGFR2-VEGFR3 in HUVEC and MEC. This computational methodology can be extended to investigate other molecules or biological processes such as cell cycle.

  2. Use of cycle stacking patterns to define third-order depositional sequences: Middle to Late Cambrian Bonanza King Formation, southern Great basin

    Energy Technology Data Exchange (ETDEWEB)

    Montanez, I.P.; Droser, M.L. (Univ. of California, Riverside (United States))

    1991-03-01

    The Middle to Late Cambrian Bonanza King Formation (CA, NV) is characterized by superimposed scales of cyclicity. Small-scale cycles (0.5 to 10m) occur as shallowing-upward peritidal and subtidal cycles that repeat at high frequencies (10{sup 4} to 10{sup 5}). Systematic changes in stacking patterns of meter-scale cycles define several large-scale (50-250 m) third-order depositional sequences in the Bonanza King Formation. Third-order depositional sequences can be traced within ranges and correlated regionally across the platform. Peritidal cycles in the Bonanza King Formation are both subtidal- and tidal flat-dominated. Tidal flat-dominated cycles consist of muddy bases grading upward into thrombolites or columnar stromatolites all capped by planar stromatolites. Subtidal cycles in the Bonanza King Formation consist of grainstone bases that commonly fine upward and contain stacked hardgrounds. These are overlain by digitate-algal bioherms with grainstone channel fills and/or bioturbated ribbon carbonates with grainstone lenses. Transgressive depositional facies of third-order depositional sequences consist primarily of stacks of subtidal-dominated pertidial cycles and subtidal cycles, whereas regressive depositional facies are dominated by stacks of tidal flat-dominated peritidal cycles and regoliths developed over laminite cycle caps. The use of high frequency cycles in the Bonanza King Formation to delineate regionally developed third-order depositional sequences thus provides a link between cycle stratigraphy and sequence stratigraphy.

  3. Taxonomic evaluation of selected Ganoderma species and database sequence validation

    Directory of Open Access Journals (Sweden)

    Suldbold Jargalmaa

    2017-07-01

    Full Text Available Species in the genus Ganoderma include several ecologically important and pathogenic fungal species whose medicinal and economic value is substantial. Due to the highly similar morphological features within the Ganoderma, identification of species has relied heavily on DNA sequencing using BLAST searches, which are only reliable if the GenBank submissions are accurately labeled. In this study, we examined 113 specimens collected from 1969 to 2016 from various regions in Korea using morphological features and multigene analysis (internal transcribed spacer, translation elongation factor 1-α, and the second largest subunit of RNA polymerase II. These specimens were identified as four Ganoderma species: G. sichuanense, G. cf. adspersum, G. cf. applanatum, and G. cf. gibbosum. With the exception of G. sichuanense, these species were difficult to distinguish based solely on morphological features. However, phylogenetic analysis at three different loci yielded concordant phylogenetic information, and supported the four species distinctions with high bootstrap support. A survey of over 600 Ganoderma sequences available on GenBank revealed that 65% of sequences were either misidentified or ambiguously labeled. Here, we suggest corrected annotations for GenBank sequences based on our phylogenetic validation and provide updated global distribution patterns for these Ganoderma species.

  4. Taxonomic evaluation of selected Ganoderma species and database sequence validation

    Science.gov (United States)

    Jargalmaa, Suldbold; Eimes, John A.; Park, Myung Soo; Park, Jae Young; Oh, Seung-Yoon

    2017-01-01

    Species in the genus Ganoderma include several ecologically important and pathogenic fungal species whose medicinal and economic value is substantial. Due to the highly similar morphological features within the Ganoderma, identification of species has relied heavily on DNA sequencing using BLAST searches, which are only reliable if the GenBank submissions are accurately labeled. In this study, we examined 113 specimens collected from 1969 to 2016 from various regions in Korea using morphological features and multigene analysis (internal transcribed spacer, translation elongation factor 1-α, and the second largest subunit of RNA polymerase II). These specimens were identified as four Ganoderma species: G. sichuanense, G. cf. adspersum, G. cf. applanatum, and G. cf. gibbosum. With the exception of G. sichuanense, these species were difficult to distinguish based solely on morphological features. However, phylogenetic analysis at three different loci yielded concordant phylogenetic information, and supported the four species distinctions with high bootstrap support. A survey of over 600 Ganoderma sequences available on GenBank revealed that 65% of sequences were either misidentified or ambiguously labeled. Here, we suggest corrected annotations for GenBank sequences based on our phylogenetic validation and provide updated global distribution patterns for these Ganoderma species. PMID:28761785

  5. A theoretical justification for single molecule peptide sequencing.

    Directory of Open Access Journals (Sweden)

    Jagannath Swaminathan

    2015-02-01

    Full Text Available The proteomes of cells, tissues, and organisms reflect active cellular processes and change continuously in response to intracellular and extracellular cues. Deep, quantitative profiling of the proteome, especially if combined with mRNA and metabolite measurements, should provide an unprecedented view of cell state, better revealing functions and interactions of cell components. Molecular diagnostics and biomarker discovery should benefit particularly from the accurate quantification of proteomes, since complex diseases like cancer change protein abundances and modifications. Currently, shotgun mass spectrometry is the primary technology for high-throughput protein identification and quantification; while powerful, it lacks high sensitivity and coverage. We draw parallels with next-generation DNA sequencing and propose a strategy, termed fluorosequencing, for sequencing peptides in a complex protein sample at the level of single molecules. In the proposed approach, millions of individual fluorescently labeled peptides are visualized in parallel, monitoring changing patterns of fluorescence intensity as N-terminal amino acids are sequentially removed, and using the resulting fluorescence signatures (fluorosequences to uniquely identify individual peptides. We introduce a theoretical foundation for fluorosequencing and, by using Monte Carlo computer simulations, we explore its feasibility, anticipate the most likely experimental errors, quantify their potential impact, and discuss the broad potential utility offered by a high-throughput peptide sequencing technology.

  6. Genome Sequencing Reveals the Potential of Achromobacter sp. HZ01 for Bioremediation

    Directory of Open Access Journals (Sweden)

    Yue-Hui Hong

    2017-08-01

    Full Text Available Petroleum pollution is a severe environmental issue. Comprehensively revealing the genetic backgrounds of hydrocarbon-degrading microorganisms contributes to developing effective methods for bioremediation of crude oil-polluted environments. Marine bacterium Achromobacter sp. HZ01 is capable of degrading hydrocarbons and producing biosurfactants. In this study, the draft genome (5.5 Mbp of strain HZ01 has been obtained by Illumina sequencing, containing 5,162 predicted genes. Genome annotation shows that “amino acid metabolism” is the most abundant metabolic pathway. Strain HZ01 is not capable of using some common carbohydrates as the sole carbon sources, which is due to that it contains few genes associated with carbohydrate transport and lacks some important enzymes related to glycometabolism. It contains abundant proteins directly related to petroleum hydrocarbon degradation. AlkB hydroxylase and its homologs were not identified. It harbors a complete enzyme system of terminal oxidation pathway for n-alkane degradation, which may be initiated by cytochrome P450. The enzymes involved in the catechol pathway are relatively complete for the degradation of aromatic compounds. This bacterium lacks several essential enzymes for methane oxidation, and Baeyer-Villiger monooxygenase involved in the subterminal oxidation pathway and cycloalkane degradation was not identified. These results suggest that strain HZ01 degrades n-alkanes via the terminal oxidation pathway, degrades aromatic compounds primarily via the catechol pathway and cannot perform methane oxidation or cycloalkane degradation. Additionally, strain HZ01 possesses abundant genes related to the metabolism of secondary metabolites, including some genes involved in biosurfactant (such as glycolipids and lipopeptides synthesis. The genome analysis also reveals its genetic basis for nitrogen metabolism, antibiotic resistance, regulatory responses to environmental changes, cell motility

  7. Whole genome sequencing reveals a novel deletion variant in the KIT gene in horses with white spotted coat colour phenotypes.

    Science.gov (United States)

    Dürig, N; Jude, R; Holl, H; Brooks, S A; Lafayette, C; Jagannathan, V; Leeb, T

    2017-08-01

    White spotting phenotypes in horses can range in severity from the common white markings up to completely white horses. EDNRB, KIT, MITF, PAX3 and TRPM1 represent known candidate genes for such phenotypes in horses. For the present study, we re-investigated a large horse family segregating a variable white spotting phenotype, for which conventional Sanger sequencing of the candidate genes' individual exons had failed to reveal the causative variant. We obtained whole genome sequence data from an affected horse and specifically searched for structural variants in the known candidate genes. This analysis revealed a heterozygous ~1.9-kb deletion spanning exons 10-13 of the KIT gene (chr3:77,740,239_77,742,136del1898insTATAT). In continuity with previously named equine KIT variants we propose to designate the newly identified deletion variant W22. We had access to 21 horses carrying the W22 allele. Four of them were compound heterozygous W20/W22 and had a completely white phenotype. Our data suggest that W22 represents a true null allele of the KIT gene, whereas the previously identified W20 leads to a partial loss of function. These findings will enable more precise genetic testing for depigmentation phenotypes in horses. © 2017 Stichting International Foundation for Animal Genetics.

  8. Global Carrier Rates of Rare Inherited Disorders Using Population Exome Sequences.

    Directory of Open Access Journals (Sweden)

    Kohei Fujikura

    Full Text Available Exome sequencing has revealed the causative mutations behind numerous rare, inherited disorders, but it is challenging to find reliable epidemiological values for rare disorders. Here, I provide a genetic epidemiology method to identify the causative mutations behind rare, inherited disorders using two population exome sequences (1000 Genomes and NHLBI. I created global maps of carrier rate distribution for 18 recessive disorders in 16 diverse ethnic populations. Out of a total of 161 mutations associated with 18 recessive disorders, I detected 24 mutations in either or both exome studies. The genetic mapping revealed strong international spatial heterogeneities in the carrier patterns of the inherited disorders. I next validated this methodology by statistically evaluating the carrier rate of one well-understood disorder, sickle cell anemia (SCA. The population exome-based epidemiology of SCA [African (allele frequency (AF = 0.0454, N = 2447, Asian (AF = 0, N = 286, European (AF = 0.000214, N = 4677, and Hispanic (AF = 0.0111, N = 362] was not significantly different from that obtained from a clinical prevalence survey. A pair-wise proportion test revealed no significant differences between the two exome projects in terms of AF (46/48 cases; P > 0.05. I conclude that population exome-based carrier rates can form the foundation for a prospectively maintained database of use to clinical geneticists. Similar modeling methods can be applied to many inherited disorders.

  9. Sequencing of the Chlamydophila psittaci ompA Gene Reveals a New Genotype, E/B, and the Need for a Rapid Discriminatory Genotyping Method

    Science.gov (United States)

    Geens, Tom; Desplanques, Ann; Van Loock, Marnix; Bönner, Brigitte M.; Kaleta, Erhard F.; Magnino, Simone; Andersen, Arthur A.; Everett, Karin D. E.; Vanrompay, Daisy

    2005-01-01

    Twenty-one avian Chlamydophila psittaci isolates from different European countries were characterized using ompA restriction fragment length polymorphism, ompA sequencing, and major outer membrane protein serotyping. Results reveal the presence of a new genotype, E/B, in several European countries and stress the need for a discriminatory rapid genotyping method. PMID:15872282

  10. Genome sequencing and comparative genomics reveal a repertoire of putative pathogenicity genes in chilli anthracnose fungus Colletotrichum truncatum.

    Science.gov (United States)

    Rao, Soumya; Nandineni, Madhusudan R

    2017-01-01

    Colletotrichum truncatum, a major fungal phytopathogen, causes the anthracnose disease on an economically important spice crop chilli (Capsicum annuum), resulting in huge economic losses in tropical and sub-tropical countries. It follows a subcuticular intramural infection strategy on chilli with a short, asymptomatic, endophytic phase, which contrasts with the intracellular hemibiotrophic lifestyle adopted by most of the Colletotrichum species. However, little is known about the molecular determinants and the mechanism of pathogenicity in this fungus. A high quality whole genome sequence and gene annotation based on transcriptome data of an Indian isolate of C. truncatum from chilli has been obtained. Analysis of the genome sequence revealed a rich repertoire of pathogenicity genes in C. truncatum encoding secreted proteins, effectors, plant cell wall degrading enzymes, secondary metabolism associated proteins, with potential roles in the host-specific infection strategy, placing it next only to the Fusarium species. The size of genome assembly, number of predicted genes and some of the functional categories were similar to other sequenced Colletotrichum species. The comparative genomic analyses with other species and related fungi identified some unique genes and certain highly expanded gene families of CAZymes, proteases and secondary metabolism associated genes in the genome of C. truncatum. The draft genome assembly and functional annotation of potential pathogenicity genes of C. truncatum provide an important genomic resource for understanding the biology and lifestyle of this important phytopathogen and will pave the way for designing efficient disease control regimens.

  11. DNA sequencing, anatomy, and calcification patterns support a monophyletic, subarctic, carbonate reef-forming Clathromorphum (Hapalidiaceae, Corallinales, Rhodophyta).

    Science.gov (United States)

    Adey, Walter H; Hernandez-Kantun, Jazmin J; Johnson, Gabriel; Gabrielson, Paul W

    2015-02-01

    For the first time, morpho-anatomical characters that were congruent with DNA sequence data were used to characterize several genera in Hapalidiaceae-the major eco-engineers of Subarctic carbonate ecosystems. DNA sequencing of three genes (SSU, rbcL, ribulose-1, 5-bisphosphate carboxylase/oxygenase large subunit gene and psbA, photosystem II D1 protein gene), along with patterns of cell division, cell elongation, and calcification supported a monophyletic Clathromorphum. Two characters were diagnostic for this genus: (i) cell division, elongation, and primary calcification occurred only in intercalary meristematic cells and in a narrow vertical band (1-2 μm wide) resulting in a "meristem split" and (ii) a secondary calcification of interfilament crystals was also produced. Neopolyporolithon was resurrected for N. reclinatum, the generitype, and Clathromorphum loculosum was transferred to this genus. Like Clathromorphum, cell division, elongation, and calcification occurred only in intercalary meristematic cells, but in a wider vertical band (over 10-20 μm), and a "meristem split" was absent. Callilithophytum gen. nov. was proposed to accommodate Clathromorphum parcum, the obligate epiphyte of the northeast Pacific endemic geniculate coralline, Calliarthron. Diagnostic for this genus were epithallial cells terminating all cell filaments (no dorsi-ventrality was present), and a distinct "foot" was embedded in the host. Leptophytum, based on its generitype, L. laeve, was shown to be a distinct genus more closely related to Clathromorphum than to Phymatolithon. All names of treated species were applied unequivocally by linking partial rbcL sequences from holotype, isotype, or epitype specimens with field-collected material. Variation in rbcL and psbA sequences suggested that multiple species may be passing under each currently recognized species of Clathromorphum and Neopolyporolithon. © 2014 Phycological Society of America.

  12. Genomic sequencing of Pleistocene cave bears

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  13. Working memory for sequences of temporal durations reveals a volatile single-item store

    Directory of Open Access Journals (Sweden)

    Sanjay G Manohar

    2016-10-01

    Full Text Available When a sequence is held in working memory, different items are retained with differing fidelity. Here we ask whether a sequence of brief time intervals that must be remembered show recency effects, similar to those observed in verbal and visuospatial working memory. It has been suggested that prioritising some items over others can be accounted for by a focus of attention, maintaining some items in a privileged state. We therefore also investigated whether such benefits are vulnerable to disruption by attention or expectation. Participants listened to sequences of one to five tones, of varying durations (200ms to 2s. Subsequently, the length of one of the tones in the sequence had to be reproduced by holding a key. The discrepancy between the reproduced and actual durations quantified the fidelity of memory for auditory durations. Recall precision decreased with the number of items that had to be remembered, and was better for the first and last items of sequences, in line with set-size and serial position effects seen in other modalities. To test whether attentional filtering demands might impair performance, an irrelevant variation in pitch was introduced in some blocks of trials. In those blocks, memory precision was worse for sequences that consisted of only one item, i.e. the smallest memory set size. Thus, when irrelevant information was present, the benefit of having only one item in memory is attenuated. Finally we examined whether expectation could interfere with memory. On half the trials, the number of items in the upcoming sequence was cued. When the number of items was known in advance, performance was paradoxically worse when the sequence consisted of only one item. Thus the benefit of having only one item to remember is stronger when it is unexpectedly the only item. Our results suggest that similar mechanisms are used to hold auditory time durations in working memory, as for visual or verbal stimuli. Further, solitary items were

  14. Heuristics Miner for E-Commerce Visitor Access Pattern Representation

    Directory of Open Access Journals (Sweden)

    Kartina Diah Kesuma Wardhani

    2017-06-01

    Full Text Available E-commerce click stream data can form a certain pattern that describe visitor behavior while surfing the e-commerce website. This pattern can be used to initiate a design to determine alternative access sequence on the website. This research use heuristic miner algorithm to determine the pattern. σ-Algorithm and Genetic Mining are methods used for pattern recognition with frequent sequence item set approach. Heuristic Miner is an evolved form of those methods. σ-Algorithm assume that an activity in a website, that has been recorded in the data log, is a complete sequence from start to finish, without any tolerance to incomplete data or data with noise. On the other hand, Genetic Mining is a method that tolerate incomplete data or data with noise, so it can generate a more detailed e-commerce visitor access pattern. In this study, the same sequence of events obtained from six-generated patterns. The resulting pattern of visitor access is that visitors are often access the home page and then the product category page or the home page and then the full text search page.

  15. 18S rDNA Sequences from Microeukaryotes Reveal Oil Indicators in Mangrove Sediment

    Science.gov (United States)

    Santos, Henrique F.; Cury, Juliano C.; Carmo, Flavia L.; Rosado, Alexandre S.; Peixoto, Raquel S.

    2010-01-01

    Background Microeukaryotes are an effective indicator of the presence of environmental contaminants. However, the characterisation of these organisms by conventional tools is often inefficient, and recent molecular studies have revealed a great diversity of microeukaryotes. The full extent of this diversity is unknown, and therefore, the distribution, ecological role and responses to anthropogenic effects of microeukaryotes are rather obscure. The majority of oil from oceanic oil spills (e.g., the May 2010 accident in the Gulf of Mexico) converges on coastal ecosystems such as mangroves, which are threatened with worldwide disappearance, highlighting the need for efficient tools to indicate the presence of oil in these environments. However, no studies have used molecular methods to assess the effects of oil contamination in mangrove sediment on microeukaryotes as a group. Methodology/Principal Findings We evaluated the population dynamics and the prevailing 18S rDNA phylotypes of microeukaryotes in mangrove sediment microcosms with and without oil contamination, using PCR/DGGE and clone libraries. We found that microeukaryotes are useful for monitoring oil contamination in mangroves. Our clone library analysis revealed a decrease in both diversity and species richness after contamination. The phylogenetic group that showed the greatest sensitivity to oil was the Nematoda. After contamination, a large increase in the abundance of the groups Bacillariophyta (diatoms) and Biosoecida was detected. The oil-contaminated samples were almost entirely dominated by organisms related to Bacillariophyta sp. and Cafeteria minima, which indicates that these groups are possible targets for biomonitoring oil in mangroves. The DGGE fingerprints also indicated shifts in microeukaryote profiles; specific band sequencing indicated the appearance of Bacillariophyta sp. only in contaminated samples and Nematoda only in non-contaminated sediment. Conclusions/Significance We believe that

  16. Deep Sequencing of Myxilla (Ectyomyxilla) methanophila, an Epibiotic Sponge on Cold-Seep Tubeworms, Reveals Methylotrophic, Thiotrophic, and Putative Hydrocarbon-Degrading Microbial Associations

    KAUST Repository

    Arellano, Shawn M.

    2012-10-11

    The encrusting sponge Myxilla (Ectyomyxilla) methanophila (Poecilosclerida: Myxillidae) is an epibiont on vestimentiferan tubeworms at hydrocarbon seeps on the upper Louisiana slope of the Gulf of Mexico. It has long been suggested that this sponge harbors methylotrophic bacteria due to its low δ13C value and high methanol dehydrogenase activity, yet the full community of microbial associations in M. methanophila remained uncharacterized. In this study, we sequenced 16S rRNA genes representing the microbial community in M. methanophila collected from two hydrocarbon-seep sites (GC234 and Bush Hill) using both Sanger sequencing and next-generation 454 pyrosequencing technologies. Additionally, we compared the microbial community in M. methanophila to that of the biofilm collected from the associated tubeworm. Our results revealed that the microbial diversity in the sponges from both sites was low but the community structure was largely similar, showing a high proportion of methylotrophic bacteria of the genus Methylohalomonas and polycyclic aromatic hydrocarbon (PAH)-degrading bacteria of the genera Cycloclasticus and Neptunomonas. Furthermore, the sponge microbial clone library revealed the dominance of thioautotrophic gammaproteobacterial symbionts in M. methanophila. In contrast, the biofilm communities on the tubeworms were more diverse and dominated by the chemoorganotrophic Moritella at GC234 and methylotrophic Methylomonas and Methylohalomonas at Bush Hill. Overall, our study provides evidence to support previous suggestion that M. methanophila harbors methylotrophic symbionts and also reveals the association of PAH-degrading and thioautotrophic microbes in the sponge. © 2012 Springer Science+Business Media New York.

  17. Deep sequencing of Myxilla (Ectyomyxilla) methanophila, an epibiotic sponge on cold-seep tubeworms, reveals methylotrophic, thiotrophic, and putative hydrocarbon-degrading microbial associations.

    Science.gov (United States)

    Arellano, Shawn M; Lee, On On; Lafi, Feras F; Yang, Jiangke; Wang, Yong; Young, Craig M; Qian, Pei-Yuan

    2013-02-01

    The encrusting sponge Myxilla (Ectyomyxilla) methanophila (Poecilosclerida: Myxillidae) is an epibiont on vestimentiferan tubeworms at hydrocarbon seeps on the upper Louisiana slope of the Gulf of Mexico. It has long been suggested that this sponge harbors methylotrophic bacteria due to its low δ(13)C value and high methanol dehydrogenase activity, yet the full community of microbial associations in M. methanophila remained uncharacterized. In this study, we sequenced 16S rRNA genes representing the microbial community in M. methanophila collected from two hydrocarbon-seep sites (GC234 and Bush Hill) using both Sanger sequencing and next-generation 454 pyrosequencing technologies. Additionally, we compared the microbial community in M. methanophila to that of the biofilm collected from the associated tubeworm. Our results revealed that the microbial diversity in the sponges from both sites was low but the community structure was largely similar, showing a high proportion of methylotrophic bacteria of the genus Methylohalomonas and polycyclic aromatic hydrocarbon (PAH)-degrading bacteria of the genera Cycloclasticus and Neptunomonas. Furthermore, the sponge microbial clone library revealed the dominance of thioautotrophic gammaproteobacterial symbionts in M. methanophila. In contrast, the biofilm communities on the tubeworms were more diverse and dominated by the chemoorganotrophic Moritella at GC234 and methylotrophic Methylomonas and Methylohalomonas at Bush Hill. Overall, our study provides evidence to support previous suggestion that M. methanophila harbors methylotrophic symbionts and also reveals the association of PAH-degrading and thioautotrophic microbes in the sponge.

  18. Context based computational analysis and characterization of ARS consensus sequences (ACS of Saccharomyces cerevisiae genome

    Directory of Open Access Journals (Sweden)

    Vinod Kumar Singh

    2016-09-01

    Full Text Available Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS requires an essential consensus sequence (ACS for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC denoted as ORC-ACS and non-replicating ACS sequences (nrACS, that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.

  19. Characterization of Human Cytomegalovirus Genome Diversity in Immunocompromised Hosts by Whole-Genome Sequencing Directly From Clinical Specimens.

    Science.gov (United States)

    Hage, Elias; Wilkie, Gavin S; Linnenweber-Held, Silvia; Dhingra, Akshay; Suárez, Nicolás M; Schmidt, Julius J; Kay-Fedorov, Penelope C; Mischak-Weissinger, Eva; Heim, Albert; Schwarz, Anke; Schulz, Thomas F; Davison, Andrew J; Ganzenmueller, Tina

    2017-06-01

    Advances in next-generation sequencing (NGS) technologies allow comprehensive studies of genetic diversity over the entire genome of human cytomegalovirus (HCMV), a significant pathogen for immunocompromised individuals. Next-generation sequencing was performed on target enriched sequence libraries prepared directly from a variety of clinical specimens (blood, urine, breast milk, respiratory samples, biopsies, and vitreous humor) obtained longitudinally or from different anatomical compartments from 20 HCMV-infected patients (renal transplant recipients, stem cell transplant recipients, and congenitally infected children). De novo-assembled HCMV genome sequences were obtained for 57 of 68 sequenced samples. Analysis of longitudinal or compartmental HCMV diversity revealed various patterns: no major differences were detected among longitudinal, intraindividual blood samples from 9 of 15 patients and in most of the patients with compartmental samples, whereas a switch of the major HCMV population was observed in 6 individuals with sequential blood samples and upon compartmental analysis of 1 patient with HCMV retinitis. Variant analysis revealed additional aspects of minor virus population dynamics and antiviral-resistance mutations. In immunosuppressed patients, HCMV can remain relatively stable or undergo drastic genomic changes that are suggestive of the emergence of minor resident strains or de novo infection. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.

  20. An EST screen from the annelid Pomatoceros lamarckii reveals patterns of gene loss and gain in animals

    Directory of Open Access Journals (Sweden)

    Chen Wei-Chung

    2009-09-01

    Full Text Available Abstract Background Since the drastic reorganisation of the phylogeny of the animal kingdom into three major clades of bilaterians; Ecdysozoa, Lophotrochozoa and Deuterostomia, it became glaringly obvious that the selection of model systems with extensive molecular resources was heavily biased towards only two of these three clades, namely the Ecdysozoa and Deuterostomia. Increasing efforts have been put towards redressing this imbalance in recent years, and one of the principal phyla in the vanguard of this endeavour is the Annelida. Results In the context of this effort we here report our characterisation of an Expressed Sequence Tag (EST screen in the serpulid annelid, Pomatoceros lamarckii. We have sequenced over 5,000 ESTs which consolidate into over 2,000 sequences (clusters and singletons. These sequences are used to build phylogenetic trees to estimate relative branch lengths amongst different taxa and, by comparison to genomic data from other animals, patterns of gene retention and loss are deduced. Conclusion The molecular phylogenetic trees including the P. lamarckii sequences extend early observations that polychaetes tend to have relatively short branches in such trees, and hence are useful taxa with which to reconstruct gene family evolution. Also, with the availability of lophotrochozoan data such as that of P. lamarckii, it is now possible to make much more accurate reconstructions of the gene complement of the ancestor of the bilaterians than was previously possible from comparisons of ecdysozoan and deuterostome genomes to non-bilaterian outgroups. It is clear that the traditional molecular model systems for protostomes (e.g. Drosophila melanogaster and Caenorhabditis elegans, which are restricted to the Ecdysozoa, have undergone extensive gene loss during evolution. These ecdysozoan systems, in terms of gene content, are thus more derived from the bilaterian ancestral condition than lophotrochozoan systems like the polychaetes

  1. Sequence-Based Mapping and Genome Editing Reveal Mutations in Stickleback Hps5 Cause Oculocutaneous Albinism and the casper Phenotype

    Directory of Open Access Journals (Sweden)

    James C. Hart

    2017-09-01

    Full Text Available Here, we present and characterize the spontaneous X-linked recessive mutation casper, which causes oculocutaneous albinism in threespine sticklebacks (Gasterosteus aculeatus. In humans, Hermansky-Pudlak syndrome results in pigmentation defects due to disrupted formation of the melanin-containing lysosomal-related organelle (LRO, the melanosome. casper mutants display not only reduced pigmentation of melanosomes in melanophores, but also reductions in the iridescent silver color from iridophores, while the yellow pigmentation from xanthophores appears unaffected. We mapped casper using high-throughput sequencing of genomic DNA from bulked casper mutants to a region of the stickleback X chromosome (chromosome 19 near the stickleback ortholog of Hermansky-Pudlak syndrome 5 (Hps5. casper mutants have an insertion of a single nucleotide in the sixth exon of Hps5, predicted to generate an early frameshift. Genome editing using CRISPR/Cas9 induced lesions in Hps5 and phenocopied the casper mutation. Injecting single or paired Hps5 guide RNAs revealed higher incidences of genomic deletions from paired guide RNAs compared to single gRNAs. Stickleback Hps5 provides a genetic system where a hemizygous locus in XY males and a diploid locus in XX females can be used to generate an easily scored visible phenotype, facilitating quantitative studies of different genome editing approaches. Lastly, we show the ability to better visualize patterns of fluorescent transgenic reporters in Hps5 mutant fish. Thus, Hps5 mutations present an opportunity to study pigmented LROs in the emerging stickleback model system, as well as a tool to aid in assaying genome editing and visualizing enhancer activity in transgenic fish.

  2. Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes.

    Science.gov (United States)

    Kumar, Vikas; Kutschera, Verena E; Nilsson, Maria A; Janke, Axel

    2015-08-07

    The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated and analyzed for signatures of positive selection. In addition, the data allowed for a phylogenetic analysis and divergence time estimate between the two fox species. The de novo assembly of reads resulted in more than 160,000 contigs/transcripts per individual. Approximately 17,000 homologous genes were identified using human and the non-redundant databases. Positive selection analyses revealed several genes involved in various metabolic and molecular processes such as energy metabolism, cardiac gene regulation, apoptosis and blood coagulation to be under positive selection in foxes. Branch site tests identified four genes to be under positive selection in the Arctic fox transcriptome, two of which are fat metabolism genes. In the red fox transcriptome eight genes are under positive selection, including molecular process genes, notably genes involved in ATP metabolism. Analysis of the three transcriptomes and five Sanger re-sequenced genes in additional individuals identified a lower genetic variability within Arctic foxes compared to red foxes, which is consistent with distribution range differences and demographic responses to past climatic fluctuations. A phylogenomic analysis estimated that the Arctic and red fox lineages diverged about three million years ago. Transcriptome data are an economic way to generate genomic resources for evolutionary studies. Despite not representing an entire genome, this transcriptome analysis identified numerous genes that are relevant to arctic

  3. Sequence analysis-based characterization and identification of neurovirulence-associated variants of 36 EV71 strains from China.

    Science.gov (United States)

    Xu, Jun; Wang, Fang; Zhao, Desheng; Liu, Jiang; Su, Hong; Wang, Baolong

    2018-03-30

    Enterovirus 71 (EV71) is the main pathogen of hand-foot-mouth disease (HFMD) and causes several neurological complications. As new strains of EV71 are constantly discovered, it is important to understand the genomic characteristics of the viruses and the mechanism of virulence. Herein, we isolated five strains of EV71 from HFMD patients with or without neurovirulence and sequenced their whole genomes. We then performed whole genome sequence analysis of totally 36 EV71 strains. The phylogenetic analysis of the VP1 region revealed all five isolated strains are clustered into C4a of C4 subgenotype. In addition, by comparing the complete genome sequences of 36 strains, 253 variable amino acid positions were found, 14 of which were identified to be associated with neurovirulence (P < 0.05). Moreover, a similar pattern of amino acid variants combination was identified in four strains without neurovirulence, indicating this type of variant pattern might be associated with avirulence. The strains with neurovirulence appeared to be distinguished from those without neurovirulence by the variants in VP1 and P2 regions, implying VP1 and P2 are the important regions associated with neurovirulence. Indeed, 3-D modeling of VP1 and P2 regions of non-neurovirulent and neurovirulent strains revealed that the different variants resulted in different protein structures and amino acid composition of ligand binding site, which might account for their difference in neurovirulence. In summary, our study reveals 14 variable amino acid positions of VP1, P2 and P3 regions are related to the virulence and that mutations in the capsid proteins of EV71 might contribute to neurovirulence. © 2018 Wiley Periodicals, Inc.

  4. Spectro-temporal modulation masking patterns reveal frequency selectivity.

    Science.gov (United States)

    Oetjen, Arne; Verhey, Jesko L

    2015-02-01

    The present study investigated the possibility that the human auditory system demonstrates frequency selectivity to spectro-temporal amplitude modulations. Threshold modulation depth for detecting sinusoidal spectro-temporal modulations was measured using a generalized masked threshold pattern paradigm with narrowband masker modulations. Four target spectro-temporal modulations were examined, differing in their temporal and spectral modulation frequencies: a temporal modulation of -8, 8, or 16 Hz combined with a spectral modulation of 1 cycle/octave and a temporal modulation of 4 Hz combined with a spectral modulation of 0.5 cycles/octave. The temporal center frequencies of the masker modulation ranged from 0.25 to 4 times the target temporal modulation. The spectral masker-modulation center-frequencies were 0, 0.5, 1, 1.5, and 2 times the target spectral modulation. For all target modulations, the pattern of average thresholds for the eight normal-hearing listeners was consistent with the hypothesis of a spectro-temporal modulation filter. Such a pattern of modulation-frequency sensitivity was predicted on the basis of psychoacoustical data for purely temporal amplitude modulations and purely spectral amplitude modulations. An analysis of separability indicates that, for the present data set, selectivity in the spectro-temporal modulation domain can be described by a combination of a purely spectral and a purely temporal modulation filter function.

  5. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences

    Directory of Open Access Journals (Sweden)

    Alessandra Traini

    2013-01-01

    Full Text Available Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  6. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

    Science.gov (United States)

    Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

    2013-01-01

    Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  7. Genetic diversity and biogeographical patterns of Caulerpa prolifera across the Mediterranean and Mediterranean/Atlantic transition zone

    KAUST Repository

    Varela-Á lvarez, Elena; Balau, Ana C.; Marbà , Nú rià N.; Afonso-Carrillo, Julio; Duarte, Carlos M.; Serrã o, Ester Á lvares

    2015-01-01

    Knowledge of spatial patterns of genetic differentiation between populations is key to understanding processes in evolutionary history of biological species. Caulerpa is a genus of marine green algae, which has attracted much public attention, mainly because of the impacts of invasive species in the Mediterranean. However, very little is known about the ecological and evolutionary history of the Mediterranean native Caulerpa prolifera, a species which is currently found at sites distributed worldwide. C. prolifera provides a good model to explore the patterns of genetic diversity at different scales across the Mediterranean and Atlantic area. This study aims to investigate the biogeographical patterns of diversity and differentiation of C. prolifera in the Mediterranean, with special focus on the Mediterranean/Atlantic transition zone. We used two nuclear (ITS rDNA and the hypervariable microsatellite locus CaPr_J2) and one chloroplast (tufA) DNA markers on samples of C. prolifera from its entire range. Analyses of 51 sequences of the cpDNA tufA of C. prolifera, 87 ITS2 sequences and genotypes of 788 ramets of C. prolifera for the locus CaPr_J2 revealed three different biogeographical areas: West Atlantic, East Atlantic and a larger area representing the Mediterranean, the Mediterranean/Atlantic transition zone and a Pacific site (Bali). It was found out that the Mediterranean/Atlantic transition zone is a biogeographical boundary for C. prolifera. A lack of connectivity was revealed between Atlantic and Mediterranean types, and identical sequences found in the Mediterranean and Indo-Pacific suggest either recent gene flow along the Red Sea connection or a possible ancient Indo-Pacific origin.

  8. Genetic diversity and biogeographical patterns of Caulerpa prolifera across the Mediterranean and Mediterranean/Atlantic transition zone

    KAUST Repository

    Varela-Álvarez, Elena

    2015-01-11

    Knowledge of spatial patterns of genetic differentiation between populations is key to understanding processes in evolutionary history of biological species. Caulerpa is a genus of marine green algae, which has attracted much public attention, mainly because of the impacts of invasive species in the Mediterranean. However, very little is known about the ecological and evolutionary history of the Mediterranean native Caulerpa prolifera, a species which is currently found at sites distributed worldwide. C. prolifera provides a good model to explore the patterns of genetic diversity at different scales across the Mediterranean and Atlantic area. This study aims to investigate the biogeographical patterns of diversity and differentiation of C. prolifera in the Mediterranean, with special focus on the Mediterranean/Atlantic transition zone. We used two nuclear (ITS rDNA and the hypervariable microsatellite locus CaPr_J2) and one chloroplast (tufA) DNA markers on samples of C. prolifera from its entire range. Analyses of 51 sequences of the cpDNA tufA of C. prolifera, 87 ITS2 sequences and genotypes of 788 ramets of C. prolifera for the locus CaPr_J2 revealed three different biogeographical areas: West Atlantic, East Atlantic and a larger area representing the Mediterranean, the Mediterranean/Atlantic transition zone and a Pacific site (Bali). It was found out that the Mediterranean/Atlantic transition zone is a biogeographical boundary for C. prolifera. A lack of connectivity was revealed between Atlantic and Mediterranean types, and identical sequences found in the Mediterranean and Indo-Pacific suggest either recent gene flow along the Red Sea connection or a possible ancient Indo-Pacific origin.

  9. Deep sequencing of the Camellia chekiangoleosa transcriptome revealed candidate genes for anthocyanin biosynthesis.

    Science.gov (United States)

    Wang, Zhong-Wei; Jiang, Cong; Wen, Qiang; Wang, Na; Tao, Yuan-Yuan; Xu, Li-An

    2014-03-15

    Camellia chekiangoleosa is an important species of genus Camellia. It provides high-quality edible oil and has great ornamental value. The flowers are big and red which bloom between February and March. Flower pigmentation is closely related to the accumulation of anthocyanin. Although anthocyanin biosynthesis has been studied extensively in herbaceous plants, little molecular information on the anthocyanin biosynthesis pathway of C. chekiangoleosa is yet known. In the present study, a cDNA library was constructed to obtain detailed and general data from the flowers of C. chekiangoleosa. To explore the transcriptome of C. chekiangoleosa and investigate genes involved in anthocyanin biosynthesis, a 454 GS FLX Titanium platform was used to generate an EST dataset. About 46,279 sequences were obtained, and 24,593 (53.1%) were annotated. Using Blast search against the AGRIS, 1740 unigenes were found homologous to 599 Arabidopsis transcription factor genes. Based on the transcriptome dataset, nine anthocyanin biosynthesis pathway genes (PAL, CHS1, CHS2, CHS3, CHI, F3H, DFR, ANS, and UFGT) were identified and cloned. The spatio-temporal expression patterns of these genes were also analyzed using quantitative real-time polymerase chain reaction. The study results not only enrich the gene resource but also provide valuable information for further studies concerning anthocyanin biosynthesis. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. Response of heat shock protein genes of the oriental fruit moth under diapause and thermal stress reveals multiple patterns dependent on the nature of stress exposure.

    Science.gov (United States)

    Zhang, Bo; Peng, Yu; Zheng, Jincheng; Liang, Lina; Hoffmann, Ary A; Ma, Chun-Sen

    2016-07-01

    Heat shock protein gene (Hsp) families are thought to be important in thermal adaptation, but their expression patterns under various thermal stresses have still been poorly characterized outside of model systems. We have therefore characterized Hsp genes and their stress responses in the oriental fruit moth (OFM), Grapholita molesta, a widespread global orchard pest, and compared patterns of expression in this species to that of other insects. Genes from four Hsp families showed variable expression levels among tissues and developmental stages. Members of the Hsp40, 70, and 90 families were highly expressed under short exposures to heat and cold. Expression of Hsp40, 70, and Hsc70 family members increased in OFM undergoing diapause, while Hsp90 was downregulated. We found that there was strong sequence conservation of members of large Hsp families (Hsp40, Hsp60, Hsp70, Hsc70) across taxa, but this was not always matched by conservation of expression patterns. When the large Hsps as well as small Hsps from OFM were compared under acute and ramping heat stress, two groups of sHsps expression patterns were apparent, depending on whether expression increased or decreased immediately after stress exposure. These results highlight potential differences in conservation of function as opposed to sequence in this gene family and also point to Hsp genes potentially useful as bioindicators of diapause and thermal stress in OFM.

  11. Genetic diversity in breonadia salicina based on intra-species sequence variation of chloroplast dna spacer sequence

    International Nuclear Information System (INIS)

    Qurainy, F.A.; Gaafar, A.R.Z.

    2014-01-01

    Assessment and knowledge of the genetic diversity and variation within and between populations of rare and endangered plants is very important for effective conservation. Intergenic spacer sequences variation of psbA-trnH locus of chloroplast genome was assessed within Breonadia salicina (Rubiaceae), a critically endangered and endemic plant species to South western part of Kingdom of Saudi Arabia. The obtained sequence data from 19 individuals in three populations revealed nine haplotypes. The aligned sequences obtained from the overall Saudi accessions extended to 355 bp, revealing nine haplotypes. A high level of haplotype diversity (Hd = 0.842) and low level of nucleotide diversity (Pi = 0.0058) were detected. Consistently, both hierarchical analysis of molecular variance (AMOVA) and constructed neighbor-joining tree indicated null genetic differentiation among populations. This level of differentiation between populations or between regions in psbA-trnH sequences may be due to effects of the abundance of ancestral haplotype sharing and the presence of private haplotypes fixed for each population. Furthermore, the results revealed almost the same level of genetic diversity in comparison with Yemeni accessions, in which Saudi accessions were sharing three haplotypes from the four haplotypes found in Yemeni accessions. (author)

  12. Survey of transposable elements in sugarcane expressed sequence tags (ESTs

    Directory of Open Access Journals (Sweden)

    Rossi Magdalena

    2001-01-01

    Full Text Available The sugarcane expressed sequence tag (SUCEST project has produced a large number of cDNA sequences from several plant tissues submitted or not to different conditions of stress. In this paper we report the result of a search for transposable elements (TEs revealing a surprising amount of expressed TEs homologues. Of the 260,781 sequences grouped in 81,223 fragment assembly program (Phrap clusters, a total of 276 clones showed homology to previously reported TEs using a stringent cut-off value of e-50 or better. Homologous clones to Copia/Ty1 and Gypsy/Ty3 groups of long terminal repeat (LTR retrotransposons were found but no non-LTR retroelements were identified. All major transposon families were represented in sugarcane including Activator (Ac, Mutator (MuDR, Suppressor-mutator (En/Spm and Mariner. In order to compare the TE diversity in grasses genomes, we carried out a search for TEs described in sugarcane related species O.sativa, Z. mays and S. bicolor. We also present preliminary results showing the potential use of TEs insertion pattern polymorphism as molecular markers for cultivar identification.

  13. Craniopharyngioma: identification of different semiological patterns by magnetic resonance

    International Nuclear Information System (INIS)

    Molla, E.; Marti-Bonmati, L.; Casillas, C.; Poyatos, C.; Menor, F.; Arana, E.

    1998-01-01

    To study craniopharyngiomas using different MR sequences to detect semiological patterns that aid in the characterization of the different components. We performed a retrospective MR study of 17 patients with confirmed craniopharyngioma. T1-weighted spin-echo, proton density-weighted, T2-weighted gradient-echo, T1-weighted (after administration of gadolinium), T1-weighted inversion recovery and phase and opposed-phase gradient echo sequences were employed to distinguish the different patterns. The semiologic patterns considered in MR were: solid-tissue, blood, protein, fat and fluid. A solid pole was detected in all the patients. There was a cystic component in 88.2% of cases; the protein pattern was observed in 52.9%, blood in 29.4%, fluid in 23.5% and fat in 11.7%. The coexistence of three patterns was detected in 29.4% and of two patterns in 58.8%. The calcium pattern was viewed in 75% of the patients studied with CT, with four patterns coexisting in 25%, three patterns in 41.6% and two patterns in 25%. MR detects different semiologic components in craniopharyngiomas, although it is necessary to employ certain unusual sequences in order to distinguish some patterns from others. (Author) 22 refs

  14. Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing.

    Science.gov (United States)

    Hochgerner, Hannah; Zeisel, Amit; Lönnerberg, Peter; Linnarsson, Sten

    2018-02-01

    The dentate gyrus of the hippocampus is a brain region in which neurogenesis persists into adulthood; however, the relationship between developmental and adult dentate gyrus neurogenesis has not been examined in detail. Here we used single-cell RNA sequencing to reveal the molecular dynamics and diversity of dentate gyrus cell types in perinatal, juvenile, and adult mice. We found distinct quiescent and proliferating progenitor cell types, linked by transient intermediate states to neuroblast stages and fully mature granule cells. We observed shifts in the molecular identity of quiescent and proliferating radial glia and granule cells during the postnatal period that were then maintained through adult stages. In contrast, intermediate progenitor cells, neuroblasts, and immature granule cells were nearly indistinguishable at all ages. These findings demonstrate the fundamental similarity of postnatal and adult neurogenesis in the hippocampus and pinpoint the early postnatal transformation of radial glia from embryonic progenitors to adult quiescent stem cells.

  15. Multilocus sequence data reveal dozens of putative cryptic species in a radiation of endemic Californian mygalomorph spiders (Araneae, Mygalomorphae, Nemesiidae).

    Science.gov (United States)

    Leavitt, Dean H; Starrett, James; Westphal, Michael F; Hedin, Marshal

    2015-10-01

    We use mitochondrial and multi-locus nuclear DNA sequence data to infer both species boundaries and species relationships within California nemesiid spiders. Higher-level phylogenetic data show that the California radiation is monophyletic and distantly related to European members of the genus Brachythele. As such, we consider all California nemesiid taxa to belong to the genus Calisoga Chamberlin, 1937. Rather than find support for one or two taxa as previously hypothesized, genetic data reveal Calisoga to be a species-rich radiation of spiders, including perhaps dozens of species. This conclusion is supported by multiple mitochondrial barcoding analyses, and also independent analyses of nuclear data that reveal general genealogical congruence. We discovered three instances of sympatry, and genetic data indicate reproductive isolation when in sympatry. An examination of female reproductive morphology does not reveal species-specific characters, and observed male morphological differences for a subset of putative species are subtle. Our coalescent species tree analysis of putative species lays the groundwork for future research on the taxonomy and biogeographic history of this remarkable endemic radiation. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. Heuristics Miner for E-Commerce Visitor Access Pattern Representation

    OpenAIRE

    Kartina Diah Kesuma Wardhani; Wawan Yunanto

    2017-01-01

    E-commerce click stream data can form a certain pattern that describe visitor behavior while surfing the e-commerce website. This pattern can be used to initiate a design to determine alternative access sequence on the website. This research use heuristic miner algorithm to determine the pattern. σ-Algorithm and Genetic Mining are methods used for pattern recognition with frequent sequence item set approach. Heuristic Miner is an evolved form of those methods. σ-Algorithm assume that an activ...

  17. Learning multiple variable-speed sequences in striatum via cortical tutoring.

    Science.gov (United States)

    Murray, James M; Escola, G Sean

    2017-05-08

    Sparse, sequential patterns of neural activity have been observed in numerous brain areas during timekeeping and motor sequence tasks. Inspired by such observations, we construct a model of the striatum, an all-inhibitory circuit where sequential activity patterns are prominent, addressing the following key challenges: (i) obtaining control over temporal rescaling of the sequence speed, with the ability to generalize to new speeds; (ii) facilitating flexible expression of distinct sequences via selective activation, concatenation, and recycling of specific subsequences; and (iii) enabling the biologically plausible learning of sequences, consistent with the decoupling of learning and execution suggested by lesion studies showing that cortical circuits are necessary for learning, but that subcortical circuits are sufficient to drive learned behaviors. The same mechanisms that we describe can also be applied to circuits with both excitatory and inhibitory populations, and hence may underlie general features of sequential neural activity pattern generation in the brain.

  18. Potential for tree rings to reveal spatial patterns of past drought variability across western Australia

    Science.gov (United States)

    O'Donnell, Alison J.; Cook, Edward R.; Palmer, Jonathan G.; Turney, Chris S. M.; Grierson, Pauline F.

    2018-02-01

    Proxy records have provided major insights into the variability of past climates over long timescales. However, for much of the Southern Hemisphere, the ability to identify spatial patterns of past climatic variability is constrained by the sparse distribution of proxy records. This is particularly true for mainland Australia, where relatively few proxy records are located. Here, we (1) assess the potential to use existing proxy records in the Australasian region—starting with the only two multi-century tree-ring proxies from mainland Australia—to reveal spatial patterns of past hydroclimatic variability across the western third of the continent, and (2) identify strategic locations to target for the development of new proxy records. We show that the two existing tree-ring records allow robust reconstructions of past hydroclimatic variability over spatially broad areas (i.e. > 3° × 3°) in inland north- and south-western Australia. Our results reveal synchronous periods of drought and wet conditions between the inland northern and southern regions of western Australia as well as a generally anti-phase relationship with hydroclimate in eastern Australia over the last two centuries. The inclusion of 174 tree-ring proxy records from Tasmania, New Zealand and Indonesia and a coral record from Queensland did not improve the reconstruction potential over western Australia. However, our findings suggest that the addition of relatively few new proxy records from key locations in western Australia that currently have low reconstruction skill will enable the development of a comprehensive drought atlas for the region, and provide a critical link to the drought atlases of monsoonal Asia and eastern Australia and New Zealand.

  19. Whole-exome sequencing reveals genetic variants associated with chronic kidney disease characterized by tubulointerstitial damages in North Central Region, Sri Lanka.

    Science.gov (United States)

    Nanayakkara, Shanika; Senevirathna, S T M L D; Parahitiyawa, Nipuna B; Abeysekera, Tilak; Chandrajith, Rohana; Ratnatunga, Neelakanthi; Hitomi, Toshiaki; Kobayashi, Hatasu; Harada, Kouji H; Koizumi, Akio

    2015-09-01

    The familial clustering observed in chronic kidney disease of uncertain etiology (CKDu) characterized by tubulointerstitial damages in the North Central Region of Sri Lanka strongly suggests the involvement of genetic factors in its pathogenesis. The objective of the present study is to use whole-exome sequencing to identify the genetic variants associated with CKDu. Whole-exome sequencing of eight CKDu cases and eight controls was performed, followed by direct sequencing of candidate loci in 301 CKDu cases and 276 controls. Association study revealed rs34970857 (c.658G > A/p.V220M) located in the KCNA10 gene encoding a voltage-gated K channel as the most promising SNP with the highest odds ratio of 1.74. Four rare variants were identified in gene encoding Laminin beta2 (LAMB2) which is known to cause congenital nephrotic syndrome. Three out of four variants in LAMB2 were novel variants found exclusively in cases. Genetic investigations provide strong evidence on the presence of genetic susceptibility for CKDu. Possibility of presence of several rare variants associated with CKDu in this population is also suggested.

  20. Genome sequence of Thermofilum pendens reveals an exceptional loss of biosynthetic pathways without genome reduction

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Anderson, Iain; Rodriguez, Jason; Susanti, Dwi; Porat, Iris; Reich, Claudia; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Lykidis, Athanasios; Kim, Edwin; Thompson, Linda S.; Nolan, Matt; Land, Miriam; Copeland, Alex; Lapidus, Alla; Lucas, Susan; Detter, Chris; Zhulin, Igor B.; Olsen, Gary J.; Whitman, William; Mukhopadhyay, Biswarup; Bristow, James; Kyrpides, Nikos

    2008-01-01

    We report the complete genome of Thermofilum pendens, a deep-branching, hyperthermophilic member of the order Thermoproteales within the archaeal kingdom Crenarchaeota. T. pendens is a sulfur-dependent, anaerobic heterotroph isolated from a solfatara in Iceland. It is an extracellular commensal, requiring an extract of Thermoproteus tenax for growth, and the genome sequence reveals that biosynthetic pathways for purines, most amino acids, and most cofactors are absent. In fact T. pendens has fewer biosynthetic enzymes than obligate intracellular parasites, although it does not display other features common among obligate parasites and thus does not appear to be in the process of becoming a parasite. It appears that T. pendens has adapted to life in an environment rich in nutrients. T. pendens was known to utilize peptides as an energy source, but the genome reveals substantial ability to grow on carbohydrates. T. pendens is the first crenarchaeote and only the second archaeon found to have a transporter of the phosphotransferase system. In addition to fermentation, T. pendens may gain energy from sulfur reduction with hydrogen and formate as electron donors. It may also be capable of sulfur-independent growth on formate with formate hydrogenlyase. Additional novel features are the presence of a monomethylamine:corrinoid methyltransferase, the first time this enzyme has been found outside of Methanosarcinales, and a presenilin-related protein. Predicted highly expressed proteins do not include housekeeping genes, and instead include ABC transporters for carbohydrates and peptides, and CRISPR-associated proteins.

  1. Undesirable Choice Biases with Small Differences in the Spatial Structure of Chance Stimulus Sequences.

    Directory of Open Access Journals (Sweden)

    David Herrera

    Full Text Available In two-alternative discrimination tasks, experimenters usually randomize the location of the rewarded stimulus so that systematic behavior with respect to irrelevant stimuli can only produce chance performance on the learning curves. One way to achieve this is to use random numbers derived from a discrete binomial distribution to create a 'full random training schedule' (FRS. When using FRS, however, sporadic but long laterally-biased training sequences occur by chance and such 'input biases' are thought to promote the generation of laterally-biased choices (i.e., 'output biases'. As an alternative, a 'Gellerman-like training schedule' (GLS can be used. It removes most input biases by prohibiting the reward from appearing on the same location for more than three consecutive trials. The sequence of past rewards obtained from choosing a particular discriminative stimulus influences the probability of choosing that same stimulus on subsequent trials. Assuming that the long-term average ratio of choices matches the long-term average ratio of reinforcers, we hypothesized that a reduced amount of input biases in GLS compared to FRS should lead to a reduced production of output biases. We compared the choice patterns produced by a 'Rational Decision Maker' (RDM in response to computer-generated FRS and GLS training sequences. To create a virtual RDM, we implemented an algorithm that generated choices based on past rewards. Our simulations revealed that, although the GLS presented fewer input biases than the FRS, the virtual RDM produced more output biases with GLS than with FRS under a variety of test conditions. Our results reveal that the statistical and temporal properties of training sequences interacted with the RDM to influence the production of output biases. Thus, discrete changes in the training paradigms did not translate linearly into modifications in the pattern of choices generated by a RDM. Virtual RDMs could be further employed to guide

  2. Undesirable Choice Biases with Small Differences in the Spatial Structure of Chance Stimulus Sequences.

    Science.gov (United States)

    Herrera, David; Treviño, Mario

    2015-01-01

    In two-alternative discrimination tasks, experimenters usually randomize the location of the rewarded stimulus so that systematic behavior with respect to irrelevant stimuli can only produce chance performance on the learning curves. One way to achieve this is to use random numbers derived from a discrete binomial distribution to create a 'full random training schedule' (FRS). When using FRS, however, sporadic but long laterally-biased training sequences occur by chance and such 'input biases' are thought to promote the generation of laterally-biased choices (i.e., 'output biases'). As an alternative, a 'Gellerman-like training schedule' (GLS) can be used. It removes most input biases by prohibiting the reward from appearing on the same location for more than three consecutive trials. The sequence of past rewards obtained from choosing a particular discriminative stimulus influences the probability of choosing that same stimulus on subsequent trials. Assuming that the long-term average ratio of choices matches the long-term average ratio of reinforcers, we hypothesized that a reduced amount of input biases in GLS compared to FRS should lead to a reduced production of output biases. We compared the choice patterns produced by a 'Rational Decision Maker' (RDM) in response to computer-generated FRS and GLS training sequences. To create a virtual RDM, we implemented an algorithm that generated choices based on past rewards. Our simulations revealed that, although the GLS presented fewer input biases than the FRS, the virtual RDM produced more output biases with GLS than with FRS under a variety of test conditions. Our results reveal that the statistical and temporal properties of training sequences interacted with the RDM to influence the production of output biases. Thus, discrete changes in the training paradigms did not translate linearly into modifications in the pattern of choices generated by a RDM. Virtual RDMs could be further employed to guide the selection of

  3. Rapid and Accurate Sequencing of Enterovirus Genomes Using MinION Nanopore Sequencer.

    Science.gov (United States)

    Wang, Ji; Ke, Yue Hua; Zhang, Yong; Huang, Ke Qiang; Wang, Lei; Shen, Xin Xin; Dong, Xiao Ping; Xu, Wen Bo; Ma, Xue Jun

    2017-10-01

    Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly used to sequence various viral pathogens in many clinical situations because of its long reads, portability, real-time accessibility of sequenced data, and very low initial costs. However, information is lacking on MinION sequencing of enterovirus genomes. In this proof-of-concept study using Enterovirus 71 (EV71) and Coxsackievirus A16 (CA16) strains as examples, we established an amplicon-based whole genome sequencing method using MinION. We explored the accuracy, minimum sequencing time, discrimination and high-throughput sequencing ability of MinION, and compared its performance with Sanger sequencing. Within the first minute (min) of sequencing, the accuracy of MinION was 98.5% for the single EV71 strain and 94.12%-97.33% for 10 genetically-related CA16 strains. In as little as 14 min, 99% identity was reached for the single EV71 strain, and in 17 min (on average), 99% identity was achieved for 10 CA16 strains in a single run. MinION is suitable for whole genome sequencing of enteroviruses with sufficient accuracy and fine discrimination and has the potential as a fast, reliable and convenient method for routine use. Copyright © 2017 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.

  4. Differential sequence diversity at merozoite surface protein-1 locus of Plasmodium knowlesi from humans and macaques in Thailand.

    Science.gov (United States)

    Putaporntip, Chaturong; Thongaree, Siriporn; Jongwutiwes, Somchai

    2013-08-01

    To determine the genetic diversity and potential transmission routes of Plasmodium knowlesi, we analyzed the complete nucleotide sequence of the gene encoding the merozoite surface protein-1 of this simian malaria (Pkmsp-1), an asexual blood-stage vaccine candidate, from naturally infected humans and macaques in Thailand. Analysis of Pkmsp-1 sequences from humans (n=12) and monkeys (n=12) reveals five conserved and four variable domains. Most nucleotide substitutions in conserved domains were dimorphic whereas three of four variable domains contained complex repeats with extensive sequence and size variation. Besides purifying selection in conserved domains, evidence of intragenic recombination scattering across Pkmsp-1 was detected. The number of haplotypes, haplotype diversity, nucleotide diversity and recombination sites of human-derived sequences exceeded that of monkey-derived sequences. Phylogenetic networks based on concatenated conserved sequences of Pkmsp-1 displayed a character pattern that could have arisen from sampling process or the presence of two independent routes of P. knowlesi transmission, i.e. from macaques to human and from human to humans in Thailand. Copyright © 2013 Elsevier B.V. All rights reserved.

  5. Molecular recognition of AT-DNA sequences by the induced CD pattern of dibenzotetraaza[14]annulene (DBTAA)–adenine derivatives

    Science.gov (United States)

    Stojković, Marijana Radić; Škugor, Marko; Dudek, Łukasz; Grolik, Jarosław; Eilmes, Julita

    2014-01-01

    Summary An investigation of the interactions of two novel and several known DBTAA–adenine conjugates with double-stranded DNA and RNA has revealed the DNA/RNA groove as the dominant binding site, which is in contrast to the majority of previously studied DBTAA analogues (DNA/RNA intercalators). Only DBTAA–propyladenine conjugates revealed the molecular recognition of AT-DNA by an ICD band pattern > 300 nm, whereas significant ICD bands did not appear for other ds-DNA/RNA. A structure–activity relation for the studied series of compounds showed that the essential structural features for the ICD recognition are a) the presence of DNA-binding appendages (adenine side chain and positively charged side chain) on both DBTAA side chains, and b) the presence of a short propyl linker, which does not support intramolecular aromatic stacking between DBTAA and adenine. The observed AT-DNA-ICD pattern differs from previously reported ss-DNA (poly dT) ICD recognition by a strong negative ICD band at 350 nm, which allows for the dynamic differentiation between ss-DNA (poly dT) and coupled ds-AT-DNA. PMID:25246976

  6. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, nois...... patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer....

  7. Trace metal depositional patterns from an open pit mining activity as revealed by archived avian gizzard contents.

    Science.gov (United States)

    Bendell, L I

    2011-02-15

    Archived samples of blue grouse (Dendragapus obscurus) gizzard contents, inclusive of grit, collected yearly between 1959 and 1970 were analyzed for cadmium, lead, zinc, and copper content. Approximately halfway through the 12-year sampling period, an open-pit copper mine began activities, then ceased operations 2 years later. Thus the archived samples provided a unique opportunity to determine if avian gizzard contents, inclusive of grit, could reveal patterns in the anthropogenic deposition of trace metals associated with mining activities. Gizzard concentrations of cadmium and copper strongly coincided with the onset of opening and the closing of the pit mining activity. Gizzard zinc and lead demonstrated significant among year variation; however, maximum concentrations did not correlate to mining activity. The archived gizzard contents did provide a useful tool for documenting trends in metal depositional patterns related to an anthropogenic activity. Further, blue grouse ingesting grit particles during the time of active mining activity would have been exposed to toxicologically significant levels of cadmium. Gizzard lead concentrations were also of toxicological significance but not related to mining activity. This type of "pulse" toxic metal exposure as a consequence of open-pit mining activity would not necessarily have been revealed through a "snap-shot" of soil, plant or avian tissue trace metal analysis post-mining activity. Copyright © 2010 Elsevier B.V. All rights reserved.

  8. A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

    Directory of Open Access Journals (Sweden)

    Glass John I

    2010-07-01

    Full Text Available Abstract Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT. Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the

  9. Detection of rifampin resistance patterns in Mycobacterium tuberculosis strains isolated in Iran by polymerase chain reaction-single-strand conformation polymorphism and direct sequencing methods

    Directory of Open Access Journals (Sweden)

    Bahram Nasr Isfahani

    2006-09-01

    Full Text Available Mutations in the rpoB locus confer conformational changes leading to defective binding of rifampin (RIF to rpoB and consequently resistance in Mycobacterium tuberculosis. Polymerase chain reaction-single-strand conformation polymorphism (PCR-SSCP was established as a rapid screening test for the detection of mutations in the rpoB gene, and direct sequencing has been unambiguously applied to characterize mutations. A total of 37 of Iranian isolates of M. tuberculosis, 16 sensitive and 21 resistant to RIF, were used in this study. A 193-bp region of the rpoB gene was amplified and PCR-SSCP patterns were determined by electrophoresis in 10% acrylamide gel and silver staining. Also, 21 samples of 193-bp rpoB amplicons with different PCR-SSCP patterns from RIFr and 10 from RIFs were sequenced. Seven distinguishable PCR-SSCP patterns were recognized in the 21 Iranian RIFr strains, while 15 out of 16 RIFs isolates demonstrated PCR-SSCP banding patterns similar to that of sensitive standard strain H37Rv. However one of the sensitive isolates demonstrated a different pattern. There were seen six different mutations in the amplified region of rpoB gene: codon 516(GAC/GTC, 523(GGG/GGT, 526(CAC/TAC, 531(TCG/TTG, 511(CTG/TTG, and 512(AGC/TCG. This study demonstrated the high specificity (93.8% and sensitivity (95.2% of PCR-SSCP method for detection of mutation in rpoB gene; 85.7% of RIFr strains showed a single mutation and 14.3% had no mutations. Three strains showed mutations caused polymorphism. Our data support the common notion that rifampin resistance genotypes are generally present mutations in codons 531 and 526, most frequently found in M. tuberculosis populations regardless of geographic origin.

  10. Elucidation of hepatitis C virus transmission and early diversification by single genome sequencing.

    Science.gov (United States)

    Li, Hui; Stoddard, Mark B; Wang, Shuyi; Blair, Lily M; Giorgi, Elena E; Parrish, Erica H; Learn, Gerald H; Hraber, Peter; Goepfert, Paul A; Saag, Michael S; Denny, Thomas N; Haynes, Barton F; Hahn, Beatrice H; Ribeiro, Ruy M; Perelson, Alan S; Korber, Bette T; Bhattacharya, Tanmoy; Shaw, George M

    2012-01-01

    A precise molecular identification of transmitted hepatitis C virus (HCV) genomes could illuminate key aspects of transmission biology, immunopathogenesis and natural history. We used single genome sequencing of 2,922 half or quarter genomes from plasma viral RNA to identify transmitted/founder (T/F) viruses in 17 subjects with acute community-acquired HCV infection. Sequences from 13 of 17 acute subjects, but none of 14 chronic controls, exhibited one or more discrete low diversity viral lineages. Sequences within each lineage generally revealed a star-like phylogeny of mutations that coalesced to unambiguous T/F viral genomes. Numbers of transmitted viruses leading to productive clinical infection were estimated to range from 1 to 37 or more (median = 4). Four acutely infected subjects showed a distinctly different pattern of virus diversity that deviated from a star-like phylogeny. In these cases, empirical analysis and mathematical modeling suggested high multiplicity virus transmission from individuals who themselves were acutely infected or had experienced a virus population bottleneck due to antiviral drug therapy. These results provide new quantitative and qualitative insights into HCV transmission, revealing for the first time virus-host interactions that successful vaccines or treatment interventions will need to overcome. Our findings further suggest a novel experimental strategy for identifying full-length T/F genomes for proteome-wide analyses of HCV biology and adaptation to antiviral drug or immune pressures.

  11. GEITLERINEMA SPECIES (OSCILLATORIALES, CYANOBACTERIA) REVEALED BY CELLULAR MORPHOLOGY, ULTRASTRUCTURE, AND DNA SEQUENCING(1).

    Science.gov (United States)

    Do Carmo Bittencourt-Oliveira, Maria; Do Nascimento Moura, Ariadne; De Oliveira, Mariana Cabral; Sidnei Massola, Nelson

    2009-06-01

    Geitlerinema amphibium (C. Agardh ex Gomont) Anagn. and G. unigranulatum (Rama N. Singh) Komárek et M. T. P. Azevedo are morphologically close species with characteristics frequently overlapping. Ten strains of Geitlerinema (six of G. amphibium and four of G. unigranulatum) were analyzed by DNA sequencing and transmission electronic and optical microscopy. Among the investigated strains, the two species were not separated with respect to cellular dimensions, and cellular width was the most varying characteristic. The number and localization of granules, as well as other ultrastructural characteristics, did not provide a means to discriminate between the two species. The two species were not separated either by geography or environment. These results were further corroborated by the analysis of the cpcB-cpcA intergenic spacer (PC-IGS) sequences. Given the fact that morphology is very uniform, plus the coexistence of these populations in the same habitat, it would be nearly impossible to distinguish between them in nature. On the other hand, two of the analyzed strains were distinct from all others based on the PC-IGS sequences, in spite of their morphological similarity. PC-IGS sequences indicate that these two strains could be a different species of Geitlerinema. Using morphology, cell ultrastructure, and PC-IGS sequences, it is not possible to distinguish G. amphibium and G. unigranulatum. Therefore, they should be treated as one species, G. unigranulatum as a synonym of G. amphibium. © 2009 Phycological Society of America.

  12. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  13. A Scalable Permutation Approach Reveals Replication and Preservation Patterns of Network Modules in Large Datasets.

    Science.gov (United States)

    Ritchie, Scott C; Watts, Stephen; Fearnley, Liam G; Holt, Kathryn E; Abraham, Gad; Inouye, Michael

    2016-07-01

    Network modules-topologically distinct groups of edges and nodes-that are preserved across datasets can reveal common features of organisms, tissues, cell types, and molecules. Many statistics to identify such modules have been developed, but testing their significance requires heuristics. Here, we demonstrate that current methods for assessing module preservation are systematically biased and produce skewed p values. We introduce NetRep, a rapid and computationally efficient method that uses a permutation approach to score module preservation without assuming data are normally distributed. NetRep produces unbiased p values and can distinguish between true and false positives during multiple hypothesis testing. We use NetRep to quantify preservation of gene coexpression modules across murine brain, liver, adipose, and muscle tissues. Complex patterns of multi-tissue preservation were revealed, including a liver-derived housekeeping module that displayed adipose- and muscle-specific association with body weight. Finally, we demonstrate the broader applicability of NetRep by quantifying preservation of bacterial networks in gut microbiota between men and women. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.

  14. Research on parallel algorithm for sequential pattern mining

    Science.gov (United States)

    Zhou, Lijuan; Qin, Bai; Wang, Yu; Hao, Zhongxiao

    2008-03-01

    Sequential pattern mining is the mining of frequent sequences related to time or other orders from the sequence database. Its initial motivation is to discover the laws of customer purchasing in a time section by finding the frequent sequences. In recent years, sequential pattern mining has become an important direction of data mining, and its application field has not been confined to the business database and has extended to new data sources such as Web and advanced science fields such as DNA analysis. The data of sequential pattern mining has characteristics as follows: mass data amount and distributed storage. Most existing sequential pattern mining algorithms haven't considered the above-mentioned characteristics synthetically. According to the traits mentioned above and combining the parallel theory, this paper puts forward a new distributed parallel algorithm SPP(Sequential Pattern Parallel). The algorithm abides by the principal of pattern reduction and utilizes the divide-and-conquer strategy for parallelization. The first parallel task is to construct frequent item sets applying frequent concept and search space partition theory and the second task is to structure frequent sequences using the depth-first search method at each processor. The algorithm only needs to access the database twice and doesn't generate the candidated sequences, which abates the access time and improves the mining efficiency. Based on the random data generation procedure and different information structure designed, this paper simulated the SPP algorithm in a concrete parallel environment and implemented the AprioriAll algorithm. The experiments demonstrate that compared with AprioriAll, the SPP algorithm had excellent speedup factor and efficiency.

  15. Translational database selection and multiplexed sequence capture for up front filtering of reliable breast cancer biomarker candidates.

    Directory of Open Access Journals (Sweden)

    Patrik L Ståhl

    Full Text Available Biomarker identification is of utmost importance for the development of novel diagnostics and therapeutics. Here we make use of a translational database selection strategy, utilizing data from the Human Protein Atlas (HPA on differentially expressed protein patterns in healthy and breast cancer tissues as a means to filter out potential biomarkers for underlying genetic causatives of the disease. DNA was isolated from ten breast cancer biopsies, and the protein coding and flanking non-coding genomic regions corresponding to the selected proteins were extracted in a multiplexed format from the samples using a single DNA sequence capture array. Deep sequencing revealed an even enrichment of the multiplexed samples and a great variation of genetic alterations in the tumors of the sampled individuals. Benefiting from the upstream filtering method, the final set of biomarker candidates could be completely verified through bidirectional Sanger sequencing, revealing a 40 percent false positive rate despite high read coverage. Of the variants encountered in translated regions, nine novel non-synonymous variations were identified and verified, two of which were present in more than one of the ten tumor samples.

  16. Appearances can be deceptive: revealing a hidden viral infection with deep sequencing in a plant quarantine context.

    Science.gov (United States)

    Candresse, Thierry; Filloux, Denis; Muhire, Brejnev; Julian, Charlotte; Galzi, Serge; Fort, Guillaume; Bernardo, Pauline; Daugrois, Jean-Heindrich; Fernandez, Emmanuel; Martin, Darren P; Varsani, Arvind; Roumagnac, Philippe

    2014-01-01

    Comprehensive inventories of plant viral diversity are essential for effective quarantine and sanitation efforts. The safety of regulated plant material exchanges presently relies heavily on techniques such as PCR or nucleic acid hybridisation, which are only suited to the detection and characterisation of specific, well characterised pathogens. Here, we demonstrate the utility of sequence-independent next generation sequencing (NGS) of both virus-derived small interfering RNAs (siRNAs) and virion-associated nucleic acids (VANA) for the detailed identification and characterisation of viruses infecting two quarantined sugarcane plants. Both plants originated from Egypt and were known to be infected with Sugarcane streak Egypt Virus (SSEV; Genus Mastrevirus, Family Geminiviridae), but were revealed by the NGS approaches to also be infected by a second highly divergent mastrevirus, here named Sugarcane white streak Virus (SWSV). This novel virus had escaped detection by all routine quarantine detection assays and was found to also be present in sugarcane plants originating from Sudan. Complete SWSV genomes were cloned and sequenced from six plants and all were found to share >91% genome-wide identity. With the exception of two SWSV variants, which potentially express unusually large RepA proteins, the SWSV isolates display genome characteristics very typical to those of all other previously described mastreviruses. An analysis of virus-derived siRNAs for SWSV and SSEV showed them to be strongly influenced by secondary structures within both genomic single stranded DNA and mRNA transcripts. In addition, the distribution of siRNA size frequencies indicates that these mastreviruses are likely subject to both transcriptional and post-transcriptional gene silencing. Our study stresses the potential advantages of NGS-based virus metagenomic screening in a plant quarantine setting and indicates that such techniques could dramatically reduce the numbers of non

  17. Appearances can be deceptive: revealing a hidden viral infection with deep sequencing in a plant quarantine context.

    Directory of Open Access Journals (Sweden)

    Thierry Candresse

    Full Text Available Comprehensive inventories of plant viral diversity are essential for effective quarantine and sanitation efforts. The safety of regulated plant material exchanges presently relies heavily on techniques such as PCR or nucleic acid hybridisation, which are only suited to the detection and characterisation of specific, well characterised pathogens. Here, we demonstrate the utility of sequence-independent next generation sequencing (NGS of both virus-derived small interfering RNAs (siRNAs and virion-associated nucleic acids (VANA for the detailed identification and characterisation of viruses infecting two quarantined sugarcane plants. Both plants originated from Egypt and were known to be infected with Sugarcane streak Egypt Virus (SSEV; Genus Mastrevirus, Family Geminiviridae, but were revealed by the NGS approaches to also be infected by a second highly divergent mastrevirus, here named Sugarcane white streak Virus (SWSV. This novel virus had escaped detection by all routine quarantine detection assays and was found to also be present in sugarcane plants originating from Sudan. Complete SWSV genomes were cloned and sequenced from six plants and all were found to share >91% genome-wide identity. With the exception of two SWSV variants, which potentially express unusually large RepA proteins, the SWSV isolates display genome characteristics very typical to those of all other previously described mastreviruses. An analysis of virus-derived siRNAs for SWSV and SSEV showed them to be strongly influenced by secondary structures within both genomic single stranded DNA and mRNA transcripts. In addition, the distribution of siRNA size frequencies indicates that these mastreviruses are likely subject to both transcriptional and post-transcriptional gene silencing. Our study stresses the potential advantages of NGS-based virus metagenomic screening in a plant quarantine setting and indicates that such techniques could dramatically reduce the numbers of non

  18. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Directory of Open Access Journals (Sweden)

    Jason D Thompson

    Full Text Available Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  19. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    Science.gov (United States)

    Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  20. Large-Scale Constraint-Based Pattern Mining

    Science.gov (United States)

    Zhu, Feida

    2009-01-01

    We studied the problem of constraint-based pattern mining for three different data formats, item-set, sequence and graph, and focused on mining patterns of large sizes. Colossal patterns in each data formats are studied to discover pruning properties that are useful for direct mining of these patterns. For item-set data, we observed robustness of…

  1. Whole genome sequencing and analysis of Campylobacter coli YH502 from retail chicken reveals a plasmid-borne type VI secretion system

    Directory of Open Access Journals (Sweden)

    Sandeep Ghatak

    2017-03-01

    Full Text Available Campylobacter is a major cause of foodborne illnesses worldwide. Campylobacter infections, commonly caused by ingestion of undercooked poultry and meat products, can lead to gastroenteritis and chronic reactive arthritis in humans. Whole genome sequencing (WGS is a powerful technology that provides comprehensive genetic information about bacteria and is increasingly being applied to study foodborne pathogens: e.g., evolution, epidemiology/outbreak investigation, and detection. Herein we report the complete genome sequence of Campylobacter coli strain YH502 isolated from retail chicken in the United States. WGS, de novo assembly, and annotation of the genome revealed a chromosome of 1,718,974 bp and a mega-plasmid (pCOS502 of 125,964 bp. GC content of the genome was 31.2% with 1931 coding sequences and 53 non-coding RNAs. Multiple virulence factors including a plasmid-borne type VI secretion system and antimicrobial resistance genes (beta-lactams, fluoroquinolones, and aminoglycoside were found. The presence of T6SS in a mobile genetic element (plasmid suggests plausible horizontal transfer of these virulence genes to other organisms. The C. coli YH502 genome also harbors CRISPR sequences and associated proteins. Phylogenetic analysis based on average nucleotide identity and single nucleotide polymorphisms identified closely related C. coli genomes available in the NCBI database. Taken together, the analyzed genomic data of this potentially virulent strain of C. coli will facilitate further understanding of this important foodborne pathogen most likely leading to better control strategies. The chromosome and plasmid sequences of C. coli YH502 have been deposited in GenBank under the accession numbers CP018900.1 and CP018901.1, respectively.

  2. The complete chloroplast genome sequence of Abies nephrolepis (Pinaceae: Abietoideae

    Directory of Open Access Journals (Sweden)

    Dong-Keun Yi

    2016-06-01

    Full Text Available The plant chloroplast (cp genome has maintained a relatively conserved structure and gene content throughout evolution. Cp genome sequences have been used widely for resolving evolutionary and phylogenetic issues at various taxonomic levels of plants. Here, we report the complete cp genome of Abies nephrolepis. The A. nephrolepis cp genome is 121,336 base pairs (bp in length including a pair of short inverted repeat regions (IRa and IRb of 139 bp each separated by a small single copy (SSC region of 54,323 bp (SSC and a large single copy region of 66,735 bp (LSC. It contains 114 genes, 68 of which are protein coding genes, 35 tRNA and four rRNA genes, six open reading frames, and one pseudogene. Seventeen repeat units and 64 simple sequence repeats (SSR have been detected in A. nephrolepis cp genome. Large IR sequences locate in 42-kb inversion points (1186 bp. The A. nephrolepis cp genome is identical to Abies koreana’s which is closely related to taxa. Pairwise comparison between two cp genomes revealed 140 polymorphic sites in each. Complete cp genome sequence of A. nephrolepis has a significant potential to provide information on the evolutionary pattern of Abietoideae and valuable data for development of DNA markers for easy identification and classification.

  3. High resolution sea-level curve for the latest Frasnian and earliest Famennian derived for high frequency sequences in the Appalachian Basin

    Energy Technology Data Exchange (ETDEWEB)

    Filer, J.K. (Washington and Lee Univ., Lexington, VA (United States). Dept. of Geology)

    1992-01-01

    Siliciclastic sequences have been mapped in the subsurface and outcrop of much of the Appalachian basin in facies ranging from shale in the basin plain to shelf sandstone. Eleven transgressive/regressive cycles have been defined in an estimated 1.5 to 2.0 Ma period in the latest Frasnian and earliest Famennian, and range in duration from about 75,000 to 400,000 years. Lithofacies maps, covering most of the basin, were prepared for each sequence. These maps show both the area of basinal black shale deposition, which defines the base of each cycle, and the areal extent of subsequent clinoform siltstone and shelf sandstone deposition in the upper portion of each cycle. The stratigraphic patterns show two stacked sets of progradational basinwide sequences. Geographic scale of the study precludes autocyclic controls of cycles. Sea-level/climate cycles, probably superimposed on longer term tectonic cycles, are the proposed cause of these observed depositional patterns. Removal of the long-term progradational trend of Upper Devonian basin filling results in a proposed eustatic sea-level curve (Johnson and others (1985)) reveals correspondence of three regressive maxima in both models. The curve presented here reveals that an ongoing process of higher frequency sea-level modification was active at this time. Higher frequency sea-level events, nested within previously interpreted lower frequency global events, are inferred to also be eustatic. Models of a biotic crises which occurs at this time should consider the implications of these high frequency sea-level cycles. The patterns observed are consistent with latest Frasnian initiation of glaciation in South America. This would be somewhat earlier than has generally been accepted.

  4. Label-free proteome profiling reveals developmental-dependent patterns in young barley grains.

    Science.gov (United States)

    Kaspar-Schoenefeld, Stephanie; Merx, Kathleen; Jozefowicz, Anna Maria; Hartmann, Anja; Seiffert, Udo; Weschke, Winfriede; Matros, Andrea; Mock, Hans-Peter

    2016-06-30

    Due to its importance as a cereal crop worldwide, high interest in the determination of factors influencing barley grain quality exists. This study focusses on the elucidation of protein networks affecting early grain developmental processes. NanoLC-based separation coupled to label-free MS detection was applied to gain insights into biochemical processes during five different grain developmental phases (pre-storage until storage phase, 3days to 16days after flowering). Multivariate statistics revealed two distinct developmental patterns during the analysed grain developmental phases: proteins showed either highest abundance in the middle phase of development - in the transition phase - or at later developmental stages - within the storage phase. Verification of developmental patterns observed by proteomic analysis was done by applying hypothesis-driven approaches, namely Western Blot analysis and enzyme assays. High general metabolic activity of the grain with regard to protein synthesis, cell cycle regulation, defence against oxidative stress, and energy production via photosynthesis was observed in the transition phase. Proteins upregulated in the storage phase are related towards storage protein accumulation, and interestingly to the defence of storage reserves against pathogens. A mixed regulatory pattern for most enzymes detected in our study points to regulatory mechanisms at the level of protein isoforms. In-depth understanding of early grain developmental processes of cereal caryopses is of high importance as they influence final grain weight and quality. Our knowledge about these processes is still limited, especially on proteome level. To identify key mechanisms in early barley grain development, a label-free data-independent proteomics acquisition approach has been applied. Our data clearly show, that proteins either exhibit highest expression during cellularization and the switch to the storage phase (transition phase, 5-7 DAF), or during storage

  5. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk

    2013-01-01

    to the consensus sequence. Additionally, we got an average sequence depth for the genome of 4000 for the Iontorrent PGM and 400 for the FLX platform making the mapping suitable for single nucleotide variant (SNV) detection. The analysis revealed a single non-silent SNV A10665G leading to the amino acid change D......Next Generation Sequencing (NGS) is becoming more adopted into viral research and will be the preferred technology in the years to come. We have recently sequenced several strains of Classical Swine Fever Virus (CSFV) by NGS on both Genome Sequencer FLX (GS FLX) and Iontorrent PGM platforms...

  6. Secondary structure of the rRNA ITS2 region reveals key evolutionary patterns in acroporid corals.

    Science.gov (United States)

    Coleman, Annette W; van Oppen, Madeleine J H

    2008-10-01

    This study investigates the ribosomal RNA transcript secondary structure in corals as confirmed by compensatory base changes in Isopora/Acropora species. These species are unique versus all other corals in the absence of a eukaryote-wide conserved structural component, the helix III in internal transcriber spacer (ITS) 2, and their variability in the 5.8S-LSU helix basal to ITS2, a helix with pairings identical among all other scleractinian corals. Furthermore, Isopora/Acropora individuals display at least two, and as many as three, ITS sequence isotypes in their genome which appear to be capable of function. From consideration of the conserved elements in ITS2 and flanking regions, it appears that there are three major groups within the IsoporaAcropora lineage: the Isopora + Acropora "longi" group, the large group including Caribbean Acropora + the Acropora "carib" types plus the bulk of the Indo-Pacific Acropora species, and the remaining enigmatic "pseudo" group found in the Pacific. Interbreeding is possible among Caribbean A. palmata and A. cervicornis and among some species of Indo-Pacific Acropora. Recombinant ITS sequences are obvious among these latter, such that morphology (as represented by species name) does not correlate with common ITS sequence. The combination of characters revealed by RNA secondary structure analyses suggests a recent past/current history of interbreeding among the Indo-Pacific Acropora species and a shared ancestry of some of these with the Caribbean Acropora. The unusual absence of helix III of ITS2 of Isopora/Acropora species may have some causative role in the equally unusual instability in the 5.8S-LSU helix basal to ITS2 of this species complex.

  7. Selective retrieval of memory and concept sequences through neuro-windows

    OpenAIRE

    Kakeya, Hideki; Okabe, Yoichi

    1999-01-01

    This letter presents a crosscorrelational associative memory model which realizes selective retrieval of pattern sequences. When hierarchically correlated sequences are memorized, sequences of the correlational centers can be defined as the concept sequences. The authors propose a modified neuro-window method which enables selective retrieval of memory sequences and concept sequences. It is also shown that the proposed model realizes capacity expansion of the memory which stores random sequen...

  8. Deep Sequencing of Plant and Animal DNA Contained within Traditional Chinese Medicines Reveals Legality Issues and Health Safety Concerns

    Science.gov (United States)

    Coghlan, Megan L.; Haile, James; Houston, Jayne; Murray, Dáithí C.; White, Nicole E.; Moolhuijzen, Paula; Bellgard, Matthew I.; Bunce, Michael

    2012-01-01

    Traditional Chinese medicine (TCM) has been practiced for thousands of years, but only within the last few decades has its use become more widespread outside of Asia. Concerns continue to be raised about the efficacy, legality, and safety of many popular complementary alternative medicines, including TCMs. Ingredients of some TCMs are known to include derivatives of endangered, trade-restricted species of plants and animals, and therefore contravene the Convention on International Trade in Endangered Species (CITES) legislation. Chromatographic studies have detected the presence of heavy metals and plant toxins within some TCMs, and there are numerous cases of adverse reactions. It is in the interests of both biodiversity conservation and public safety that techniques are developed to screen medicinals like TCMs. Targeting both the p-loop region of the plastid trnL gene and the mitochondrial 16S ribosomal RNA gene, over 49,000 amplicon sequence reads were generated from 15 TCM samples presented in the form of powders, tablets, capsules, bile flakes, and herbal teas. Here we show that second-generation, high-throughput sequencing (HTS) of DNA represents an effective means to genetically audit organic ingredients within complex TCMs. Comparison of DNA sequence data to reference databases revealed the presence of 68 different plant families and included genera, such as Ephedra and Asarum, that are potentially toxic. Similarly, animal families were identified that include genera that are classified as vulnerable, endangered, or critically endangered, including Asiatic black bear (Ursus thibetanus) and Saiga antelope (Saiga tatarica). Bovidae, Cervidae, and Bufonidae DNA were also detected in many of the TCM samples and were rarely declared on the product packaging. This study demonstrates that deep sequencing via HTS is an efficient and cost-effective way to audit highly processed TCM products and will assist in monitoring their legality and safety especially when

  9. Whole-genome sequencing of Bacillus subtilis XF-1 reveals mechanisms for biological control and multiple beneficial properties in plants.

    Science.gov (United States)

    Guo, Shengye; Li, Xingyu; He, Pengfei; Ho, Honhing; Wu, Yixin; He, Yueqiu

    2015-06-01

    Bacillus subtilis XF-1 is a gram-positive, plant-associated bacterium that stimulates plant growth and produces secondary metabolites that suppress soil-borne plant pathogens. In particular, it is especially highly efficient at controlling the clubroot disease of cruciferous crops. Its 4,061,186-bp genome contains an estimated 3853 protein-coding sequences and the 1155 genes of XF-1 are present in most genome-sequenced Bacillus strains: 3757 genes in B. subtilis 168, and 1164 in B. amyloliquefaciens FZB42. Analysis using the Cluster of Orthologous Groups database of proteins shows that 60 genes control bacterial mobility, 221 genes are related to cell wall and membrane biosynthesis, and more than 112 are genes associated with secondary metabolites. In addition, the genes contributed to the strain's plant colonization, bio-control and stimulation of plant growth. Sequencing of the genome is a fundamental step for developing a desired strain to serve as an efficient biological control agent and plant growth stimulator. Similar to other members of the taxon, XF-1 has a genome that contains giant gene clusters for the non-ribosomal synthesis of antifungal lipopeptides (surfactin and fengycin), the polyketides (macrolactin and bacillaene), the siderophore bacillibactin, and the dipeptide bacilysin. There are two synthesis pathways for volatile growth-promoting compounds. The expression of biosynthesized antibiotic peptides in XF-1 was revealed by matrix-assisted laser desorption/ionization-time of flight mass spectrometry.

  10. Sequence conservation and combinatorial complexity of Drosophila neural precursor cell enhancers

    Directory of Open Access Journals (Sweden)

    Kuzin Alexander

    2008-08-01

    Full Text Available Abstract Background The presence of highly conserved sequences within cis-regulatory regions can serve as a valuable starting point for elucidating the basis of enhancer function. This study focuses on regulation of gene expression during the early events of Drosophila neural development. We describe the use of EvoPrinter and cis-Decoder, a suite of interrelated phylogenetic footprinting and alignment programs, to characterize highly conserved sequences that are shared among co-regulating enhancers. Results Analysis of in vivo characterized enhancers that drive neural precursor gene expression has revealed that they contain clusters of highly conserved sequence blocks (CSBs made up of shorter shared sequence elements which are present in different combinations and orientations within the different co-regulating enhancers; these elements contain either known consensus transcription factor binding sites or consist of novel sequences that have not been functionally characterized. The CSBs of co-regulated enhancers share a large number of sequence elements, suggesting that a diverse repertoire of transcription factors may interact in a highly combinatorial fashion to coordinately regulate gene expression. We have used information gained from our comparative analysis to discover an enhancer that directs expression of the nervy gene in neural precursor cells of the CNS and PNS. Conclusion The combined use EvoPrinter and cis-Decoder has yielded important insights into the combinatorial appearance of fundamental sequence elements required for neural enhancer function. Each of the 30 enhancers examined conformed to a pattern of highly conserved blocks of sequences containing shared constituent elements. These data establish a basis for further analysis and understanding of neural enhancer function.

  11. Sequencing of Australian wild rice genomes reveals ancestral relationships with domesticated rice.

    Science.gov (United States)

    Brozynska, Marta; Copetti, Dario; Furtado, Agnelo; Wing, Rod A; Crayn, Darren; Fox, Glen; Ishikawa, Ryuji; Henry, Robert J

    2017-06-01

    The related A genome species of the Oryza genus are the effective gene pool for rice. Here, we report draft genomes for two Australian wild A genome taxa: O. rufipogon-like population, referred to as Taxon A, and O. meridionalis-like population, referred to as Taxon B. These two taxa were sequenced and assembled by integration of short- and long-read next-generation sequencing (NGS) data to create a genomic platform for a wider rice gene pool. Here, we report that, despite the distinct chloroplast genome, the nuclear genome of the Australian Taxon A has a sequence that is much closer to that of domesticated rice (O. sativa) than to the other Australian wild populations. Analysis of 4643 genes in the A genome clade showed that the Australian annual, O. meridionalis, and related perennial taxa have the most divergent (around 3 million years) genome sequences relative to domesticated rice. A test for admixture showed possible introgression into the Australian Taxon A (diverged around 1.6 million years ago) especially from the wild indica/O. nivara clade in Asia. These results demonstrate that northern Australia may be the centre of diversity of the A genome Oryza and suggest the possibility that this might also be the centre of origin of this group and represent an important resource for rice improvement. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  12. Analysis of complete nucleotide sequences of Angolan hepatitis B virus isolates reveals the existence of a separate lineage within genotype E.

    Directory of Open Access Journals (Sweden)

    Barbara V Lago

    Full Text Available Hepatitis B virus genotype E (HBV/E is highly prevalent in Western Africa. In this work, 30 HBV/E isolates from HBsAg positive Angolans (staff and visitors of a private hospital in Luanda were genetically characterized: 16 of them were completely sequenced and the pre-S/S sequences of the remaining 14 were determined. A high proportion (12/30, 40% of subjects tested positive for both HBsAg and anti-HBs markers. Deduced amino acid sequences revealed the existence of specific substitutions and deletions in the B- and T-cell epitopes of the surface antigen (pre-S1- and pre-S2 regions of the virus isolates derived from 8/12 individuals with concurrent HBsAg/anti-HBs. Phylogenetic analysis performed with 231 HBV/E full-length sequences, including 16 from this study, showed that all isolates from Angola, Namibia and the Democratic Republic of Congo (n = 28 clustered in a separate lineage, divergent from the HBV/E isolates from nine other African countries, namely Cameroon, Central African Republic, Côte d'Ivoire, Ghana, Guinea, Madagascar, Niger, Nigeria and Sudan, with a Bayesian posterior probability of 1. Five specific mutations, namely small S protein T57I, polymerase Q177H, G245W and M612L, and X protein V30L, were observed in 79-96% of the isolates of the separate lineage, compared to a frequency of 0-12% among the other HBV/E African isolates.

  13. Interchangeable Positions in Interaction Sequences in Science Classrooms

    Directory of Open Access Journals (Sweden)

    Carol Rees

    2017-03-01

    Full Text Available Triadic dialogue, the Initiation, Response, Evaluation sequence typical of teacher /student interactions in classrooms, has long been identified as a barrier to students’ access to learning, including science learning. A large body of research on the subject has over the years led to projects and policies aimed at increasing opportunities for students to learn through interactive dialogue in classrooms. However, the triadic dialogue pattern continues to dominate, even when teachers intend changing this. Prior quantitative research on the subject has focused on identifying independent variables such as style of teacher questioning that have an impact, while qualitative researchers have worked to interpret the use of dialogue within the whole context of work in the classroom. A recent paper offers an alternative way to view the triadic dialogue pattern and its origin; the triadic dialogue pattern is an irreducible social phenomenon that arises in a particular situation regardless of the identity of the players who inhabit the roles in the turn-taking sequence (Roth & Gardner, 2012. According to this perspective, alternative patterns of dialogue would exist which are alternative irreducible social phenomena that arise in association with different situations. The aim of this paper is to examine as precisely as possible, the characteristics of dialogue patterns in a seventh-eighth grade classroom during science inquiry, and the precise situations from which these dialogue patterns emerge, regardless of the staffing (teacher or students in the turn-taking sequence. Three different patterns were identified each predominating in a particular situation. This fine-grained analysis could offer valuable insights into ways to support teachers working to alter the kinds of dialogue patterns that arise in their classrooms.

  14. Multilocus Sequence Typing Reveals a New Cluster of Closely Related Candida tropicalis Genotypes in Italian Patients With Neurological Disorders.

    Science.gov (United States)

    Scordino, Fabio; Giuffrè, Letterio; Barberi, Giuseppina; Marino Merlo, Francesca; Orlando, Maria Grazia; Giosa, Domenico; Romeo, Orazio

    2018-01-01

    Candida tropicalis is a pathogenic yeast that has emerged as an important cause of candidemia especially in elderly patients with hematological malignancies. Infections caused by this species are mainly reported from Latin America and Asian-Pacific countries although recent epidemiological data revealed that C. tropicalis accounts for 6-16.4% of the Candida bloodstream infections (BSIs) in Italy by representing a relevant issue especially for patients receiving long-term hospital care. The aim of this study was to describe the genetic diversity of C. tropicalis isolates contaminating the hands of healthcare workers (HCWs) and hospital environments and/or associated with BSIs occurring in patients with different neurological disorders and without hematological disease. A total of 28 C. tropicalis isolates were genotyped using multilocus sequence typing analysis of six housekeeping ( ICL1, MDR1, SAPT2, SAPT4, XYR1 , and ZWF1 ) genes and data revealed the presence of only eight diploid sequence types (DSTs) of which 6 (75%) were completely new. Four eBURST clonal complexes (CC2, CC10, CC11, and CC33) contained all DSTs found in this study and the CC33 resulted in an exclusive, well-defined, clonal cluster from Italy. In conclusion, C. tropicalis could represent an important cause of BSIs in long-term hospitalized patients with no underlying hematological disease. The findings of this study also suggest a potential horizontal transmission of a specific C. tropicalis clone through hands of HCWs and expand our understanding of the molecular epidemiology of this pathogen whose population structure is still far from being fully elucidated as its complexity increases as different categories of patients and geographic areas are examined.

  15. Metagenomic sequencing reveals the relationship between microbiota composition and quality of Chinese Rice Wine.

    Science.gov (United States)

    Hong, Xutao; Chen, Jing; Liu, Lin; Wu, Huan; Tan, Haiqin; Xie, Guangfa; Xu, Qian; Zou, Huijun; Yu, Wenjing; Wang, Lan; Qin, Nan

    2016-05-31

    Chinese Rice Wine (CRW) is a common alcoholic beverage in China. To investigate the influence of microbial composition on the quality of CRW, high throughput sequencing was performed for 110 wine samples on bacterial 16S rRNA gene and fungal Internal Transcribed Spacer II (ITS2). Bioinformatic analyses demonstrated that the quality of yeast starter and final wine correlated with microbial taxonomic composition, which was exemplified by our finding that wine spoilage resulted from a high proportion of genus Lactobacillus. Subsequently, based on Lactobacillus abundance of an early stage, a model was constructed to predict final wine quality. In addition, three batches of 20 representative wine samples selected from a pool of 110 samples were further analyzed in metagenomics. The results revealed that wine spoilage was due to rapid growth of Lactobacillus brevis at the early stage of fermentation. Gene functional analysis indicated the importance of some pathways such as synthesis of biotin, malolactic fermentation and production of short-chain fatty acid. These results led to a conclusion that metabolisms of microbes influence the wine quality. Thus, nurturing of beneficial microbes and inhibition of undesired ones are both important for the mechanized brewery.

  16. The expression of the clock gene cycle has rhythmic pattern and is affected by photoperiod in the moth Sesamia nonagrioides.

    Science.gov (United States)

    Kontogiannatos, Dimitrios; Gkouvitsas, Theodoros; Kourti, Anna

    2017-06-01

    To obtain clues to the link between the molecular mechanism of circadian and photoperiod clocks, we have cloned the circadian clock gene cycle (Sncyc) in the corn stalk borer, Sesamia nonagrioides, which undergoes facultative diapause controlled by photoperiod. Sequence analysis revealed a high degree of conservation among insects for this gene. SnCYC consists of 667 amino acids and structural analysis showed that it contains a BCTR domain in its C-terminal in addition to the common domains found in Drosophila CYC, i.e. bHLH, PAS-A, PAS-B domains. The results revealed that the sequence of Sncyc showed a similarity to that of its mammalian orthologue, Bmal1. We also investigated the expression patterns of Sncyc in the brain of larvae growing under long-day 16L: 8D (LD), constant darkness (DD) and short-day 10L: 14D (SD) conditions using qRT-PCR assays. The mRNAs of Sncyc expression was rhythmic in LD, DD and SD cycles. Also, it is remarkable that the photoperiodic conditions affect the expression patterns and/or amplitudes of circadian clock gene Sncyc. This gene is associated with diapause in S. nonagrioides, because under SD (diapause conditions) the photoperiodic signal altered mRNA accumulation. Sequence and expression analysis of cyc in S. nonagrioides shows interesting differences compared to Drosophila where this gene does not oscillate or change in expression patterns in response to photoperiod, suggesting that this species is an interesting new model to study the molecular control of insect circadian and photoperiodic clocks. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Analysis of the 9p21.3 sequence associated with coronary artery disease reveals a tendency for duplication in a CAD patient

    Science.gov (United States)

    Kouprina, Natalay; Noskov, Vladimir N.; Waterfall, Joshua J.; Walker, Robert L.; Meltzer, Paul S.; Topol, Eric J.; Larionov, Vladimir

    2018-01-01

    Tandem segmental duplications (SDs) greater than 10 kb are widespread in complex genomes. They provide material for gene divergence and evolutionary adaptation, while formation of specific de novo SDs is a hallmark of cancer and some human diseases. Most SDs map to distinct genomic regions termed ‘duplication blocks’. SDs organization within these blocks is often poorly characterized as they are mosaics of ancestral duplicons juxtaposed with younger duplicons arising from more recent duplication events. Structural and functional analysis of SDs is further hampered as long repetitive DNA structures are underrepresented in existing BAC and YAC libraries. We applied Transformation-Associated Recombination (TAR) cloning, a versatile technique for large DNA manipulation, to selectively isolate the coronary artery disease (CAD) interval sequence within the 9p21.3 chromosome locus from a patient with coronary artery disease and normal individuals. Four tandem head-to-tail duplicons, each ∼50 kb long, were recovered in the patient but not in normal individuals. Sequence analysis revealed that the repeats varied by 10-15 SNPs between each other and by 82 SNPs between the human genome sequence (version hg19). SNPs polymorphism within the junctions between repeats allowed two junction types to be distinguished, Type 1 and Type 2, which were found at a 2:1 ratio. The junction sequences contained an Alu element, a sequence previously shown to play a role in duplication. Knowledge of structural variation in the CAD interval from more patients could help link this locus to cardiovascular diseases susceptibility, and maybe relevant to other cases of regional amplification, including cancer. PMID:29632643

  18. Seqenv: linking sequences to environments through text mining.

    Science.gov (United States)

    Sinclair, Lucas; Ijaz, Umer Z; Jensen, Lars Juhl; Coolen, Marco J L; Gubry-Rangin, Cecile; Chroňáková, Alica; Oulas, Anastasis; Pavloudi, Christina; Schnetzer, Julia; Weimann, Aaron; Ijaz, Ali; Eiler, Alexander; Quince, Christopher; Pafilis, Evangelos

    2016-01-01

    Understanding the distribution of taxa and associated traits across different environments is one of the central questions in microbial ecology. High-throughput sequencing (HTS) studies are presently generating huge volumes of data to address this biogeographical topic. However, these studies are often focused on specific environment types or processes leading to the production of individual, unconnected datasets. The large amounts of legacy sequence data with associated metadata that exist can be harnessed to better place the genetic information found in these surveys into a wider environmental context. Here we introduce a software program, seqenv, to carry out precisely such a task. It automatically performs similarity searches of short sequences against the "nt" nucleotide database provided by NCBI and, out of every hit, extracts-if it is available-the textual metadata field. After collecting all the isolation sources from all the search results, we run a text mining algorithm to identify and parse words that are associated with the Environmental Ontology (EnvO) controlled vocabulary. This, in turn, enables us to determine both in which environments individual sequences or taxa have previously been observed and, by weighted summation of those results, to summarize complete samples. We present two demonstrative applications of seqenv to a survey of ammonia oxidizing archaea as well as to a plankton paleome dataset from the Black Sea. These demonstrate the ability of the tool to reveal novel patterns in HTS and its utility in the fields of environmental source tracking, paleontology, and studies of microbial biogeography. To install seqenv, go to: https://github.com/xapple/seqenv.

  19. Seqenv: linking sequences to environments through text mining

    Directory of Open Access Journals (Sweden)

    Lucas Sinclair

    2016-12-01

    Full Text Available Understanding the distribution of taxa and associated traits across different environments is one of the central questions in microbial ecology. High-throughput sequencing (HTS studies are presently generating huge volumes of data to address this biogeographical topic. However, these studies are often focused on specific environment types or processes leading to the production of individual, unconnected datasets. The large amounts of legacy sequence data with associated metadata that exist can be harnessed to better place the genetic information found in these surveys into a wider environmental context. Here we introduce a software program, seqenv, to carry out precisely such a task. It automatically performs similarity searches of short sequences against the “nt” nucleotide database provided by NCBI and, out of every hit, extracts–if it is available–the textual metadata field. After collecting all the isolation sources from all the search results, we run a text mining algorithm to identify and parse words that are associated with the Environmental Ontology (EnvO controlled vocabulary. This, in turn, enables us to determine both in which environments individual sequences or taxa have previously been observed and, by weighted summation of those results, to summarize complete samples. We present two demonstrative applications of seqenv to a survey of ammonia oxidizing archaea as well as to a plankton paleome dataset from the Black Sea. These demonstrate the ability of the tool to reveal novel patterns in HTS and its utility in the fields of environmental source tracking, paleontology, and studies of microbial biogeography. To install seqenv, go to: https://github.com/xapple/seqenv.

  20. Automating the generation of lexical patterns for processing free text in clinical documents.

    Science.gov (United States)

    Meng, Frank; Morioka, Craig

    2015-09-01

    Many tasks in natural language processing utilize lexical pattern-matching techniques, including information extraction (IE), negation identification, and syntactic parsing. However, it is generally difficult to derive patterns that achieve acceptable levels of recall while also remaining highly precise. We present a multiple sequence alignment (MSA)-based technique that automatically generates patterns, thereby leveraging language usage to determine the context of words that influence a given target. MSAs capture the commonalities among word sequences and are able to reveal areas of linguistic stability and variation. In this way, MSAs provide a systemic approach to generating lexical patterns that are generalizable, which will both increase recall levels and maintain high levels of precision. The MSA-generated patterns exhibited consistent F1-, F.5-, and F2- scores compared to two baseline techniques for IE across four different tasks. Both baseline techniques performed well for some tasks and less well for others, but MSA was found to consistently perform at a high level for all four tasks. The performance of MSA on the four extraction tasks indicates the method's versatility. The results show that the MSA-based patterns are able to handle the extraction of individual data elements as well as relations between two concepts without the need for large amounts of manual intervention. We presented an MSA-based framework for generating lexical patterns that showed consistently high levels of both performance and recall over four different extraction tasks when compared to baseline methods. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  1. A Repeated Signal Difference for Recognising Patterns

    Directory of Open Access Journals (Sweden)

    Kieran Greer

    2016-08-01

    Full Text Available This paper describes a new mechanism that might help with defining pattern sequences, by the fact that it can produce an upper bound on the ensemble value that can persistently oscillate with the actual values produced from each pattern. With every firing event, a node also receives an on/off feedback switch. If the node fires then it sends a feedback result depending on the input signal strength. If the input signal is positive or larger, it can store an ‘on’ switch feedback for the next iteration. If the signal is negative or smaller it can store an ‘off’ switch feedback for the next iteration. If the node does not fire, then it does not affect the current feedback situation and receives the switch command produced by the last active pattern event for the same neuron. The upper bound therefore also represents the largest or most enclosing pattern set and the lower value is for the actual set of firing patterns. If the pattern sequence repeats, it will oscillate between the two values, allowing them to be recognised and measured more easily, over time. Tests show that changing the sequence ordering produces different value sets, which can also be measured.

  2. Analysis of synonymous codon usage patterns in the genus Rhizobium.

    Science.gov (United States)

    Wang, Xinxin; Wu, Liang; Zhou, Ping; Zhu, Shengfeng; An, Wei; Chen, Yu; Zhao, Lin

    2013-11-01

    The codon usage patterns of rhizobia have received increasing attention. However, little information is available regarding the conserved features of the codon usage patterns in a typical rhizobial genus. The codon usage patterns of six completely sequenced strains belonging to the genus Rhizobium were analysed as model rhizobia in the present study. The relative neutrality plot showed that selection pressure played a role in codon usage in the genus Rhizobium. Spearman's rank correlation analysis combined with correspondence analysis (COA) showed that the codon adaptation index and the effective number of codons (ENC) had strong correlation with the first axis of the COA, which indicated the important role of gene expression level and the ENC in the codon usage patterns in this genus. The relative synonymous codon usage of Cys codons had the strongest correlation with the second axis of the COA. Accordingly, the usage of Cys codons was another important factor that shaped the codon usage patterns in Rhizobium genomes and was a conserved feature of the genus. Moreover, the comparison of codon usage between highly and lowly expressed genes showed that 20 unique preferred codons were shared among Rhizobium genomes, revealing another conserved feature of the genus. This is the first report of the codon usage patterns in the genus Rhizobium.

  3. Molecular Cloning and Sequence Analysis of a Phenylalanine Ammonia-Lyase Gene from Dendrobium

    Science.gov (United States)

    Cai, Yongping; Lin, Yi

    2013-01-01

    In this study, a phenylalanine ammonia-lyase (PAL) gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748) has 2,458 bps and contains a complete open reading frame (ORF) of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum. PMID:23638048

  4. Phylogenetic and genome-wide deep-sequencing analyses of canine parvovirus reveal co-infection with field variants and emergence of a recent recombinant strain.

    Directory of Open Access Journals (Sweden)

    Ruben Pérez

    Full Text Available Canine parvovirus (CPV, a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population and a major recombinant strain (86.7%. The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity.

  5. Phylogenetic and Genome-Wide Deep-Sequencing Analyses of Canine Parvovirus Reveal Co-Infection with Field Variants and Emergence of a Recent Recombinant Strain

    Science.gov (United States)

    Pérez, Ruben; Calleros, Lucía; Marandino, Ana; Sarute, Nicolás; Iraola, Gregorio; Grecco, Sofia; Blanc, Hervé; Vignuzzi, Marco; Isakov, Ofer; Shomron, Noam; Carrau, Lucía; Hernández, Martín; Francia, Lourdes; Sosa, Katia; Tomás, Gonzalo; Panzera, Yanina

    2014-01-01

    Canine parvovirus (CPV), a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c) with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population) and a major recombinant strain (86.7%). The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity. PMID:25365348

  6. Fungi Sailing the Arctic Ocean: Speciose Communities in North Atlantic Driftwood as Revealed by High-Throughput Amplicon Sequencing.

    Science.gov (United States)

    Rämä, Teppo; Davey, Marie L; Nordén, Jenni; Halvorsen, Rune; Blaalid, Rakel; Mathiassen, Geir H; Alsos, Inger G; Kauserud, Håvard

    2016-08-01

    High amounts of driftwood sail across the oceans and provide habitat for organisms tolerating the rough and saline environment. Fungi have adapted to the extremely cold and saline conditions which driftwood faces in the high north. For the first time, we applied high-throughput sequencing to fungi residing in driftwood to reveal their taxonomic richness, community composition, and ecology in the North Atlantic. Using pyrosequencing of ITS2 amplicons obtained from 49 marine logs, we found 807 fungal operational taxonomic units (OTUs) based on clustering at 97 % sequence similarity cut-off level. The phylum Ascomycota comprised 74 % of the OTUs and 20 % belonged to Basidiomycota. The richness of basidiomycetes decreased with prolonged submersion in the sea, supporting the general view of ascomycetes being more extremotolerant. However, more than one fourth of the fungal OTUs remained unassigned to any fungal class, emphasising the need for better DNA reference data from the marine habitat. Different fungal communities were detected in coniferous and deciduous logs. Our results highlight that driftwood hosts a considerably higher fungal diversity than currently known. The driftwood fungal community is not a terrestrial relic but a speciose assemblage of fungi adapted to the stressful marine environment and different kinds of wooden substrates found in it.

  7. Genome-wide patterns of copy number variation in the diversified chicken genomes using next-generation sequencing.

    Science.gov (United States)

    Yi, Guoqiang; Qu, Lujiang; Liu, Jianfeng; Yan, Yiyuan; Xu, Guiyun; Yang, Ning

    2014-11-07

    Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing. A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson's correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding. Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.

  8. Sequencing the extrachromosomal circular mobilome reveals retrotransposon activity in plants

    OpenAIRE

    Lanciano, Sophie; Carpentier, M. C.; Llauro, C.; Jobet, E.; Robakowska-Hyzorek, D.; Lasserre, E.; Ghesquière, Alain; Panaud, O.; Mirouze, Marie

    2017-01-01

    Retrotransposons are mobile genetic elements abundant in plant and animal genomes. While efficiently silenced by the epigenetic machinery, they can be reactivated upon stress or during development. Their level of transcription not reflecting their transposition ability, it is thus difficult to evaluate their contribution to the active mobilome. Here we applied a simple methodology based on the high throughput sequencing of extrachromosomal circular DNA (eccDNA) forms of active retrotransposon...

  9. Oligonucleotide array discovery of polymorphisms in cultivated tomato (Solanum lycopersicum L. reveals patterns of SNP variation associated with breeding

    Directory of Open Access Journals (Sweden)

    Zhu Tong

    2009-10-01

    Full Text Available Abstract Background Cultivated tomato (Solanum lycopersicum L. has narrow genetic diversity that makes it difficult to identify polymorphisms between elite germplasm. We explored array-based single feature polymorphism (SFP discovery as a high-throughput approach for marker development in cultivated tomato. Results Three varieties, FL7600 (fresh-market, OH9242 (processing, and PI114490 (cherry were used as a source of genomic DNA for hybridization to oligonucleotide arrays. Identification of SFPs was based on outlier detection using regression analysis of normalized hybridization data within a probe set for each gene. A subset of 189 putative SFPs was sequenced for validation. The rate of validation depended on the desired level of significance (α used to define the confidence interval (CI, and ranged from 76% for polymorphisms identified at α ≤ 10-6 to 60% for those identified at α ≤ 10-2. Validation percentage reached a plateau between α ≤ 10-4 and α ≤ 10-7, but failure to identify known SFPs (Type II error increased dramatically at α ≤ 10-6. Trough sequence validation, we identified 279 SNPs and 27 InDels in 111 loci. Sixty loci contained ≥ 2 SNPs per locus. We used a subset of validated SNPs for genetic diversity analysis of 92 tomato varieties and accessions. Pairwise estimation of θ (Fst suggested significant differentiation between collections of fresh-market, processing, vintage, Latin American (landrace, and S. pimpinellifolium accessions. The fresh-market and processing groups displayed high genetic diversity relative to vintage and landrace groups. Furthermore, the patterns of SNP variation indicated that domestication and early breeding practices have led to progressive genetic bottlenecks while modern breeding practices have reintroduced genetic variation into the crop from wild species. Finally, we examined the ratio of non-synonymous (Ka to synonymous substitutions (Ks for 20 loci with multiple SNPs (≥ 4 per

  10. Private selective sweeps identified from next-generation pool-sequencing reveal convergent pathways under selection in two inbred Schistosoma mansoni strains.

    Directory of Open Access Journals (Sweden)

    Julie A J Clément

    Full Text Available BACKGROUND: The trematode flatworms of the genus Schistosoma, the causative agents of schistosomiasis, are among the most prevalent parasites in humans, affecting more than 200 million people worldwide. In this study, we focused on two well-characterized strains of S. mansoni, to explore signatures of selection. Both strains are highly inbred and exhibit differences in life history traits, in particular in their compatibility with the intermediate host Biomphalaria glabrata. METHODOLOGY/PRINCIPAL FINDINGS: We performed high throughput sequencing of DNA from pools of individuals of each strain using Illumina technology and identified single nucleotide polymorphisms (SNP and copy number variations (CNV. In total, 708,898 SNPs were identified and roughly 2,000 CNVs. The SNPs revealed low nucleotide diversity (π = 2 × 10(-4 within each strain and a high differentiation level (Fst = 0.73 between them. Based on a recently developed in-silico approach, we further detected 12 and 19 private (i.e. specific non-overlapping selective sweeps among the 121 and 151 sweeps found in total for each strain. CONCLUSIONS/SIGNIFICANCE: Functional annotation of transcripts lying in the private selective sweeps revealed specific selection for functions related to parasitic interaction (e.g. cell-cell adhesion or redox reactions. Despite high differentiation between strains, we identified evolutionary convergence of genes related to proteolysis, known as a key virulence factor and a potential target of drug and vaccine development. Our data show that pool-sequencing can be used for the detection of selective sweeps in parasite populations and enables one to identify biological functions under selection.

  11. Deep sequencing of the Mexican avocado transcriptome, an ancient angiosperm with a high content of fatty acids.

    Science.gov (United States)

    Ibarra-Laclette, Enrique; Méndez-Bravo, Alfonso; Pérez-Torres, Claudia Anahí; Albert, Victor A; Mockaitis, Keithanne; Kilaru, Aruna; López-Gómez, Rodolfo; Cervantes-Luevano, Jacob Israel; Herrera-Estrella, Luis

    2015-08-13

    Avocado (Persea americana) is an economically important tropical fruit considered to be a good source of fatty acids. Despite its importance, the molecular and cellular characterization of biochemical and developmental processes in avocado is limited due to the lack of transcriptome and genomic information. The transcriptomes of seeds, roots, stems, leaves, aerial buds and flowers were determined using different sequencing platforms. Additionally, the transcriptomes of three different stages of fruit ripening (pre-climacteric, climacteric and post-climacteric) were also analyzed. The analysis of the RNAseqatlas presented here reveals strong differences in gene expression patterns between different organs, especially between root and flower, but also reveals similarities among the gene expression patterns in other organs, such as stem, leaves and aerial buds (vegetative organs) or seed and fruit (storage organs). Important regulators, functional categories, and differentially expressed genes involved in avocado fruit ripening were identified. Additionally, to demonstrate the utility of the avocado gene expression atlas, we investigated the expression patterns of genes implicated in fatty acid metabolism and fruit ripening. A description of transcriptomic changes occurring during fruit ripening was obtained in Mexican avocado, contributing to a dynamic view of the expression patterns of genes involved in fatty acid biosynthesis and the fruit ripening process.

  12. Evolution in the lineament patterns associated to strong earthquakes revealed by satellite observations

    Science.gov (United States)

    Soto-Pinto, C. A.; Arellano-Baeza, A. A.; Ouzounov, D. P.

    2011-12-01

    We study the temporal evolution of the stress patterns in the crust by using high-resolution (10-300 m) satellite images from MODIS and ASTER satellite sensors. We are able to detect some changes in density and orientation of lineaments preceding earthquake events. A lineament is generally defined as a straight or a somewhat curved feature in the landscape visible in a satellite image as an aligned sequence of pixels of a contrasting intensity compared to the background. The system of lineaments extracted from the satellite images is not identical to the geological lineaments; nevertheless, it generally reflects the structure of the faults and fractures in the Earth's crust. Our analysis has shown that the system of lineaments is very dynamical, and the significant number of lineaments appeared approximately one month before an earthquake, while one month after the earthquake the lineament configuration returned to its initial state. These features were not observed in the test areas that are free of any seismic activity in that period (null hypothesis). We have designed a computational prototype capable to detect lineament evolution and to utilize both ASTER and MODIS satellite L1/L2. We will demonstrate the first successful test results for several Mw> 5 earthquakes in Chile, Peru, China, and California (USA).

  13. A second life for old data: Global patterns in pollution ecology revealed from published observational studies

    Energy Technology Data Exchange (ETDEWEB)

    Kozlov, Mikhail V., E-mail: mikoz@utu.fi [Section of Ecology, University of Turku, 20014 Turku (Finland); Zvereva, Elena L. [Section of Ecology, University of Turku, 20014 Turku (Finland)

    2011-05-15

    A synthesis of research on the responses of terrestrial biota (1095 effect sizes) to industrial pollution (206 point emission sources) was conducted to reveal regional and global patterns from small-scale observational studies. A meta-analysis, in combination with other statistical methods, showed that the effects of pollution depend on characteristics of the specific polluter (type, amount of emission, duration of impact on biota), the affected organism (trophic group, life history), the level at which the response was measured (organism, population, community), and the environment (biome, climate). In spite of high heterogeneity in responses, we have detected several general patterns. We suggest that the development of evolutionary adaptations to pollution is a common phenomenon and that the harmful effects of pollution on terrestrial ecosystems are likely to increase as the climate warms. We argue that community- and ecosystem-level responses to pollution should be explored directly, rather than deduced from organism-level studies. - Research synthesis demonstrated that the harmful effects of pollution on terrestrial ecosystems are likely to increase as the climate warms.

  14. A second life for old data: Global patterns in pollution ecology revealed from published observational studies

    International Nuclear Information System (INIS)

    Kozlov, Mikhail V.; Zvereva, Elena L.

    2011-01-01

    A synthesis of research on the responses of terrestrial biota (1095 effect sizes) to industrial pollution (206 point emission sources) was conducted to reveal regional and global patterns from small-scale observational studies. A meta-analysis, in combination with other statistical methods, showed that the effects of pollution depend on characteristics of the specific polluter (type, amount of emission, duration of impact on biota), the affected organism (trophic group, life history), the level at which the response was measured (organism, population, community), and the environment (biome, climate). In spite of high heterogeneity in responses, we have detected several general patterns. We suggest that the development of evolutionary adaptations to pollution is a common phenomenon and that the harmful effects of pollution on terrestrial ecosystems are likely to increase as the climate warms. We argue that community- and ecosystem-level responses to pollution should be explored directly, rather than deduced from organism-level studies. - Research synthesis demonstrated that the harmful effects of pollution on terrestrial ecosystems are likely to increase as the climate warms.

  15. Sensitivity to structure in action sequences: An infant event-related potential study.

    Science.gov (United States)

    Monroy, Claire D; Gerson, Sarah A; Domínguez-Martínez, Estefanía; Kaduk, Katharina; Hunnius, Sabine; Reid, Vincent

    2017-05-06

    Infants are sensitive to structure and patterns within continuous streams of sensory input. This sensitivity relies on statistical learning, the ability to detect predictable regularities in spatial and temporal sequences. Recent evidence has shown that infants can detect statistical regularities in action sequences they observe, but little is known about the neural process that give rise to this ability. In the current experiment, we combined electroencephalography (EEG) with eye-tracking to identify electrophysiological markers that indicate whether 8-11-month-old infants detect violations to learned regularities in action sequences, and to relate these markers to behavioral measures of anticipation during learning. In a learning phase, infants observed an actor performing a sequence featuring two deterministic pairs embedded within an otherwise random sequence. Thus, the first action of each pair was predictive of what would occur next. One of the pairs caused an action-effect, whereas the second did not. In a subsequent test phase, infants observed another sequence that included deviant pairs, violating the previously observed action pairs. Event-related potential (ERP) responses were analyzed and compared between the deviant and the original action pairs. Findings reveal that infants demonstrated a greater Negative central (Nc) ERP response to the deviant actions for the pair that caused the action-effect, which was consistent with their visual anticipations during the learning phase. Findings are discussed in terms of the neural and behavioral processes underlying perception and learning of structured action sequences. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Frequent Pattern Mining of Eye-Tracking Records Partitioned into Cognitive Chunks

    Directory of Open Access Journals (Sweden)

    Noriyuki Matsuda

    2014-01-01

    Full Text Available Assuming that scenes would be visually scanned by chunking information, we partitioned fixation sequences of web page viewers into chunks using isolate gaze point(s as the delimiter. Fixations were coded in terms of the segments in a 5×5 mesh imposed on the screen. The identified chunks were mostly short, consisting of one or two fixations. These were analyzed with respect to the within- and between-chunk distances in the overall records and the patterns (i.e., subsequences frequently shared among the records. Although the two types of distances were both dominated by zero- and one-block shifts, the primacy of the modal shifts was less prominent between chunks than within them. The lower primacy was compensated by the longer shifts. The patterns frequently extracted at three threshold levels were mostly simple, consisting of one or two chunks. The patterns revealed interesting properties as to segment differentiation and the directionality of the attentional shifts.

  17. Sequence diversity of the leukotoxin (lktA) gene in caprine and ovine strains of Mannheimia haemolytica.

    Science.gov (United States)

    Vougidou, C; Sandalakis, V; Psaroulaki, A; Petridou, E; Ekateriniadou, L

    2013-04-20

    Mannheimia haemolytica is the aetiological agent of pneumonic pasteurellosis in small ruminants. The primary virulence factor of the bacterium is a leukotoxin (LktA), which induces apoptosis in susceptible cells via mitochondrial targeting. It has been previously shown that certain lktA alleles are associated either with cattle or sheep. The objective of the present study was to investigate lktA sequence variation among ovine and caprine M haemolytica strains isolated from pneumonic lungs, revealing any potential adaptation for the caprine host, for which there is no available data. Furthermore, we investigated amino acid variation in the N-terminal part of the sequences and its effect on targeting mitochondria. Data analysis showed that the prevalent caprine genotype differed at a single non-synonymous site from a previously described uncommon bovine allele, whereas the ovine sequences represented new, distinct alleles. N-terminal sequence differences did not affect the mitochondrial targeting ability of the isolates; interestingly enough in one case, mitochondrial matrix targeting was indicated rather than membrane association, suggesting an alternative LktA trafficking pattern.

  18. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    Science.gov (United States)

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  19. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    Directory of Open Access Journals (Sweden)

    Kacy L Gordon

    2015-05-01

    Full Text Available Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2 from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  20. A structural study for the optimisation of functional motifs encoded in protein sequences

    Directory of Open Access Journals (Sweden)

    Helmer-Citterich Manuela

    2004-04-01

    Full Text Available Abstract Background A large number of PROSITE patterns select false positives and/or miss known true positives. It is possible that – at least in some cases – the weak specificity and/or sensitivity of a pattern is due to the fact that one, or maybe more, functional and/or structural key residues are not represented in the pattern. Multiple sequence alignments are commonly used to build functional sequence patterns. If residues structurally conserved in proteins sharing a function cannot be aligned in a multiple sequence alignment, they are likely to be missed in a standard pattern construction procedure. Results Here we present a new procedure aimed at improving the sensitivity and/ or specificity of poorly-performing patterns. The procedure can be summarised as follows: 1. residues structurally conserved in different proteins, that are true positives for a pattern, are identified by means of a computational technique and by visual inspection. 2. the sequence positions of the structurally conserved residues falling outside the pattern are used to build extended sequence patterns. 3. the extended patterns are optimised on the SWISS-PROT database for their sensitivity and specificity. The method was applied to eight PROSITE patterns. Whenever structurally conserved residues are found in the surface region close to the pattern (seven out of eight cases, the addition of information inferred from structural analysis is shown to improve pattern selectivity and in some cases selectivity and sensitivity as well. In some of the cases considered the procedure allowed the identification of functionally interesting residues, whose biological role is also discussed. Conclusion Our method can be applied to any type of functional motif or pattern (not only PROSITE ones which is not able to select all and only the true positive hits and for which at least two true positive structures are available. The computational technique for the identification of

  1. Quantifying relief on alluvial fans using airborne lidar to reveal patterns of sediment accumulation

    Science.gov (United States)

    Morelan, A. E., III; Oskin, M. E.

    2017-12-01

    We present a method of quantifying detailed surface relief on alluvial fans from high-resolution topography. Average slope and curvature of the fan are used together to empirically derive an idealized, radially symmetric fan surface, from which we compute residual topography. Maps produced using this technique highlight spatial patterns of fan deposition and avulsion. Regions of high residual topography reveal active and abandoned sediment lobes accumulated from recent depositional events, often with well-defined channels at their apex. Preliminary observations suggest that surface relief is uniform across a collection of fans in a given region and source lithology. Alluvial fans with granitic catchment lithologies in eastern California (n=12), each with varying source catchment size and mean fan slope, all show relief of around 4 meters. A collection of fans from the Carrizo Plain in central California (n=12), with source catchments set within Miocene marine and nonmarine sedimentary rocks, show significantly lower relief values around 2 meters. We hypothesize that particle grain size determines this contrasting relief through its control on the thickness of fan-building debris flows. In both settings we find that sediment lobes tend to extend toward the fan toe. This pattern supports a process, observed in analog experiments, of fan deposition dominated by back-filling and overtopping of distributary channels by debris-flows.

  2. Assessment of small RNA sorting into different extracellular fractions revealed by high-throughput sequencing of breast cell lines

    Science.gov (United States)

    Tosar, Juan Pablo; Gámbaro, Fabiana; Sanguinetti, Julia; Bonilla, Braulio; Witwer, Kenneth W.; Cayota, Alfonso

    2015-01-01

    Intercellular communication can be mediated by extracellular small regulatory RNAs (sRNAs). Circulating sRNAs are being intensively studied for their promising use as minimally invasive disease biomarkers. To date, most attention is centered on exosomes and microRNAs as the vectors and the secreted species, respectively. However, this field would benefit from an increased understanding of the plethora of sRNAs secreted by different cell types in different extracellular fractions. It is still not clear if specific sRNAs are selected for secretion, or if sRNA secretion is mostly passive. We sequenced the intracellular sRNA content (19–60 nt) of breast epithelial cell lines (MCF-7 and MCF-10A) and compared it with extracellular fractions enriched in microvesicles, exosomes and ribonucleoprotein complexes. Our results are consistent with a non-selective secretion model for most microRNAs, although a few showed secretion patterns consistent with preferential secretion. On the contrary, 5′ tRNA halves and 5′ RNA Y4-derived fragments of 31–33 were greatly and significantly enriched in the extracellular space (even in non-mammary cell lines), where tRNA halves were detected as part of ∼45 kDa ribonucleoprotein complexes. Overall, we show that different sRNA families have characteristic secretion patterns and open the question of the role of these sRNAs in the extracellular space. PMID:25940616

  3. Sequence and expression pattern of a novel human orphan G-protein-coupled receptor, GPRC5B, a family C receptor with a short amino-terminal domain

    DEFF Research Database (Denmark)

    Bräuner-Osborne, Hans; Krogsgaard-Larsen, P

    2000-01-01

    Query of GenBank with the amino acid sequence of human metabotropic glutamate receptor subtype 2 (mGluR2) identified a predicted gene product of unknown function on BAC clone CIT987SK-A-69G12 (located on chromosome band 16p12) as a homologous protein. The transcript, entitled GPRC5B, was cloned f...... from an expressed sequence tag clone that contained the entire open reading frame of the transcript encoding a protein of 395 amino acids. Analysis of the protein sequence reveal that GPRC5B contains a signal peptide and seven transmembrane alpha-helices, which is a hallmark of G...

  4. Physiological pattern of lumbar disc height

    International Nuclear Information System (INIS)

    Biggemann, M.; Frobin, W.; Brinckmann, P.

    1997-01-01

    Purpose of this study is to present a new method of quantifying objectively the height of all discs in lateral radiographs of the lumbar spine and of analysing the normal craniocaudal sequence pattern of lumbar disc heights. Methods: The new parameter is the ventrally measured disc height corrected for the dependence on the angle of lordosis by normalisation to mean angles observed in the erect posture of healthy persons. To eliminate radiographic magnification, the corrected ventral height is related to the mean depth of the cranially adjoining vertebra. In this manner lumbar disc heights were objectively measured in young, mature and healthy persons (146 males and 65 females). The craniocaudal sequence pattern was analysed by mean values from all persons and by height differences of adjoining discs in each individual lumbar spine. Results: Mean normative values demonstrated an increase in disc height between L1/L2 and L4/L5 and a constant or decreasing disc height between L4/L5 and L5/S1. However, this 'physiological sequence of disc height in the statistical mean' was observed in only 36% of normal males and 55% of normal females. Conclusion: The radiological pattern of the 'physiological sequence of lumbar disc height' leads to a relevant portion of false positive pathological results especially at L4/L5. An increase of disc height from L4/L5 to L5/S1 may be normal. The recognition of decreased disc height should be based on an abrupt change in the heights of adjoining discs and not on a deviation from a craniocaudal sequence pattern. (orig.) [de

  5. Genome sequencing reveals complex secondary metabolome in themarine actinomycete Salinispora tropica

    Energy Technology Data Exchange (ETDEWEB)

    Udwary, Daniel W.; Zeigler, Lisa; Asolkar, Ratnakar; Singan,Vasanth; Lapidus, Alla; Fenical, William; Jensen, Paul R.; Moore, BradleyS.

    2007-05-01

    Recent fermentation studies have identified actinomycetes ofthe marine-dwelling genus Salinispora as prolific natural productproducers. To further evaluate their biosynthetic potential, we analyzedall identifiable secondary natural product gene clusters from therecently sequenced 5,184,724 bp S. tropica CNB-440 circular genome. Ouranalysis shows that biosynthetic potential meets or exceeds that shown byprevious Streptomyces genome sequences as well as other naturalproduct-producing actinomycetes. The S. tropica genome features ninepolyketide synthase systems of every known formally classified family,non-ribosomal peptide synthetases and several hybrid clusters. While afew clusters appear to encode molecules previously identified inStreptomyces species,the majority of the 15 biosynthetic loci are novel.Specific chemical information about putative and observed natural productmolecules is presented and discussed. In addition, our bioinformaticanalysis was critical for the structure elucidation of the novelpolyenemacrolactam salinilactam A. This study demonstrates the potentialfor genomic analysis to complement and strengthen traditional naturalproduct isolation studies and firmly establishes the genus Salinispora asa rich source of novel drug-like molecules.

  6. Variation in extragenic repetitive DNA sequences in Pseudomonas syringae and potential use of modified REP primers in the identification of closely related isolates

    Directory of Open Access Journals (Sweden)

    Elif Çepni

    2012-01-01

    Full Text Available In this study, Pseudomonas syringe pathovars isolated from olive, tomato and bean were identified by species-specific PCR and their genetic diversity was assessed by repetitive extragenic palindromic (REP-PCR. Reverse universal primers for REP-PCR were designed by using the bases of A, T, G or C at the positions of 1, 4 and 11 to identify additional polymorphism in the banding patterns. Binding of the primers to different annealing sites in the genome revealed additional fingerprint patterns in eight isolates of P. savastanoi pv. savastanoi and two isolates of P. syringae pv. tomato. The use of four different bases in the primer sequences did not affect the PCR reproducibility and was very efficient in revealing intra-pathovar diversity, particularly in P. savastanoi pv. savastanoi. At the pathovar level, the primer BOX1AR yielded shared fragments, in addition to five bands that discriminated among the pathovars P. syringae pv. phaseolicola, P. savastanoi pv. savastanoi and P. syringae pv. tomato. REP-PCR with a modified primer containing C produced identical bands among the isolates in a pathovar but separated three pathovars more distinctly than four other primers. Although REP-and BOX-PCRs have been successfully used in the molecular identification of Pseudomonas isolates from Turkish flora, a PCR based on inter-enterobacterial repetitive intergenic concensus (ERIC sequences failed to produce clear banding patterns in this study.

  7. Characterization of promoter sequence of toll-like receptor genes in Vechur cattle

    Directory of Open Access Journals (Sweden)

    R. Lakshmi

    2016-06-01

    Full Text Available Aim: To analyze the promoter sequence of toll-like receptor (TLR genes in Vechur cattle, an indigenous breed of Kerala with the sequence of Bos taurus and access the differences that could be attributed to innate immune responses against bovine mastitis. Materials and Methods: Blood samples were collected from Jugular vein of Vechur cattle, maintained at Vechur cattle conservation center of Kerala Veterinary and Animal Sciences University, using an acid-citrate-dextrose anticoagulant. The genomic DNA was extracted, and polymerase chain reaction was carried out to amplify the promoter region of TLRs. The amplified product of TLR2, 4, and 9 promoter regions was sequenced by Sanger enzymatic DNA sequencing technique. Results: The sequence of promoter region of TLR2 of Vechur cattle with the B. taurus sequence present in GenBank showed 98% similarity and revealed variants for four sequence motifs. The sequence of the promoter region of TLR4 of Vechur cattle revealed 99% similarity with that of B. taurus sequence but not reveals significant variant in motifregions. However, two heterozygous loci were observed from the chromatogram. Promoter sequence of TLR9 gene also showed 99% similarity to B. taurus sequence and revealed variants for four sequence motifs. Conclusion: The results of this study indicate that significant variation in the promoter of TLR2 and 9 genes in Vechur cattle breed and may potentially link the influence the innate immunity response against mastitis diseases.

  8. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin; Wong, Yue Him; Tsang, Ling Ming; Chu, Ka Hou; Qian, Pei Yuan; Chan, Benny K K

    2013-01-01

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  9. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin

    2013-12-12

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  10. Whole Exome Sequencing for a Patient with Rubinstein-Taybi Syndrome Reveals de Novo Variants besides an Overt CREBBP Mutation

    Directory of Open Access Journals (Sweden)

    Hee Jeong Yoo

    2015-03-01

    Full Text Available Rubinstein-Taybi syndrome (RSTS is a rare condition with a prevalence of 1 in 125,000–720,000 births and characterized by clinical features that include facial, dental, and limb dysmorphology and growth retardation. Most cases of RSTS occur sporadically and are caused by de novo mutations. Cytogenetic or molecular abnormalities are detected in only 55% of RSTS cases. Previous genetic studies have yielded inconsistent results due to the variety of methods used for genetic analysis. The purpose of this study was to use whole exome sequencing (WES to evaluate the genetic causes of RSTS in a young girl presenting with an Autism phenotype. We used the Autism diagnostic observation schedule (ADOS and Autism diagnostic interview revised (ADI-R to confirm her diagnosis of Autism. In addition, various questionnaires were used to evaluate other psychiatric features. We used WES to analyze the DNA sequences of the patient and her parents and to search for de novo variants. The patient showed all the typical features of Autism, WES revealed a de novo frameshift mutation in CREBBP and de novo sequence variants in TNC and IGFALS genes. Mutations in the CREBBP gene have been extensively reported in RSTS patients, while potential missense mutations in TNC and IGFALS genes have not previously been associated with RSTS. The TNC and IGFALS genes are involved in central nervous system development and growth. It is possible for patients with RSTS to have additional de novo variants that could account for previously unexplained phenotypes.

  11. Comparative genomics reveals insights into avian genome evolution and adaptation

    Science.gov (United States)

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  12. Ecstasy and new patterns of drug use: a normal population study.

    Science.gov (United States)

    Pedersen, W; Skrondal, A

    1999-11-01

    (i) To describe illegal drug use patterns in an adolescent normal population sample with special emphasis on MDMA, ecstasy; (ii) to investigate where ecstasy is introduced in a hypothesized drug use sequence, and (iii) to contrast the predictors of ecstasy use with those of other illegal substances. Special attention was given to the relationship to subcultural music preferences and house-party-going. A school-based survey of the total cohort of adolescents enrolled in the school system in a city. 10,812 adolescents, age 14-17 years, response rate 94.3%. Oslo, the capital and only metropolitan town in Norway. Social class was measured by the occupation standard ISCO 88, questions were posed as regards frequency of alcohol use and alcohol intoxication, cigarette smoking and use of cannabis, amphetamines, ecstasy and heroin. Alcohol problems were measured by a shortened version of Rutgers Alcohol Problem Index (RAPI), conduct problems were measured according to the four categories of acts forming the basis of the diagnosis conduct disorder in DSM-IV, internalizing mental health problems were measured using items from Hopkins Symptoms Checklist (HCL). A number of questions were asked as regards subcultural music preferences and house-party-going. STATISTICAL MODELS: A hypothesized cumulative sequence in drug use was investigated by means of latent class analysis, and the predictors of the various patterns of drug use were estimated and compared by means of multinominal logistic regression analysis. The use of ecstasy was often intermingled with the use of cannabis, amphetamines and heroin, in a pattern of polydrug use. The latent class analysis revealed the following drug use sequence: (1) alcohol, (2) cigarettes, (3) cannabis, (4) amphetamines, (5) ecstasy and (6) heroin. There was no significant association between ecstasy use and parental social class or residential area of the town. All patterns of illegal drug use were highly associated with cigarette smoking

  13. Evidence for Sequence Scrambling and Divergent H/D Exchange Reactions of Doubly-Charged Isobaric b-Type Fragment Ions

    Science.gov (United States)

    Zekavat, Behrooz; Miladi, Mahsan; Al-Fdeilat, Abdullah H.; Somogyi, Arpad; Solouki, Touradj

    2014-02-01

    To date, only a limited number of reports are available on structural variants of multiply-charged b-fragment ions. We report on observed bimodal gas-phase hydrogen/deuterium exchange (HDX) reaction kinetics and patterns for substance P b10 2+ that point to presence of isomeric structures. We also compare HDX reactions, post-ion mobility/collision-induced dissociation (post-IM/CID), and sustained off-resonance irradiation-collision induced dissociation (SORI-CID) of substance P b10 2+ and a cyclic peptide with an identical amino acid (AA) sequence order to substance P b10. The observed HDX patterns and reaction kinetics and SORI-CID pattern for the doubly charged head-to-tail cyclized peptide were different from either of the presumed isomers of substance P b10 2+, suggesting that b10 2+ may not exist exclusively as a head-to-tail cyclized structure. Ultra-high mass measurement accuracy was used to assign identities of the observed SORI-CID fragment ions of substance P b10 2+; over 30 % of the observed SORI-CID fragment ions from substance P b10 2+ had rearranged (scrambled) AA sequences. Moreover, post-IM/CID experiments revealed the presence of two conformer types for substance P b10 2+, whereas only one conformer type was observed for the head-to-tail cyclized peptide. We also show that AA sequence scrambling from CID of doubly-charged b-fragment ions is not unique to substance P b10 2+.

  14. Evidence for sequence scrambling and divergent H/D exchange reactions of doubly-charged isobaric b-type fragment ions.

    Science.gov (United States)

    Zekavat, Behrooz; Miladi, Mahsan; Al-Fdeilat, Abdullah H; Somogyi, Arpad; Solouki, Touradj

    2014-02-01

    To date, only a limited number of reports are available on structural variants of multiply-charged b-fragment ions. We report on observed bimodal gas-phase hydrogen/deuterium exchange (HDX) reaction kinetics and patterns for substance P b10(2+) that point to presence of isomeric structures. We also compare HDX reactions, post-ion mobility/collision-induced dissociation (post-IM/CID), and sustained off-resonance irradiation-collision induced dissociation (SORI-CID) of substance P b10(2+) and a cyclic peptide with an identical amino acid (AA) sequence order to substance P b10. The observed HDX patterns and reaction kinetics and SORI-CID pattern for the doubly charged head-to-tail cyclized peptide were different from either of the presumed isomers of substance P b10(2+), suggesting that b10(2+) may not exist exclusively as a head-to-tail cyclized structure. Ultra-high mass measurement accuracy was used to assign identities of the observed SORI-CID fragment ions of substance P b10(2+); over 30% of the observed SORI-CID fragment ions from substance P b10(2+) had rearranged (scrambled) AA sequences. Moreover, post-IM/CID experiments revealed the presence of two conformer types for substance P b10(2+), whereas only one conformer type was observed for the head-to-tail cyclized peptide. We also show that AA sequence scrambling from CID of doubly-charged b-fragment ions is not unique to substance P b10(2+).

  15. 3' terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing.

    Science.gov (United States)

    Goldfarb, Katherine C; Cech, Thomas R

    2013-09-21

    Post-transcriptional 3' end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3' RACE coupled with high-throughput sequencing to characterize the 3' terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. The 3' terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3' terminus of an in vitro transcribed MRP RNA control and the differing 3' terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). 3' RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3' terminal sequences of noncoding RNAs.

  16. Permutation Entropy for Random Binary Sequences

    Directory of Open Access Journals (Sweden)

    Lingfeng Liu

    2015-12-01

    Full Text Available In this paper, we generalize the permutation entropy (PE measure to binary sequences, which is based on Shannon’s entropy, and theoretically analyze this measure for random binary sequences. We deduce the theoretical value of PE for random binary sequences, which can be used to measure the randomness of binary sequences. We also reveal the relationship between this PE measure with other randomness measures, such as Shannon’s entropy and Lempel–Ziv complexity. The results show that PE is consistent with these two measures. Furthermore, we use PE as one of the randomness measures to evaluate the randomness of chaotic binary sequences.

  17. Genes but not genomes reveal bacterial domestication of Lactococcus lactis.

    Directory of Open Access Journals (Sweden)

    Delphine Passerini

    Full Text Available BACKGROUND: The population structure and diversity of Lactococcus lactis subsp. lactis, a major industrial bacterium involved in milk fermentation, was determined at both gene and genome level. Seventy-six lactococcal isolates of various origins were studied by different genotyping methods and thirty-six strains displaying unique macrorestriction fingerprints were analyzed by a new multilocus sequence typing (MLST scheme. This gene-based analysis was compared to genomic characteristics determined by pulsed-field gel electrophoresis (PFGE. METHODOLOGY/PRINCIPAL FINDINGS: The MLST analysis revealed that L. lactis subsp. lactis is essentially clonal with infrequent intra- and intergenic recombination; also, despite its taxonomical classification as a subspecies, it displays a genetic diversity as substantial as that within several other bacterial species. Genome-based analysis revealed a genome size variability of 20%, a value typical of bacteria inhabiting different ecological niches, and that suggests a large pan-genome for this subspecies. However, the genomic characteristics (macrorestriction pattern, genome or chromosome size, plasmid content did not correlate to the MLST-based phylogeny, with strains from the same sequence type (ST differing by up to 230 kb in genome size. CONCLUSION/SIGNIFICANCE: The gene-based phylogeny was not fully consistent with the traditional classification into dairy and non-dairy strains but supported a new classification based on ecological separation between "environmental" strains, the main contributors to the genetic diversity within the subspecies, and "domesticated" strains, subject to recent genetic bottlenecks. Comparison between gene- and genome-based analyses revealed little relationship between core and dispensable genome phylogenies, indicating that clonal diversification and phenotypic variability of the "domesticated" strains essentially arose through substantial genomic flux within the dispensable

  18. Patterns of homoeologous gene expression shown by RNA sequencing in hexaploid bread wheat.

    KAUST Repository

    Leach, Lindsey J

    2014-04-11

    BACKGROUND: Bread wheat (Triticum aestivum) has a large, complex and hexaploid genome consisting of A, B and D homoeologous chromosome sets. Therefore each wheat gene potentially exists as a trio of A, B and D homoeoloci, each of which may contribute differentially to wheat phenotypes. We describe a novel approach combining wheat cytogenetic resources (chromosome substitution \\'nullisomic-tetrasomic\\' lines) with next generation deep sequencing of gene transcripts (RNA-Seq), to directly and accurately identify homoeologue-specific single nucleotide variants and quantify the relative contribution of individual homoeoloci to gene expression. RESULTS: We discover, based on a sample comprising ~5-10% of the total wheat gene content, that at least 45% of wheat genes are expressed from all three distinct homoeoloci. Most of these genes show strikingly biased expression patterns in which expression is dominated by a single homoeolocus. The remaining ~55% of wheat genes are expressed from either one or two homoeoloci only, through a combination of extensive transcriptional silencing and homoeolocus loss. CONCLUSIONS: We conclude that wheat is tending towards functional diploidy, through a variety of mechanisms causing single homoeoloci to become the predominant source of gene transcripts. This discovery has profound consequences for wheat breeding and our understanding of wheat evolution.

  19. Patterns of homoeologous gene expression shown by RNA sequencing in hexaploid bread wheat.

    KAUST Repository

    Leach, Lindsey J; Belfield, Eric J; Jiang, Caifu; Brown, Carly; Mithani, Aziz; Harberd, Nicholas P

    2014-01-01

    BACKGROUND: Bread wheat (Triticum aestivum) has a large, complex and hexaploid genome consisting of A, B and D homoeologous chromosome sets. Therefore each wheat gene potentially exists as a trio of A, B and D homoeoloci, each of which may contribute differentially to wheat phenotypes. We describe a novel approach combining wheat cytogenetic resources (chromosome substitution 'nullisomic-tetrasomic' lines) with next generation deep sequencing of gene transcripts (RNA-Seq), to directly and accurately identify homoeologue-specific single nucleotide variants and quantify the relative contribution of individual homoeoloci to gene expression. RESULTS: We discover, based on a sample comprising ~5-10% of the total wheat gene content, that at least 45% of wheat genes are expressed from all three distinct homoeoloci. Most of these genes show strikingly biased expression patterns in which expression is dominated by a single homoeolocus. The remaining ~55% of wheat genes are expressed from either one or two homoeoloci only, through a combination of extensive transcriptional silencing and homoeolocus loss. CONCLUSIONS: We conclude that wheat is tending towards functional diploidy, through a variety of mechanisms causing single homoeoloci to become the predominant source of gene transcripts. This discovery has profound consequences for wheat breeding and our understanding of wheat evolution.

  20. Correlations between human mobility and social interaction reveal general activity patterns.

    Science.gov (United States)

    Mollgaard, Anders; Lehmann, Sune; Mathiesen, Joachim

    2017-01-01

    A day in the life of a person involves a broad range of activities which are common across many people. Going beyond diurnal cycles, a central question is: to what extent do individuals act according to patterns shared across an entire population? Here we investigate the interplay between different activity types, namely communication, motion, and physical proximity by analyzing data collected from smartphones distributed among 638 individuals. We explore two central questions: Which underlying principles govern the formation of the activity patterns? Are the patterns specific to each individual or shared across the entire population? We find that statistics of the entire population allows us to successfully predict 71% of the activity and 85% of the inactivity involved in communication, mobility, and physical proximity. Surprisingly, individual level statistics only result in marginally better predictions, indicating that a majority of activity patterns are shared across our sample population. Finally, we predict short-term activity patterns using a generalized linear model, which suggests that a simple linear description might be sufficient to explain a wide range of actions, whether they be of social or of physical character.

  1. Musical Scales in Tone Sequences Improve Temporal Accuracy.

    Science.gov (United States)

    Li, Min S; Di Luca, Massimiliano

    2018-01-01

    Predicting the time of stimulus onset is a key component in perception. Previous investigations of perceived timing have focused on the effect of stimulus properties such as rhythm and temporal irregularity, but the influence of non-temporal properties and their role in predicting stimulus timing has not been exhaustively considered. The present study aims to understand how a non-temporal pattern in a sequence of regularly timed stimuli could improve or bias the detection of temporal deviations. We presented interspersed sequences of 3, 4, 5, and 6 auditory tones where only the timing of the last stimulus could slightly deviate from isochrony. Participants reported whether the last tone was 'earlier' or 'later' relative to the expected regular timing. In two conditions, the tones composing the sequence were either organized into musical scales or they were random tones. In one experiment, all sequences ended with the same tone; in the other experiment, each sequence ended with a different tone. Results indicate higher discriminability of anisochrony with musical scales and with longer sequences, irrespective of the knowledge of the final tone. Such an outcome suggests that the predictability of non-temporal properties, as enabled by the musical scale pattern, can be a factor in determining the sensitivity of time judgments.

  2. An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

    Science.gov (United States)

    Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M

    2004-01-01

    Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051

  3. Ultra-deep sequencing reveals high prevalence and broad structural diversity of hepatitis B surface antigen mutations in a global population.

    Science.gov (United States)

    Gencay, Mikael; Hübner, Kirsten; Gohl, Peter; Seffner, Anja; Weizenegger, Michael; Neofytos, Dionysios; Batrla, Richard; Woeste, Andreas; Kim, Hyon-Suk; Westergaard, Gaston; Reinsch, Christine; Brill, Eva; Thu Thuy, Pham Thi; Hoang, Bui Huu; Sonderup, Mark; Spearman, C Wendy; Pabinger, Stephan; Gautier, Jérémie; Brancaccio, Giuseppina; Fasano, Massimo; Santantonio, Teresa; Gaeta, Giovanni B; Nauck, Markus; Kaminski, Wolfgang E

    2017-01-01

    The diversity of the hepatitis B surface antigen (HBsAg) has a significant impact on the performance of diagnostic screening tests and the clinical outcome of hepatitis B infection. Neutralizing or diagnostic antibodies against the HBsAg are directed towards its highly conserved major hydrophilic region (MHR), in particular towards its "a" determinant subdomain. Here, we explored, on a global scale, the genetic diversity of the HBsAg MHR in a large, multi-ethnic cohort of randomly selected subjects with HBV infection from four continents. A total of 1553 HBsAg positive blood samples of subjects originating from 20 different countries across Africa, America, Asia and central Europe were characterized for amino acid variation in the MHR. Using highly sensitive ultra-deep sequencing, we found 72.8% of the successfully sequenced subjects (n = 1391) demonstrated amino acid sequence variation in the HBsAg MHR. This indicates that the global variation frequency in the HBsAg MHR is threefold higher than previously reported. The majority of the amino acid mutations were found in the HBV genotypes B (28.9%) and C (25.4%). Collectively, we identified 345 distinct amino acid mutations in the MHR. Among these, we report 62 previously unknown mutations, which extends the worldwide pool of currently known HBsAg MHR mutations by 22%. Importantly, topological analysis identified the "a" determinant upstream flanking region as the structurally most diverse subdomain of the HBsAg MHR. The highest prevalence of "a" determinant region mutations was observed in subjects from Asia, followed by the African, American and European cohorts, respectively. Finally, we found that more than half (59.3%) of all HBV subjects investigated carried multiple MHR mutations. Together, this worldwide ultra-deep sequencing based genotyping study reveals that the global prevalence and structural complexity of variation in the hepatitis B surface antigen have, to date, been significantly underappreciated.

  4. Retirement Patterns and Income Inequality

    Science.gov (United States)

    Fasang, Anette Eva

    2012-01-01

    How do social policies shape life courses, and which consequences do different life course patterns hold for individuals? This article engages the example of retirement in Germany and Britain to analyze life course patterns and their consequences for income inequality. Sequence analysis is used to measure retirement trajectories. The liberal…

  5. Targeted cancer exome sequencing reveals recurrent mutations in myeloproliferative neoplasms

    Science.gov (United States)

    Tenedini, E; Bernardis, I; Artusi, V; Artuso, L; Roncaglia, E; Guglielmelli, P; Pieri, L; Bogani, C; Biamonte, F; Rotunno, G; Mannarelli, C; Bianchi, E; Pancrazzi, A; Fanelli, T; Malagoli Tagliazucchi, G; Ferrari, S; Manfredini, R; Vannucchi, A M; Tagliafico, E

    2014-01-01

    With the intent of dissecting the molecular complexity of Philadelphia-negative myeloproliferative neoplasms (MPN), we designed a target enrichment panel to explore, using next-generation sequencing (NGS), the mutational status of an extensive list of 2000 cancer-associated genes and microRNAs. The genomic DNA of granulocytes and in vitro-expanded CD3+T-lymphocytes, as a germline control, was target-enriched and sequenced in a learning cohort of 20 MPN patients using Roche 454 technology. We identified 141 genuine somatic mutations, most of which were not previously described. To test the frequency of the identified variants, a larger validation cohort of 189 MPN patients was additionally screened for these mutations using Ion Torrent AmpliSeq NGS. Excluding the genes already described in MPN, for 8 genes (SCRIB, MIR662, BARD1, TCF12, FAT4, DAP3, POLG and NRAS), we demonstrated a mutation frequency between 3 and 8%. We also found that mutations at codon 12 of NRAS (NRASG12V and NRASG12D) were significantly associated, for primary myelofibrosis (PMF), with highest dynamic international prognostic scoring system (DIPSS)-plus score categories. This association was then confirmed in 66 additional PMF patients composing a final dataset of 168 PMF showing a NRAS mutation frequency of 4.7%, which was associated with a worse outcome, as defined by the DIPSS plus score. PMID:24150215

  6. Evolutionary divergence in the fungal response to fluconazole revealed by soft clustering

    KAUST Repository

    Kuo, Dwight

    2010-07-23

    Background: Fungal infections are an emerging health risk, especially those involving yeast that are resistant to antifungal agents. To understand the range of mechanisms by which yeasts can respond to anti-fungals, we compared gene expression patterns across three evolutionarily distant species - Saccharomyces cerevisiae, Candida glabrata and Kluyveromyces lactis - over time following fluconazole exposure. Results: Conserved and diverged expression patterns were identified using a novel soft clustering algorithm that concurrently clusters data from all species while incorporating sequence orthology. The analysis suggests complementary strategies for coping with ergosterol depletion by azoles - Saccharomyces imports exogenous ergosterol, Candida exports fluconazole, while Kluyveromyces does neither, leading to extreme sensitivity. In support of this hypothesis we find that only Saccharomyces becomes more azole resistant in ergosterol-supplemented media; that this depends on sterol importers Aus1 and Pdr11; and that transgenic expression of sterol importers in Kluyveromyces alleviates its drug sensitivity. Conclusions: We have compared the dynamic transcriptional responses of three diverse yeast species to fluconazole treatment using a novel clustering algorithm. This approach revealed significant divergence among regulatory programs associated with fluconazole sensitivity. In future, such approaches might be used to survey a wider range of species, drug concentrations and stimuli to reveal conserved and divergent molecular response pathways.

  7. Evolutionary divergence in the fungal response to fluconazole revealed by soft clustering

    KAUST Repository

    Kuo, Dwight; Tan, Kai; Zinman, Guy; Ravasi, Timothy; Bar-Joseph, Ziv; Ideker, Trey

    2010-01-01

    Background: Fungal infections are an emerging health risk, especially those involving yeast that are resistant to antifungal agents. To understand the range of mechanisms by which yeasts can respond to anti-fungals, we compared gene expression patterns across three evolutionarily distant species - Saccharomyces cerevisiae, Candida glabrata and Kluyveromyces lactis - over time following fluconazole exposure. Results: Conserved and diverged expression patterns were identified using a novel soft clustering algorithm that concurrently clusters data from all species while incorporating sequence orthology. The analysis suggests complementary strategies for coping with ergosterol depletion by azoles - Saccharomyces imports exogenous ergosterol, Candida exports fluconazole, while Kluyveromyces does neither, leading to extreme sensitivity. In support of this hypothesis we find that only Saccharomyces becomes more azole resistant in ergosterol-supplemented media; that this depends on sterol importers Aus1 and Pdr11; and that transgenic expression of sterol importers in Kluyveromyces alleviates its drug sensitivity. Conclusions: We have compared the dynamic transcriptional responses of three diverse yeast species to fluconazole treatment using a novel clustering algorithm. This approach revealed significant divergence among regulatory programs associated with fluconazole sensitivity. In future, such approaches might be used to survey a wider range of species, drug concentrations and stimuli to reveal conserved and divergent molecular response pathways.

  8. Isolation of a candidate human telomerase catalytic subunit gene, which reveals complex splicing patterns in different cell types.

    Science.gov (United States)

    Kilian, A; Bowtell, D D; Abud, H E; Hime, G R; Venter, D J; Keese, P K; Duncan, E L; Reddel, R R; Jefferson, R A

    1997-11-01

    Telomerase is a multicomponent reverse transcriptase enzyme that adds DNA repeats to the ends of chromosomes using its RNA component as a template for synthesis. Telomerase activity is detected in the germline as well as the majority of tumors and immortal cell lines, and at low levels in several types of normal cells. We have cloned a human gene homologous to a protein from Saccharomyces cerevisiae and Euplotes aediculatus that has reverse transcriptase motifs and is thought to be the catalytic subunit of telomerase in those species. This gene is present in the human genome as a single copy sequence with a dominant transcript of approximately 4 kb in a human colon cancer cell line, LIM1215. The cDNA sequence was determined using clones from a LIM1215 cDNA library and by RT-PCR, cRACE and 3'RACE on mRNA from the same source. We show that the gene is expressed in several normal tissues, telomerase-positive post-crisis (immortal) cell lines and various tumors but is not expressed in the majority of normal tissues analyzed, pre-crisis (non-immortal) cells and telomerase-negative immortal (ALT) cell lines. Multiple products were identified by RT-PCR using primers within the reverse transcriptase domain. Sequencing of these products suggests that they arise by alternative splicing. Strikingly, various tumors, cell lines and even normal tissues (colonic crypt and testis) showed considerable differences in the splicing patterns. Alternative splicing of the telomerase catalytic subunit transcript may be important for the regulation of telomerase activity and may give rise to proteins with different biochemical functions.

  9. Genomic multiple sequence alignments: refinement using a genetic algorithm

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2005-08-01

    Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only

  10. Evidence for Long-Timescale Patterns of Synaptic Inputs in CA1 of Awake Behaving Mice.

    Science.gov (United States)

    Kolb, Ilya; Talei Franzesi, Giovanni; Wang, Michael; Kodandaramaiah, Suhasa B; Forest, Craig R; Boyden, Edward S; Singer, Annabelle C

    2018-02-14

    Repeated sequences of neural activity are a pervasive feature of neural networks in vivo and in vitro In the hippocampus, sequential firing of many neurons over periods of 100-300 ms reoccurs during behavior and during periods of quiescence. However, it is not known whether the hippocampus produces longer sequences of activity or whether such sequences are restricted to specific network states. Furthermore, whether long repeated patterns of activity are transmitted to single cells downstream is unclear. To answer these questions, we recorded intracellularly from hippocampal CA1 of awake, behaving male mice to examine both subthreshold activity and spiking output in single neurons. In eight of nine recordings, we discovered long (900 ms) reoccurring subthreshold fluctuations or "repeats." Repeats generally were high-amplitude, nonoscillatory events reoccurring with 10 ms precision. Using statistical controls, we determined that repeats occurred more often than would be expected from unstructured network activity (e.g., by chance). Most spikes occurred during a repeat, and when a repeat contained a spike, the spike reoccurred with precision on the order of ≤20 ms, showing that long repeated patterns of subthreshold activity are strongly connected to spike output. Unexpectedly, we found that repeats occurred independently of classic hippocampal network states like theta oscillations or sharp-wave ripples. Together, these results reveal surprisingly long patterns of repeated activity in the hippocampal network that occur nonstochastically, are transmitted to single downstream neurons, and strongly shape their output. This suggests that the timescale of information transmission in the hippocampal network is much longer than previously thought. SIGNIFICANCE STATEMENT We found long (≥900 ms), repeated, subthreshold patterns of activity in CA1 of awake, behaving mice. These repeated patterns ("repeats") occurred more often than expected by chance and with 10 ms

  11. Temporal patterns of fire sequences observed in Canton of Ticino (southern Switzerland

    Directory of Open Access Journals (Sweden)

    L. Telesca

    2010-04-01

    Full Text Available Temporal dynamical analysis in fire sequences recorded from 1969 to 2008 in Canton Ticino (Switzerland was carried out by using the Allan Factor statistics. The obtained results show the presence of daily periodicities, superimposed to two time-scaling regimes. The daily cycle vanishes for sequences of higher altitude fires, for which a single scaling behaviour is observed.

  12. Bioassessment of a Drinking Water Reservoir Using Plankton: High Throughput Sequencing vs. Traditional Morphological Method

    Directory of Open Access Journals (Sweden)

    Wanli Gao

    2018-01-01

    Full Text Available Drinking water safety is increasingly perceived as one of the top global environmental issues. Plankton has been commonly used as a bioindicator for water quality in lakes and reservoirs. Recently, DNA sequencing technology has been applied to bioassessment. In this study, we compared the effectiveness of the 16S and 18S rRNA high throughput sequencing method (HTS and the traditional optical microscopy method (TOM in the bioassessment of drinking water quality. Five stations reflecting different habitats and hydrological conditions in Danjiangkou Reservoir, one of the largest drinking water reservoirs in Asia, were sampled May 2016. Non-metric multi-dimensional scaling (NMDS analysis showed that plankton assemblages varied among the stations and the spatial patterns revealed by the two methods were consistent. The correlation between TOM and HTS in a symmetric Procrustes analysis was 0.61, revealing overall good concordance between the two methods. Procrustes analysis also showed that site-specific differences between the two methods varied among the stations. Station Heijizui (H, a site heavily influenced by two tributaries, had the largest difference while station Qushou (Q, a confluence site close to the outlet dam, had the smallest difference between the two methods. Our results show that DNA sequencing has the potential to provide consistent identification of taxa, and reliable bioassessment in a long-term biomonitoring and assessment program for drinking water reservoirs.

  13. The complete chloroplast genome sequence of Mahonia bealei (Berberidaceae) reveals a significant expansion of the inverted repeat and phylogenetic relationship with other angiosperms.

    Science.gov (United States)

    Ma, Ji; Yang, Bingxian; Zhu, Wei; Sun, Lianli; Tian, Jingkui; Wang, Xumin

    2013-10-10

    Mahonia bealei (Berberidaceae) is a frequently-used traditional Chinese medicinal plant with efficient anti-inflammatory ability. This plant is one of the sources of berberine, a new cholesterol-lowering drug with anti-diabetic activity. We have sequenced the complete nucleotide sequence of the chloroplast (cp) genome of M. bealei. The complete cp genome of M. bealei is 164,792 bp in length, and has a typical structure with large (LSC 73,052 bp) and small (SSC 18,591 bp) single-copy regions separated by a pair of inverted repeats (IRs 36,501 bp) of large size. The Mahonia cp genome contains 111 unique genes and 39 genes are duplicated in the IR regions. The gene order and content of M. bealei are almost unarranged which is consistent with the hypothesis that large IRs stabilize cp genome and reduce gene loss-and-gain probabilities during evolutionary process. A large IR expansion of over 12 kb has occurred in M. bealei, 15 genes (rps19, rpl22, rps3, rpl16, rpl14, rps8, infA, rpl36, rps11, petD, petB, psbH, psbN, psbT and psbB) have expanded to have an additional copy in the IRs. The IR expansion rearrangement occurred via a double-strand DNA break and subsequence repair, which is different from the ordinary gene conversion mechanism. Repeat analysis identified 39 direct/inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Analysis also revealed 75 simple sequence repeat (SSR) loci and almost all are composed of A or T, contributing to a distinct bias in base composition. Comparison of protein-coding sequences with ESTs reveals 9 putative RNA edits and 5 of them resulted in non-synonymous modifications in rpoC1, rps2, rps19 and ycf1. Phylogenetic analysis using maximum parsimony (MP) and maximum likelihood (ML) was performed on a dataset composed of 65 protein-coding genes from 25 taxa, which yields an identical tree topology as previous plastid-based trees, and provides strong support for the sister relationship between Ranunculaceae and Berberidaceae

  14. Population Genetic Structure of Rock Bream (Oplegnathus fasciatus Temminck & Schlegel, 1884) Revealed by mtDNA COI Sequence in Korea and China

    Science.gov (United States)

    Park, Hyun Suk; Kim, Choong-Gon; Kim, Sung; Park, Yong-Joo; Choi, Hee-Jung; Xiao, Zhizhong; Li, Jun; Xiao, Yongshuang; Lee, Youn-Ho

    2018-04-01

    The rock bream, Oplegnathus fasciatus, is a common rocky reef game fish in East Asia and recently has become an aquaculture species. Despite its commercial importance, the population genetic structure of this fish species remains poorly understood. In this study, 163 specimens were collected from 6 localities along the coastal waters of Korea and China and their genetic variation was analyzed with mtDNA COI sequences. A total of 34 polymorphic sites were detected which determined 30 haplotypes. The genetic pattern reveals a low level of nucleotide diversity (0.04 ± 0.003) but a high level of haplotype diversity (0.83 ± 0.02). The 30 haplotypes are divided into two major genealogical clades: one that consists of only Zhoushan (ZS, East China Sea) specific haplotypes from the southern East China Sea and the other that consists of the remaining haplotypes from the northern East China Sea, Yellow Sea, Korea Strait, and East Sea/Sea of Japan. The two clades are separated by approximately 330 435 kyBP. Analyses of AMOVA and F st show a significant population differentiation between the ZS sample and the other ones, corroborating separation of the two genealogical clades. Larval dispersal and the fresh Yangtze River plume are invoked as the main determining factors for this population genetic structure of O. fasciatus. Neutrality tests and mismatch distribution analyses indicate late Pleistocene population expansion along the coastal waters of Korea and China approximately 133-183 kyBP during which there were periodic cycles of glaciations and deglaciations. Such population information needs to be taken into account when stock enhancement and conservation measures are implemented for this fisheries species.

  15. Genome-Wide Comparison of Magnaporthe Species Reveals a Host-Specific Pattern of Secretory Proteins and Transposable Elements.

    Directory of Open Access Journals (Sweden)

    Meghana Deepak Shirke

    Full Text Available Blast disease caused by the Magnaporthe species is a major factor affecting the productivity of rice, wheat and millets. This study was aimed at generating genomic information for rice and non-rice Magnaporthe isolates to understand the extent of genetic variation. We have sequenced the whole genome of the Magnaporthe isolates, infecting rice (leaf and neck, finger millet (leaf and neck, foxtail millet (leaf and buffel grass (leaf. Rice and finger millet isolates infecting both leaf and neck tissues were sequenced, since the damage and yield loss caused due to neck blast is much higher as compared to leaf blast. The genome-wide comparison was carried out to study the variability in gene content, candidate effectors, repeat element distribution, genes involved in carbohydrate metabolism and SNPs. The analysis of repeat element footprints revealed some genes such as naringenin, 2-oxoglutarate 3-dioxygenase being targeted by Pot2 and Occan, in isolates from different host species. Some repeat insertions were host-specific while other insertions were randomly shared between isolates. The distributions of repeat elements, secretory proteins, CAZymes and SNPs showed significant variation across host-specific lineages of Magnaporthe indicating an independent genome evolution orchestrated by multiple genomic factors.

  16. Expression pattern of glycoside hydrolase genes in Lutzomyia longipalpis reveals key enzymes involved in larval digestion

    Directory of Open Access Journals (Sweden)

    Caroline da Silva Moraes

    2014-08-01

    Full Text Available The sand fly Lutzomyia longipalpis is the most important vector of American Visceral Leishmaniasis. Adults are phytophagous (males and females or blood feeders (females only, and larvae feed on solid detritus. Digestion in sand fly larvae has scarcely been studied, but some glycosidase activities putatively involved in microorganism digestion were already described. Nevertheless, the molecular nature of these enzymes, as the corresponding genes and transcripts, were not explored yet. Catabolism of microbial carbohydrates in insects generally involves β-1,3-glucanases, chitinases and digestive lysozymes. In this work, the transcripts of digestive β-1,3-glucanase and chitinases were identified in the L. longipalpis larvae throughout analysis of sequences and expression patterns of glycoside hydrolases families 16, 18 and 22. The activity of one i-type lysozyme was also registered. Interestingly, this lysozyme seems to play a role in immunity, rather than digestion. This is the first attempt to identify the molecular nature of sand fly larval digestive enzymes.

  17. Expression pattern of glycoside hydrolase genes in Lutzomyia longipalpis reveals key enzymes involved in larval digestion

    Science.gov (United States)

    Moraes, Caroline da Silva; Diaz-Albiter, Hector M.; Faria, Maiara do Valle; Sant'Anna, Maurício R. V.; Dillon, Rod J.; Genta, Fernando A.

    2014-01-01

    The sand fly Lutzomyia longipalpis is the most important vector of American Visceral Leishmaniasis. Adults are phytophagous (males and females) or blood feeders (females only), and larvae feed on solid detritus. Digestion in sand fly larvae has scarcely been studied, but some glycosidase activities putatively involved in microorganism digestion were already described. Nevertheless, the molecular nature of these enzymes, as the corresponding genes and transcripts, were not explored yet. Catabolism of microbial carbohydrates in insects generally involves β-1,3-glucanases, chitinases, and digestive lysozymes. In this work, the transcripts of digestive β-1,3-glucanase and chitinases were identified in the L. longipalpis larvae throughout analysis of sequences and expression patterns of glycoside hydrolases families 16, 18, and 22. The activity of one i-type lysozyme was also registered. Interestingly, this lysozyme seems to play a role in immunity, rather than digestion. This is the first attempt to identify the molecular nature of sand fly larval digestive enzymes. PMID:25140153

  18. Ocean time-series reveals recurring seasonal patterns of virioplankton dynamics in the northwestern Sargasso Sea.

    Science.gov (United States)

    Parsons, Rachel J; Breitbart, Mya; Lomas, Michael W; Carlson, Craig A

    2012-02-01

    There are an estimated 10(30) virioplankton in the world oceans, the majority of which are phages (viruses that infect bacteria). Marine phages encompass enormous genetic diversity, affect biogeochemical cycling of elements, and partially control aspects of prokaryotic production and diversity. Despite their importance, there is a paucity of data describing virioplankton distributions over time and depth in oceanic systems. A decade of high-resolution time-series data collected from the upper 300 m in the northwestern Sargasso Sea revealed recurring temporal and vertical patterns of virioplankton abundance in unprecedented detail. An annual virioplankton maximum developed between 60 and 100 m during periods of summer stratification and eroded during winter convective mixing. The timing and vertical positioning of this seasonal pattern was related to variability in water column stability and the dynamics of specific picophytoplankton and heterotrophic bacterioplankton lineages. Between 60 and 100 m, virioplankton abundance was negatively correlated to the dominant heterotrophic bacterioplankton lineage SAR11, as well as the less abundant picophytoplankton, Synechococcus. In contrast, virioplankton abundance was positively correlated to the dominant picophytoplankton lineage Prochlorococcus, and the less abundant alpha-proteobacteria, Rhodobacteraceae. Seasonally, virioplankton abundances were highly synchronous with Prochlorococcus distributions and the virioplankton to Prochlorococcus ratio remained remarkably constant during periods of water column stratification. The data suggest that a significant fraction of viruses in the mid-euphotic zone of the subtropical gyres may be cyanophages and patterns in their abundance are largely determined by Prochlorococcus dynamics in response to water column stability. This high-resolution, decadal survey of virioplankton abundance provides insight into the possible controls of virioplankton dynamics in the open ocean.

  19. Synthesis of compact patterns for NMR relaxation decay in intelligent "electronic tongue" for analyzing heavy oil composition

    Science.gov (United States)

    Lapshenkov, E. M.; Volkov, V. Y.; Kulagin, V. P.

    2018-05-01

    The article is devoted to the problem of pattern creation of the NMR sensor signal for subsequent recognition by the artificial neural network in the intelligent device "the electronic tongue". The specific problem of removing redundant data from the spin-spin relaxation signal pattern that is used as a source of information in analyzing the composition of oil and petroleum products is considered. The method is proposed that makes it possible to remove redundant data of the relaxation decay pattern but without introducing additional distortion. This method is based on combining some relaxation decay curve intervals that increment below the noise level such that the increment of the combined intervals is above the noise level. In this case, the relaxation decay curve samples that are located inside the combined intervals are removed from the pattern. This method was tested on the heavy-oil NMR signal patterns that were created by using the Carr-Purcell-Meibum-Gill (CPMG) sequence for recording the relaxation process. Parameters of CPMG sequence are: 100 μs - time interval between 180° pulses, 0.4s - duration of measurement. As a result, it was revealed that the proposed method allowed one to reduce the number of samples 15 times (from 4000 to 270), and the maximum detected root mean square error (RMS error) equals 0.00239 (equivalent to signal-to-noise ratio 418).

  20. Image sequence analysis

    CERN Document Server

    1981-01-01

    The processing of image sequences has a broad spectrum of important applica­ tions including target tracking, robot navigation, bandwidth compression of TV conferencing video signals, studying the motion of biological cells using microcinematography, cloud tracking, and highway traffic monitoring. Image sequence processing involves a large amount of data. However, because of the progress in computer, LSI, and VLSI technologies, we have now reached a stage when many useful processing tasks can be done in a reasonable amount of time. As a result, research and development activities in image sequence analysis have recently been growing at a rapid pace. An IEEE Computer Society Workshop on Computer Analysis of Time-Varying Imagery was held in Philadelphia, April 5-6, 1979. A related special issue of the IEEE Transactions on Pattern Anal­ ysis and Machine Intelligence was published in November 1980. The IEEE Com­ puter magazine has also published a special issue on the subject in 1981. The purpose of this book ...

  1. CO I barcoding reveals new clades and radiation patterns of Indo-Pacific sponges of the family Irciniidae (Demospongiae: Dictyoceratida.

    Directory of Open Access Journals (Sweden)

    Judith Pöppe

    2010-04-01

    Full Text Available DNA barcoding is a promising tool to facilitate a rapid and unambiguous identification of sponge species. Demosponges of the order Dictyoceratida are particularly challenging to identify, but are of ecological as well as biochemical importance.Here we apply DNA barcoding with the standard CO1-barcoding marker on selected Indo-Pacific specimens of two genera, Ircinia and Psammocinia of the family Irciniidae. We show that the CO1 marker identifies several species new to science, reveals separate radiation patterns of deep-sea Ircinia sponges and indicates dispersal patterns of Psammocinia species. However, some species cannot be unambiguously barcoded by solely this marker due to low evolutionary rates.We support previous suggestions for a combination of the standard CO1 fragment with an additional fragment for sponge DNA barcoding.

  2. Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization.

    Science.gov (United States)

    Gcebe, Nomakorinte; Rutten, Victor; Pittius, Nicolaas Gey van; Naicker, Brendon; Michel, Anita

    2017-04-01

    Non-tuberculous mycobacteria (NTM) are ubiquitous in the environment, and an increasing number of NTM species have been isolated and characterized from both humans and animals, highlighting the zoonotic potential of these bacteria. Host exposure to NTM may impact on cross-reactive immune responsiveness, which may affect diagnosis of bovine tuberculosis and may also play a role in the variability of the efficacy of Mycobacterium bovis BCG vaccination against tuberculosis. In this study we characterized 10 NTM isolates originating from water, soil, nasal swabs of cattle and African buffalo as well as bovine tissue samples. These isolates were previously identified during an NTM survey and were all found, using 16S rRNA gene sequence analysis to be closely related to Mycobacterium moriokaense. A polyphasic approach that included phenotypic characterization, antibiotic susceptibility profiling, mycolic acid profiling and phylogenetic analysis of four gene loci, 16S rRNA, hsp65, sodA and rpoB, was employed to characterize these isolates. Sequence data analysis of the four gene loci revealed that these isolates belong to a unique species of the genus Mycobacterium. This evidence was further supported by several differences in phenotypic characteristics between the isolates and the closely related species. We propose the name Mycobacterium malmesburyense sp. nov. for this novel species. The type strain is WCM 7299T (=ATCC BAA-2759T=CIP 110822T).

  3. Frequent mutations in EGFR, KRAS and TP53 genes in human lung cancer tumors detected by ion torrent DNA sequencing.

    Directory of Open Access Journals (Sweden)

    Xin Cai

    Full Text Available Lung cancer is the most common malignancy and the leading cause of cancer deaths worldwide. While smoking is by far the leading cause of lung cancer, other environmental and genetic factors influence the development and progression of the cancer. Since unique mutations patterns have been observed in individual cancer samples, identification and characterization of the distinctive lung cancer molecular profile is essential for developing more effective, tailored therapies. Until recently, personalized DNA sequencing to identify genetic mutations in cancer was impractical and expensive. The recent technological advancements in next-generation DNA sequencing, such as the semiconductor-based Ion Torrent sequencing platform, has made DNA sequencing cost and time effective with more reliable results. Using the Ion Torrent Ampliseq Cancer Panel, we sequenced 737 loci from 45 cancer-related genes to identify genetic mutations in 76 human lung cancer samples. The sequencing analysis revealed missense mutations in KRAS, EGFR, and TP53 genes in the breast cancer samples of various histologic types. Thus, this study demonstrates the necessity of sequencing individual human cancers in order to develop personalized drugs or combination therapies to effectively target individual, breast cancer-specific mutations.

  4. Information decomposition method to analyze symbolical sequences

    International Nuclear Information System (INIS)

    Korotkov, E.V.; Korotkova, M.A.; Kudryashov, N.A.

    2003-01-01

    The information decomposition (ID) method to analyze symbolical sequences is presented. This method allows us to reveal a latent periodicity of any symbolical sequence. The ID method is shown to have advantages in comparison with application of the Fourier transformation, the wavelet transform and the dynamic programming method to look for latent periodicity. Examples of the latent periods for poetic texts, DNA sequences and amino acids are presented. Possible origin of a latent periodicity for different symbolical sequences is discussed

  5. The chemical structure of DNA sequence signals for RNA transcription

    Science.gov (United States)

    George, D. G.; Dayhoff, M. O.

    1982-01-01

    The proposed recognition sites for RNA transcription for E. coli NRA polymerase, bacteriophage T7 RNA polymerase, and eukaryotic RNA polymerase Pol II are evaluated in the light of the requirements for efficient recognition. It is shown that although there is good experimental evidence that specific nucleic acid sequence patterns are involved in transcriptional regulation in bacteria and bacterial viruses, among the sequences now available, only in the case of the promoters recognized by bacteriophage T7 polymerase does it seem likely that the pattern is sufficient. It is concluded that the eukaryotic pattern that is investigated is not restrictive enough to serve as a recognition site.

  6. Deep Sequence Analysis of AgoshRNA Processing Reveals 3' A Addition and Trimming.

    Science.gov (United States)

    Harwig, Alex; Herrera-Carrillo, Elena; Jongejan, Aldo; van Kampen, Antonius Hubertus; Berkhout, Ben

    2015-07-14

    The RNA interference (RNAi) pathway, in which microprocessor and Dicer collaborate to process microRNAs (miRNA), was recently expanded by the description of alternative processing routes. In one of these noncanonical pathways, Dicer action is replaced by the Argonaute2 (Ago2) slicer function. It was recently shown that the stem-length of precursor-miRNA or short hairpin RNA (shRNA) molecules is a major determinant for Dicer versus Ago2 processing. Here we present the results of a deep sequence study on the processing of shRNAs with different stem length and a top G·U wobble base pair (bp). This analysis revealed some unexpected properties of these so-called AgoshRNA molecules that are processed by Ago2 instead of Dicer. First, we confirmed the gradual shift from Dicer to Ago2 processing upon shortening of the hairpin length. Second, hairpins with a stem larger than 19 base pair are inefficiently cleaved by Ago2 and we noticed a shift in the cleavage site. Third, the introduction of a top G·U bp in a regular shRNA can promote Ago2-cleavage, which coincides with a loss of Ago2-loading of the Dicer-cleaved 3' strand. Fourth, the Ago2-processed AgoshRNAs acquire a short 3' tail of 1-3 A-nucleotides (nt) and we present evidence that this product is subsequently trimmed by the poly(A)-specific ribonuclease (PARN).

  7. Complex Codon Usage Pattern and Compositional Features of Retroviruses

    Directory of Open Access Journals (Sweden)

    Sourav RoyChoudhury

    2013-01-01

    Full Text Available Retroviruses infect a wide range of organisms including humans. Among them, HIV-1, which causes AIDS, has now become a major threat for world health. Some of these viruses are also potential gene transfer vectors. In this study, the patterns of synonymous codon usage in retroviruses have been studied through multivariate statistical methods on ORFs sequences from the available 56 retroviruses. The principal determinant for evolution of the codon usage pattern in retroviruses seemed to be the compositional constraints, while selection for translation of the viral genes plays a secondary role. This was further supported by multivariate analysis on relative synonymous codon usage. Thus, it seems that mutational bias might have dominated role over translational selection in shaping the codon usage of retroviruses. Codon adaptation index was used to identify translationally optimal codons among genes from retroviruses. The comparative analysis of the preferred and optimal codons among different retroviral groups revealed that four codons GAA, AAA, AGA, and GGA were significantly more frequent in most of the retroviral genes inspite of some differences. Cluster analysis also revealed that phylogenetically related groups of retroviruses have probably evolved their codon usage in a concerted manner under the influence of their nucleotide composition.

  8. Sequence periodicity in nucleosomal DNA and intrinsic curvature.

    Science.gov (United States)

    Nair, T Murlidharan

    2010-05-17

    Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.

  9. Whole transcriptome analysis of Acinetobacter baumannii assessed by RNA-sequencing reveals different mRNA expression profiles in biofilm compared to planktonic cells.

    Directory of Open Access Journals (Sweden)

    Soraya Rumbo-Feal

    Full Text Available Acinetobacterbaumannii has emerged as a dangerous opportunistic pathogen, with many strains able to form biofilms and thus cause persistent infections. The aim of the present study was to use high-throughput sequencing techniques to establish complete transcriptome profiles of planktonic (free-living and sessile (biofilm forms of A. baumannii ATCC 17978 and thereby identify differences in their gene expression patterns. Collections of mRNA from planktonic (both exponential and stationary phase cultures and sessile (biofilm cells were sequenced. Six mRNA libraries were prepared following the mRNA-Seq protocols from Illumina. Reads were obtained in a HiScanSQ platform and mapped against the complete genome to describe the complete mRNA transcriptomes of planktonic and sessile cells. The results showed that the gene expression pattern of A. baumannii biofilm cells was distinct from that of planktonic cells, including 1621 genes over-expressed in biofilms relative to stationary phase cells and 55 genes expressed only in biofilms. These differences suggested important changes in amino acid and fatty acid metabolism, motility, active transport, DNA-methylation, iron acquisition, transcriptional regulation, and quorum sensing, among other processes. Disruption or deletion of five of these genes caused a significant decrease in biofilm formation ability in the corresponding mutant strains. Among the genes over-expressed in biofilm cells were those in an operon involved in quorum sensing. One of them, encoding an acyl carrier protein, was shown to be involved in biofilm formation as demonstrated by the significant decrease in biofilm formation by the corresponding knockout strain. The present work serves as a basis for future studies examining the complex network systems that regulate bacterial biofilm formation and maintenance.

  10. Genetic patterns in European geometrid moths revealed by the Barcode Index Number (BIN system.

    Directory of Open Access Journals (Sweden)

    Axel Hausmann

    Full Text Available BACKGROUND: The geometrid moths of Europe are one of the best investigated insect groups in traditional taxonomy making them an ideal model group to test the accuracy of the Barcode Index Number (BIN system of BOLD (Barcode of Life Datasystems, a method that supports automated, rapid species delineation and identification. METHODOLOGY/PRINCIPAL FINDINGS: This study provides a DNA barcode library for 219 of the 249 European geometrid moth species (88% in five selected subfamilies. The data set includes COI sequences for 2130 specimens. Most species (93% were found to possess diagnostic barcode sequences at the European level while only three species pairs (3% were genetically indistinguishable in areas of sympatry. As a consequence, 97% of the European species we examined were unequivocally discriminated by barcodes within their natural areas of distribution. We found a 1:1 correspondence between BINs and traditionally recognized species for 67% of these species. Another 17% of the species (15 pairs, three triads shared BINs, while specimens from the remaining species (18% were divided among two or more BINs. Five of these species are mixtures, both sharing and splitting BINs. For 82% of the species with two or more BINs, the genetic splits involved allopatric populations, many of which have previously been hypothesized to represent distinct species or subspecies. CONCLUSIONS/SIGNIFICANCE: This study confirms the effectiveness of DNA barcoding as a tool for species identification and illustrates the potential of the BIN system to characterize formal genetic units independently of an existing classification. This suggests the system can be used to efficiently assess the biodiversity of large, poorly known assemblages of organisms. For the moths examined in this study, cases of discordance between traditionally recognized species and BINs arose from several causes including overlooked species, synonymy, and cases where DNA barcodes revealed

  11. Hits to the left, flops to the right: different emotions during listening to music are reflected in cortical lateralisation patterns.

    Science.gov (United States)

    Altenmüller, Eckart; Schürmann, Kristian; Lim, Vanessa K; Parlitz, Dietrich

    2002-01-01

    In order to investigate the neurobiological mechanisms accompanying emotional valence judgements during listening to complex auditory stimuli, cortical direct current (dc)-electroencephalography (EEG) activation patterns were recorded from 16 right-handed students. Students listened to 160 short sequences taken from the repertoires of jazz, rock-pop, classical music and environmental sounds (each n=40). Emotional valence of the perceived stimuli were rated on a 5-step scale after each sequence. Brain activation patterns during listening revealed widespread bilateral fronto-temporal activation, but a highly significant lateralisation effect: positive emotional attributions were accompanied by an increase in left temporal activation, negative by a more bilateral pattern with preponderance of the right fronto-temporal cortex. Female participants demonstrated greater valence-related differences than males. No differences related to the four stimulus categories could be detected, suggesting that the actual auditory brain activation patterns were more determined by their affective emotional valence than by differences in acoustical "fine" structure. The results are consistent with a model of hemispheric specialisation concerning perceived positive or negative emotions proposed by Heilman [Journal of Neuropsychiatry and Clinical Neuroscience 9 (1997) 439].

  12. Rapid-Sequence Serial Sexual Homicides.

    Science.gov (United States)

    Schlesinger, Louis B; Ramirez, Stephanie; Tusa, Brittany; Jarvis, John P; Erdberg, Philip

    2017-03-01

    Serial sexual murderers have been described as committing homicides in a methodical manner, taking substantial time between offenses to elude the authorities. The results of our study of the temporal patterns (i.e., the length of time between homicides) of a nonrandom national sample of 44 serial sexual murderers and their 201 victims indicate that this representation may not always be accurate. Although 25 offenders (56.8%) killed with longer than a 14-day period between homicides, a sizeable subgroup was identified: 19 offenders (43.2%) who committed homicides in rapid-sequence fashion, with fewer than 14 days between all or some of the murders. Six offenders (13.6%) killed all their victims in one rapid-sequence, spree-like episode, with homicides just days apart or sometimes two murders in the same day. Thirteen offenders (29.5%) killed in one or two rapid-sequence clusters (i.e., more than one murder within a 14-day period, as well as additional homicides with greater than 14 days between each). The purpose of our study was to describe this subgroup of rapid-sequence offenders who have not been identified until now. These findings argue for accelerated forensic assessments of dangerousness and public safety when a sexual murder is detected. Psychiatric disorders with rapidly occurring symptom patterns, or even atypical mania or mood dysregulation, may serve as exemplars for understanding this extraordinary group of offenders. © 2017 American Academy of Psychiatry and the Law.

  13. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius.

    Directory of Open Access Journals (Sweden)

    Ceiridwen J Edwards

    Full Text Available BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer. In total, 289.9 megabases (22.48% of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously

  14. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius).

    LENUS (Irish Health Repository)

    Edwards, Ceiridwen J

    2010-01-01

    BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius) has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs) from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+\\/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer). In total, 289.9 megabases (22.48%) of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously identified

  15. The mitochondrial genome sequence of the ciliate Paramecium caudatum reveals a shift in nucleotide composition and codon usage within the genus Paramecium

    Directory of Open Access Journals (Sweden)

    Berendonk Thomas U

    2011-05-01

    Full Text Available Abstract Background Despite the fact that the organization of the ciliate mitochondrial genome is exceptional, only few ciliate mitochondrial genomes have been sequenced until today. All ciliate mitochondrial genomes are linear. They are 40 kb to 47 kb long and contain some 50 tightly packed genes without introns. Earlier studies documented that the mitochondrial guanine + cytosine contents are very different between Paramecium tetraurelia and all studied Tetrahymena species. This raises the question of whether the high mitochondrial G+C content observed in P. tetraurelia is a characteristic property of Paramecium mtDNA, or whether it is an exception of the ciliate mitochondrial genomes known so far. To test this question, we determined the mitochondrial genome sequence of Paramecium caudatum and compared the gene content and sequence properties to the closely related P. tetraurelia. Results The guanine + cytosine content of the P. caudatum mitochondrial genome was significantly lower than that of P. tetraurelia (22.4% vs. 41.2%. This difference in the mitochondrial nucleotide composition was accompanied by significantly different codon usage patterns in both species, i.e. within P. caudatum clearly A/T ending codons dominated, whereas for P. tetraurelia the synonymous codons were more balanced with a higher number of G/C ending codons. Further analyses indicated that the nucleotide composition of most members of the genus Paramecium resembles that of P. caudatum and that the shift observed in P. tetraurelia is restricted to the P. aurelia species complex. Conclusions Surprisingly, the codon usage bias in the P. caudatum mitochondrial genome, exemplified by the effective number of codons, is more similar to the distantly related T. pyriformis and other single-celled eukaryotes such as Chlamydomonas, than to the closely related P. tetraurelia. These differences in base composition and codon usage bias were, however, not reflected in the amino

  16. In Silico Characterization of Pectate Lyase Protein Sequences from Different Source Organisms

    Directory of Open Access Journals (Sweden)

    Amit Kumar Dubey

    2010-01-01

    Full Text Available A total of 121 protein sequences of pectate lyases were subjected to homology search, multiple sequence alignment, phylogenetic tree construction, and motif analysis. The phylogenetic tree constructed revealed different clusters based on different source organisms representing bacterial, fungal, plant, and nematode pectate lyases. The multiple accessions of bacterial, fungal, nematode, and plant pectate lyase protein sequences were placed closely revealing a sequence level similarity. The multiple sequence alignment of these pectate lyase protein sequences from different source organisms showed conserved regions at different stretches with maximum homology from amino acid residues 439–467, 715–816, and 829–910 which could be used for designing degenerate primers or probes specific for pectate lyases. The motif analysis revealed a conserved Pec_Lyase_C domain uniformly observed in all pectate lyases irrespective of variable sources suggesting its possible role in structural and enzymatic functions.

  17. Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing

    DEFF Research Database (Denmark)

    Gerlinger, Marco; Rowan, Andrew J.; Horswell, Stuart

    2012-01-01

    .RESULTS: Phylogenetic reconstruction revealed branched evolutionary tumor growth, with 63 to 69% of all somatic mutations not detectable across every tumor region. Intratumor heterogeneity was observed for a mutation within an autoinhibitory domain of the mammalian target of rapamycin (mTOR) kinase, correlating with S6...

  18. Sequence and phylogenetic analysis of virulent Newcastle disease virus isolates from Pakistan during 2009–2013 reveals circulation of new sub genotype

    International Nuclear Information System (INIS)

    Siddique, Naila; Naeem, Khalid; Abbas, Muhammad Athar; Ali Malik, Akbar; Rashid, Farooq; Rafique, Saba; Ghafar, Abdul; Rehman, Abdul

    2013-01-01

    Despite observing the standard bio-security measures at commercial poultry farms and extensive use of Newcastle disease vaccines, a new genotype VII-f of Newcastle disease virus (NDV) got introduced in Pakistan during 2011. In this regard 300 ND outbreaks recorded so far have resulted into huge losses of approximately USD 200 million during 2011–2013. A total of 33 NDV isolates recovered during 2009–2013 throughout Pakistan were characterized biologically and phylogenetically. The phylogenetic analysis revealed a new velogenic sub genotype VII-f circulating in commercial and domestic poultry along with the earlier reported sub genotype VII-b. Partial sequencing of Fusion gene revealed two types of cleavage site motifs; lentogenic 112 GRQGRL 117 and velogenic 112 RRQKRF 117 along with some point mutations indicative of genetic diversity. We report here a new sub genotype of virulent NDV circulating in commercial and backyard poultry in Pakistan and provide evidence for the possible genetic diversity which may be causing new NDV out breaks. - Highlights: • The first report of isolation of new genotype VII-f of virulent Newcastle disease virus (NDV) in Pakistan. • We report the partial Fusion gene sequences of new genotype VII-f of virulent NDV from Pakistan. • We report the phylogenetic relationship of new NDV strains with reported NDV strains. • Provide outbreak history of new virulent NDV strain in commercial and backyard poultry in Pakistan. • We provide possible evidence for the role of backyard poultry in NDV outbreaks

  19. Genetic Diversity of Pinus nigra Arn. Populations in Southern Spain and Northern Morocco Revealed By Inter-Simple Sequence Repeat Profiles

    Directory of Open Access Journals (Sweden)

    Oussama Ahrazem

    2012-05-01

    Full Text Available Eight Pinus nigra Arn. populations from Southern Spain and Northern Morocco were examined using inter-simple sequence repeat markers to characterize the genetic variability amongst populations. Pair-wise population genetic distance ranged from 0.031 to 0.283, with a mean of 0.150 between populations. The highest inter-population average distance was between PaCU from Cuenca and YeCA from Cazorla, while the lowest distance was between TaMO from Morocco and MA Sierra Mágina populations. Analysis of molecular variance (AMOVA and Nei’s genetic diversity analyses revealed higher genetic variation within the same population than among different populations. Genetic differentiation (Gst was 0.233. Cuenca showed the highest Nei’s genetic diversity followed by the Moroccan region, Sierra Mágina, and Cazorla region. However, clustering of populations was not in accordance with their geographical locations. Principal component analysis showed the presence of two major groups—Group 1 contained all populations from Cuenca while Group 2 contained populations from Cazorla, Sierra Mágina and Morocco—while Bayesian analysis revealed the presence of three clusters. The low genetic diversity observed in PaCU and YeCA is probably a consequence of inappropriate management since no estimation of genetic variability was performed before the silvicultural treatments. Data indicates that the inter-simple sequence repeat (ISSR method is sufficiently informative and powerful to assess genetic variability among populations of P. nigra.

  20. Development of next generation sequencing panel for UMOD and association with kidney disease.

    Directory of Open Access Journals (Sweden)

    Caitlin Bailie

    Full Text Available Chronic kidney disease (CKD has a prevalence of approximately 10% in adult populations. CKD can progress to end-stage renal disease (ESRD and this is usually fatal unless some form of renal replacement therapy (chronic dialysis or renal transplantation is provided. There is an inherited predisposition to CKD with several genetic risk markers now identified. The UMOD gene has been associated with CKD of varying aetiologies. An AmpliSeq next generation sequencing panel was developed to facilitate comprehensive sequencing of the UMOD gene, covering exonic and regulatory regions. SNPs and CpG sites in the genomic region encompassing UMOD were evaluated for association with CKD in two studies; the UK Wellcome Trust Case-Control 3 Renal Transplant Dysfunction Study (n = 1088 and UK-ROI GENIE GWAS (n = 1726. A technological comparison of two Ion Torrent machines revealed 100% allele call concordance between S5 XL™ and PGM™ machines. One SNP (rs183962941, located in a non-coding region of UMOD, was nominally associated with ESRD (p = 0.008. No association was identified between UMOD variants and estimated glomerular filtration rate. Analysis of methylation data for over 480,000 CpG sites revealed differential methylation patterns within UMOD, the most significant of these was cg03140788 p = 3.7 x 10-10.

  1. Distribution and interaction patterns of bacterial communities in an ornithogenic soil of Seymour Island, Antarctica.

    Science.gov (United States)

    Rampelotto, Pabulo Henrique; Barboza, Anthony Diego Muller; Pereira, Antônio Batista; Triplett, Eric W; Schaefer, Carlos Ernesto G R; de Oliveira Camargo, Flávio Anastácio; Roesch, Luiz Fernando Wurdig

    2015-04-01

    Next-generation, culture-independent sequencing offers an excellent opportunity to examine network interactions among different microbial species. In this study, soil bacterial communities from a penguin rookery site at Seymour Island were analyzed for abundance, structure, diversity, and interaction networks to identify interaction patterns among the various taxa at three soil depths. The analysis revealed the presence of eight phyla distributed in different proportions among the surface layer (0-8 cm), middle layer (20-25 cm), and bottom (35-40 cm). The bottom layer presented the highest values of bacterial richness, diversity, and evenness when compared to surface and middle layers. The network analysis revealed the existence of a unique pattern of interactions in which the soil microbial network formed a clustered topology, rather than a modular structure as is usually found in biological communities. In addition, specific taxa were identified as important players in microbial community structure. Furthermore, simulation analyses indicated that the loss of potential keystone groups of microorganisms might alter the patterns of interactions within the microbial community. These findings provide new insights for assessing the consequences of environmental disturbances at the whole-community level in Antarctica.

  2. Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations.

    Directory of Open Access Journals (Sweden)

    Brian B Tuch

    Full Text Available Due to growing throughput and shrinking cost, massively parallel sequencing is rapidly becoming an attractive alternative to microarrays for the genome-wide study of gene expression and copy number alterations in primary tumors. The sequencing of transcripts (RNA-Seq should offer several advantages over microarray-based methods, including the ability to detect somatic mutations and accurately measure allele-specific expression. To investigate these advantages we have applied a novel, strand-specific RNA-Seq method to tumors and matched normal tissue from three patients with oral squamous cell carcinomas. Additionally, to better understand the genomic determinants of the gene expression changes observed, we have sequenced the tumor and normal genomes of one of these patients. We demonstrate here that our RNA-Seq method accurately measures allelic imbalance and that measurement on the genome-wide scale yields novel insights into cancer etiology. As expected, the set of genes differentially expressed in the tumors is enriched for cell adhesion and differentiation functions, but, unexpectedly, the set of allelically imbalanced genes is also enriched for these same cancer-related functions. By comparing the transcriptomic perturbations observed in one patient to his underlying normal and tumor genomes, we find that allelic imbalance in the tumor is associated with copy number mutations and that copy number mutations are, in turn, strongly associated with changes in transcript abundance. These results support a model in which allele-specific deletions and duplications drive allele-specific changes in gene expression in the developing tumor.

  3. Sequencing of bovine herpesvirus 4 v.test strain reveals important genome features

    Directory of Open Access Journals (Sweden)

    Gillet Laurent

    2011-08-01

    Full Text Available Abstract Background Bovine herpesvirus 4 (BoHV-4 is a useful model for the human pathogenic gammaherpesviruses Epstein-Barr virus and Kaposi's Sarcoma-associated Herpesvirus. Although genome manipulations of this virus have been greatly facilitated by the cloning of the BoHV-4 V.test strain as a Bacterial Artificial Chromosome (BAC, the lack of a complete genome sequence for this strain limits its experimental use. Methods In this study, we have determined the complete sequence of BoHV-4 V.test strain by a pyrosequencing approach. Results The long unique coding region (LUR consists of 108,241 bp encoding at least 79 open reading frames and is flanked by several polyrepetitive DNA units (prDNA. As previously suggested, we showed that the prDNA unit located at the left prDNA-LUR junction (prDNA-G differs from the other prDNA units (prDNA-inner. Namely, the prDNA-G unit lacks the conserved pac-2 cleavage and packaging signal in its right terminal region. Based on the mechanisms of cleavage and packaging of herpesvirus genomes, this feature implies that only genomes bearing left and right end prDNA units are encapsulated into virions. Conclusions In this study, we have determined the complete genome sequence of the BAC-cloned BoHV-4 V.test strain and identified genome organization features that could be important in other herpesviruses.

  4. Parallel Algorithms and Patterns

    Energy Technology Data Exchange (ETDEWEB)

    Robey, Robert W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-06-16

    This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.

  5. Predicting tissue-specific expressions based on sequence characteristics

    KAUST Repository

    Paik, Hyojung; Ryu, Tae Woo; Heo, Hyoungsam; Seo, Seungwon; Lee, Doheon; Hur, Cheolgoo

    2011-01-01

    In multicellular organisms, including humans, understanding expression specificity at the tissue level is essential for interpreting protein function, such as tissue differentiation. We developed a prediction approach via generated sequence features from overrepresented patterns in housekeeping (HK) and tissue-specific (TS) genes to classify TS expression in humans. Using TS domains and transcriptional factor binding sites (TFBSs), sequence characteristics were used as indices of expressed tissues in a Random Forest algorithm by scoring exclusive patterns considering the biological intuition; TFBSs regulate gene expression, and the domains reflect the functional specificity of a TS gene. Our proposed approach displayed better performance than previous attempts and was validated using computational and experimental methods.

  6. Predicting tissue-specific expressions based on sequence characteristics

    KAUST Repository

    Paik, Hyojung

    2011-04-30

    In multicellular organisms, including humans, understanding expression specificity at the tissue level is essential for interpreting protein function, such as tissue differentiation. We developed a prediction approach via generated sequence features from overrepresented patterns in housekeeping (HK) and tissue-specific (TS) genes to classify TS expression in humans. Using TS domains and transcriptional factor binding sites (TFBSs), sequence characteristics were used as indices of expressed tissues in a Random Forest algorithm by scoring exclusive patterns considering the biological intuition; TFBSs regulate gene expression, and the domains reflect the functional specificity of a TS gene. Our proposed approach displayed better performance than previous attempts and was validated using computational and experimental methods.

  7. Software architecture design patterns in Java

    CERN Document Server

    Kuchana, Partha

    2004-01-01

    AN INTRODUCTION TO DESIGN PATTERNSDesign Patterns: Origin and HistoryArchitectural to Software Design PatternsWhat is a Design Pattern?More about Design PatternsAbout This BookUNIFIED MODELING LANGUAGE (UML)UML: A Quick ReferenceClass DiagramsSequence diagramsBASIC PATTERNSInterfaceDescriptionExamplePractice QuestionsAbstract Parent ClassDescriptionExamplePractice QuestionsPrivate MethodsDescriptionExamplePractice QuestionsAccessor MethodsDescriptionAccessor Method NomenclatureExampleDirect Reference versus Accessor MethodsPractice QuestionsConstant Data ManagerDescriptionExamplePractice Quest

  8. Deep sequencing of the viral phoH gene reveals temporal variation, depth-specific composition, and persistent dominance of the same viral phoH genes in the Sargasso Sea

    Directory of Open Access Journals (Sweden)

    Dawn B. Goldsmith

    2015-06-01

    Full Text Available Deep sequencing of the viral phoH gene, a host-derived auxiliary metabolic gene, was used to track viral diversity throughout the water column at the Bermuda Atlantic Time-series Study (BATS site in the summer (September and winter (March of three years. Viral phoH sequences reveal differences in the viral communities throughout a depth profile and between seasons in the same year. Variation was also detected between the same seasons in subsequent years, though these differences were not as great as the summer/winter distinctions. Over 3,600 phoH operational taxonomic units (OTUs; 97% sequence identity were identified. Despite high richness, most phoH sequences belong to a few large, common OTUs whereas the majority of the OTUs are small and rare. While many OTUs make sporadic appearances at just a few times or depths, a small number of OTUs dominate the community throughout the seasons, depths, and years.

  9. Three genetic stocks of frigate tuna Auxis thazard thazard (Lacepede, 1800) along the Indian coast revealed from sequence analyses of mitochondrial DNA D-loop region

    Digital Repository Service at National Institute of Oceanography (India)

    GirishKumar; Kunal, S.P.; Menezes, M.R.; Meena, R.M.

    revealed from sequence analyses of mitochondrial DNA D-loop region Name of authors: 1. Girish Kumar* Biological Oceanography Division (BOD) National Institute of Oceanography (NIO) Dona Paula, Goa 403004, India. Email: girishkumar....nio@gmail.com Tel: +919766548060 2. Swaraj Priyaranjan Kunal Biological Oceanography Division (BOD) National Institute of Oceanography (NIO) Dona Paula, Goa 403004, India. Email: swar.mbt@gmail.com 3. Maria Rosalia Menezes Biological Oceanography...

  10. Molecular cloning and sequence analysis of a phenylalanine ammonia-lyase gene from dendrobium.

    Directory of Open Access Journals (Sweden)

    Qing Jin

    Full Text Available In this study, a phenylalanine ammonia-lyase (PAL gene was cloned from Dendrobium candidum using homology cloning and RACE. The full-length sequence and catalytic active sites that appear in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum are also found: PAL cDNA of D. candidum (designated Dc-PAL1, GenBank No. JQ765748 has 2,458 bps and contains a complete open reading frame (ORF of 2,142 bps, which encodes 713 amino acid residues. The amino acid sequence of DcPAL1 has more than 80% sequence identity with the PAL genes of other plants, as indicated by multiple alignments. The dominant sites and catalytic active sites, which are similar to that showing in PAL proteins of Arabidopsis thaliana and Nicotiana tabacum, are also found in DcPAL1. Phylogenetic tree analysis revealed that DcPAL is more closely related to PALs from orchidaceae plants than to those of other plants. The differential expression patterns of PAL in protocorm-like body, leaf, stem, and root, suggest that the PAL gene performs multiple physiological functions in Dendrobium candidum.

  11. Sequencing Chromosomal Abnormalities Reveals Neurodevelopmental Loci that Confer Risk across Diagnostic Boundaries

    DEFF Research Database (Denmark)

    Talkowski, Michael E.; Rosenfeld, Jill A.; Blumenthal, Ian

    2012-01-01

    Sequencing of balanced chromosomal abnormalities, combined with convergent genomic studies of gene expression, copy-number variation, and genome-wide association, identifies 22 new loci that contribute to autism and related neurodevelopmental disorders. These data support a polygenic risk model...

  12. Messenger RNA biomarker signatures for forensic body fluid identification revealed by targeted RNA sequencing.

    Science.gov (United States)

    Hanson, E; Ingold, S; Haas, C; Ballantyne, J

    2018-05-01

    The recovery of a DNA profile from the perpetrator or victim in criminal investigations can provide valuable 'source level' information for investigators. However, a DNA profile does not reveal the circumstances by which biological material was transferred. Some contextual information can be obtained by a determination of the tissue or fluid source of origin of the biological material as it is potentially indicative of some behavioral activity on behalf of the individual that resulted in its transfer from the body. Here, we sought to improve upon established RNA based methods for body fluid identification by developing a targeted multiplexed next generation mRNA sequencing assay comprising a panel of approximately equal sized gene amplicons. The multiplexed biomarker panel includes several highly specific gene targets with the necessary specificity to definitively identify most forensically relevant biological fluids and tissues (blood, semen, saliva, vaginal secretions, menstrual blood and skin). In developing the biomarker panel we evaluated 66 gene targets, with a progressive iteration of testing target combinations that exhibited optimal sensitivity and specificity using a training set of forensically relevant body fluid samples. The current assay comprises 33 targets: 6 blood, 6 semen, 6 saliva, 4 vaginal secretions, 5 menstrual blood and 6 skin markers. We demonstrate the sensitivity and specificity of the assay and the ability to identify body fluids in single source and admixed stains. A 16 sample blind test was carried out by one lab with samples provided by the other participating lab. The blinded lab correctly identified the body fluids present in 15 of the samples with the major component identified in the 16th. Various classification methods are being investigated to permit inference of the body fluid/tissue in dried physiological stains. These include the percentage of reads in a sample that are due to each of the 6 tissues/body fluids tested and

  13. Scalable Kernel Methods and Algorithms for General Sequence Analysis

    Science.gov (United States)

    Kuksa, Pavel

    2011-01-01

    Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…

  14. Classification, expression pattern and comparative analysis of sugarcane expressed sequences tags (ESTs encoding glycine-rich proteins (GRPs

    Directory of Open Access Journals (Sweden)

    Fusaro Adriana

    2001-01-01

    Full Text Available Since the isolation of the first glycine-rich proteins (GRPs in plants a wealth of new GRPs have been identified. The highly specific but diverse expression pattern of grp genes, taken together with the distinct sub-cellular localization of some GRP groups, clearly indicate that these proteins are involved in several independent physiological processes. Notwithstanding the absence of a clear definition of the role of GRPs in plant cells, studies conducted with these proteins have provided new and interesting insights into the molecular biology and cell biology of plants. Complexly regulated promoters and distinct mechanisms for the regulation of gene expression have been demonstrated and new protein targeting pathways, as well as the exportation of GRPs from different cell types have been discovered. These data show that GRPs can be useful as markers and/or models to understand distinct aspects of plant biology. In this paper, the structural and functional features of these proteins in sugarcane (Saccharum officinarum L. are summarized. Since this is the first description of GRPs in sugarcane, special emphasis has been given to the expression pattern of these GRP genes by studying their abundance and prevalence in the different cDNA-libraries of the Sugarcane Expressed Sequence Tag (SUCEST project . The comparison of sugarcane GRPs with GRPs from other species is also discussed.

  15. Single-strand conformation polymorphism (SSCP)-based mutation scanning approaches to fingerprint sequence variation in ribosomal DNA of ascaridoid nematodes.

    Science.gov (United States)

    Zhu, X Q; Gasser, R B

    1998-06-01

    In this study, we assessed single-strand conformation polymorphism (SSCP)-based approaches for their capacity to fingerprint sequence variation in ribosomal DNA (rDNA) of ascaridoid nematodes of veterinary and/or human health significance. The second internal transcribed spacer region (ITS-2) of rDNA was utilised as the target region because it is known to provide species-specific markers for this group of parasites. ITS-2 was amplified by PCR from genomic DNA derived from individual parasites and subjected to analysis. Direct SSCP analysis of amplicons from seven taxa (Toxocara vitulorum, Toxocara cati, Toxocara canis, Toxascaris leonina, Baylisascaris procyonis, Ascaris suum and Parascaris equorum) showed that the single-strand (ss) ITS-2 patterns produced allowed their unequivocal identification to species. While no variation in SSCP patterns was detected in the ITS-2 within four species for which multiple samples were available, the method allowed the direct display of four distinct sequence types of ITS-2 among individual worms of T. cati. Comparison of SSCP/sequencing with the methods of dideoxy fingerprinting (ddF) and restriction endonuclease fingerprinting (REF) revealed that also ddF allowed the definition of the four sequence types, whereas REF displayed three of four. The findings indicate the usefulness of the SSCP-based approaches for the identification of ascaridoid nematodes to species, the direct display of sequence variation in rDNA and the detection of population variation. The ability to fingerprint microheterogeneity in ITS-2 rDNA using such approaches also has implications for studying fundamental aspects relating to mutational change in rDNA.

  16. Mitochondrial DNA Variation Reveals a Sharp Genetic Break within the Distribution of the Blue Land Crab Cardisoma guanhumi in the Western Central Atlantic

    Directory of Open Access Journals (Sweden)

    Maria Rosimere Xavier Amaral

    2015-08-01

    Full Text Available The blue land crab Cardisoma guanhumi is widely distributed throughout tropical and subtropical estuarine regions in the Western Central Atlantic (WCA. Patterns of population genetic structure and historical demographics of the species were assessed by mtDNA control region sequence analysis to examine the connectivity among five populations (n = 97 within the region for future conservation strategies and decision-making of fishery management. A total of 234 polymorphic nucleotides were revealed within the sequence region, which have defined 93 distinct haplotypes. No dominant mtDNA haplotypes were found but instead a distribution of a few low-frequency recurrent haplotypes with a large number of singletons. A NJ-tree and a median-joining haplotype network revealed two distinct clusters, corresponding to individuals from estuaries located along the Caribbean Sea and Brazilian waters, respectively. AMOVA and FST statistics supported the hypothesis that two main geographic regions exists. Phylogeographical discontinuity was further demonstrated by the Bayesian assignment analysis and a significant pattern of isolation-by-distance. Additionally, tests of neutral evolution and analysis of mismatch distribution indicate a complex demographic history in the WCA, which corresponds to bottleneck and subsequent population growth. Overall, a sharp genetic break between Caribbean and Brazilian populations raised concerns over the conservation status of the blue land crab.

  17. Probabilistic Motor Sequence Yields Greater Offline and Less Online Learning than Fixed Sequence.

    Science.gov (United States)

    Du, Yue; Prashad, Shikha; Schoenbrun, Ilana; Clark, Jane E

    2016-01-01

    It is well acknowledged that motor sequences can be learned quickly through online learning. Subsequently, the initial acquisition of a motor sequence is boosted or consolidated by offline learning. However, little is known whether offline learning can drive the fast learning of motor sequences (i.e., initial sequence learning in the first training session). To examine offline learning in the fast learning stage, we asked four groups of young adults to perform the serial reaction time (SRT) task with either a fixed or probabilistic sequence and with or without preliminary knowledge (PK) of the presence of a sequence. The sequence and PK were manipulated to emphasize either procedural (probabilistic sequence; no preliminary knowledge (NPK)) or declarative (fixed sequence; with PK) memory that were found to either facilitate or inhibit offline learning. In the SRT task, there were six learning blocks with a 2 min break between each consecutive block. Throughout the session, stimuli followed the same fixed or probabilistic pattern except in Block 5, in which stimuli appeared in a random order. We found that PK facilitated the learning of a fixed sequence, but not a probabilistic sequence. In addition to overall learning measured by the mean reaction time (RT), we examined the progressive changes in RT within and between blocks (i.e., online and offline learning, respectively). It was found that the two groups who performed the fixed sequence, regardless of PK, showed greater online learning than the other two groups who performed the probabilistic sequence. The groups who performed the probabilistic sequence, regardless of PK, did not display online learning, as indicated by a decline in performance within the learning blocks. However, they did demonstrate remarkably greater offline improvement in RT, which suggests that they are learning the probabilistic sequence offline. These results suggest that in the SRT task, the fast acquisition of a motor sequence is driven

  18. Combining genomic sequencing methods to explore viral diversity and reveal potential virus-host interactions

    Directory of Open Access Journals (Sweden)

    Cheryl-Emiliane Tien Chow

    2015-04-01

    Full Text Available Viral diversity and virus-host interactions in oxygen-starved regions of the ocean, also known as oxygen minimum zones (OMZs, remain relatively unexplored. Microbial community metabolism in OMZs alters nutrient and energy flow through marine food webs, resulting in biological nitrogen loss and greenhouse gas production. Thus, viruses infecting OMZ microbes have the potential to modulate community metabolism with resulting feedback on ecosystem function. Here, we describe viral communities inhabiting oxic surface (10m and oxygen-starved basin (200m waters of Saanich Inlet, a seasonally anoxic fjord on the coast of Vancouver Island, British Columbia using viral metagenomics and complete viral fosmid sequencing on samples collected between April 2007 and April 2010. Of 6459 open reading frames (ORFs predicted across all 34 viral fosmids, 77.6% (n=5010 had no homology to reference viral genomes. These fosmids recruited a higher proportion of viral metagenomic sequences from Saanich Inlet than from nearby northeastern subarctic Pacific Ocean (Line P waters, indicating differences in the viral communities between coastal and open ocean locations. While functional annotations of fosmid ORFs were limited, recruitment to NCBI’s non-redundant ‘nr’ database and publicly available single-cell genomes identified putative viruses infecting marine thaumarchaeal and SUP05 proteobacteria to provide potential host linkages with relevance to coupled biogeochemical cycling processes in OMZ waters. Taken together, these results highlight the power of coupled analyses of multiple sequence data types, such as viral metagenomic and fosmid sequence data with prokaryotic single cell genomes, to chart viral diversity, elucidate genomic and ecological contexts for previously unclassifiable viral sequences, and identify novel host interactions in natural and engineered ecosystems.

  19. Mitochondrial genomes reveal recombination in the presumed asexual Fusarium oxysporum species complex.

    Science.gov (United States)

    Brankovics, Balázs; van Dam, Peter; Rep, Martijn; de Hoog, G Sybren; J van der Lee, Theo A; Waalwijk, Cees; van Diepeningen, Anne D

    2017-09-18

    The Fusarium oxysporum species complex (FOSC) contains several phylogenetic lineages. Phylogenetic studies identified two to three major clades within the FOSC. The mitochondrial sequences are highly informative phylogenetic markers, but have been mostly neglected due to technical difficulties. A total of 61 complete mitogenomes of FOSC strains were de novo assembled and annotated. Length variations and intron patterns support the separation of three phylogenetic species. The variable region of the mitogenome that is typical for the genus Fusarium shows two new variants in the FOSC. The variant typical for Fusarium is found in members of all three clades, while variant 2 is found in clades 2 and 3 and variant 3 only in clade 2. The extended set of loci analyzed using a new implementation of the genealogical concordance species recognition method support the identification of three phylogenetic species within the FOSC. Comparative analysis of the mitogenomes in the FOSC revealed ongoing mitochondrial recombination within, but not between phylogenetic species. The recombination indicates the presence of a parasexual cycle in F. oxysporum. The obstacles hindering the usage of the mitogenomes are resolved by using next generation sequencing and selective genome assemblers, such as GRAbB. Complete mitogenome sequences offer a stable basis and reference point for phylogenetic and population genetic studies.

  20. Oil palm genome sequence reveals divergence of interfertile species in old and new worlds

    Science.gov (United States)

    Singh, Rajinder; Ong-Abdullah, Meilina; Low, Eng-Ti Leslie; Manaf, Mohamad Arif Abdul; Rosli, Rozana; Nookiah, Rajanaidu; Ooi, Leslie Cheng-Li; Ooi, Siew–Eng; Chan, Kuang-Lim; Halim, Mohd Amin; Azizi, Norazah; Nagappan, Jayanthi; Bacher, Blaire; Lakey, Nathan; Smith, Steven W; He, Dong; Hogan, Michael; Budiman, Muhammad A; Lee, Ernest K; DeSalle, Rob; Kudrna, David; Goicoechea, Jose Louis; Wing, Rod; Wilson, Richard K; Fulton, Robert S; Ordway, Jared M; Martienssen, Robert A; Sambanthamurthi, Ravigadevi

    2013-01-01

    Oil palm is the most productive oil-bearing crop. Planted on only 5% of the total vegetable oil acreage, palm oil accounts for 33% of vegetable oil, and 45% of edible oil worldwide, but increased cultivation competes with dwindling rainforest reserves. We report the 1.8 gigabase (Gb) genome sequence of the African oil palm Elaeis guineensis, the predominant source of worldwide oil production. 1.535 Gb of assembled sequence and transcriptome data from 30 tissue types were used to predict at least 34,802 genes, including oil biosynthesis genes and homologues of WRINKLED1 (WRI1), and other transcriptional regulators1, which are highly expressed in the kernel. We also report the draft sequence of the S. American oil palm Elaeis oleifera, which has the same number of chromosomes (2n=32) and produces fertile interspecific hybrids with E. guineensis2, but appears to have diverged in the new world. Segmental duplications of chromosome arms define the palaeotetraploid origin of palm trees. The oil palm sequence enables the discovery of genes for important traits as well as somaclonal epigenetic alterations which restrict the use of clones in commercial plantings3, and thus helps achieve sustainability for biofuels and edible oils, reducing the rainforest footprint of this tropical plantation crop. PMID:23883927