WorldWideScience

Sample records for sequence conservation identifies

  1. Lariat sequencing in a unicellular yeast identifies regulated alternative splicing of exons that are evolutionarily conserved with humans.

    Science.gov (United States)

    Awan, Ali R; Manfredo, Amanda; Pleiss, Jeffrey A

    2013-07-30

    Alternative splicing is a potent regulator of gene expression that vastly increases proteomic diversity in multicellular eukaryotes and is associated with organismal complexity. Although alternative splicing is widespread in vertebrates, little is known about the evolutionary origins of this process, in part because of the absence of phylogenetically conserved events that cross major eukaryotic clades. Here we describe a lariat-sequencing approach, which offers high sensitivity for detecting splicing events, and its application to the unicellular fungus, Schizosaccharomyces pombe, an organism that shares many of the hallmarks of alternative splicing in mammalian systems but for which no previous examples of exon-skipping had been demonstrated. Over 200 previously unannotated splicing events were identified, including examples of regulated alternative splicing. Remarkably, an evolutionary analysis of four of the exons identified here as subject to skipping in S. pombe reveals high sequence conservation and perfect length conservation with their homologs in scores of plants, animals, and fungi. Moreover, alternative splicing of two of these exons have been documented in multiple vertebrate organisms, making these the first demonstrations of identical alternative-splicing patterns in species that are separated by over 1 billion y of evolution.

  2. Spatially conserved regulatory elements identified within human and mouse Cd247 gene using high-throughput sequencing data from the ENCODE project

    DEFF Research Database (Denmark)

    Pundhir, Sachin; Hannibal, Tine Dahlbæk; Bang-Berthelsen, Claus Heiner

    2014-01-01

    . In this study, we have utilized the wealth of high-throughput sequencing data produced during the Encyclopedia of DNA Elements (ENCODE) project to identify spatially conserved regulatory elements within the Cd247 gene from human and mouse. We show the presence of two transcription factor binding sites...

  3. An effective approach for annotation of protein families with low sequence similarity and conserved motifs: identifying GDSL hydrolases across the plant kingdom.

    Science.gov (United States)

    Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica

    2016-02-18

    The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through

  4. Sequencing Conservation Actions Through Threat Assessments in the Southeastern United States

    Science.gov (United States)

    Robert D. Sutter; Christopher C. Szell

    2006-01-01

    The identification of conservation priorities is one of the leading issues in conservation biology. We present a project of The Nature Conservancy, called Sequencing Conservation Actions, which prioritizes conservation areas and identifies foci for crosscutting strategies at various geographic scales. We use the term “Sequencing” to mean an ordering of actions over...

  5. The Activity of Escherichia coli Dihydroorotate Dehydrogenase Is Dependent on a Conserved Loop Identified by Sequence Homology, Mutagenesis, and Limited Proteolysis

    DEFF Research Database (Denmark)

    Björnberg, Olof; Grüner, Anne Charlotte; Roepstorff, Peter

    1999-01-01

    of dihydroorotate dehydrogenases, but sedimentation in sucrose gradients suggests a dimeric structure also of the E. coli enzyme. Product inhibition showed that the E. coli enzyme, in contrast to the L. lactis enzyme, has separate binding sites for dihydroorotate and the electron acceptor. Trypsin readily cleaved...... the E. coli enzyme into two fragments of 182 and 154 residues, respectively. Cleavage reduced the activity more than 100-fold but left other molecular properties, including the heat stability, intact. The trypsin cleavage site, at R182, is positioned in a conserved region that, in the L. lactis enzyme......, forms a loop where a cysteine residue is very critical for activity. In the corresponding position, the enzyme from E. coli has a serine residue. Mutagenesis of this residue (S175) to alanine or cysteine reduced the activities 10000- and 500-fold, respectively. The S175C mutant was also defective...

  6. Identifying ambassador species for conservation marketing

    Directory of Open Access Journals (Sweden)

    E.A. Macdonald

    2017-10-01

    Full Text Available Conservation relies heavily on external funding, much of it from a supportive public. Therefore it is important to know which species are most likely to catalyse such funding. Whilst previous work has looked at the physical attributes that contribute to a species' appeal, no previous studies have tried to examine the extent to which a species' sympatriots might contribute to it's potential as flagship for wider conservation. Therefore, here we estimate ‘flexibility’ and ‘appeal’ scores for all terrestrial mammals (n = 4320 and identify which of these might serve as ambassadors (defined as both highly appealing and flexible. Relatively few mammals (between 240 and 331 emerged as ambassadors, with carnivores featuring heavily in this group (representing 5% of terrestrial mammals but 39% of ambassadors. ‘Top ambassadors’ were defined as those with both flexibility and appeal scores greater than 1 standard deviation above the mean. Less than a quarter of the 20 most endangered and evolutionary distinct species in this study were classed as ambassadors, highlighting the need for surrogate species to catalyse conservation effort in areas with such priority species. This is the first global analysis bringing together flexibility and appeal for all terrestrial mammals, and demonstrates an approach for determining how best to market species in order to achieve maximal conservation gain in a world with urgent conservation need but limited resources.

  7. On the relationship between residue structural environment and sequence conservation in proteins.

    Science.gov (United States)

    Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao

    2017-09-01

    Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.

  8. In Vivo Enhancer Analysis Chromosome 16 Conserved NoncodingSequences

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.; Ahituv, Nadav; Moses, Alan M.; Nobrega,Marcelo; Prabhakar, Shyam; Shoukry, Malak; Minovitsky, Simon; Visel,Axel; Dubchak, Inna; Holt, Amy; Lewis, Keith D.; Plajzer-Frick, Ingrid; Akiyama, Jennifer; De Val, Sarah; Afzal, Veena; Black, Brian L.; Couronne, Olivier; Eisen, Michael B.; Rubin, Edward M.

    2006-02-01

    The identification of enhancers with predicted specificitiesin vertebrate genomes remains a significant challenge that is hampered bya lack of experimentally validated training sets. In this study, weleveraged extreme evolutionary sequence conservation as a filter toidentify putative gene regulatory elements and characterized the in vivoenhancer activity of human-fish conserved and ultraconserved1 noncodingelements on human chromosome 16 as well as such elements from elsewherein the genome. We initially tested 165 of these extremely conservedsequences in a transgenic mouse enhancer assay and observed that 48percent (79/165) functioned reproducibly as tissue-specific enhancers ofgene expression at embryonic day 11.5. While driving expression in abroad range of anatomical structures in the embryo, the majority of the79 enhancers drove expression in various regions of the developingnervous system. Studying a set of DNA elements that specifically droveforebrain expression, we identified DNA signatures specifically enrichedin these elements and used these parameters to rank all ~;3,400human-fugu conserved noncoding elements in the human genome. The testingof the top predictions in transgenic mice resulted in a three-foldenrichment for sequences with forebrain enhancer activity. These datadramatically expand the catalogue of in vivo-characterized human geneenhancers and illustrate the future utility of such training sets for avariety of iological applications including decoding the regulatoryvocabulary of the human genome.

  9. The relationship of protein conservation and sequence length

    Directory of Open Access Journals (Sweden)

    Panchenko Anna R

    2002-11-01

    Full Text Available Abstract Background In general, the length of a protein sequence is determined by its function and the wide variance in the lengths of an organism's proteins reflects the diversity of specific functional roles for these proteins. However, additional evolutionary forces that affect the length of a protein may be revealed by studying the length distributions of proteins evolving under weaker functional constraints. Results We performed sequence comparisons to distinguish highly conserved and poorly conserved proteins from the bacterium Escherichia coli, the archaeon Archaeoglobus fulgidus, and the eukaryotes Saccharomyces cerevisiae, Drosophila melanogaster, and Homo sapiens. For all organisms studied, the conserved and nonconserved proteins have strikingly different length distributions. The conserved proteins are, on average, longer than the poorly conserved ones, and the length distributions for the poorly conserved proteins have a relatively narrow peak, in contrast to the conserved proteins whose lengths spread over a wider range of values. For the two prokaryotes studied, the poorly conserved proteins approximate the minimal length distribution expected for a diverse range of structural folds. Conclusions There is a relationship between protein conservation and sequence length. For all the organisms studied, there seems to be a significant evolutionary trend favoring shorter proteins in the absence of other, more specific functional constraints.

  10. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, nois...... patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer....

  11. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

    Science.gov (United States)

    Nagar, Anurag; Hahsler, Michael

    2013-01-01

    Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to

  12. Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma

    Energy Technology Data Exchange (ETDEWEB)

    Krauthammer, Michael; Kong, Yong; Ha, Byung Hak; Evans, Perry; Bacchiocchi, Antonella; McCusker, James P.; Cheng, Elaine; Davis, Matthew J.; Goh, Gerald; Choi, Murim; Ariyan, Stephan; Narayan, Deepak; Dutton-Regester, Ken; Capatana, Ana; Holman, Edna C.; Bosenberg, Marcus; Sznol, Mario; Kluger, Harriet M.; Brash, Douglas E.; Stern, David F.; Materin, Miguel A.; Lo, Roger S.; Mane, Shrikant; Ma, Shuangge; Kidd, Kenneth K.; Hayward, Nicholas K.; Lifton, Richard P.; Schlessinger, Joseph; Boggon, Titus J.; Halaban, Ruth (Yale-MED); (UCLA); (Queens)

    2012-10-11

    We characterized the mutational landscape of melanoma, the form of skin cancer with the highest mortality rate, by sequencing the exomes of 147 melanomas. Sun-exposed melanomas had markedly more ultraviolet (UV)-like C>T somatic mutations compared to sun-shielded acral, mucosal and uveal melanomas. Among the newly identified cancer genes was PPP6C, encoding a serine/threonine phosphatase, which harbored mutations that clustered in the active site in 12% of sun-exposed melanomas, exclusively in tumors with mutations in BRAF or NRAS. Notably, we identified a recurrent UV-signature, an activating mutation in RAC1 in 9.2% of sun-exposed melanomas. This activating mutation, the third most frequent in our cohort of sun-exposed melanoma after those of BRAF and NRAS, changes Pro29 to serine (RAC1{sup P29S}) in the highly conserved switch I domain. Crystal structures, and biochemical and functional studies of RAC1{sup P29S} showed that the alteration releases the conformational restraint conferred by the conserved proline, causes an increased binding of the protein to downstream effectors, and promotes melanocyte proliferation and migration. These findings raise the possibility that pharmacological inhibition of downstream effectors of RAC1 signaling could be of therapeutic benefit.

  13. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    Energy Technology Data Exchange (ETDEWEB)

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  14. Evolutionary growth process of highly conserved sequences in vertebrate genomes.

    Science.gov (United States)

    Ishibashi, Minaka; Noda, Akiko Ogura; Sakate, Ryuichi; Imanishi, Tadashi

    2012-08-01

    Genome sequence comparison between evolutionarily distant species revealed ultraconserved elements (UCEs) among mammals under strong purifying selection. Most of them were also conserved among vertebrates. Because they tend to be located in the flanking regions of developmental genes, they would have fundamental roles in creating vertebrate body plans. However, the evolutionary origin and selection mechanism of these UCEs remain unclear. Here we report that UCEs arose in primitive vertebrates, and gradually grew in vertebrate evolution. We searched for UCEs in two teleost fishes, Tetraodon nigroviridis and Oryzias latipes, and found 554 UCEs with 100% identity over 100 bps. Comparison of teleost and mammalian UCEs revealed 43 pairs of common, jawed-vertebrate UCEs (jUCE) with high sequence identities, ranging from 83.1% to 99.2%. Ten of them retain lower similarities to the Petromyzon marinus genome, and the substitution rates of four non-exonic jUCEs were reduced after the teleost-mammal divergence, suggesting that robust conservation had been acquired in the jawed vertebrate lineage. Our results indicate that prototypical UCEs originated before the divergence of jawed and jawless vertebrates and have been frozen as perfect conserved sequences in the jawed vertebrate lineage. In addition, our comparative sequence analyses of UCEs and neighboring regions resulted in a discovery of lineage-specific conserved sequences. They were added progressively to prototypical UCEs, suggesting step-wise acquisition of novel regulatory roles. Our results indicate that conserved non-coding elements (CNEs) consist of blocks with distinct evolutionary history, each having been frozen since different evolutionary era along the vertebrate lineage. Copyright © 2012 Elsevier B.V. All rights reserved.

  15. Highly conserved non-coding sequences are associated with vertebrate development.

    Directory of Open Access Journals (Sweden)

    Adam Woolfe

    2005-01-01

    Full Text Available In addition to protein coding sequence, the human genome contains a significant amount of regulatory DNA, the identification of which is proving somewhat recalcitrant to both in silico and functional methods. An approach that has been used with some success is comparative sequence analysis, whereby equivalent genomic regions from different organisms are compared in order to identify both similarities and differences. In general, similarities in sequence between highly divergent organisms imply functional constraint. We have used a whole-genome comparison between humans and the pufferfish, Fugu rubripes, to identify nearly 1,400 highly conserved non-coding sequences. Given the evolutionary divergence between these species, it is likely that these sequences are found in, and furthermore are essential to, all vertebrates. Most, and possibly all, of these sequences are located in and around genes that act as developmental regulators. Some of these sequences are over 90% identical across more than 500 bases, being more highly conserved than coding sequence between these two species. Despite this, we cannot find any similar sequences in invertebrate genomes. In order to begin to functionally test this set of sequences, we have used a rapid in vivo assay system using zebrafish embryos that allows tissue-specific enhancer activity to be identified. Functional data is presented for highly conserved non-coding sequences associated with four unrelated developmental regulators (SOX21, PAX6, HLXB9, and SHH, in order to demonstrate the suitability of this screen to a wide range of genes and expression patterns. Of 25 sequence elements tested around these four genes, 23 show significant enhancer activity in one or more tissues. We have identified a set of non-coding sequences that are highly conserved throughout vertebrates. They are found in clusters across the human genome, principally around genes that are implicated in the regulation of development

  16. Accelerated Evolution of Conserved Noncoding Sequences in theHuman Genome

    Energy Technology Data Exchange (ETDEWEB)

    Prambhakar, Shyam; Noonan, James P.; Paabo, Svante; Rubin, EdwardM.

    2006-07-06

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detect"cryptic" functional elements, which are too weakly conserved amongmammals to distinguish from nonfunctional DNA. To address this problem,we explored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  17. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  18. Identifying species conservation strategies to reduce disease-associated declines

    Science.gov (United States)

    Gerber, Brian D.; Converse, Sarah J.; Muths, Erin L.; Crockett, Harry J.; Mosher, Brittany A.; Bailey, Larissa L.

    2018-01-01

    Emerging infectious diseases (EIDs) are a salient threat to many animal taxa, causing local and global extinctions, altering communities and ecosystem function. The EID chytridiomycosis is a prominent driver of amphibian declines, which is caused by the fungal pathogen Batrachochytrium dendrobatidis (Bd). To guide conservation policy, we developed a predictive decision-analytic model that combines empirical knowledge of host-pathogen metapopulation dynamics with expert judgment regarding effects of management actions, to select from potential conservation strategies. We apply our approach to a boreal toad (Anaxyrus boreas boreas) and Bd system, identifying optimal strategies that balance tradeoffs in maximizing toad population persistence and landscape-level distribution, while considering costs. The most robust strategy is expected to reduce the decline of toad breeding sites from 53% to 21% over 50 years. Our findings are incorporated into management policy to guide conservation planning. Our online modeling application provides a template for managers of other systems challenged by EIDs.

  19. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    Science.gov (United States)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  20. Social Network Analysis Identifies Key Participants in Conservation Development.

    Science.gov (United States)

    Farr, Cooper M; Reed, Sarah E; Pejchar, Liba

    2018-05-01

    Understanding patterns of participation in private lands conservation, which is often implemented voluntarily by individual citizens and private organizations, could improve its effectiveness at combating biodiversity loss. We used social network analysis (SNA) to examine participation in conservation development (CD), a private land conservation strategy that clusters houses in a small portion of a property while preserving the remaining land as protected open space. Using data from public records for six counties in Colorado, USA, we compared CD participation patterns among counties and identified actors that most often work with others to implement CDs. We found that social network characteristics differed among counties. The network density, or proportion of connections in the network, varied from fewer than 2 to nearly 15%, and was higher in counties with smaller populations and fewer CDs. Centralization, or the degree to which connections are held disproportionately by a few key actors, was not correlated strongly with any county characteristics. Network characteristics were not correlated with the prevalence of wildlife-friendly design features in CDs. The most highly connected actors were biological and geological consultants, surveyors, and engineers. Our work demonstrates a new application of SNA to land-use planning, in which CD network patterns are examined and key actors are identified. For better conservation outcomes of CD, we recommend using network patterns to guide strategies for outreach and information dissemination, and engaging with highly connected actor types to encourage widespread adoption of best practices for CD design and stewardship.

  1. Conservation patterns in different functional sequence categoriesof divergent Drosophila species

    Energy Technology Data Exchange (ETDEWEB)

    Papatsenko, Dmitri; Kislyuk, Andrey; Levine, Michael; Dubchak, Inna

    2005-10-01

    We have explored the distributions of fully conservedungapped blocks in genome-wide pairwise alignments of recently completedspecies of Drosophila: D.yakuba, D.ananassae, D.pseudoobscura, D.virilisand D.mojavensis. Based on these distributions we have found that nearlyevery functional sequence category possesses its own distinctiveconservation pattern, sometimes independent of the overall sequenceconservation level. In the coding and regulatory regions, the ungappedblocks were longer than in introns, UTRs and non-functional sequences. Atthe same time, the blocks in the coding regions carried 3N+2 signaturecharacteristic to synonymic substitutions in the 3rd codon positions.Larger block sizes in transcription regulatory regions can be explainedby the presence of conserved arrays of binding sites for transcriptionfactors. We also have shown that the longest ungapped blocks, or'ultraconserved' sequences, are associated with specific gene groups,including those encoding ion channels and components of the cytoskeleton.We discussed how restrained conservation patterns may help in mappingfunctional sequence categories and improving genomeannotation.

  2. Allele frequencies of variants in ultra conserved elements identify selective pressure on transcription factor binding.

    Directory of Open Access Journals (Sweden)

    Toomas Silla

    Full Text Available Ultra-conserved genes or elements (UCGs/UCEs in the human genome are extreme examples of conservation. We characterized natural variations in 2884 UCEs and UCGs in two distinct populations; Singaporean Chinese (n = 280 and Italian (n = 501 by using a pooled sample, targeted capture, sequencing approach. We identify, with high confidence, in these regions the abundance of rare SNVs (MAF5% are more often found in relatively less-conserved nucleotides within UCEs, compared to rare variants. Moreover, prevalent variants are less likely to overlap transcription factor binding site. Using SNPfold we found no significant influence of RNA secondary structure on UCE conservation. All together, these results suggest UCEs are not under selective pressure as a stretch of DNA but are under differential evolutionary pressure on the single nucleotide level.

  3. Allele frequencies of variants in ultra conserved elements identify selective pressure on transcription factor binding.

    Science.gov (United States)

    Silla, Toomas; Kepp, Katrin; Tai, E Shyong; Goh, Liang; Davila, Sonia; Catela Ivkovic, Tina; Calin, George A; Voorhoeve, P Mathijs

    2014-01-01

    Ultra-conserved genes or elements (UCGs/UCEs) in the human genome are extreme examples of conservation. We characterized natural variations in 2884 UCEs and UCGs in two distinct populations; Singaporean Chinese (n = 280) and Italian (n = 501) by using a pooled sample, targeted capture, sequencing approach. We identify, with high confidence, in these regions the abundance of rare SNVs (MAFpower for association studies. By combining our data with 1000 Genome Project data, we show in three independent datasets that prevalent UCE variants (MAF>5%) are more often found in relatively less-conserved nucleotides within UCEs, compared to rare variants. Moreover, prevalent variants are less likely to overlap transcription factor binding site. Using SNPfold we found no significant influence of RNA secondary structure on UCE conservation. All together, these results suggest UCEs are not under selective pressure as a stretch of DNA but are under differential evolutionary pressure on the single nucleotide level.

  4. The BsaHI restriction-modification system: Cloning, sequencing and analysis of conserved motifs

    Directory of Open Access Journals (Sweden)

    Roberts Richard J

    2008-05-01

    Full Text Available Abstract Background Restriction and modification enzymes typically recognise short DNA sequences of between two and eight bases in length. Understanding the mechanism of this recognition represents a significant challenge that we begin to address for the BsaHI restriction-modification system, which recognises the six base sequence GRCGYC. Results The DNA sequences of the genes for the BsaHI methyltransferase, bsaHIM, and restriction endonuclease, bsaHIR, have been determined (GenBank accession #EU386360, cloned and expressed in E. coli. Both the restriction endonuclease and methyltransferase enzymes share significant similarity with a group of 6 other enzymes comprising the restriction-modification systems HgiDI and HgiGI and the putative HindVP, NlaCORFDP, NpuORFC228P and SplZORFNP restriction-modification systems. A sequence alignment of these homologues shows that their amino acid sequences are largely conserved and highlights several motifs of interest. We target one such conserved motif, reading SPERRFD, at the C-terminal end of the bsaHIR gene. A mutational analysis of these amino acids indicates that the motif is crucial for enzymatic activity. Sequence alignment of the methyltransferase gene reveals a short motif within the target recognition domain that is conserved among enzymes recognising the same sequences. Thus, this motif may be used as a diagnostic tool to define the recognition sequences of the cytosine C5 methyltransferases. Conclusion We have cloned and sequenced the BsaHI restriction and modification enzymes. We have identified a region of the R. BsaHI enzyme that is crucial for its activity. Analysis of the amino acid sequence of the BsaHI methyltransferase enzyme led us to propose two new motifs that can be used in the diagnosis of the recognition sequence of the cytosine C5-methyltransferases.

  5. High-throughput sequencing, characterization and detection of new and conserved cucumber miRNAs.

    Directory of Open Access Journals (Sweden)

    Germán Martínez

    Full Text Available Micro RNAS (miRNAs are a class of endogenous small non coding RNAs involved in the post-transcriptional regulation of gene expression. In plants, a great number of conserved and specific miRNAs, mainly arising from model species, have been identified to date. However less is known about the diversity of these regulatory RNAs in vegetal species with agricultural and/or horticultural importance. Here we report a combined approach of bioinformatics prediction, high-throughput sequencing data and molecular methods to analyze miRNAs populations in cucumber (Cucumis sativus plants. A set of 19 conserved and 6 known but non-conserved miRNA families were found in our cucumber small RNA dataset. We also identified 7 (3 with their miRNA* strand not previously described miRNAs, candidates to be cucumber-specific. To validate their description these new C. sativus miRNAs were detected by northern blot hybridization. Additionally, potential targets for most conserved and new miRNAs were identified in cucumber genome.In summary, in this study we have identified, by first time, conserved, known non-conserved and new miRNAs arising from an agronomically important species such as C. sativus. The detection of this complex population of regulatory small RNAs suggests that similarly to that observe in other plant species, cucumber miRNAs may possibly play an important role in diverse biological and metabolic processes.

  6. WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    Directory of Open Access Journals (Sweden)

    Pesole Graziano

    2007-02-01

    Full Text Available Abstract Background This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. Results We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. Conclusion Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes.

  7. smRNAome profiling to identify conserved and novel microRNAs in Stevia rebaudiana Bertoni

    Science.gov (United States)

    2012-01-01

    Background MicroRNAs (miRNAs) constitute a family of small RNA (sRNA) population that regulates the gene expression and plays an important role in plant development, metabolism, signal transduction and stress response. Extensive studies on miRNAs have been performed in different plants such as Arabidopsis thaliana, Oryza sativa etc. and volume of the miRNA database, mirBASE, has been increasing on day to day basis. Stevia rebaudiana Bertoni is an important perennial herb which accumulates high concentrations of diterpene steviol glycosides which contributes to its high indexed sweetening property with no calorific value. Several studies have been carried out for understanding molecular mechanism involved in biosynthesis of these glycosides, however, information about miRNAs has been lacking in S. rebaudiana. Deep sequencing of small RNAs combined with transcriptomic data is a powerful tool for identifying conserved and novel miRNAs irrespective of availability of genome sequence data. Results To identify miRNAs in S. rebaudiana, sRNA library was constructed and sequenced using Illumina genome analyzer II. A total of 30,472,534 reads representing 2,509,190 distinct sequences were obtained from sRNA library. Based on sequence similarity, we identified 100 miRNAs belonging to 34 highly conserved families. Also, we identified 12 novel miRNAs whose precursors were potentially generated from stevia EST and nucleotide sequences. All novel sequences have not been earlier described in other plant species. Putative target genes were predicted for most conserved and novel miRNAs. The predicted targets are mainly mRNA encoding enzymes regulating essential plant metabolic and signaling pathways. Conclusions This study led to the identification of 34 highly conserved miRNA families and 12 novel potential miRNAs indicating that specific miRNAs exist in stevia species. Our results provided information on stevia miRNAs and their targets building a foundation for future studies to

  8. Relationships between residue Voronoi volume and sequence conservation in proteins.

    Science.gov (United States)

    Liu, Jen-Wei; Cheng, Chih-Wen; Lin, Yu-Feng; Chen, Shao-Yu; Hwang, Jenn-Kang; Yen, Shih-Chung

    2018-02-01

    Functional and biophysical constraints can cause different levels of sequence conservation in proteins. Previously, structural properties, e.g., relative solvent accessibility (RSA) and packing density of the weighted contact number (WCN), have been found to be related to protein sequence conservation (CS). The Voronoi volume has recently been recognized as a new structural property of the local protein structural environment reflecting CS. However, for surface residues, it is sensitive to water molecules surrounding the protein structure. Herein, we present a simple structural determinant termed the relative space of Voronoi volume (RSV); it uses the Voronoi volume and the van der Waals volume of particular residues to quantify the local structural environment. RSV (range, 0-1) is defined as (Voronoi volume-van der Waals volume)/Voronoi volume of the target residue. The concept of RSV describes the extent of available space for every protein residue. RSV and Voronoi profiles with and without water molecules (RSVw, RSV, VOw, and VO) were compared for 554 non-homologous proteins. RSV (without water) showed better Pearson's correlations with CS than did RSVw, VO, or VOw values. The mean correlation coefficient between RSV and CS was 0.51, which is comparable to the correlation between RSA and CS (0.49) and that between WCN and CS (0.56). RSV is a robust structural descriptor with and without water molecules and can quantitatively reflect evolutionary information in a single protein structure. Therefore, it may represent a practical structural determinant to study protein sequence, structure, and function relationships. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Extreme sequence divergence but conserved ligand-binding specificity in Streptococcus pyogenes M protein.

    Directory of Open Access Journals (Sweden)

    2006-05-01

    Full Text Available Many pathogenic microorganisms evade host immunity through extensive sequence variability in a protein region targeted by protective antibodies. In spite of the sequence variability, a variable region commonly retains an important ligand-binding function, reflected in the presence of a highly conserved sequence motif. Here, we analyze the limits of sequence divergence in a ligand-binding region by characterizing the hypervariable region (HVR of Streptococcus pyogenes M protein. Our studies were focused on HVRs that bind the human complement regulator C4b-binding protein (C4BP, a ligand that confers phagocytosis resistance. A previous comparison of C4BP-binding HVRs identified residue identities that could be part of a binding motif, but the extended analysis reported here shows that no residue identities remain when additional C4BP-binding HVRs are included. Characterization of the HVR in the M22 protein indicated that two relatively conserved Leu residues are essential for C4BP binding, but these residues are probably core residues in a coiled-coil, implying that they do not directly contribute to binding. In contrast, substitution of either of two relatively conserved Glu residues, predicted to be solvent-exposed, had no effect on C4BP binding, although each of these changes had a major effect on the antigenic properties of the HVR. Together, these findings show that HVRs of M proteins have an extraordinary capacity for sequence divergence and antigenic variability while retaining a specific ligand-binding function.

  10. Identifying structural variants using linked-read sequencing data.

    Science.gov (United States)

    Elyanow, Rebecca; Wu, Hsin-Ta; Raphael, Benjamin J

    2017-11-03

    Structural variation, including large deletions, duplications, inversions, translocations, and other rearrangements, is common in human and cancer genomes. A number of methods have been developed to identify structural variants from Illumina short-read sequencing data. However, reliable identification of structural variants remains challenging because many variants have breakpoints in repetitive regions of the genome and thus are difficult to identify with short reads. The recently developed linked-read sequencing technology from 10X Genomics combines a novel barcoding strategy with Illumina sequencing. This technology labels all reads that originate from a small number (~5-10) DNA molecules ~50Kbp in length with the same molecular barcode. These barcoded reads contain long-range sequence information that is advantageous for identification of structural variants. We present Novel Adjacency Identification with Barcoded Reads (NAIBR), an algorithm to identify structural variants in linked-read sequencing data. NAIBR predicts novel adjacencies in a individual genome resulting from structural variants using a probabilistic model that combines multiple signals in barcoded reads. We show that NAIBR outperforms several existing methods for structural variant identification - including two recent methods that also analyze linked-reads - on simulated sequencing data and 10X whole-genome sequencing data from the NA12878 human genome and the HCC1954 breast cancer cell line. Several of the novel somatic structural variants identified in HCC1954 overlap known cancer genes. Software is available at compbio.cs.brown.edu/software. braphael@princeton.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  11. Metazoan Remaining Genes for Essential Amino Acid Biosynthesis: Sequence Conservation and Evolutionary Analyses

    Directory of Open Access Journals (Sweden)

    Igor R. Costa

    2014-12-01

    Full Text Available Essential amino acids (EAA consist of a group of nine amino acids that animals are unable to synthesize via de novo pathways. Recently, it has been found that most metazoans lack the same set of enzymes responsible for the de novo EAA biosynthesis. Here we investigate the sequence conservation and evolution of all the metazoan remaining genes for EAA pathways. Initially, the set of all 49 enzymes responsible for the EAA de novo biosynthesis in yeast was retrieved. These enzymes were used as BLAST queries to search for similar sequences in a database containing 10 complete metazoan genomes. Eight enzymes typically attributed to EAA pathways were found to be ubiquitous in metazoan genomes, suggesting a conserved functional role. In this study, we address the question of how these genes evolved after losing their pathway partners. To do this, we compared metazoan genes with their fungal and plant orthologs. Using phylogenetic analysis with maximum likelihood, we found that acetolactate synthase (ALS and betaine-homocysteine S-methyltransferase (BHMT diverged from the expected Tree of Life (ToL relationships. High sequence conservation in the paraphyletic group Plant-Fungi was identified for these two genes using a newly developed Python algorithm. Selective pressure analysis of ALS and BHMT protein sequences showed higher non-synonymous mutation ratios in comparisons between metazoans/fungi and metazoans/plants, supporting the hypothesis that these two genes have undergone non-ToL evolution in animals.

  12. Deep sequencing discovery of novel and conserved microRNAs in trifoliate orange (Citrus trifoliata

    Directory of Open Access Journals (Sweden)

    Yu Huaping

    2010-07-01

    Full Text Available Abstract Background MicroRNAs (miRNAs play a critical role in post-transcriptional gene regulation and have been shown to control many genes involved in various biological and metabolic processes. There have been extensive studies to discover miRNAs and analyze their functions in model plant species, such as Arabidopsis and rice. Deep sequencing technologies have facilitated identification of species-specific or lowly expressed as well as conserved or highly expressed miRNAs in plants. Results In this research, we used Solexa sequencing to discover new microRNAs in trifoliate orange (Citrus trifoliata which is an important rootstock of citrus. A total of 13,106,753 reads representing 4,876,395 distinct sequences were obtained from a short RNA library generated from small RNA extracted from C. trifoliata flower and fruit tissues. Based on sequence similarity and hairpin structure prediction, we found that 156,639 reads representing 63 sequences from 42 highly conserved miRNA families, have perfect matches to known miRNAs. We also identified 10 novel miRNA candidates whose precursors were all potentially generated from citrus ESTs. In addition, five miRNA* sequences were also sequenced. These sequences had not been earlier described in other plant species and accumulation of the 10 novel miRNAs were confirmed by qRT-PCR analysis. Potential target genes were predicted for most conserved and novel miRNAs. Moreover, four target genes including one encoding IRX12 copper ion binding/oxidoreductase and three genes encoding NB-LRR disease resistance protein have been experimentally verified by detection of the miRNA-mediated mRNA cleavage in C. trifoliata. Conclusion Deep sequencing of short RNAs from C. trifoliata flowers and fruits identified 10 new potential miRNAs and 42 highly conserved miRNA families, indicating that specific miRNAs exist in C. trifoliata. These results show that regulatory miRNAs exist in agronomically important trifoliate orange

  13. Dipeptide frequency/bias analysis identifies conserved sites of nonrandomness shared by cysteine-rich motifs.

    Science.gov (United States)

    Campion, S R; Ameen, A S; Lai, L; King, J M; Munzenmaier, T N

    2001-08-15

    This report describes the application of a simple computational tool, AAPAIR.TAB, for the systematic analysis of the cysteine-rich EGF, Sushi, and Laminin motif/sequence families at the two-amino acid level. Automated dipeptide frequency/bias analysis detects preferences in the distribution of amino acids in established protein families, by determining which "ordered dipeptides" occur most frequently in comprehensive motif-specific sequence data sets. Graphic display of the dipeptide frequency/bias data revealed family-specific preferences for certain dipeptides, but more importantly detected a shared preference for employment of the ordered dipeptides Gly-Tyr (GY) and Gly-Phe (GF) in all three protein families. The dipeptide Asn-Gly (NG) also exhibited high-frequency and bias in the EGF and Sushi motif families, whereas Asn-Thr (NT) was distinguished in the Laminin family. Evaluation of the distribution of dipeptides identified by frequency/bias analysis subsequently revealed the highly restricted localization of the G(F/Y) and N(G/T) sequence elements at two separate sites of extreme conservation in the consensus sequence of all three sequence families. The similar employment of the high-frequency/bias dipeptides in three distinct protein sequence families was further correlated with the concurrence of these shared molecular determinants at similar positions within the distinctive scaffolds of three structurally divergent, but similarly employed, motif modules.

  14. Identifying Important Atlantic Areas for the conservation of Balearic shearwaters: Spatial overlap with conservation areas

    Science.gov (United States)

    Pérez-Roda, Amparo; Delord, Karine; Boué, Amélie; Arcos, José Manuel; García, David; Micol, Thierry; Weimerskirch, Henri; Pinaud, David; Louzao, Maite

    2017-07-01

    Marine protected areas (MPAs) are considered one of the main tools in both fisheries and conservation management to protect threatened species and their habitats around the globe. However, MPAs are underrepresented in marine environments compared to terrestrial environments. Within this context, we studied the Atlantic non-breeding distribution of the southern population of Balearic shearwaters (Puffinus mauretanicus) breeding in Eivissa during the 2011-2012 period based on global location sensing (GLS) devices. Our objectives were (1) to identify overall Important Atlantic Areas (IAAs) from a southern population, (2) to describe spatio-temporal patterns of oceanographic habitat use, and (3) to assess whether existing conservation areas (Natura 2000 sites and marine Important Bird Areas (IBAs)) cover the main IAAs of Balearic shearwaters. Our results highlighted that the Atlantic staging (from June to October in 2011) dynamic of the southern population was driven by individual segregation at both spatial and temporal scales. Individuals ranged in the North-East Atlantic over four main IAAs (Bay of Biscay: BoB, Western Iberian shelf: WIS, Gulf of Cadiz: GoC, West of Morocco: WoM). While most individuals spent more time on the WIS or in the GoC, a small number of birds visited IAAs at the extremes of their Atlantic distribution range (i.e., BoB and WoM). The chronology of the arrivals to the IAAs showed a latitudinal gradient with northern areas reached earlier during the Atlantic staging. The IAAs coincided with the most productive areas (higher chlorophyll a values) in the NE Atlantic between July and October. The spatial overlap between IAAs and conservation areas was higher for Natura 2000 sites than marine IBAs (areas with and without legal protection, respectively). Concerning the use of these areas, a slightly higher proportion of estimated positions fell within marine IBAs compared to designated Natura 2000 sites, with Spanish and Portuguese conservation

  15. Exome sequencing identifies ZNF644 mutations in high myopia.

    Directory of Open Access Journals (Sweden)

    Yi Shi

    2011-06-01

    Full Text Available Myopia is the most common ocular disorder worldwide, and high myopia in particular is one of the leading causes of blindness. Genetic factors play a critical role in the development of myopia, especially high myopia. Recently, the exome sequencing approach has been successfully used for the disease gene identification of Mendelian disorders. Here we show a successful application of exome sequencing to identify a gene for an autosomal dominant disorder, and we have identified a gene potentially responsible for high myopia in a monogenic form. We captured exomes of two affected individuals from a Han Chinese family with high myopia and performed sequencing analysis by a second-generation sequencer with a mean coverage of 30× and sufficient depth to call variants at ∼97% of each targeted exome. The shared genetic variants of these two affected individuals in the family being studied were filtered against the 1000 Genomes Project and the dbSNP131 database. A mutation A672G in zinc finger protein 644 isoform 1 (ZNF644 was identified as being related to the phenotype of this family. After we performed sequencing analysis of the exons in the ZNF644 gene in 300 sporadic cases of high myopia, we identified an additional five mutations (I587V, R680G, C699Y, 3'UTR+12 C>G, and 3'UTR+592 G>A in 11 different patients. All these mutations were absent in 600 normal controls. The ZNF644 gene was expressed in human retinal and retinal pigment epithelium (RPE. Given that ZNF644 is predicted to be a transcription factor that may regulate genes involved in eye development, mutation may cause the axial elongation of eyeball found in high myopia patients. Our results suggest that ZNF644 might be a causal gene for high myopia in a monogenic form.

  16. BlockLogo: Visualization of peptide and sequence motif conservation

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Kudahl, Ulrich Johan; Simon, Christian

    2013-01-01

    BlockLogo is a web-server application for the visualization of protein and nucleotide fragments, continuous protein sequence motifs, and discontinuous sequence motifs using calculation of block entropy from multiple sequence alignments. The user input consists of a multiple sequence alignment, se...

  17. Sample sequencing of vascular plants demonstrates widespread conservation and divergence of microRNAs.

    Science.gov (United States)

    Chávez Montes, Ricardo A; de Fátima Rosas-Cárdenas, Flor; De Paoli, Emanuele; Accerbi, Monica; Rymarquis, Linda A; Mahalingam, Gayathri; Marsch-Martínez, Nayelli; Meyers, Blake C; Green, Pamela J; de Folter, Stefan

    2014-04-23

    Small RNAs are pivotal regulators of gene expression that guide transcriptional and post-transcriptional silencing mechanisms in eukaryotes, including plants. Here we report a comprehensive atlas of sRNA and miRNA from 3 species of algae and 31 representative species across vascular plants, including non-model plants. We sequence and quantify sRNAs from 99 different tissues or treatments across species, resulting in a data set of over 132 million distinct sequences. Using miRBase mature sequences as a reference, we identify the miRNA sequences present in these libraries. We apply diverse profiling methods to examine critical sRNA and miRNA features, such as size distribution, tissue-specific regulation and sequence conservation between species, as well as to predict putative new miRNA sequences. We also develop database resources, computational analysis tools and a dedicated website, http://smallrna.udel.edu/. This study provides new insights on plant sRNAs and miRNAs, and a foundation for future studies.

  18. The interplay of sequence conservation and T cell immune recognition

    DEFF Research Database (Denmark)

    Bresciani, Anne Gøther; Sette, Alessandro; Greenbaum, Jason

    2014-01-01

    examined the hypothesis that conservation of a peptide in bacteria that are part of the healthy human microbiome leads to a reduced level of immunogenicity due to tolerization of T cells to the commensal bacteria. This was done by comparing experimentally characterized T cell epitope recognition data from...... the Immune Epitope Database with their conservation in the human microbiome. Indeed, we did see a lower immunogenicity for conserved peptides conserved. While many aspects how this conservation comparison is done require further optimization, this is a first step towards a better understanding T cell...... recognition of peptides in bacterial pathogens is influenced by their conservation in commensal bacteria. If the further work proves that this approach is successful, the degree of overlap of a peptide with the human proteome or microbiome could be added to the arsenal of tools available to assess peptide...

  19. Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

    Science.gov (United States)

    Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

    2012-08-01

    Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.

  20. An Evolutionarily Young Polar Bear (Ursus maritimus Endogenous Retrovirus Identified from Next Generation Sequence Data

    Directory of Open Access Journals (Sweden)

    Kyriakos Tsangaras

    2015-11-01

    Full Text Available Transcriptome analysis of polar bear (Ursus maritimus tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV. Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos and black bear (Ursus americanus but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals.

  1. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data.

    Science.gov (United States)

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E; Greenwood, Alex D

    2015-11-24

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals.

  2. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data

    Science.gov (United States)

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E.; Greenwood, Alex D.

    2015-01-01

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals. PMID:26610552

  3. Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences.

    LENUS (Irish Health Repository)

    Ivanov, Ivaylo P

    2011-05-01

    In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5\\' cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized--both for increased coding capacity and potentially also for novel regulatory mechanisms--remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5\\' untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.

  4. Effects of temperature and mass conservation on the typical chemical sequences of hydrogen oxidation

    Science.gov (United States)

    Nicholson, Schuyler B.; Alaghemandi, Mohammad; Green, Jason R.

    2018-01-01

    Macroscopic properties of reacting mixtures are necessary to design synthetic strategies, determine yield, and improve the energy and atom efficiency of many chemical processes. The set of time-ordered sequences of chemical species are one representation of the evolution from reactants to products. However, only a fraction of the possible sequences is typical, having the majority of the joint probability and characterizing the succession of chemical nonequilibrium states. Here, we extend a variational measure of typicality and apply it to atomistic simulations of a model for hydrogen oxidation over a range of temperatures. We demonstrate an information-theoretic methodology to identify typical sequences under the constraints of mass conservation. Including these constraints leads to an improved ability to learn the chemical sequence mechanism from experimentally accessible data. From these typical sequences, we show that two quantities defining the variational typical set of sequences—the joint entropy rate and the topological entropy rate—increase linearly with temperature. These results suggest that, away from explosion limits, data over a narrow range of thermodynamic parameters could be sufficient to extrapolate these typical features of combustion chemistry to other conditions.

  5. Exome sequencing identifies SUCO mutations in mesial temporal lobe epilepsy.

    Science.gov (United States)

    Sha, Zhiqiang; Sha, Longze; Li, Wenting; Dou, Wanchen; Shen, Yan; Wu, Liwen; Xu, Qi

    2015-03-30

    Mesial temporal lobe epilepsy (mTLE) is the main type and most common medically intractable form of epilepsy. Severity of disease-based stratified samples may help identify new disease-associated mutant genes. We analyzed mRNA expression profiles from patient hippocampal tissue. Three of the seven patients had severe mTLE with generalized-onset convulsions and consciousness loss that occurred over many years. We found that compared with other groups, patients with severe mTLE were classified into a distinct group. Whole-exome sequencing and Sanger sequencing validation in all seven patients identified three novel SUN domain-containing ossification factor (SUCO) mutations in severely affected patients. Furthermore, SUCO knock down significantly reduced dendritic length in vitro. Our results indicate that mTLE defects may affect neuronal development, and suggest that neurons have abnormal development due to lack of SUCO, which may be a generalized-onset epilepsy-related gene. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  6. Comparative analyses of six solanaceous transcriptomes reveal a high degree of sequence conservation and species-specific transcripts

    Directory of Open Access Journals (Sweden)

    Ouyang Shu

    2005-09-01

    Full Text Available Abstract Background The Solanaceae is a family of closely related species with diverse phenotypes that have been exploited for agronomic purposes. Previous studies involving a small number of genes suggested sequence conservation across the Solanaceae. The availability of large collections of Expressed Sequence Tags (ESTs for the Solanaceae now provides the opportunity to assess sequence conservation and divergence on a genomic scale. Results All available ESTs and Expressed Transcripts (ETs, 449,224 sequences for six Solanaceae species (potato, tomato, pepper, petunia, tobacco and Nicotiana benthamiana, were clustered and assembled into gene indices. Examination of gene ontologies revealed that the transcripts within the gene indices encode a similar suite of biological processes. Although the ESTs and ETs were derived from a variety of tissues, 55–81% of the sequences had significant similarity at the nucleotide level with sequences among the six species. Putative orthologs could be identified for 28–58% of the sequences. This high degree of sequence conservation was supported by expression profiling using heterologous hybridizations to potato cDNA arrays that showed similar expression patterns in mature leaves for all six solanaceous species. 16–19% of the transcripts within the six Solanaceae gene indices did not have matches among Solanaceae, Arabidopsis, rice or 21 other plant gene indices. Conclusion Results from this genome scale analysis confirmed a high level of sequence conservation at the nucleotide level of the coding sequence among Solanaceae. Additionally, the results indicated that part of the Solanaceae transcriptome is likely to be unique for each species.

  7. Unique Trichomonas vaginalis gene sequences identified in multinational regions of Northwest China.

    Science.gov (United States)

    Liu, Jun; Feng, Meng; Wang, Xiaolan; Fu, Yongfeng; Ma, Cailing; Cheng, Xunjia

    2017-07-24

    Trichomonas vaginalis (T. vaginalis) is a flagellated protozoan parasite that infects humans worldwide. This study determined the sequence of the 18S ribosomal RNA gene of T. vaginalis infecting both females and males in Xinjiang, China. Samples from 73 females and 28 males were collected and confirmed for infection with T. vaginalis, a total of 110 sequences were identified when the T. vaginalis 18S ribosomal RNA gene was sequenced. These sequences were used to prepare a phylogenetic network. The rooted network comprised three large clades and several independent branches. Most of the Xinjiang sequences were in one group. Preliminary results suggest that Xinjiang T. vaginalis isolates might be genetically unique, as indicated by the sequence of their 18S ribosomal RNA gene. Low migration rate of local people in this province may contribute to a genetic conservativeness of T. vaginalis. The unique genetic feature of our isolates may suggest a different clinical presentation of trichomoniasis, including metronidazole susceptibility, T. vaginalis virus or Mycoplasma co-infection characteristics. The transmission and evolution of Xinjiang T. vaginalis is of interest and should be studied further. More attention should be given to T. vaginalis infection in both females and males in Xinjiang.

  8. Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes

    Directory of Open Access Journals (Sweden)

    Maggi Giorgio P

    2008-06-01

    Full Text Available Abstract Background The accurate detection of genes and the identification of functional regions is still an open issue in the annotation of genomic sequences. This problem affects new genomes but also those of very well studied organisms such as human and mouse where, despite the great efforts, the inventory of genes and regulatory regions is far from complete. Comparative genomics is an effective approach to address this problem. Unfortunately it is limited by the computational requirements needed to perform genome-wide comparisons and by the problem of discriminating between conserved coding and non-coding sequences. This discrimination is often based (thus dependent on the availability of annotated proteins. Results In this paper we present the results of a comprehensive comparison of human and mouse genomes performed with a new high throughput grid-based system which allows the rapid detection of conserved sequences and accurate assessment of their coding potential. By detecting clusters of coding conserved sequences the system is also suitable to accurately identify potential gene loci. Following this analysis we created a collection of human-mouse conserved sequence tags and carefully compared our results to reliable annotations in order to benchmark the reliability of our classifications. Strikingly we were able to detect several potential gene loci supported by EST sequences but not corresponding to as yet annotated genes. Conclusion Here we present a new system which allows comprehensive comparison of genomes to detect conserved coding and non-coding sequences and the identification of potential gene loci. Our system does not require the availability of any annotated sequence thus is suitable for the analysis of new or poorly annotated genomes.

  9. MetaPhinder-Identifying Bacteriophage Sequences in Metagenomic Data Sets

    DEFF Research Database (Denmark)

    Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole

    2016-01-01

    genome structure of many bacteriophages. The method is demonstrated to outperform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source...... and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e. contigs) of phage origin in metage-nomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic...... code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder....

  10. Sequence conservation between porcine and human LRRK2

    DEFF Research Database (Denmark)

    Larsen, Knud; Madsen, Lone Bruhn

    2009-01-01

     Leucine-rich repeat kinase 2 (LRRK2) is a member of the ROCO protein superfamily (Ras of complex proteins (Roc) with a C-terminal Roc domain). Mutations in the LRRK2 gene lead to autosomal dominant Parkinsonism. We have cloned the porcine LRRK2 cDNA in an attempt to characterize conserved...... and expression patterns are conserved across species. The porcine LRRK2 gene was mapped to chromosome 5q25. The results obtained suggest that the LRRK2 gene might be of particular interest in our attempt to generate a transgenic porcine model for Parkinson's disease...

  11. Identifying missing dictionary entries with frequency-conserving context models

    Science.gov (United States)

    Williams, Jake Ryland; Clark, Eric M.; Bagrow, James P.; Danforth, Christopher M.; Dodds, Peter Sheridan

    2015-10-01

    In an effort to better understand meaning from natural language texts, we explore methods aimed at organizing lexical objects into contexts. A number of these methods for organization fall into a family defined by word ordering. Unlike demographic or spatial partitions of data, these collocation models are of special importance for their universal applicability. While we are interested here in text and have framed our treatment appropriately, our work is potentially applicable to other areas of research (e.g., speech, genomics, and mobility patterns) where one has ordered categorical data (e.g., sounds, genes, and locations). Our approach focuses on the phrase (whether word or larger) as the primary meaning-bearing lexical unit and object of study. To do so, we employ our previously developed framework for generating word-conserving phrase-frequency data. Upon training our model with the Wiktionary, an extensive, online, collaborative, and open-source dictionary that contains over 100 000 phrasal definitions, we develop highly effective filters for the identification of meaningful, missing phrase entries. With our predictions we then engage the editorial community of the Wiktionary and propose short lists of potential missing entries for definition, developing a breakthrough, lexical extraction technique and expanding our knowledge of the defined English lexicon of phrases.

  12. SeqAnt: A web service to rapidly identify and annotate DNA sequence variations

    Directory of Open Access Journals (Sweden)

    Patel Viren

    2010-09-01

    Full Text Available Abstract Background The enormous throughput and low cost of second-generation sequencing platforms now allow research and clinical geneticists to routinely perform single experiments that identify tens of thousands to millions of variant sites. Existing methods to annotate variant sites using information from publicly available databases via web browsers are too slow to be useful for the large sequencing datasets being routinely generated by geneticists. Because sequence annotation of variant sites is required before functional characterization can proceed, the lack of a high-throughput pipeline to efficiently annotate variant sites can act as a significant bottleneck in genetics research. Results SeqAnt (Sequence Annotator is an open source web service and software package that rapidly annotates DNA sequence variants and identifies recessive or compound heterozygous loci in human, mouse, fly, and worm genome sequencing experiments. Variants are characterized with respect to their functional type, frequency, and evolutionary conservation. Annotated variants can be viewed on a web browser, downloaded in a tab-delimited text file, or directly uploaded in a BED format to the UCSC genome browser. To demonstrate the speed of SeqAnt, we annotated a series of publicly available datasets that ranged in size from 37 to 3,439,107 variant sites. The total time to completely annotate these data completely ranged from 0.17 seconds to 28 minutes 49.8 seconds. Conclusion SeqAnt is an open source web service and software package that overcomes a critical bottleneck facing research and clinical geneticists using second-generation sequencing platforms. SeqAnt will prove especially useful for those investigators who lack dedicated bioinformatics personnel or infrastructure in their laboratories.

  13. Somatic mutations in histiocytic sarcoma identified by next generation sequencing.

    Science.gov (United States)

    Liu, Qingqing; Tomaszewicz, Keith; Hutchinson, Lloyd; Hornick, Jason L; Woda, Bruce; Yu, Hongbo

    2016-08-01

    Histiocytic sarcoma is a rare malignant neoplasm of presumed hematopoietic origin showing morphologic and immunophenotypic evidence of histiocytic differentiation. Somatic mutation importance in the pathogenesis or disease progression of histiocytic sarcoma was largely unknown. To identify somatic mutations in histiocytic sarcoma, we studied 5 histiocytic sarcomas [3 female and 2 male patients; mean age 54.8 (20-72), anatomic sites include lymph node, uterus, and pleura] and matched normal tissues from each patient as germ line controls. Somatic mutations in 50 "Hotspot" oncogenes and tumor suppressor genes were examined using next generation sequencing. Three (out of five) histiocytic sarcoma cases carried somatic mutations in BRAF. Among them, G464V [variant frequency (VF) of 43.6 %] and G466R (VF of 29.6 %) located at the P loop potentially interfere with the hydrophobic interaction between P and activating loops and ultimately activation of BRAF. Also detected was BRAF somatic mutation N581S (VF of 7.4 %), which was located at the catalytic loop of BRAF kinase domain: its role in modifying kinase activity was unclear. A similar mutational analysis was also performed on nine acute monocytic/monoblastic leukemia cases, which did not identify any BRAF somatic mutations. Our study detected several BRAF mutations in histiocytic sarcomas, which may be important in understanding the tumorigenesis of this rare neoplasm and providing mechanisms for potential therapeutical opportunities.

  14. Novel expressed sequences identified in a model of androgen independent prostate cancer

    Directory of Open Access Journals (Sweden)

    Jones Steven JM

    2007-01-01

    Full Text Available Abstract Background Prostate cancer is the most frequently diagnosed cancer in American men, and few effective treatment options are available to patients who develop hormone-refractory prostate cancer. The molecular changes that occur to allow prostate cells to proliferate in the absence of androgens are not fully understood. Results Subtractive hybridization experiments performed with samples from an in vivo model of hormonal progression identified 25 expressed sequences representing novel human transcripts. Intriguingly, these 25 sequences have small open-reading frames and are not highly conserved through evolution, suggesting many of these novel expressed sequences may be derived from untranslated regions of novel transcripts or from non-coding transcripts. Examination of a large metalibrary of human Serial Analysis of Gene Expression (SAGE tags demonstrated that only three of these novel sequences had been previously detected. RT-PCR experiments confirmed that the 6 sequences tested were expressed in specific human tissues, as well as in clinical samples of prostate cancer. Further RT-PCR experiments for five of these fragments indicated they originated from large untranslated regions of unannotated transcripts. Conclusion This study underlines the value of using complementary techniques in the annotation of the human genome. The tissue-specific expression of 4 of the 6 clones tested indicates the expression of these novel transcripts is tightly regulated, and future work will determine the possible role(s these novel transcripts may play in the progression of prostate cancer.

  15. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3' UTRs and coding sequences.

    Science.gov (United States)

    Šulc, Miroslav; Marín, Ray M; Robins, Harlan S; Vaníček, Jiří

    2015-07-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3' untranslated regions (3' UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3' UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA-mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA-mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3′ UTRs and coding sequences

    Science.gov (United States)

    Šulc, Miroslav; Marín, Ray M.; Robins, Harlan S.; Vaníček, Jiří

    2015-01-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3′ untranslated regions (3′ UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3′ UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA–mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA–mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. PMID:25948580

  17. Position-specific prediction of methylation sites from sequence conservation based on information theory.

    Science.gov (United States)

    Shi, Yinan; Guo, Yanzhi; Hu, Yayun; Li, Menglong

    2015-07-23

    Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation sites. The use of high-throughput bioinformatics methods has become imperative to predict methylation sites. In this study, we developed a novel method that is based only on sequence conservation to predict protein methylation sites. Conservation difference profiles between methylated and non-methylated peptides were constructed by the information entropy (IE) in a wider neighbor interval around the methylation sites that fully incorporated all of the environmental information. Then, the distinctive neighbor residues were identified by the importance scores of information gain (IG). The most representative model was constructed by support vector machine (SVM) for Arginine and Lysine methylation, respectively. This model yielded a promising result on both the benchmark dataset and independent test set. The model was used to screen the entire human proteome, and many unknown substrates were identified. These results indicate that our method can serve as a useful supplement to elucidate the mechanism of protein methylation and facilitate hypothesis-driven experimental design and validation.

  18. Identifying key conservation threats to Alpine birds through expert knowledge

    Science.gov (United States)

    Pedrini, Paolo; Brambilla, Mattia; Rolando, Antonio; Girardello, Marco

    2016-01-01

    Alpine biodiversity is subject to a range of increasing threats, but the scarcity of data for many taxa means that it is difficult to assess the level and likely future impact of a given threat. Expert opinion can be a useful tool to address knowledge gaps in the absence of adequate data. Experts with experience in Alpine ecology were approached to rank threat levels for 69 Alpine bird species over the next 50 years for the whole European Alps in relation to ten categories: land abandonment, climate change, renewable energy, fire, forestry practices, grazing practices, hunting, leisure, mining and urbanization. There was a high degree of concordance in ranking of perceived threats among experts for most threat categories. The major overall perceived threats to Alpine birds identified through expert knowledge were land abandonment, urbanization, leisure and forestry, although other perceived threats were ranked highly for particular species groups (renewable energy and hunting for raptors, hunting for gamebirds). For groups of species defined according to their breeding habitat, open habitat species and treeline species were perceived as the most threatened. A spatial risk assessment tool based on summed scores for the whole community showed threat levels were highest for bird communities of the northern and western Alps. Development of the approaches given in this paper, including addressing biases in the selection of experts and adopting a more detailed ranking procedure, could prove useful in the future in identifying future threats, and in carrying out risk assessments based on levels of threat to the whole bird community. PMID:26966659

  19. Identifying key conservation threats to Alpine birds through expert knowledge

    Directory of Open Access Journals (Sweden)

    Dan E. Chamberlain

    2016-02-01

    Full Text Available Alpine biodiversity is subject to a range of increasing threats, but the scarcity of data for many taxa means that it is difficult to assess the level and likely future impact of a given threat. Expert opinion can be a useful tool to address knowledge gaps in the absence of adequate data. Experts with experience in Alpine ecology were approached to rank threat levels for 69 Alpine bird species over the next 50 years for the whole European Alps in relation to ten categories: land abandonment, climate change, renewable energy, fire, forestry practices, grazing practices, hunting, leisure, mining and urbanization. There was a high degree of concordance in ranking of perceived threats among experts for most threat categories. The major overall perceived threats to Alpine birds identified through expert knowledge were land abandonment, urbanization, leisure and forestry, although other perceived threats were ranked highly for particular species groups (renewable energy and hunting for raptors, hunting for gamebirds. For groups of species defined according to their breeding habitat, open habitat species and treeline species were perceived as the most threatened. A spatial risk assessment tool based on summed scores for the whole community showed threat levels were highest for bird communities of the northern and western Alps. Development of the approaches given in this paper, including addressing biases in the selection of experts and adopting a more detailed ranking procedure, could prove useful in the future in identifying future threats, and in carrying out risk assessments based on levels of threat to the whole bird community.

  20. Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis.

    Science.gov (United States)

    Sharmin, Refat; Islam, Abul B M M K

    2016-01-01

    MERS-CoV is a newly emerged human coronavirus reported closely related with HKU4 and HKU5 Bat coronaviruses. Bat and MERS corona-viruses are structurally related. Therefore, it is of interest to estimate the degree of conserved antigenic sites among them. It is of importance to elucidate the shared antigenic-sites and extent of conservation between them to understand the evolutionary dynamics of MERS-CoV. Multiple sequence alignment of the spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins was employed to identify the sequence conservation among MERS and Bat (HKU4, HKU5) coronaviruses. We used various in silico tools to predict the conserved antigenic sites. We found that MERS-CoV shared 30 % of its S protein antigenic sites with HKU4 and 70 % with HKU5 bat-CoV. Whereas 100 % of its E, M and N protein's antigenic sites are found to be conserved with those in HKU4 and HKU5. This sharing suggests that in case of pathogenicity MERS-CoV is more closely related to HKU5 bat-CoV than HKU4 bat-CoV. The conserved epitopes indicates their evolutionary relationship and ancestry of pathogenicity.

  1. Evolutionary conservation of nuclear and nucleolar targeting sequences in yeast ribosomal protein S6A

    International Nuclear Information System (INIS)

    Lipsius, Edgar; Walter, Korden; Leicher, Torsten; Phlippen, Wolfgang; Bisotti, Marc-Angelo; Kruppa, Joachim

    2005-01-01

    Over 1 billion years ago, the animal kingdom diverged from the fungi. Nevertheless, a high sequence homology of 62% exists between human ribosomal protein S6 and S6A of Saccharomyces cerevisiae. To investigate whether this similarity in primary structure is mirrored in corresponding functional protein domains, the nuclear and nucleolar targeting signals were delineated in yeast S6A and compared to the known human S6 signals. The complete sequence of S6A and cDNA fragments was fused to the 5'-end of the LacZ gene, the constructs were transiently expressed in COS cells, and the subcellular localization of the fusion proteins was detected by indirect immunofluorescence. One bipartite and two monopartite nuclear localization signals as well as two nucleolar binding domains were identified in yeast S6A, which are located at homologous regions in human S6 protein. Remarkably, the number, nature, and position of these targeting signals have been conserved, albeit their amino acid sequences have presumably undergone a process of co-evolution with their corresponding rRNAs

  2. Mutations in the newly identified RAX regulatory sequence are not a frequent cause of micro/anophthalmia.

    Science.gov (United States)

    Chassaing, Nicolas; Vigouroux, Adeline; Calvas, Patrick

    2009-06-01

    Microphthalmia and anophthalmia are at the severe end of the spectrum of abnormalities in ocular development. A few genes (SOX2, OTX2, RAX, and CHX10) have been implicated in isolated micro/anophthalmia, but causative mutations of these genes explain less than a quarter of these developmental defects. A specifically conserved SOX2/OTX2-mediated RAX expression regulatory sequence has recently been identified. We postulated that mutations in this sequence could lead to micro/anophthalmia, and thus we performed molecular screening of this regulatory element in patients suffering from micro/anophthalmia. Fifty-one patients suffering from nonsyndromic microphthalmia (n = 40) or anophthalmia (n = 11) were included in this study after negative molecular screening for SOX2, OTX2, RAX, and CHX10 mutations. Mutation screening of the RAX regulatory sequence was performed by direct sequencing for these patients. No mutations were identified in the highly conserved RAX regulatory sequence in any of the 51 patients. Mutations in the newly identified RAX regulatory sequence do not represent a frequent cause of nonsyndromic micro/anophthalmia.

  3. Sequence and structural analysis of the chitinase insertion domain reveals two conserved motifs involved in chitin-binding.

    Directory of Open Access Journals (Sweden)

    Hai Li

    2010-01-01

    Full Text Available Chitinases are prevalent in life and are found in species including archaea, bacteria, fungi, plants, and animals. They break down chitin, which is the second most abundant carbohydrate in nature after cellulose. Hence, they are important for maintaining a balance between carbon and nitrogen trapped as insoluble chitin in biomass. Chitinases are classified into two families, 18 and 19 glycoside hydrolases. In addition to a catalytic domain, which is a triosephosphate isomerase barrel, many family 18 chitinases contain another module, i.e., chitinase insertion domain. While numerous studies focus on the biological role of the catalytic domain in chitinase activity, the function of the chitinase insertion domain is not completely understood. Bioinformatics offers an important avenue in which to facilitate understanding the role of residues within the chitinase insertion domain in chitinase function.Twenty-seven chitinase insertion domain sequences, which include four experimentally determined structures and span five kingdoms, were aligned and analyzed using a modified sequence entropy parameter. Thirty-two positions with conserved residues were identified. The role of these conserved residues was explored by conducting a structural analysis of a number of holo-enzymes. Hydrogen bonding and van der Waals calculations revealed a distinct subset of four conserved residues constituting two sequence motifs that interact with oligosaccharides. The other conserved residues may be key to the structure, folding, and stability of this domain.Sequence and structural studies of the chitinase insertion domains conducted within the framework of evolution identified four conserved residues which clearly interact with the substrates. Furthermore, evolutionary studies propose a link between the appearance of the chitinase insertion domain and the function of family 18 chitinases in the subfamily A.

  4. Mitochondrial genome sequences illuminate maternal lineages of conservation concern in a rare carnivore

    Science.gov (United States)

    Brian J. Knaus; Richard Cronn; Aaron Liston; Kristine Pilgrim; Michael K. Schwartz

    2011-01-01

    Science-based wildlife management relies on genetic information to infer population connectivity and identify conservation units. The most commonly used genetic marker for characterizing animal biodiversity and identifying maternal lineages is the mitochondrial genome. Mitochondrial genotyping figures prominently in conservation and management plans, with much of the...

  5. Sequence analysis of the L protein of the Ebola 2014 outbreak: Insight into conserved regions and mutations.

    Science.gov (United States)

    Ayub, Gohar; Waheed, Yasir

    2016-06-01

    The 2014 Ebola outbreak was one of the largest that have occurred; it started in Guinea and spread to Nigeria, Liberia and Sierra Leone. Phylogenetic analysis of the current virus species indicated that this outbreak is the result of a divergent lineage of the Zaire ebolavirus. The L protein of Ebola virus (EBOV) is the catalytic subunit of the RNA‑dependent RNA polymerase complex, which, with VP35, is key for the replication and transcription of viral RNA. Earlier sequence analysis demonstrated that the L protein of all non‑segmented negative‑sense (NNS) RNA viruses consists of six domains containing conserved functional motifs. The aim of the present study was to analyze the presence of these motifs in 2014 EBOV isolates, highlight their function and how they may contribute to the overall pathogenicity of the isolates. For this purpose, 81 2014 EBOV L protein sequences were aligned with 475 other NNS RNA viruses, including Paramyxoviridae and Rhabdoviridae viruses. Phylogenetic analysis of all EBOV outbreak L protein sequences was also performed. Analysis of the amino acid substitutions in the 2014 EBOV outbreak was conducted using sequence analysis. The alignment demonstrated the presence of previously conserved motifs in the 2014 EBOV isolates and novel residues. Notably, all the mutations identified in the 2014 EBOV isolates were tolerant, they were pathogenic with certain examples occurring within previously determined functional conserved motifs, possibly altering viral pathogenicity, replication and virulence. The phylogenetic analysis demonstrated that all sequences with the exception of the 2014 EBOV sequences were clustered together. The 2014 EBOV outbreak has acquired a great number of mutations, which may explain the reasons behind this unprecedented outbreak. Certain residues critical to the function of the polymerase remain conserved and may be targets for the development of antiviral therapeutic agents.

  6. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

    Science.gov (United States)

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

    2014-04-08

    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  7. Peptomics, identification of novel cationic Arabidopsis peptides with conserved sequence motifs

    DEFF Research Database (Denmark)

    Olsen, Addie Nina; Mundy, John; Skriver, Karen

    2002-01-01

    Arabidopsis family of 34 genes. The predicted peptides are characterized by a conserved C-terminal sequence motif and additional primary structure conservation in a core region. The majority of these genes had not previously been annotated. A subset of the predicted peptides show high overall sequence...... similarity to Rapid Alkalinization Factor (RALF), a peptide isolated from tobacco. We therefore refer to this peptide family as RALFL for RALF-Like. RT-PCR analysis confirmed that several of the Arabidopsis genes are expressed and that their expression patterns vary. The identification of a large gene family...

  8. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    Science.gov (United States)

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  9. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    Directory of Open Access Journals (Sweden)

    Kacy L Gordon

    2015-05-01

    Full Text Available Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2 from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  10. Simple sequence repeat (SSR) markers are effective for identifying ...

    African Journals Online (AJOL)

    DNA was extracted from newly formed leaves and amplified using 21 simple sequence repeat (SSR) markers (NH001c, NH002b, NH005b, NH007b, NH008b, NH009b, NH011b, NH013b, NH012a, NH014a, NH015a, NH017a, KA4b, KA5, KA14, KA16, KB16, KU10, BGA35, BGT23b and HGA8b). The data was analyzed by ...

  11. Genome-wide discovery and differential regulation of conserved and novel microRNAs in chickpea via deep sequencing.

    Science.gov (United States)

    Jain, Mukesh; Chevala, V V S Narayana; Garg, Rohini

    2014-11-01

    MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  12. Sequence conservation and combinatorial complexity of Drosophila neural precursor cell enhancers

    Directory of Open Access Journals (Sweden)

    Kuzin Alexander

    2008-08-01

    Full Text Available Abstract Background The presence of highly conserved sequences within cis-regulatory regions can serve as a valuable starting point for elucidating the basis of enhancer function. This study focuses on regulation of gene expression during the early events of Drosophila neural development. We describe the use of EvoPrinter and cis-Decoder, a suite of interrelated phylogenetic footprinting and alignment programs, to characterize highly conserved sequences that are shared among co-regulating enhancers. Results Analysis of in vivo characterized enhancers that drive neural precursor gene expression has revealed that they contain clusters of highly conserved sequence blocks (CSBs made up of shorter shared sequence elements which are present in different combinations and orientations within the different co-regulating enhancers; these elements contain either known consensus transcription factor binding sites or consist of novel sequences that have not been functionally characterized. The CSBs of co-regulated enhancers share a large number of sequence elements, suggesting that a diverse repertoire of transcription factors may interact in a highly combinatorial fashion to coordinately regulate gene expression. We have used information gained from our comparative analysis to discover an enhancer that directs expression of the nervy gene in neural precursor cells of the CNS and PNS. Conclusion The combined use EvoPrinter and cis-Decoder has yielded important insights into the combinatorial appearance of fundamental sequence elements required for neural enhancer function. Each of the 30 enhancers examined conformed to a pattern of highly conserved blocks of sequences containing shared constituent elements. These data establish a basis for further analysis and understanding of neural enhancer function.

  13. Third-Generation Sequencing and Analysis of Four Complete Pig Liver Esterase Gene Sequences in Clones Identified by Screening BAC Library.

    Science.gov (United States)

    Zhou, Qiongqiong; Sun, Wenjuan; Liu, Xiyan; Wang, Xiliang; Xiao, Yuncai; Bi, Dingren; Yin, Jingdong; Shi, Deshi

    2016-01-01

    Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing. After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis. Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression. This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes provides the necessary foundation for

  14. Cheap and Nasty? The Potential Perils of Using Management Costs to Identify Global Conservation Priorities

    Science.gov (United States)

    McCreless, Erin; Visconti, Piero; Carwardine, Josie; Wilcox, Chris; Smith, Robert J.

    2013-01-01

    The financial cost of biodiversity conservation varies widely around the world and such costs should be considered when identifying countries to best focus conservation investments. Previous global prioritizations have been based on global models for protected area management costs, but this metric may be related to other factors that negatively influence the effectiveness and social impacts of conservation. Here we investigate such relationships and first show that countries with low predicted costs are less politically stable. Local support and capacity can mitigate the impacts of such instability, but we also found that these countries have less civil society involvement in conservation. Therefore, externally funded projects in these countries must rely on government agencies for implementation. This can be problematic, as our analyses show that governments in countries with low predicted costs score poorly on indices of corruption, bureaucratic quality and human rights. Taken together, our results demonstrate that using national-level estimates for protected area management costs to set global conservation priorities is simplistic, as projects in apparently low-cost countries are less likely to succeed and more likely to have negative impacts on people. We identify the need for an improved approach to develop global conservation cost metrics that better capture the true costs of avoiding or overcoming such problems. Critically, conservation scientists must engage with practitioners to better understand and implement context-specific solutions. This approach assumes that measures of conservation costs, like measures of conservation value, are organization specific, and would bring a much-needed focus on reducing the negative impacts of conservation to develop projects that benefit people and biodiversity. PMID:24260502

  15. Cheap and nasty? The potential perils of using management costs to identify global conservation priorities.

    Directory of Open Access Journals (Sweden)

    Erin McCreless

    Full Text Available The financial cost of biodiversity conservation varies widely around the world and such costs should be considered when identifying countries to best focus conservation investments. Previous global prioritizations have been based on global models for protected area management costs, but this metric may be related to other factors that negatively influence the effectiveness and social impacts of conservation. Here we investigate such relationships and first show that countries with low predicted costs are less politically stable. Local support and capacity can mitigate the impacts of such instability, but we also found that these countries have less civil society involvement in conservation. Therefore, externally funded projects in these countries must rely on government agencies for implementation. This can be problematic, as our analyses show that governments in countries with low predicted costs score poorly on indices of corruption, bureaucratic quality and human rights. Taken together, our results demonstrate that using national-level estimates for protected area management costs to set global conservation priorities is simplistic, as projects in apparently low-cost countries are less likely to succeed and more likely to have negative impacts on people. We identify the need for an improved approach to develop global conservation cost metrics that better capture the true costs of avoiding or overcoming such problems. Critically, conservation scientists must engage with practitioners to better understand and implement context-specific solutions. This approach assumes that measures of conservation costs, like measures of conservation value, are organization specific, and would bring a much-needed focus on reducing the negative impacts of conservation to develop projects that benefit people and biodiversity.

  16. Cheap and nasty? The potential perils of using management costs to identify global conservation priorities.

    Science.gov (United States)

    McCreless, Erin; Visconti, Piero; Carwardine, Josie; Wilcox, Chris; Smith, Robert J

    2013-01-01

    The financial cost of biodiversity conservation varies widely around the world and such costs should be considered when identifying countries to best focus conservation investments. Previous global prioritizations have been based on global models for protected area management costs, but this metric may be related to other factors that negatively influence the effectiveness and social impacts of conservation. Here we investigate such relationships and first show that countries with low predicted costs are less politically stable. Local support and capacity can mitigate the impacts of such instability, but we also found that these countries have less civil society involvement in conservation. Therefore, externally funded projects in these countries must rely on government agencies for implementation. This can be problematic, as our analyses show that governments in countries with low predicted costs score poorly on indices of corruption, bureaucratic quality and human rights. Taken together, our results demonstrate that using national-level estimates for protected area management costs to set global conservation priorities is simplistic, as projects in apparently low-cost countries are less likely to succeed and more likely to have negative impacts on people. We identify the need for an improved approach to develop global conservation cost metrics that better capture the true costs of avoiding or overcoming such problems. Critically, conservation scientists must engage with practitioners to better understand and implement context-specific solutions. This approach assumes that measures of conservation costs, like measures of conservation value, are organization specific, and would bring a much-needed focus on reducing the negative impacts of conservation to develop projects that benefit people and biodiversity.

  17. AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

    Directory of Open Access Journals (Sweden)

    Claros M Gonzalo

    2010-06-01

    Full Text Available Abstract Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used

  18. Origin and spread of photosynthesis based upon conserved sequence features in key bacteriochlorophyll biosynthesis proteins.

    Science.gov (United States)

    Gupta, Radhey S

    2012-11-01

    The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been

  19. Transcriptome sequencing in prostate cancer identifies inter-tumor heterogeneity

    Directory of Open Access Journals (Sweden)

    Janet Mendonca

    2015-06-01

    Full Text Available Given the dearth of gene mutations in prostate cancer, [1] ,[2] it is likely that genomic rearrangements play a significant role in the evolution of prostate cancer. However, in the search for recurrent genomic alterations, "private alterations" have received less attention. Such alterations may provide insights into the evolution, behavior, and clinical outcome of an individual tumor. In a recent report in "Genome Biology" Wyatt et al. [3] defines unique alterations in a cohort of high-risk prostate cancer patient with a lethal phenotype. Utilizing a transcriptome sequencing approach they observe high inter-tumor heterogeneity; however, the genes altered distill into three distinct cancer-relevant pathways. Their analysis reveals the presence of several non-ETS fusions, which may contribute to the phenotype of individual tumors, and have significance for disease progression.

  20. Determination of 5 '-leader sequences from radically disparate strains of porcine reproductive and respiratory syndrome virus reveals the presence of highly conserved sequence motifs

    DEFF Research Database (Denmark)

    Oleksiewicz, M.B.; Bøtner, Anette; Nielsen, Jens

    1999-01-01

    We determined the untranslated 5'-leader sequence for three different isolates of porcine reproductive and respiratory syndrome virus (PRRSV): pathogenic European- and American-types, as well as an American-type vaccine strain. 5'-leader from European- and American-type PRRSV differed in length...... (220 and 190 nt, respectively), and exhibited only approximately 50% nucleotide homology. Nevertheless, highly conserved areas were identified in the leader of all 3 PRRSV isolates, which constitute candidate motifs for binding of protein(s) involved in viral replication. These comparative data provide...

  1. Comparing spatially explicit ecological and social values for natural areas to identify effective conservation strategies.

    Science.gov (United States)

    Bryan, Brett Anthony; Raymond, Christopher Mark; Crossman, Neville David; King, Darran

    2011-02-01

    Consideration of the social values people assign to relatively undisturbed native ecosystems is critical for the success of science-based conservation plans. We used an interview process to identify and map social values assigned to 31 ecosystem services provided by natural areas in an agricultural landscape in southern Australia. We then modeled the spatial distribution of 12 components of ecological value commonly used in setting spatial conservation priorities. We used the analytical hierarchy process to weight these components and used multiattribute utility theory to combine them into a single spatial layer of ecological value. Social values assigned to natural areas were negatively correlated with ecological values overall, but were positively correlated with some components of ecological value. In terms of the spatial distribution of values, people valued protected areas, whereas those natural areas underrepresented in the reserve system were of higher ecological value. The habitats of threatened animal species were assigned both high ecological value and high social value. Only small areas were assigned both high ecological value and high social value in the study area, whereas large areas of high ecological value were of low social value, and vice versa. We used the assigned ecological and social values to identify different conservation strategies (e.g., information sharing, community engagement, incentive payments) that may be effective for specific areas. We suggest that consideration of both ecological and social values in selection of conservation strategies can enhance the success of science-based conservation planning. ©2010 Society for Conservation Biology.

  2. Multi-species sequence comparison reveals conservation of ghrelin gene-derived splice variants encoding a truncated ghrelin peptide.

    Science.gov (United States)

    Seim, Inge; Jeffery, Penny L; Thomas, Patrick B; Walpole, Carina M; Maugham, Michelle; Fung, Jenny N T; Yap, Pei-Yi; O'Keeffe, Angela J; Lai, John; Whiteside, Eliza J; Herington, Adrian C; Chopin, Lisa K

    2016-06-01

    The peptide hormone ghrelin is a potent orexigen produced predominantly in the stomach. It has a number of other biological actions, including roles in appetite stimulation, energy balance, the stimulation of growth hormone release and the regulation of cell proliferation. Recently, several ghrelin gene splice variants have been described. Here, we attempted to identify conserved alternative splicing of the ghrelin gene by cross-species sequence comparisons. We identified a novel human exon 2-deleted variant and provide preliminary evidence that this splice variant and in1-ghrelin encode a C-terminally truncated form of the ghrelin peptide, termed minighrelin. These variants are expressed in humans and mice, demonstrating conservation of alternative splicing spanning 90 million years. Minighrelin appears to have similar actions to full-length ghrelin, as treatment with exogenous minighrelin peptide stimulates appetite and feeding in mice. Forced expression of the exon 2-deleted preproghrelin variant mirrors the effect of the canonical preproghrelin, stimulating cell proliferation and migration in the PC3 prostate cancer cell line. This is the first study to characterise an exon 2-deleted preproghrelin variant and to demonstrate sequence conservation of ghrelin gene-derived splice variants that encode a truncated ghrelin peptide. This adds further impetus for studies into the alternative splicing of the ghrelin gene and the function of novel ghrelin peptides in vertebrates.

  3. Combining endangered plants and animals as surrogates to identify priority conservation areas in Yunnan, China

    Science.gov (United States)

    Yang, Feiling; Hu, Jinming; Wu, Ruidong

    2016-08-01

    Suitable surrogates are critical for identifying optimal priority conservation areas (PCAs) to protect regional biodiversity. This study explored the efficiency of using endangered plants and animals as surrogates for identifying PCAs at the county level in Yunnan, southwest China. We ran the Dobson algorithm under three surrogate scenarios at 75% and 100% conservation levels and identified four types of PCAs. Assessment of the protection efficiencies of the four types of PCAs showed that endangered plants had higher surrogacy values than endangered animals but that the two were not substitutable; coupled endangered plants and animals as surrogates yielded a higher surrogacy value than endangered plants or animals as surrogates; the plant-animal priority areas (PAPAs) was the optimal among the four types of PCAs for conserving both endangered plants and animals in Yunnan. PAPAs could well represent overall species diversity distribution patterns and overlap with critical biogeographical regions in Yunnan. Fourteen priority units in PAPAs should be urgently considered as optimizing Yunnan’s protected area system. The spatial pattern of PAPAs at the 100% conservation level could be conceptualized into three connected conservation belts, providing a valuable reference for optimizing the layout of the in situ protected area system in Yunnan.

  4. Potential of DNA sequences to identify zoanthids (Cnidaria: Zoantharia).

    Science.gov (United States)

    Sinniger, Frederic; Reimer, James D; Pawlowski, Jan

    2008-12-01

    The order Zoantharia is known for its chaotic taxonomy and difficult morphological identification. One method that potentially could help for examining such troublesome taxa is DNA barcoding, which identifies species using standard molecular markers. The mitochondrial cytochrome oxidase subunit I (COI) has been utilized to great success in groups such as birds and insects; however, its applicability in many other groups is controversial. Recently, some studies have suggested that barcoding is not applicable to anthozoans. Here, we examine the use of COI and mitochondrial 16S ribosomal DNA for zoanthid identification. Despite the absence of a clear barcoding gap, our results show that for most of 54 zoanthid samples, both markers could separate samples to the species, or species group, level, particularly when easily accessible ecological or distributional data were included. Additionally, we have used the short V5 region of mt 16S rDNA to identify eight old (13 to 50 years old) museum samples. We discuss advantages and disadvantages of COI and mt 16S rDNA as barcodes for Zoantharia, and recommend that either one or both of these markers be considered for zoanthid identification in the future.

  5. Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.

    Science.gov (United States)

    Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M

    2010-12-15

    Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.

  6. Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi

    Directory of Open Access Journals (Sweden)

    Huynen Leon

    2010-12-01

    Full Text Available Abstract Background Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Results Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. Conclusions The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.

  7. Remarkable sequence conservation of the last intron in the PKD1 gene.

    Science.gov (United States)

    Rodova, Marianna; Islam, M Rafiq; Peterson, Kenneth R; Calvet, James P

    2003-10-01

    The last intron of the PKD1 gene (intron 45) was found to have exceptionally high sequence conservation across four mammalian species: human, mouse, rat, and dog. This conservation did not extend to the comparable intron in pufferfish. Pairwise comparisons for intron 45 showed 91% identity (human vs. dog) to 100% identity (mouse vs. rat) for an average for all four species of 94% identity. In contrast, introns 43 and 44 of the PKD1 gene had average pairwise identities of 57% and 54%, and exons 43, 44, and 45 and the coding region of exon 46 had average pairwise identities of 80%, 84%, 82%, and 80%. Intron 45 is 90 to 95 bp in length, with the major region of sequence divergence being in a central 4-bp to 9-bp variable region. RNA secondary structure analysis of intron 45 predicts a branching stem-loop structure in which the central variable region lies in one loop and the putative branch point sequence lies in another loop, suggesting that the intron adopts a specific stem-loop structure that may be important for its removal. Although intron 45 appears to conform to the class of small, G-triplet-containing introns that are spliced by a mechanism utilizing intron definition, its high sequence conservation may be a reflection of constraints imposed by a unique mechanism that coordinates splicing of this last PKD1 intron with polyadenylation.

  8. T-cell recognition is shaped by epitope sequence conservation in the host proteome and microbiome

    DEFF Research Database (Denmark)

    Bresciani, Anne Gøther; Paul, Sinu; Schommer, Nina

    2016-01-01

    or allergen with the conservation of its sequence in the human proteome or the healthy human microbiome. Indeed, performing such comparisons on large sets of validated T-cell epitopes, we found that epitopes that are similar with self-antigens above a certain threshold showed lower immunogenicity, presumably...... as a result of negative selection of T cells capable of recognizing such peptides. Moreover, we also found a reduced level of immune recognition for epitopes conserved in the commensal microbiome, presumably as a result of peripheral tolerance. These findings indicate that the existence (and potentially...

  9. Identifying Statistical Dependence in Genomic Sequences via Mutual Information Estimates

    Directory of Open Access Journals (Sweden)

    Wojciech Szpankowski

    2007-12-01

    Full Text Available Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, they are used for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5′ untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's combined DNA index system (CODIS, we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats—an application of importance in genetic profiling.

  10. Using a distribution and conservation status weighted hotspot approach to identify areas in need of conservation action to benefit Idaho bird species

    Science.gov (United States)

    Haines, Aaron M.; Leu, Matthias; Svancara, Leona K.; Wilson, Gina; Scott, J. Michael

    2010-01-01

    Identification of biodiversity hotspots (hereafter, hotspots) has become a common strategy to delineate important areas for wildlife conservation. However, the use of hotspots has not often incorporated important habitat types, ecosystem services, anthropogenic activity, or consistency in identifying important conservation areas. The purpose of this study was to identify hotspots to improve avian conservation efforts for Species of Greatest Conservation Need (SGCN) in the state of Idaho, United States. We evaluated multiple approaches to define hotspots and used a unique approach based on weighting species by their distribution size and conservation status to identify hotspot areas. All hotspot approaches identified bodies of water (Bear Lake, Grays Lake, and American Falls Reservoir) as important hotspots for Idaho avian SGCN, but we found that the weighted approach produced more congruent hotspot areas when compared to other hotspot approaches. To incorporate anthropogenic activity into hotspot analysis, we grouped species based on their sensitivity to specific human threats (i.e., urban development, agriculture, fire suppression, grazing, roads, and logging) and identified ecological sections within Idaho that may require specific conservation actions to address these human threats using the weighted approach. The Snake River Basalts and Overthrust Mountains ecological sections were important areas for potential implementation of conservation actions to conserve biodiversity. Our approach to identifying hotspots may be useful as part of a larger conservation strategy to aid land managers or local governments in applying conservation actions on the ground.

  11. Identifying all moiety conservation laws in genome-scale metabolic networks.

    Science.gov (United States)

    De Martino, Andrea; De Martino, Daniele; Mulet, Roberto; Pagnani, Andrea

    2014-01-01

    The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation.

  12. Identifying all moiety conservation laws in genome-scale metabolic networks.

    Directory of Open Access Journals (Sweden)

    Andrea De Martino

    Full Text Available The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation.

  13. Identifying and managing conflicts between forest conservation and other human interests in Europe

    NARCIS (Netherlands)

    Niemela, J.; Young, J.; Alard, D.; Askasibar, M.; Henle, K.; Johnson, R.; Kurttila, M.; Larsson, T.B.; Matouch, S.; Nowicki, P.L.; Paiva, R.Q.; Portoghesi, L.; Smulders, M.J.M.; Stevenson, A.; Tartes, U.; Watt, A.

    2005-01-01

    In this paper, circumstances where various human activities and interests clash with the conservation of forest biodiversity are examined, with particular focus on the drivers behind the conflicts. After identifying past and current human-related threats potentially leading to conflicts in forests,

  14. Opuntia in Mexico: identifying priority areas for conserving biodiversity in a multi-use landscape.

    Directory of Open Access Journals (Sweden)

    Patricia Illoldi-Rangel

    Full Text Available BACKGROUND: México is one of the world's centers of species diversity (richness for Opuntia cacti. Yet, in spite of their economic and ecological importance, Opuntia species remain poorly studied and protected in México. Many of the species are sparsely but widely distributed across the landscape and are subject to a variety of human uses, so devising implementable conservation plans for them presents formidable difficulties. Multi-criteria analysis can be used to design a spatially coherent conservation area network while permitting sustainable human usage. METHODS AND FINDINGS: Species distribution models were created for 60 Opuntia species using MaxEnt. Targets of representation within conservation area networks were assigned at 100% for the geographically rarest species and 10% for the most common ones. Three different conservation plans were developed to represent the species within these networks using total area, shape, and connectivity as relevant criteria. Multi-criteria analysis and a metaheuristic adaptive tabu search algorithm were used to search for optimal solutions. The plans were built on the existing protected areas of México and prioritized additional areas for management for the persistence of Opuntia species. All plans required around one-third of México's total area to be prioritized for attention for Opuntia conservation, underscoring the implausibility of Opuntia conservation through traditional land reservation. Tabu search turned out to be both computationally tractable and easily implementable for search problems of this kind. CONCLUSIONS: Opuntia conservation in México require the management of large areas of land for multiple uses. The multi-criteria analyses identified priority areas and organized them in large contiguous blocks that can be effectively managed. A high level of connectivity was established among the prioritized areas resulting in the enhancement of possible modes of plant dispersal as well as

  15. Exome sequencing in 53 sporadic cases of schizophrenia identifies 18 putative candidate genes.

    Directory of Open Access Journals (Sweden)

    Michel Guipponi

    Full Text Available Schizophrenia (SCZ is a severe, debilitating mental illness which has a significant genetic component. The identification of genetic factors related to SCZ has been challenging and these factors remain largely unknown. To evaluate the contribution of de novo variants (DNVs to SCZ, we sequenced the exomes of 53 individuals with sporadic SCZ and of their non-affected parents. We identified 49 DNVs, 18 of which were predicted to alter gene function, including 13 damaging missense mutations, 2 conserved splice site mutations, 2 nonsense mutations, and 1 frameshift deletion. The average number of exonic DNV per proband was 0.88, which corresponds to an exonic point mutation rate of 1.7×10(-8 per nucleotide per generation. The non-synonymous-to-synonymous mutation ratio of 2.06 did not differ from neutral expectations. Overall, this study provides a list of 18 putative candidate genes for sporadic SCZ, and when combined with the results of similar reports, identifies a second proband carrying a non-synonymous DNV in the RGS12 gene.

  16. A Potential Tool for Swift Fox (Vulpes velox) Conservation: Individuality of Long-Range Barking Sequences

    DEFF Research Database (Denmark)

    Darden, Safi-Kirstine Klem; Dabelsteen, Torben; Pedersen, Simon Boel

    2003-01-01

    Vocal individuality has been found in a number canid species. This natural variation can have applications in several aspects of species conservation, from behavioral studies to estimating population density or abundance. The swift fox (Vulpes velox) is a North American canid listed as endangered...... in Canada and extirpated, endangered, or threatened in parts of the United States. The barking sequence is a long-range vocalization in the species' vocal repertoire. It consists of a series of barks and is most common during the mating season. We analyzed barking sequences recorded in a standardized...

  17. Structure-sequence based analysis for identification of conserved regions in proteins

    Science.gov (United States)

    Zemla, Adam T; Zhou, Carol E; Lam, Marisa W; Smith, Jason R; Pardes, Elizabeth

    2013-05-28

    Disclosed are computational methods, and associated hardware and software products for scoring conservation in a protein structure based on a computationally identified family or cluster of protein structures. A method of computationally identifying a family or cluster of protein structures in also disclosed herein.

  18. Detecting remote sequence homology in disordered proteins: discovery of conserved motifs in the N-termini of Mononegavirales phosphoproteins.

    Directory of Open Access Journals (Sweden)

    David Karlin

    Full Text Available Paramyxovirinae are a large group of viruses that includes measles virus and parainfluenza viruses. The viral Phosphoprotein (P plays a central role in viral replication. It is composed of a highly variable, disordered N-terminus and a conserved C-terminus. A second viral protein alternatively expressed, the V protein, also contains the N-terminus of P, fused to a zinc finger. We suspected that, despite their high variability, the N-termini of P/V might all be homologous; however, using standard approaches, we could previously identify sequence conservation only in some Paramyxovirinae. We now compared the N-termini using sensitive sequence similarity search programs, able to detect residual similarities unnoticeable by conventional approaches. We discovered that all Paramyxovirinae share a short sequence motif in their first 40 amino acids, which we called soyuz1. Despite its short length (11-16aa, several arguments allow us to conclude that soyuz1 probably evolved by homologous descent, unlike linear motifs. Conservation across such evolutionary distances suggests that soyuz1 plays a crucial role and experimental data suggest that it binds the viral nucleoprotein to prevent its illegitimate self-assembly. In some Paramyxovirinae, the N-terminus of P/V contains a second motif, soyuz2, which might play a role in blocking interferon signaling. Finally, we discovered that the P of related Mononegavirales contain similarly overlooked motifs in their N-termini, and that their C-termini share a previously unnoticed structural similarity suggesting a common origin. Our results suggest several testable hypotheses regarding the replication of Mononegavirales and suggest that disordered regions with little overall sequence similarity, common in viral and eukaryotic proteins, might contain currently overlooked motifs (intermediate in length between linear motifs and disordered domains that could be detected simply by comparing orthologous proteins.

  19. Whole-genome sequencing approaches for conservation biology: Advantages, limitations and practical recommendations.

    Science.gov (United States)

    Fuentes-Pardo, Angela P; Ruzzante, Daniel E

    2017-10-01

    Whole-genome resequencing (WGR) is a powerful method for addressing fundamental evolutionary biology questions that have not been fully resolved using traditional methods. WGR includes four approaches: the sequencing of individuals to a high depth of coverage with either unresolved or resolved haplotypes, the sequencing of population genomes to a high depth by mixing equimolar amounts of unlabelled-individual DNA (Pool-seq) and the sequencing of multiple individuals from a population to a low depth (lcWGR). These techniques require the availability of a reference genome. This, along with the still high cost of shotgun sequencing and the large demand for computing resources and storage, has limited their implementation in nonmodel species with scarce genomic resources and in fields such as conservation biology. Our goal here is to describe the various WGR methods, their pros and cons and potential applications in conservation biology. WGR offers an unprecedented marker density and surveys a wide diversity of genetic variations not limited to single nucleotide polymorphisms (e.g., structural variants and mutations in regulatory elements), increasing their power for the detection of signatures of selection and local adaptation as well as for the identification of the genetic basis of phenotypic traits and diseases. Currently, though, no single WGR approach fulfils all requirements of conservation genetics, and each method has its own limitations and sources of potential bias. We discuss proposed ways to minimize such biases. We envision a not distant future where the analysis of whole genomes becomes a routine task in many nonmodel species and fields including conservation biology. © 2017 John Wiley & Sons Ltd.

  20. Genome-wide analysis of regulatory proteases sequences identified through bioinformatics data mining in Taenia solium.

    Science.gov (United States)

    Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng

    2014-06-04

    Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases

  1. A freshwater biodiversity hotspot under pressure - assessing threats and identifying conservation needs for ancient Lake Ohrid

    Science.gov (United States)

    Kostoski, G.; Albrecht, C.; Trajanovski, S.; Wilke, T.

    2010-12-01

    Immediate conservation measures for world-wide freshwater resources are of eminent importance. This is particularly true for so-called ancient lakes. While these lakes are famous for being evolutionary theatres, often displaying an extraordinarily high degree of biodiversity and endemism, in many cases these biota are also experiencing extreme anthropogenic impact. Lake Ohrid, a major European biodiversity hotspot situated in a trans-frontier setting on the Balkans, is a prime example for a lake with a magnitude of narrow range endemic taxa that are under increasing anthropogenic pressure. Unfortunately, evidence for a "creeping biodiversity crisis" has accumulated over the last decades, and major socio-political changes have gone along with human-mediated environmental changes. Based on field surveys, monitoring data, published records, and expert interviews, we aimed to (1) assess threats to Lake Ohrids' (endemic) biodiversity, (2) summarize existing conservation activities and strategies, and (3) outline future conservation needs for Lake Ohrid. We compiled threats to both specific taxa (and in cases to particular species) as well as to the lake ecosystems itself. Major conservation concerns identified for Lake Ohrid are: (1) watershed impacts, (2) agriculture and forestry, (3) tourism and population growth, (4) non-indigenous species, (5) habitat alteration or loss, (6) unsustainable exploitation of fisheries, and (7) global climate change. Among the major (well-known) threats with high impact are nutrient input (particularly of phosphorus), habitat conversion and silt load. Other threats are potentially of high impact but less well known. Such threats include pollution with hazardous substances (from sources such as mines, former industries, agriculture) or climate change. We review and discuss institutional responsibilities, environmental monitoring and ecosystem management, existing parks and reserves, biodiversity and species measures, international

  2. Multivariate ordination identifies vegetation types associated with spider conservation in brassica crops

    Directory of Open Access Journals (Sweden)

    Hafiz Sohaib Ahmed Saqib

    2017-10-01

    Full Text Available Conservation biological control emphasizes natural and other non-crop vegetation as a source of natural enemies to focal crops. There is an unmet need for better methods to identify the types of vegetation that are optimal to support specific natural enemies that may colonize the crops. Here we explore the commonality of the spider assemblage—considering abundance and diversity (H—in brassica crops with that of adjacent non-crop and non-brassica crop vegetation. We employ spatial-based multivariate ordination approaches, hierarchical clustering and spatial eigenvector analysis. The small-scale mixed cropping and high disturbance frequency of southern Chinese vegetation farming offered a setting to test the role of alternate vegetation for spider conservation. Our findings indicate that spider families differ markedly in occurrence with respect to vegetation type. Grassy field margins, non-crop vegetation, taro and sweetpotato harbour spider morphospecies and functional groups that are also present in brassica crops. In contrast, pumpkin and litchi contain spiders not found in brassicas, and so may have little benefit for conservation biological control services for brassicas. Our findings also illustrate the utility of advanced statistical approaches for identifying spatial relationships between natural enemies and the land uses most likely to offer alternative habitats for conservation biological control efforts that generates testable hypotheses for future studies.

  3. Exome sequencing identifies a novel SMCHD1 mutation in facioscapulohumeral muscular dystrophy 2.

    Science.gov (United States)

    Mitsuhashi, Satomi; Boyden, Steven E; Estrella, Elicia A; Jones, Takako I; Rahimov, Fedik; Yu, Timothy W; Darras, Basil T; Amato, Anthony A; Folkerth, Rebecca D; Jones, Peter L; Kunkel, Louis M; Kang, Peter B

    2013-12-01

    FSHD2 is a rare form of facioscapulohumeral muscular dystrophy (FSHD) characterized by the absence of a contraction in the D4Z4 macrosatellite repeat region on chromosome 4q35 that is the hallmark of FSHD1. However, hypomethylation of this region is common to both subtypes. Recently, mutations in SMCHD1 combined with a permissive 4q35 allele were reported to cause FSHD2. We identified a novel p.Lys275del SMCHD1 mutation in a family affected with FSHD2 using whole-exome sequencing and linkage analysis. This mutation alters a highly conserved amino acid in the ATPase domain of SMCHD1. Subject III-11 is a male who developed asymmetrical muscle weakness characteristic of FSHD at 13 years. Physical examination revealed marked bilateral atrophy at biceps brachii, bilateral scapular winging, some asymmetrical weakness at tibialis anterior and peroneal muscles, and mild lower facial weakness. Biopsy of biceps brachii in subject II-5, the father of III-11, demonstrated lobulated fibers and dystrophic changes. Endomysial and perivascular inflammation was found, which has been reported in FSHD1 but not FSHD2. Given the previous report of SMCHD1 mutations in FSHD2 and the clinical presentations consistent with the FSHD phenotype, we conclude that the SMCHD1 mutation is the likely cause of the disease in this family. Copyright © 2013 Elsevier B.V. All rights reserved.

  4. Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

    Science.gov (United States)

    Höps, Wolfram; Jeffryes, Matt; Bateman, Alex

    2018-01-01

    We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation.  Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases.  We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes.  Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.

  5. Optimal packaging of FIV genomic RNA depends upon a conserved long-range interaction and a palindromic sequence within gag.

    Science.gov (United States)

    Rizvi, Tahir A; Kenyon, Julia C; Ali, Jahabar; Aktar, Suriya J; Phillip, Pretty S; Ghazawi, Akela; Mustafa, Farah; Lever, Andrew M L

    2010-10-15

    The feline immunodeficiency virus (FIV) is a lentivirus that is related to human immunodeficiency virus (HIV), causing a similar pathology in cats. It is a potential small animal model for AIDS and the FIV-based vectors are also being pursued for human gene therapy. Previous studies have mapped the FIV packaging signal (ψ) to two or more discontinuous regions within the 5' 511 nt of the genomic RNA and structural analyses have determined its secondary structure. The 5' and 3' sequences within ψ region interact through extensive long-range interactions (LRIs), including a conserved heptanucleotide interaction between R/U5 and gag. Other secondary structural elements identified include a conserved 150 nt stem-loop (SL2) and a small palindromic stem-loop within gag open reading frame that might act as a viral dimerization initiation site. We have performed extensive mutational analysis of these sequences and structures and ascertained their importance in FIV packaging using a trans-complementation assay. Disrupting the conserved heptanucleotide LRI to prevent base pairing between R/U5 and gag reduced packaging by 2.8-5.5 fold. Restoration of pairing using an alternative, non-wild type (wt) LRI sequence restored RNA packaging and propagation to wt levels, suggesting that it is the structure of the LRI, rather than its sequence, that is important for FIV packaging. Disrupting the palindrome within gag reduced packaging by 1.5-3-fold, but substitution with a different palindromic sequence did not restore packaging completely, suggesting that the sequence of this region as well as its palindromic nature is important. Mutation of individual regions of SL2 did not have a pronounced effect on FIV packaging, suggesting that either it is the structure of SL2 as a whole that is necessary for optimal packaging, or that there is redundancy within this structure. The mutational analysis presented here has further validated the previously predicted RNA secondary structure of FIV

  6. High-Throughput Sequencing Reveals Diverse Sets of Conserved, Nonconserved, and Species-Specific miRNAs in Jute

    Directory of Open Access Journals (Sweden)

    Md. Tariqul Islam

    2015-01-01

    Full Text Available MicroRNAs play a pivotal role in regulating a broad range of biological processes, acting by cleaving mRNAs or by translational repression. A group of plant microRNAs are evolutionarily conserved; however, others are expressed in a species-specific manner. Jute is an agroeconomically important fibre crop; nonetheless, no practical information is available for microRNAs in jute to date. In this study, Illumina sequencing revealed a total of 227 known microRNAs and 17 potential novel microRNA candidates in jute, of which 164 belong to 23 conserved families and the remaining 63 belong to 58 nonconserved families. Among a total of 81 identified microRNA families, 116 potential target genes were predicted for 39 families and 11 targets were predicted for 4 among the 17 identified novel microRNAs. For understanding better the functions of microRNAs, target genes were analyzed by Gene Ontology and their pathways illustrated by KEGG pathway analyses. The presence of microRNAs identified in jute was validated by stem-loop RT-PCR followed by end point PCR and qPCR for randomly selected 20 known and novel microRNAs. This study exhaustively identifies microRNAs and their target genes in jute which will ultimately pave the way for understanding their role in this crop and other crops.

  7. Identifying and prioritizing ungulate migration routes for landscape-level conservation

    Science.gov (United States)

    Sawyer, Hall; Kauffman, Matthew J.; Nielson, Ryan M.; Horne, Jon S.

    2009-01-01

    As habitat loss and fragmentation increase across ungulate ranges, identifying and prioritizing migration routes for conservation has taken on new urgency. Here we present a general framework using the Brownian bridge movement model (BBMM) that: (1) provides a probabilistic estimate of the migration routes of a sampled population, (2) distinguishes between route segments that function as stopover sites vs. those used primarily as movement corridors, and (3) prioritizes routes for conservation based upon the proportion of the sampled population that uses them. We applied this approach to a migratory mule deer (Odocoileus hemionus) population in a pristine area of southwest Wyoming, USA, where 2000 gas wells and 1609 km of pipelines and roads have been proposed for development. Our analysis clearly delineated where migration routes occurred relative to proposed development and provided guidance for on-the-ground conservation efforts. Mule deer migration routes were characterized by a series of stopover sites where deer spent most of their time, connected by movement corridors through which deer moved quickly. Our findings suggest management strategies that differentiate between stopover sites and movement corridors may be warranted. Because some migration routes were used by more mule deer than others, proportional level of use may provide a reasonable metric by which routes can be prioritized for conservation. The methods we outline should be applicable to a wide range of species that inhabit regions where migration routes are threatened or poorly understood.

  8. Identifying Corneal Infections in Formalin-Fixed Specimens Using Next Generation Sequencing.

    Science.gov (United States)

    Li, Zhigang; Breitwieser, Florian P; Lu, Jennifer; Jun, Albert S; Asnaghi, Laura; Salzberg, Steven L; Eberhart, Charles G

    2018-01-01

    We test the ability of next-generation sequencing, combined with computational analysis, to identify a range of organisms causing infectious keratitis. This retrospective study evaluated 16 cases of infectious keratitis and four control corneas in formalin-fixed tissues from the pathology laboratory. Infectious cases also were analyzed in the microbiology laboratory using culture, polymerase chain reaction, and direct staining. Classified sequence reads were analyzed with two different metagenomics classification engines, Kraken and Centrifuge, and visualized using the Pavian software tool. Sequencing generated 20 to 46 million reads per sample. On average, 96% of the reads were classified as human, 0.3% corresponded to known vectors or contaminant sequences, 1.7% represented microbial sequences, and 2.4% could not be classified. The two computational strategies successfully identified the fungal, bacterial, and amoebal pathogens in most patients, including all four bacterial and mycobacterial cases, five of six fungal cases, three of three Acanthamoeba cases, and one of three herpetic keratitis cases. In several cases, additional potential pathogens also were identified. In one case with cytomegalovirus identified by Kraken and Centrifuge, the virus was confirmed by direct testing, while two where Staphylococcus aureus or cytomegalovirus were identified by Centrifuge but not Kraken could not be confirmed. Confirmation was not attempted for an additional three potential pathogens identified by Kraken and 11 identified by Centrifuge. Next generation sequencing combined with computational analysis can identify a wide range of pathogens in formalin-fixed corneal specimens, with potential applications in clinical diagnostics and research.

  9. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.

    Science.gov (United States)

    Motamayor, Juan C; Mockaitis, Keithanne; Schmutz, Jeremy; Haiminen, Niina; Livingstone, Donald; Cornejo, Omar; Findley, Seth D; Zheng, Ping; Utro, Filippo; Royaert, Stefan; Saski, Christopher; Jenkins, Jerry; Podicheti, Ram; Zhao, Meixia; Scheffler, Brian E; Stack, Joseph C; Feltus, Frank A; Mustiga, Guiliana M; Amores, Freddy; Phillips, Wilbert; Marelli, Jean Philippe; May, Gregory D; Shapiro, Howard; Ma, Jianxin; Bustamante, Carlos D; Schnell, Raymond J; Main, Dorrie; Gilbert, Don; Parida, Laxmi; Kuhn, David N

    2013-06-03

    Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits.

  10. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color

    Science.gov (United States)

    2013-01-01

    Background Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. Results We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. Conclusions We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits. PMID:23731509

  11. A unique genomic sequence in the Wolf-Hirschhorn syndrome [WHS] region of humans is conserved in the great apes.

    Science.gov (United States)

    Tarzami, S T; Kringstein, A M; Conte, R A; Verma, R S

    1996-10-01

    The Wolf-Hirschhorn syndrome (WHS) is caused by a partial deletion in the short arm of chromosome 4 band 16.3 (4p 16.3). A unique-sequence human DNA probe (39 kb) localized within this region has been used to search for sequence homology in the apes' equivalent chromosome 3 by FISH-technique. The WHS loci are conserved in higher primates at the expected position. Nevertheless, a control probe, which detects alphoid sequences of the pericentromeric region of humans, is diverged in chimpanzee, gorilla, and orangutan. The conservation of WHS loci and divergence of DNA alphoid sequences have further added to the controversy concerning human descent.

  12. Compendium of Immune Signatures Identifies Conserved and Species-Specific Biology in Response to Inflammation.

    Science.gov (United States)

    Godec, Jernej; Tan, Yan; Liberzon, Arthur; Tamayo, Pablo; Bhattacharya, Sanchita; Butte, Atul J; Mesirov, Jill P; Haining, W Nicholas

    2016-01-19

    Gene-expression profiling has become a mainstay in immunology, but subtle changes in gene networks related to biological processes are hard to discern when comparing various datasets. For instance, conservation of the transcriptional response to sepsis in mouse models and human disease remains controversial. To improve transcriptional analysis in immunology, we created ImmuneSigDB: a manually annotated compendium of ∼5,000 gene-sets from diverse cell states, experimental manipulations, and genetic perturbations in immunology. Analysis using ImmuneSigDB identified signatures induced in activated myeloid cells and differentiating lymphocytes that were highly conserved between humans and mice. Sepsis triggered conserved patterns of gene expression in humans and mouse models. However, we also identified species-specific biological processes in the sepsis transcriptional response: although both species upregulated phagocytosis-related genes, a mitosis signature was specific to humans. ImmuneSigDB enables granular analysis of transcriptomic data to improve biological understanding of immune processes of the human and mouse immune systems. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. The First Myriapod Genome Sequence Reveals Conservative Arthropod Gene Content and Genome Organisation in the Centipede Strigamia maritima

    Science.gov (United States)

    Chipman, Ariel D.; Ferrier, David E. K.; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S. T.; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C.; Alonso, Claudio R.; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C. J.; Blankenburg, Kerstin P.; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K.; Du Pasquier, Louis; Duncan, Elizabeth J.; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D.; Extavour, Cassandra G.; Francisco, Liezl; Gabaldón, Toni; Gillis, William J.; Goodwin-Horn, Elizabeth A.; Green, Jack E.; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J. P.; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H. L.; Hunn, Julia P.; Hunnekuhl, Vera S.; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N.; Jiggins, Francis M.; Jones, Tamsin E.; Kaiser, Tobias S.; Kalra, Divya; Kenny, Nathan J.; Korchina, Viktoriya; Kovar, Christie L.; Kraus, F. Bernhard; Lapraz, François; Lee, Sandra L.; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N.; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J.; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H.; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C.; Robertson, Helen E.; Robertson, Hugh M.; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E.; Schurko, Andrew M.; Siggens, Kenneth W.; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J.; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M.; Willis, Judith H.; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M.; Worley, Kim C.; Gibbs, Richard A.; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  14. PDL1 Signals through Conserved Sequence Motifs to Overcome Interferon-Mediated Cytotoxicity

    Directory of Open Access Journals (Sweden)

    Maria Gato-Cañas

    2017-08-01

    Full Text Available PDL1 blockade produces remarkable clinical responses, thought to occur by T cell reactivation through prevention of PDL1-PD1 T cell inhibitory interactions. Here, we find that PDL1 cell-intrinsic signaling protects cancer cells from interferon (IFN cytotoxicity and accelerates tumor progression. PDL1 inhibited IFN signal transduction through a conserved class of sequence motifs that mediate crosstalk with IFN signaling. Abrogation of PDL1 expression or antibody-mediated PDL1 blockade strongly sensitized cancer cells to IFN cytotoxicity through a STAT3/caspase-7-dependent pathway. Moreover, somatic mutations found in human carcinomas within these PDL1 sequence motifs disrupted motif regulation, resulting in PDL1 molecules with enhanced protective activities from type I and type II IFN cytotoxicity. Overall, our results reveal a mode of action of PDL1 in cancer cells as a first line of defense against IFN cytotoxicity.

  15. Tetrapods on the EDGE: Overcoming data limitations to identify phylogenetic conservation priorities

    Science.gov (United States)

    Gray, Claudia L.; Wearn, Oliver R.; Owen, Nisha R.

    2018-01-01

    The scale of the ongoing biodiversity crisis requires both effective conservation prioritisation and urgent action. As extinction is non-random across the tree of life, it is important to prioritise threatened species which represent large amounts of evolutionary history. The EDGE metric prioritises species based on their Evolutionary Distinctiveness (ED), which measures the relative contribution of a species to the total evolutionary history of their taxonomic group, and Global Endangerment (GE), or extinction risk. EDGE prioritisations rely on adequate phylogenetic and extinction risk data to generate meaningful priorities for conservation. However, comprehensive phylogenetic trees of large taxonomic groups are extremely rare and, even when available, become quickly out-of-date due to the rapid rate of species descriptions and taxonomic revisions. Thus, it is important that conservationists can use the available data to incorporate evolutionary history into conservation prioritisation. We compared published and new methods to estimate missing ED scores for species absent from a phylogenetic tree whilst simultaneously correcting the ED scores of their close taxonomic relatives. We found that following artificial removal of species from a phylogenetic tree, the new method provided the closest estimates of their “true” ED score, differing from the true ED score by an average of less than 1%, compared to the 31% and 38% difference of the previous methods. The previous methods also substantially under- and over-estimated scores as more species were artificially removed from a phylogenetic tree. We therefore used the new method to estimate ED scores for all tetrapods. From these scores we updated EDGE prioritisation rankings for all tetrapod species with IUCN Red List assessments, including the first EDGE prioritisation for reptiles. Further, we identified criteria to identify robust priority species in an effort to further inform conservation action whilst

  16. Sequence recombination and conservation of Varroa destructor virus-1 and deformed wing virus in field collected honey bees (Apis mellifera.

    Directory of Open Access Journals (Sweden)

    Hui Wang

    Full Text Available We sequenced small (s RNAs from field collected honeybees (Apis mellifera and bumblebees (Bombuspascuorum using the Illumina technology. The sRNA reads were assembled and resulting contigs were used to search for virus homologues in GenBank. Matches with Varroadestructor virus-1 (VDV1 and Deformed wing virus (DWV genomic sequences were obtained for A. mellifera but not B. pascuorum. Further analyses suggested that the prevalent virus population was composed of VDV-1 and a chimera of 5'-DWV-VDV1-DWV-3'. The recombination junctions in the chimera genomes were confirmed by using RT-PCR, cDNA cloning and Sanger sequencing. We then focused on conserved short fragments (CSF, size > 25 nt in the virus genomes by using GenBank sequences and the deep sequencing data obtained in this study. The majority of CSF sites confirmed conservation at both between-species (GenBank sequences and within-population (dataset of this study levels. However, conserved nucleotide positions in the GenBank sequences might be variable at the within-population level. High mutation rates (Pi>10% were observed at a number of sites using the deep sequencing data, suggesting that sequence conservation might not always be maintained at the population level. Virus-host interactions and strategies for developing RNAi treatments against VDV1/DWV infections are discussed.

  17. Comparative genomics of 12 strains of Erwinia amylovora identifies a pan-genome with a large conserved core.

    Directory of Open Access Journals (Sweden)

    Rachel A Mann

    Full Text Available The plant pathogen Erwinia amylovora can be divided into two host-specific groupings; strains infecting a broad range of hosts within the Rosaceae subfamily Spiraeoideae (e.g., Malus, Pyrus, Crataegus, Sorbus and strains infecting Rubus (raspberries and blackberries. Comparative genomic analysis of 12 strains representing distinct populations (e.g., geographic, temporal, host origin of E. amylovora was used to describe the pan-genome of this major pathogen. The pan-genome contains 5751 coding sequences and is highly conserved relative to other phytopathogenic bacteria comprising on average 89% conserved, core genes. The chromosomes of Spiraeoideae-infecting strains were highly homogeneous, while greater genetic diversity was observed between Spiraeoideae- and Rubus-infecting strains (and among individual Rubus-infecting strains, the majority of which was attributed to variable genomic islands. Based on genomic distance scores and phylogenetic analysis, the Rubus-infecting strain ATCC BAA-2158 was genetically more closely related to the Spiraeoideae-infecting strains of E. amylovora than it was to the other Rubus-infecting strains. Analysis of the accessory genomes of Spiraeoideae- and Rubus-infecting strains has identified putative host-specific determinants including variation in the effector protein HopX1(Ea and a putative secondary metabolite pathway only present in Rubus-infecting strains.

  18. Identifying conservation priorities and management strategies based on ecosystem services to improve urban sustainability in Harbin, China.

    Science.gov (United States)

    Qu, Yi; Lu, Ming

    2018-01-01

    Rapid urbanization and agricultural development has resulted in the degradation of ecosystems, while also negatively impacting ecosystem services (ES) and urban sustainability. Identifying conservation priorities for ES and applying reasonable management strategies have been found to be effective methods for mitigating this phenomenon. The purpose of this study is to propose a comprehensive framework for identifying ES conservation priorities and associated management strategies for these planning areas. First, we incorporated 10 ES indicators within a systematic conservation planning (SCP) methodology in order to identify ES conservation priorities with high irreplaceability values based on conservation target goals associated with the potential distribution of ES indicators. Next, we assessed the efficiency of the ES conservation priorities for meeting the designated conservation target goals. Finally, ES conservation priorities were clustered into groups using a K-means clustering analysis in an effort to identify the dominant ES per location before formulating management strategies. We effectively identified 12 ES priorities to best represent conservation target goals for the ES indicators. These 12 priorities had a total areal coverage of 13,364 km 2 representing 25.16% of the study area. The 12 priorities were further clustered into five significantly different groups ( p -values between groups urban and agricultural areas, thereby preventing urban and agriculture sprawl and guiding sustainable urban development.

  19. Asymmetrical distribution of non-conserved regulatory sequences at PHOX2B is reflected at the ENCODE loci and illuminates a possible genome-wide trend

    Directory of Open Access Journals (Sweden)

    McCallion Andrew S

    2009-01-01

    Full Text Available Abstract Background Transcriptional regulatory elements are central to development and interspecific phenotypic variation. Current regulatory element prediction tools rely heavily upon conservation for prediction of putative elements. Recent in vitro observations from the ENCODE project combined with in vivo analyses at the zebrafish phox2b locus suggests that a significant fraction of regulatory elements may fall below commonly applied metrics of conservation. We propose to explore these observations in vivo at the human PHOX2B locus, and also evaluate the potential evidence for genome-wide applicability of these observations through a novel analysis of extant data. Results Transposon-based transgenic analysis utilizing a tiling path proximal to human PHOX2B in zebrafish recapitulates the observations at the zebrafish phox2b locus of both conserved and non-conserved regulatory elements. Analysis of human sequences conserved with previously identified zebrafish phox2b regulatory elements demonstrates that the orthologous sequences exhibit overlapping regulatory control. Additionally, analysis of non-conserved sequences scattered over 135 kb 5' to PHOX2B, provides evidence of non-conserved regulatory elements positively biased with close proximity to the gene. Furthermore, we provide a novel analysis of data from the ENCODE project, finding a non-uniform distribution of regulatory elements consistent with our in vivo observations at PHOX2B. These observations remain largely unchanged when one accounts for the sequence repeat content of the assayed intervals, when the intervals are sub-classified by biological role (developmental versus non-developmental, or by gene density (gene desert versus non-gene desert. Conclusion While regulatory elements frequently display evidence of evolutionary conservation, a fraction appears to be undetected by current metrics of conservation. In vivo observations at the PHOX2B locus, supported by our analyses of in

  20. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    Science.gov (United States)

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  1. Combining measures of dispersal to identify conservation strategies in fragmented landscapes.

    Science.gov (United States)

    Leidner, Allison K; Haddad, Nick M

    2011-10-01

    Understanding the way in which habitat fragmentation disrupts animal dispersal is key to identifying effective and efficient conservation strategies. To differentiate the potential effectiveness of 2 frequently used strategies for increasing the connectivity of populations in fragmented landscapes-corridors and stepping stones-we combined 3 complimentary methods: behavioral studies at habitat edges, mark-recapture, and genetic analyses. Each of these methods addresses different steps in the dispersal process that a single intensive study could not address. We applied the 3 methods to the case study of Atrytonopsis new species 1, a rare butterfly endemic to a partially urbanized stretch of barrier islands in North Carolina (U.S.A.). Results of behavioral analyses showed the butterfly flew into urban and forested areas, but not over open beach; mark-recapture showed that the butterfly dispersed successfully through short stretches of urban areas (5 km) were a dispersal barrier, but shorter stretches of urban areas (≤5 km) were not. Although results from all 3 methods indicated natural features in the landscape, not urbanization, were barriers to dispersal, when we combined the results we could determine where barriers might arise: forests restricted dispersal for the butterfly only when there were long stretches with no habitat. Therefore, urban areas have the potential to become a dispersal barrier if their extent increases, a finding that may have gone unnoticed if we had used a single approach. Protection of stepping stones should be sufficient to maintain connectivity for Atrytonopsis new species 1 at current levels of urbanization. Our research highlights how the use of complementary approaches for studying animal dispersal in fragmented landscapes can help identify conservation strategies. ©2011 Society for Conservation Biology.

  2. Identify alternative splicing events based on position-specific evolutionary conservation.

    Directory of Open Access Journals (Sweden)

    Liang Chen

    Full Text Available The evolution of eukaryotes is accompanied by the increased complexity of alternative splicing which greatly expands genome information. One of the greatest challenges in the post-genome era is a complete revelation of human transcriptome with consideration of alternative splicing. Here, we introduce a comparative genomics approach to systemically identify alternative splicing events based on the differential evolutionary conservation between exons and introns and the high-quality annotation of the ENCODE regions. Specifically, we focus on exons that are included in some transcripts but are completely spliced out for others and we call them conditional exons. First, we characterize distinguishing features among conditional exons, constitutive exons and introns. One of the most important features is the position-specific conservation score. There are dramatic differences in conservation scores between conditional exons and constitutive exons. More importantly, the differences are position-specific. For flanking intronic regions, the differences between conditional exons and constitutive exons are also position-specific. Using the Random Forests algorithm, we can classify conditional exons with high specificities (97% for the identification of conditional exons from intron regions and 95% for the classification of known exons and fair sensitivities (64% and 32% respectively. We applied the method to the human genome and identified 39,640 introns that actually contain conditional exons and classified 8,813 conditional exons from the current RefSeq exon list. Among those, 31,673 introns containing conditional exons and 5,294 conditional exons classified from known exons cannot be inferred from RefSeq, UCSC or Ensembl annotations. Some of these de novo predictions were experimentally verified.

  3. MicroRNA repertoire for functional genome research in tilapia identified by deep sequencing.

    Science.gov (United States)

    Yan, Biao; Wang, Zhen-Hua; Zhu, Chang-Dong; Guo, Jin-Tao; Zhao, Jin-Liang

    2014-08-01

    The Nile tilapia (Oreochromis niloticus; Cichlidae) is an economically important species in aquaculture and occupies a prominent position in the aquaculture industry. MicroRNAs (miRNAs) are a class of noncoding RNAs that post-transcriptionally regulate gene expression involved in diverse biological and metabolic processes. To increase the repertoire of miRNAs characterized in tilapia, we used the Illumina/Solexa sequencing technology to sequence a small RNA library using pooled RNA sample isolated from the different developmental stages of tilapia. Bioinformatic analyses suggest that 197 conserved and 27 novel miRNAs are expressed in tilapia. Sequence alignments indicate that all tested miRNAs and miRNAs* are highly conserved across many species. In addition, we characterized the tissue expression patterns of five miRNAs using real-time quantitative PCR. We found that miR-1/206, miR-7/9, and miR-122 is abundantly expressed in muscle, brain, and liver, respectively, implying a potential role in the regulation of tissue differentiation or the maintenance of tissue identity. Overall, our results expand the number of tilapia miRNAs, and the discovery of miRNAs in tilapia genome contributes to a better understanding the role of miRNAs in regulating diverse biological processes.

  4. SNPs in Multi-Species Conserved Sequences (MCS as useful markers in association studies: a practical approach

    Directory of Open Access Journals (Sweden)

    Pericak-Vance Margaret A

    2007-08-01

    Full Text Available Abstract Background Although genes play a key role in many complex diseases, the specific genes involved in most complex diseases remain largely unidentified. Their discovery will hinge on the identification of key sequence variants that are conclusively associated with disease. While much attention has been focused on variants in protein-coding DNA, variants in noncoding regions may also play many important roles in complex disease by altering gene regulation. Since the vast majority of noncoding genomic sequence is of unknown function, this increases the challenge of identifying "functional" variants that cause disease. However, evolutionary conservation can be used as a guide to indicate regions of noncoding or coding DNA that are likely to have biological function, and thus may be more likely to harbor SNP variants with functional consequences. To help bias marker selection in favor of such variants, we devised a process that prioritizes annotated SNPs for genotyping studies based on their location within Multi-species Conserved Sequences (MCSs and used this process to select SNPs in a region of linkage to a complex disease. This allowed us to evaluate the utility of the chosen SNPs for further association studies. Previously, a region of chromosome 1q43 was linked to Multiple Sclerosis (MS in a genome-wide screen. We chose annotated SNPs in the region based on location within MCSs (termed MCS-SNPs. We then obtained genotypes for 478 MCS-SNPs in 989 individuals from MS families. Results Analysis of our MCS-SNP genotypes from the 1q43 region and comparison to HapMap data confirmed that annotated SNPs in MCS regions are frequently polymorphic and show subtle signatures of selective pressure, consistent with previous reports of genome-wide variation in conserved regions. We also present an online tool that allows MCS data to be directly exported to the UCSC genome browser so that MCS-SNPs can be easily identified within genomic regions of

  5. An uncertainty inclusive un-mixing model to identify tracer non-conservativeness

    Science.gov (United States)

    Sherriff, Sophie; Rowan, John; Franks, Stewart; Fenton, Owen; Jordan, Phil; hUallacháin, Daire Ó.

    2015-04-01

    sensitive screening technique than assessing target values against source data. Non-conservative behaviour was identified in field data however only at a significant degree of corruption. Whilst further testing is required to determine the impact of individual and combined uncertainty components on synthetic, controlled experiments and field data, this study provides a framework for future assessment of uncertainty in un-mixing models.

  6. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing

    Science.gov (United States)

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-01-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations. PMID:26206155

  7. Correlation between sequence conservation and structural thermodynamics of microRNA precursors from human, mouse, and chicken genomes

    Directory of Open Access Journals (Sweden)

    Wang Shengqi

    2010-10-01

    Full Text Available Abstract Background Previous studies have shown that microRNA precursors (pre-miRNAs have considerably more stable secondary structures than other native RNAs (tRNA, rRNA, and mRNA and artificial RNA sequences. However, pre-miRNAs with ultra stable secondary structures have not been investigated. It is not known if there is a tendency in pre-miRNA sequences towards or against ultra stable structures? Furthermore, the relationship between the structural thermodynamic stability of pre-miRNA and their evolution remains unclear. Results We investigated the correlation between pre-miRNA sequence conservation and structural stability as measured by adjusted minimum folding free energies in pre-miRNAs isolated from human, mouse, and chicken. The analysis revealed that conserved and non-conserved pre-miRNA sequences had structures with similar average stabilities. However, the relatively ultra stable and unstable pre-miRNAs were more likely to be non-conserved than pre-miRNAs with moderate stability. Non-conserved pre-miRNAs had more G+C than A+U nucleotides, while conserved pre-miRNAs contained more A+U nucleotides. Notably, the U content of conserved pre-miRNAs was especially higher than that of non-conserved pre-miRNAs. Further investigations showed that conserved and non-conserved pre-miRNAs exhibited different structural element features, even though they had comparable levels of stability. Conclusions We proposed that there is a correlation between structural thermodynamic stability and sequence conservation for pre-miRNAs from human, mouse, and chicken genomes. Our analyses suggested that pre-miRNAs with relatively ultra stable or unstable structures were less favoured by natural selection than those with moderately stable structures. Comparison of nucleotide compositions between non-conserved and conserved pre-miRNAs indicated the importance of U nucleotides in the pre-miRNA evolutionary process. Several characteristic structural elements were

  8. Comparative analysis of function and interaction of transcription factors in nematodes: Extensive conservation of orthology coupled to rapid sequence evolution

    Directory of Open Access Journals (Sweden)

    Singh Rama S

    2008-08-01

    Full Text Available Abstract Background Much of the morphological diversity in eukaryotes results from differential regulation of gene expression in which transcription factors (TFs play a central role. The nematode Caenorhabditis elegans is an established model organism for the study of the roles of TFs in controlling the spatiotemporal pattern of gene expression. Using the fully sequenced genomes of three Caenorhabditid nematode species as well as genome information from additional more distantly related organisms (fruit fly, mouse, and human we sought to identify orthologous TFs and characterized their patterns of evolution. Results We identified 988 TF genes in C. elegans, and inferred corresponding sets in C. briggsae and C. remanei, containing 995 and 1093 TF genes, respectively. Analysis of the three gene sets revealed 652 3-way reciprocal 'best hit' orthologs (nematode TF set, approximately half of which are zinc finger (ZF-C2H2 and ZF-C4/NHR types and HOX family members. Examination of the TF genes in C. elegans and C. briggsae identified the presence of significant tandem clustering on chromosome V, the majority of which belong to ZF-C4/NHR family. We also found evidence for lineage-specific duplications and rapid evolution of many of the TF genes in the two species. A search of the TFs conserved among nematodes in Drosophila melanogaster, Mus musculus and Homo sapiens revealed 150 reciprocal orthologs, many of which are associated with important biological processes and human diseases. Finally, a comparison of the sequence, gene interactions and function indicates that nematode TFs conserved across phyla exhibit significantly more interactions and are enriched in genes with annotated mutant phenotypes compared to those that lack orthologs in other species. Conclusion Our study represents the first comprehensive genome-wide analysis of TFs across three nematode species and other organisms. The findings indicate substantial conservation of transcription

  9. Identifying Priority Areas for Conservation: A Global Assessment for Forest-Dependent Birds

    Science.gov (United States)

    Buchanan, Graeme M.; Donald, Paul F.; Butchart, Stuart H. M.

    2011-01-01

    Limited resources are available to address the world's growing environmental problems, requiring conservationists to identify priority sites for action. Using new distribution maps for all of the world's forest-dependent birds (60.6% of all bird species), we quantify the contribution of remaining forest to conserving global avian biodiversity. For each of the world's partly or wholly forested 5-km cells, we estimated an impact score of its contribution to the distribution of all the forest bird species estimated to occur within it, and so is proportional to the impact on the conservation status of the world's forest-dependent birds were the forest it contains lost. The distribution of scores was highly skewed, a very small proportion of cells having scores several orders of magnitude above the global mean. Ecoregions containing the highest values of this score included relatively species-poor islands such as Hawaii and Palau, the relatively species-rich islands of Indonesia and the Philippines, and the megadiverse Atlantic Forests and northern Andes of South America. Ecoregions with high impact scores and high deforestation rates (2000–2005) included montane forests in Cameroon and the Eastern Arc of Tanzania, although deforestation data were not available for all ecoregions. Ecoregions with high impact scores, high rates of recent deforestation and low coverage by the protected area network included Indonesia's Seram rain forests and the moist forests of Trinidad and Tobago. Key sites in these ecoregions represent some of the most urgent priorities for expansion of the global protected areas network to meet Convention on Biological Diversity targets to increase the proportion of land formally protected to 17% by 2020. Areas with high impact scores, rapid deforestation, low protection and high carbon storage values may represent significant opportunities for both biodiversity conservation and climate change mitigation, for example through Reducing Emissions from

  10. Identifying priority areas for conservation: a global assessment for forest-dependent birds.

    Directory of Open Access Journals (Sweden)

    Graeme M Buchanan

    Full Text Available Limited resources are available to address the world's growing environmental problems, requiring conservationists to identify priority sites for action. Using new distribution maps for all of the world's forest-dependent birds (60.6% of all bird species, we quantify the contribution of remaining forest to conserving global avian biodiversity. For each of the world's partly or wholly forested 5-km cells, we estimated an impact score of its contribution to the distribution of all the forest bird species estimated to occur within it, and so is proportional to the impact on the conservation status of the world's forest-dependent birds were the forest it contains lost. The distribution of scores was highly skewed, a very small proportion of cells having scores several orders of magnitude above the global mean. Ecoregions containing the highest values of this score included relatively species-poor islands such as Hawaii and Palau, the relatively species-rich islands of Indonesia and the Philippines, and the megadiverse Atlantic Forests and northern Andes of South America. Ecoregions with high impact scores and high deforestation rates (2000-2005 included montane forests in Cameroon and the Eastern Arc of Tanzania, although deforestation data were not available for all ecoregions. Ecoregions with high impact scores, high rates of recent deforestation and low coverage by the protected area network included Indonesia's Seram rain forests and the moist forests of Trinidad and Tobago. Key sites in these ecoregions represent some of the most urgent priorities for expansion of the global protected areas network to meet Convention on Biological Diversity targets to increase the proportion of land formally protected to 17% by 2020. Areas with high impact scores, rapid deforestation, low protection and high carbon storage values may represent significant opportunities for both biodiversity conservation and climate change mitigation, for example through Reducing

  11. Identifying priority areas for conservation: a global assessment for forest-dependent birds.

    Science.gov (United States)

    Buchanan, Graeme M; Donald, Paul F; Butchart, Stuart H M

    2011-01-01

    Limited resources are available to address the world's growing environmental problems, requiring conservationists to identify priority sites for action. Using new distribution maps for all of the world's forest-dependent birds (60.6% of all bird species), we quantify the contribution of remaining forest to conserving global avian biodiversity. For each of the world's partly or wholly forested 5-km cells, we estimated an impact score of its contribution to the distribution of all the forest bird species estimated to occur within it, and so is proportional to the impact on the conservation status of the world's forest-dependent birds were the forest it contains lost. The distribution of scores was highly skewed, a very small proportion of cells having scores several orders of magnitude above the global mean. Ecoregions containing the highest values of this score included relatively species-poor islands such as Hawaii and Palau, the relatively species-rich islands of Indonesia and the Philippines, and the megadiverse Atlantic Forests and northern Andes of South America. Ecoregions with high impact scores and high deforestation rates (2000-2005) included montane forests in Cameroon and the Eastern Arc of Tanzania, although deforestation data were not available for all ecoregions. Ecoregions with high impact scores, high rates of recent deforestation and low coverage by the protected area network included Indonesia's Seram rain forests and the moist forests of Trinidad and Tobago. Key sites in these ecoregions represent some of the most urgent priorities for expansion of the global protected areas network to meet Convention on Biological Diversity targets to increase the proportion of land formally protected to 17% by 2020. Areas with high impact scores, rapid deforestation, low protection and high carbon storage values may represent significant opportunities for both biodiversity conservation and climate change mitigation, for example through Reducing Emissions from

  12. Comparative transcriptome analysis within the Lolium/Festuca species complex reveals high sequence conservation

    DEFF Research Database (Denmark)

    Czaban, Adrian; Sharma, Sapna; Byrne, Stephen

    2015-01-01

    species from the Lolium-Festuca complex, ranging from 52,166 to 72,133 transcripts per assembly. We have also predicted a set of proteins and validated it with a high-confidence protein database from three closely related species (H. vulgare, B. distachyon and O. sativa). We have obtained gene family...... clusters for the four species using OrthoMCL and analyzed their inferred phylogenetic relationships. Our results indicate that VRN2 is a candidate gene for differentiating vernalization and non-vernalization types in the Lolium-Festuca complex. Grouping of the gene families based on their BLAST identity...... enabled us to divide ortholog groups into those that are very conserved and those that are more evolutionarily relaxed. The ratio of the non-synonumous to synonymous substitutions enabled us to pinpoint protein sequences evolving in response to positive selection. These proteins may explain some...

  13. Identifying patients at risk of intraoperative and postoperative transfusion in isolated CABG: toward selective conservation strategies.

    Science.gov (United States)

    Arora, Rakesh C; Légaré, Jean-Francois; Buth, Karen J; Sullivan, John A; Hirsch, Gregory M

    2004-11-01

    Allogeneic blood product use during cardiac operation is often reported to exceed 40% despite published guidelines and costly blood conservation strategies. We developed a predictive model, based on eight preoperative risk factors, of allogeneic blood product transfusion rates in patients undergoing a cardiac procedure. All 3,046 consecutive, isolated coronary artery bypass graft (CABG) procedures at a university hospital from 1995 to 1998 were included. A logistic regression model was created to identify independent predictors of allogeneic blood product transfusion. This model was validated using a prospective patient sample. Overall use of allogeneic blood products was 23% with a crude operative mortality of 2.1%. In isolated, elective, first-time CABG cases, 16.9% received allogeneic blood products. Independent predictors of blood product usage in CABG patients were preoperative hemoglobin 12.0 or less, emergent operation, renal failure, female sex, age 70 years or older, left ventricular ejection fraction 0.40 or less, redo procedure, and low body surface area. Prospective validation of this model on 2,117 consecutive isolated CABG patients demonstrated an observed-to-expected allogeneic blood product transfusion rate ratio of 1.06. This internally validated logistic regression risk model is a sensitive and specific predictor of allogeneic blood product use in patients undergoing isolated CABG. Utilization of this model allows for preoperative risk stratification and may allow for more rational resource allocation of costly blood conservation strategies and blood bank resources.

  14. Identifying recombinants in human and primate immunodeficiency virus sequence alignments using quartet scanning

    Directory of Open Access Journals (Sweden)

    Martin Darren P

    2009-04-01

    Full Text Available Abstract Background Recombination has a profound impact on the evolution of viruses, but characterizing recombination patterns in molecular sequences remains a challenging endeavor. Despite its importance in molecular evolutionary studies, identifying the sequences that exhibit such patterns has received comparatively less attention in the recombination detection framework. Here, we extend a quartet-mapping based recombination detection method to enable identification of recombinant sequences without prior specifications of either query and reference sequences. Through simulations we evaluate different recombinant identification statistics and significance tests. We compare the quartet approach with triplet-based methods that employ additional heuristic tests to identify parental and recombinant sequences. Results Analysis of phylogenetic simulations reveal that identifying the descendents of relatively old recombination events is a challenging task for all methods available, and that quartet scanning performs relatively well compared to the triplet based methods. The use of quartet scanning is further demonstrated by analyzing both well-established and putative HIV-1 recombinant strains. In agreement with recent findings, we provide evidence that the presumed circulating recombinant CRF02_AG is a 'pure' lineage, whereas the presumed parental lineage subtype G has a recombinant origin. We also demonstrate HIV-1 intrasubtype recombination, confirm the hybrid origin of SIV in chimpanzees and further disentangle the recombinant history of SIV lineages in a primate immunodeficiency virus data set. Conclusion Quartet scanning makes a valuable addition to triplet-based methods for identifying recombinant sequences without prior specifications of either query and reference sequences. The new method is available in the VisRD v.3.0 package http://www.cmp.uea.ac.uk/~vlm/visrd.

  15. Core genome conservation of Staphylococcus haemolyticus limits sequence based population structure analysis.

    Science.gov (United States)

    Cavanagh, Jorunn Pauline; Klingenberg, Claus; Hanssen, Anne-Merethe; Fredheim, Elizabeth Aarag; Francois, Patrice; Schrenzel, Jacques; Flægstad, Trond; Sollid, Johanna Ericson

    2012-06-01

    The notoriously multi-resistant Staphylococcus haemolyticus is an emerging pathogen causing serious infections in immunocompromised patients. Defining the population structure is important to detect outbreaks and spread of antimicrobial resistant clones. Currently, the standard typing technique is pulsed-field gel electrophoresis (PFGE). In this study we describe novel molecular typing schemes for S. haemolyticus using multi locus sequence typing (MLST) and multi locus variable number of tandem repeats (VNTR) analysis. Seven housekeeping genes (MLST) and five VNTR loci (MLVF) were selected for the novel typing schemes. A panel of 45 human and veterinary S. haemolyticus isolates was investigated. The collection had diverse PFGE patterns (38 PFGE types) and was sampled over a 20 year-period from eight countries. MLST resolved 17 sequence types (Simpsons index of diversity [SID]=0.877) and MLVF resolved 14 repeat types (SID=0.831). We found a low sequence diversity. Phylogenetic analysis clustered the isolates in three (MLST) and one (MLVF) clonal complexes, respectively. Taken together, neither the MLST nor the MLVF scheme was suitable to resolve the population structure of this S. haemolyticus collection. Future MLVF and MLST schemes will benefit from addition of more variable core genome sequences identified by comparing different fully sequenced S. haemolyticus genomes. Copyright © 2012 Elsevier B.V. All rights reserved.

  16. Solexa sequencing identification of conserved and novel microRNAs in backfat of Large White and Chinese Meishan pigs.

    Directory of Open Access Journals (Sweden)

    Chen Chen

    Full Text Available The domestic pig (Sus scrofa, an important species in animal production industry, is a right model for studying adipogenesis and fat deposition. In order to expand the repertoire of porcine miRNAs and further explore potential regulatory miRNAs which have influence on adipogenesis, high-throughput Solexa sequencing approach was adopted to identify miRNAs in backfat of Large White (lean type pig and Meishan pigs (Chinese indigenous fatty pig. We identified 215 unique miRNAs comprising 75 known pre-miRNAs, of which 49 miRNA*s were first identified in our study, 73 miRNAs were overlapped in both libraries, and 140 were novelly predicted miRNAs, and 215 unique miRNAs were collectively corresponding to 235 independent genomic loci. Furthermore, we analyzed the sequence variations, seed edits and phylogenetic development of the miRNAs. 17 miRNAs were widely conserved from vertebrates to invertebrates, suggesting that these miRNAs may serve as potential evolutional biomarkers. 9 conserved miRNAs with significantly differential expressions were determined. The expression of miR-215, miR-135, miR-224 and miR-146b was higher in Large White pigs, opposite to the patterns shown by miR-1a, miR-133a, miR-122, miR-204 and miR-183. Almost all novel miRNAs could be considered pig-specific except ssc-miR-1343, miR-2320, miR-2326, miR-2411 and miR-2483 which had homologs in Bos taurus, among which ssc-miR-1343, miR-2320, miR-2411 and miR-2483 were validated in backfat tissue by stem-loop qPCR. Our results displayed a high level of concordance between the qPCR and Solexa sequencing method in 9 of 10 miRNAs comparisons except for miR-1a. Moreover, we found 2 miRNAs, miR-135 and miR-183, may exert impacts on porcine backfat development through WNT signaling pathway. In conclusion, our research develops porcine miRNAs and should be beneficial to study the adipogenesis and fat deposition of different pig breeds based on miRNAs.

  17. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium

    Directory of Open Access Journals (Sweden)

    Lynch Michael

    2010-05-01

    Full Text Available Abstract Background In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa remains a virtually unexplored issue. Results By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Conclusions Our observations 1 shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2 are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3 reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  18. Evolutionary dynamics of a conserved sequence motif in the ribosomal genes of the ciliate Paramecium.

    Science.gov (United States)

    Catania, Francesco; Lynch, Michael

    2010-05-04

    In protozoa, the identification of preserved motifs by comparative genomics is often impeded by difficulties to generate reliable alignments for non-coding sequences. Moreover, the evolutionary dynamics of regulatory elements in 3' untranslated regions (both in protozoa and metazoa) remains a virtually unexplored issue. By screening Paramecium tetraurelia's 3' untranslated regions for 8-mers that were previously found to be preserved in mammalian 3' UTRs, we detect and characterize a motif that is distinctly conserved in the ribosomal genes of this ciliate. The motif appears to be conserved across Paramecium aurelia species but is absent from the ribosomal genes of four additional non-Paramecium species surveyed, including another ciliate, Tetrahymena thermophila. Motif-free ribosomal genes retain fewer paralogs in the genome and appear to be lost more rapidly relative to motif-containing genes. Features associated with the discovered preserved motif are consistent with this 8-mer playing a role in post-transcriptional regulation. Our observations 1) shed light on the evolution of a putative regulatory motif across large phylogenetic distances; 2) are expected to facilitate the understanding of the modulation of ribosomal genes expression in Paramecium; and 3) reveal a largely unexplored--and presumably not restricted to Paramecium--association between the presence/absence of a DNA motif and the evolutionary fate of its host genes.

  19. Conservation

    NARCIS (Netherlands)

    Noteboom, H.P.

    1985-01-01

    The IUCN/WWF Plants Conservation Programme 1984 — 1985. World Wildlife Fund chose plants to be the subject of their fund-raising campaign in the period 1984 — 1985. The objectives were to: 1. Use information techniques to achieve the conservation objectives of the Plants Programme – to save plants;

  20. Conservation.

    Science.gov (United States)

    National Audubon Society, New York, NY.

    This set of teaching aids consists of seven Audubon Nature Bulletins, providing the teacher and student with informational reading on various topics in conservation. The bulletins have these titles: Plants as Makers of Soil, Water Pollution Control, The Ground Water Table, Conservation--To Keep This Earth Habitable, Our Threatened Air Supply,…

  1. Identifying conservation priorities and management strategies based on ecosystem services to improve urban sustainability in Harbin, China

    Directory of Open Access Journals (Sweden)

    Yi Qu

    2018-04-01

    Full Text Available Rapid urbanization and agricultural development has resulted in the degradation of ecosystems, while also negatively impacting ecosystem services (ES and urban sustainability. Identifying conservation priorities for ES and applying reasonable management strategies have been found to be effective methods for mitigating this phenomenon. The purpose of this study is to propose a comprehensive framework for identifying ES conservation priorities and associated management strategies for these planning areas. First, we incorporated 10 ES indicators within a systematic conservation planning (SCP methodology in order to identify ES conservation priorities with high irreplaceability values based on conservation target goals associated with the potential distribution of ES indicators. Next, we assessed the efficiency of the ES conservation priorities for meeting the designated conservation target goals. Finally, ES conservation priorities were clustered into groups using a K-means clustering analysis in an effort to identify the dominant ES per location before formulating management strategies. We effectively identified 12 ES priorities to best represent conservation target goals for the ES indicators. These 12 priorities had a total areal coverage of 13,364 km2 representing 25.16% of the study area. The 12 priorities were further clustered into five significantly different groups (p-values between groups < 0.05, which helped to refine management strategies formulated to best enhance ES across the study area. The proposed method allows conservation and management plans to easily adapt to a wide variety of quantitative ES target goals within urban and agricultural areas, thereby preventing urban and agriculture sprawl and guiding sustainable urban development.

  2. In Silico Analysis of Gene Expression Network Components Underlying Pigmentation Phenotypes in the Python Identified Evolutionarily Conserved Clusters of Transcription Factor Binding Sites

    Directory of Open Access Journals (Sweden)

    Kristopher J. L. Irizarry

    2016-01-01

    Full Text Available Color variation provides the opportunity to investigate the genetic basis of evolution and selection. Reptiles are less studied than mammals. Comparative genomics approaches allow for knowledge gained in one species to be leveraged for use in another species. We describe a comparative vertebrate analysis of conserved regulatory modules in pythons aimed at assessing bioinformatics evidence that transcription factors important in mammalian pigmentation phenotypes may also be important in python pigmentation phenotypes. We identified 23 python orthologs of mammalian genes associated with variation in coat color phenotypes for which we assessed the extent of pairwise protein sequence identity between pythons and mouse, dog, horse, cow, chicken, anole lizard, and garter snake. We next identified a set of melanocyte/pigment associated transcription factors (CREB, FOXD3, LEF-1, MITF, POU3F2, and USF-1 that exhibit relatively conserved sequence similarity within their DNA binding regions across species based on orthologous alignments across multiple species. Finally, we identified 27 evolutionarily conserved clusters of transcription factor binding sites within ~200-nucleotide intervals of the 1500-nucleotide upstream regions of AIM1, DCT, MC1R, MITF, MLANA, OA1, PMEL, RAB27A, and TYR from Python bivittatus. Our results provide insight into pigment phenotypes in pythons.

  3. Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data

    Directory of Open Access Journals (Sweden)

    Iris Holmes

    2018-04-01

    Full Text Available Background Reduced representation genomic datasets are increasingly becoming available from a variety of organisms. These datasets do not target specific genes, and so may contain sequences from parasites and other organisms present in the target tissue sample. In this paper, we demonstrate that (1 RADseq datasets can be used for exploratory analysis of tissue-specific metagenomes, and (2 tissue collections house complete metagenomic communities, which can be investigated and quantified by a variety of techniques. Methods We present an exploratory method for mining metagenomic “bycatch” sequences from a range of host tissue types. We use a combination of the pyRAD assembly pipeline, NCBI’s blastn software, and custom R scripts to isolate metagenomic sequences from RADseq type datasets. Results When we focus on sequences that align with existing references in NCBI’s GenBank, we find that between three and five percent of identifiable double-digest restriction site associated DNA (ddRAD sequences from host tissue samples are from phyla to contain known blood parasites. In addition to tissue samples, we examine ddRAD sequences from metagenomic DNA extracted snake and lizard hind-gut samples. We find that the sequences recovered from these samples match with expected bacterial and eukaryotic gut microbiome phyla. Discussion Our results suggest that (1 museum tissue banks originally collected for host DNA archiving are also preserving valuable parasite and microbiome communities, (2 that publicly available RADseq datasets may include metagenomic sequences that could be explored, and (3 that restriction site approaches are a useful exploratory technique to identify microbiome lineages that could be missed by primer-based approaches.

  4. SILAC Proteomics of Planarians Identifies Ncoa5 as a Conserved Component of Pluripotent Stem Cells

    Directory of Open Access Journals (Sweden)

    Alexander Böser

    2013-11-01

    Full Text Available Planarian regeneration depends on the presence of pluripotent stem cells in the adult. We developed an in vivo stable isotope labeling by amino acids in cell culture (SILAC protocol in planarians to identify proteins that are enriched in planarian stem cells. Through a comparison of SILAC proteomes of normal and stem cell-depleted planarians and of a stem cell-enriched population of sorted cells, we identified hundreds of stem cell proteins. One of these is an ortholog of nuclear receptor coactivator-5 (Ncoa5/CIA, which is known to regulate estrogen-receptor-mediated transcription in human cells. We show that Ncoa5 is essential for the maintenance of the pluripotent stem cell population in planarians and that a putative mouse ortholog is expressed in pluripotent cells of the embryo. Our study thus identifies a conserved component of pluripotent stem cells, demonstrating that planarians, in particular, when combined with in vivo SILAC, are a powerful model in stem cell research.

  5. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    Science.gov (United States)

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  6. Exome Sequencing Fails to Identify the Genetic Cause of Aicardi Syndrome.

    Science.gov (United States)

    Lund, Caroline; Striano, Pasquale; Sorte, Hanne Sørmo; Parisi, Pasquale; Iacomino, Michele; Sheng, Ying; Vigeland, Magnus D; Øye, Anne-Marte; Møller, Rikke Steensbjerre; Selmer, Kaja K; Zara, Federico

    2016-09-01

    Aicardi syndrome (AS) is a well-characterized neurodevelopmental disorder with an unknown etiology. In this study, we performed whole-exome sequencing in 11 female patients with the diagnosis of AS, in order to identify the disease-causing gene. In particular, we focused on detecting variants in the X chromosome, including the analysis of variants with a low number of sequencing reads, in case of somatic mosaicism. For 2 of the patients, we also sequenced the exome of the parents to search for de novo mutations. We did not identify any genetic variants likely to be damaging. Only one single missense variant was identified by the de novo analyses of the 2 trios, and this was considered benign. The failure to identify a disease gene in this study may be due to technical limitations of our study design, including the possibility that the genetic aberration leading to AS is situated in a non-exonic region or that the mutation is somatic and not detectable by our approach. Alternatively, it is possible that AS is genetically heterogeneous and that 11 patients are not sufficient to reveal the causative genes. Future studies of AS should consider designs where also non-exonic regions are explored and apply a sequencing depth so that also low-grade somatic mosaicism can be detected.

  7. Exome Sequencing Fails to Identify the Genetic Cause of Aicardi Syndrome

    DEFF Research Database (Denmark)

    Lund, Caroline; Striano, Pasquale; Sorte, Hanne Sørmo

    2016-01-01

    Aicardi syndrome (AS) is a well-characterized neurodevelopmental disorder with an unknown etiology. In this study, we performed whole-exome sequencing in 11 female patients with the diagnosis of AS, in order to identify the disease-causing gene. In particular, we focused on detecting variants in ...

  8. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    OpenAIRE

    Hu, H.; Haas, S.A.; Chelly, J.; Van Esch, H.; Raynaud, M.; de Brouwer, A.P.M.; Weinert, S.; Froyen, G.; Frints, S.G.M.; Laumonnier, F.; Zemojtel, T.; Love, M.I.; Richard, H.; Emde, A.K.; Bienek, M.

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of ...

  9. A Novel Prosthetic Joint Infection Pathogen, Mycoplasma salivarium, Identified by Metagenomic Shotgun Sequencing.

    Science.gov (United States)

    Thoendel, Matthew; Jeraldo, Patricio; Greenwood-Quaintance, Kerryl E; Chia, Nicholas; Abdel, Matthew P; Steckelberg, James M; Osmon, Douglas R; Patel, Robin

    2017-07-15

    Defining the microbial etiology of culture-negative prosthetic joint infection (PJI) can be challenging. Metagenomic shotgun sequencing is a new tool to identify organisms undetected by conventional methods. We present a case where metagenomics was used to identify Mycoplasma salivarium as a novel PJI pathogen in a patient with hypogammaglobulinemia. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.

  10. Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence.

    Science.gov (United States)

    Benko, Sabina; Fantes, Judy A; Amiel, Jeanne; Kleinjan, Dirk-Jan; Thomas, Sophie; Ramsay, Jacqueline; Jamshidi, Negar; Essafi, Abdelkader; Heaney, Simon; Gordon, Christopher T; McBride, David; Golzio, Christelle; Fisher, Malcolm; Perry, Paul; Abadie, Véronique; Ayuso, Carmen; Holder-Espinasse, Muriel; Kilpatrick, Nicky; Lees, Melissa M; Picard, Arnaud; Temple, I Karen; Thomas, Paul; Vazquez, Marie-Paule; Vekemans, Michel; Roest Crollius, Hugues; Hastie, Nicholas D; Munnich, Arnold; Etchevers, Heather C; Pelet, Anna; Farlie, Peter G; Fitzpatrick, David R; Lyonnet, Stanislas

    2009-03-01

    Pierre Robin sequence (PRS) is an important subgroup of cleft palate. We report several lines of evidence for the existence of a 17q24 locus underlying PRS, including linkage analysis results, a clustering of translocation breakpoints 1.06-1.23 Mb upstream of SOX9, and microdeletions both approximately 1.5 Mb centromeric and approximately 1.5 Mb telomeric of SOX9. We have also identified a heterozygous point mutation in an evolutionarily conserved region of DNA with in vitro and in vivo features of a developmental enhancer. This enhancer is centromeric to the breakpoint cluster and maps within one of the microdeletion regions. The mutation abrogates the in vitro enhancer function and alters binding of the transcription factor MSX1 as compared to the wild-type sequence. In the developing mouse mandible, the 3-Mb region bounded by the microdeletions shows a regionally specific chromatin decompaction in cells expressing Sox9. Some cases of PRS may thus result from developmental misexpression of SOX9 due to disruption of very-long-range cis-regulatory elements.

  11. The putative Leishmania telomerase RNA (LeishTER undergoes trans-splicing and contains a conserved template sequence.

    Directory of Open Access Journals (Sweden)

    Elton J R Vasconcelos

    Full Text Available Telomerase RNAs (TERs are highly divergent between species, varying in size and sequence composition. Here, we identify a candidate for the telomerase RNA component of Leishmania genus, which includes species that cause leishmaniasis, a neglected tropical disease. Merging a thorough computational screening combined with RNA-seq evidence, we mapped a non-coding RNA gene localized in a syntenic locus on chromosome 25 of five Leishmania species that shares partial synteny with both Trypanosoma brucei TER locus and a putative TER candidate-containing locus of Crithidia fasciculata. Using target-driven molecular biology approaches, we detected a ∼2,100 nt transcript (LeishTER that contains a 5' spliced leader (SL cap, a putative 3' polyA tail and a predicted C/D box snoRNA domain. LeishTER is expressed at similar levels in the logarithmic and stationary growth phases of promastigote forms. A 5'SL capped LeishTER co-immunoprecipitated and co-localized with the telomerase protein component (TERT in a cell cycle-dependent manner. Prediction of its secondary structure strongly suggests the existence of a bona fide single-stranded template sequence and a conserved C[U/C]GUCA motif-containing helix II, representing the template boundary element. This study paves the way for further investigations on the biogenesis of parasite TERT ribonucleoproteins (RNPs and its role in parasite telomere biology.

  12. A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites.

    Directory of Open Access Journals (Sweden)

    Tony Håndstad

    Full Text Available BACKGROUND: Transcription factors are important controllers of gene expression and mapping transcription factor binding sites (TFBS is key to inferring transcription factor regulatory networks. Several methods for predicting TFBS exist, but there are no standard genome-wide datasets on which to assess the performance of these prediction methods. Also, it is believed that information about sequence conservation across different genomes can generally improve accuracy of motif-based predictors, but it is not clear under what circumstances use of conservation is most beneficial. RESULTS: Here we use published ChIP-seq data and an improved peak detection method to create comprehensive benchmark datasets for prediction methods which use known descriptors or binding motifs to detect TFBS in genomic sequences. We use this benchmark to assess the performance of five different prediction methods and find that the methods that use information about sequence conservation generally perform better than simpler motif-scanning methods. The difference is greater on high-affinity peaks and when using short and information-poor motifs. However, if the motifs are specific and information-rich, we find that simple motif-scanning methods can perform better than conservation-based methods. CONCLUSIONS: Our benchmark provides a comprehensive test that can be used to rank the relative performance of transcription factor binding site prediction methods. Moreover, our results show that, contrary to previous reports, sequence conservation is better suited for predicting strong than weak transcription factor binding sites.

  13. SoftSearch: integration of multiple sequence features to identify breakpoints of structural variations.

    Directory of Open Access Journals (Sweden)

    Steven N Hart

    Full Text Available BACKGROUND: Structural variation (SV represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. RESULTS: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1 not requiring secondary (or exhaustive primary alignment, 2 portability into established sequencing workflows, and 3 is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.. SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. CONCLUSIONS: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.

  14. Incorporation of unique molecular identifiers in TruSeq adapters improves the accuracy of quantitative sequencing.

    Science.gov (United States)

    Hong, Jungeui; Gresham, David

    2017-11-01

    Quantitative analysis of next-generation sequencing (NGS) data requires discriminating duplicate reads generated by PCR from identical molecules that are of unique origin. Typically, PCR duplicates are identified as sequence reads that align to the same genomic coordinates using reference-based alignment. However, identical molecules can be independently generated during library preparation. Misidentification of these molecules as PCR duplicates can introduce unforeseen biases during analyses. Here, we developed a cost-effective sequencing adapter design by modifying Illumina TruSeq adapters to incorporate a unique molecular identifier (UMI) while maintaining the capacity to undertake multiplexed, single-index sequencing. Incorporation of UMIs into TruSeq adapters (TrUMIseq adapters) enables identification of bona fide PCR duplicates as identically mapped reads with identical UMIs. Using TrUMIseq adapters, we show that accurate removal of PCR duplicates results in improved accuracy of both allele frequency (AF) estimation in heterogeneous populations using DNA sequencing and gene expression quantification using RNA-Seq.

  15. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    Science.gov (United States)

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. © 2016 Elsevier Inc. All rights reserved.

  16. Identifying Potential Recommendation Domains for Conservation Agriculture in Ethiopia, Kenya, and Malawi

    Science.gov (United States)

    Tesfaye, Kindie; Jaleta, Moti; Jena, Pradyot; Mutenje, Munyaradzi

    2015-02-01

    Conservation agriculture (CA) is being promoted as an option for reducing soil degradation, conserving water, enhancing crop productivity, and maintaining yield stability. However, CA is a knowledge- and technology-intensive practice, and may not be feasible or may not perform better than conventional agriculture under all conditions and farming systems. Using high resolution (≈1 km2) biophysical and socioeconomic geospatial data, this study identified potential recommendation domains (RDs) for CA in Ethiopia, Kenya, and Malawi. The biophysical variables used were soil texture, surface slope, and rainfall while the socioeconomic variables were market access and human and livestock population densities. Based on feasibility and comparative performance of CA over conventional agriculture, the biophysical and socioeconomic factors were first used to classify cultivated areas into three biophysical and three socioeconomic potential domains, respectively. Combinations of biophysical and socioeconomic domains were then used to develop potential RDs for CA based on adoption potential within the cultivated areas. About 39, 12, and 5 % of the cultivated areas showed high biophysical and socioeconomic potential while 50, 39, and 21 % of the cultivated areas showed high biophysical and medium socioeconomic potential for CA in Malawi, Kenya, and Ethiopia, respectively. The results indicate considerable acreages of land with high CA adoption potential in the mixed crop-livestock systems of the studied countries. However, there are large differences among countries depending on biophysical and socio-economic conditions. The information generated in this study could be used for targeting CA and prioritizing CA-related agricultural research and investment priorities in the three countries.

  17. A predicted protein interactome identifies conserved global networks and disease resistance subnetworks in maize.

    Directory of Open Access Journals (Sweden)

    Matt eGeisler

    2015-06-01

    Full Text Available Interactomes are genome-wide roadmaps of protein-protein interactions. They have been produced for humans, yeast, the fruit fly, and Arabidopsis thaliana and have become invaluable tools for generating and testing hypotheses. A predicted interactome for Zea mays (PiZeaM is presented here as an aid to the research community for this valuable crop species. PiZeaM was built using a proven method of interologs (interacting orthologs that were identified using both one-to-one and many-to-many orthology between genomes of maize and reference species. Where both maize orthologs occurred for an experimentally determined interaction in the reference species, we predicted a likely interaction in maize. A total of 49,026 unique interactions for 6,004 maize proteins were predicted. These interactions are enriched for processes that are evolutionarily conserved, but include many otherwise poorly annotated proteins in maize. The predicted maize interactions were further analyzed by comparing annotation of interacting proteins, including different layers of ontology. A map of pairwise gene co-expression was also generated and compared to predicted interactions. Two global subnetworks were constructed for highly conserved interactions. These subnetworks showed clear clustering of proteins by function. Another subnetwork was created for disease response using a bait and prey strategy to capture interacting partners for proteins that respond to other organisms. Closer examination of this subnetwork revealed the connectivity between biotic and abiotic hormone stress pathways. We believe PiZeaM will provide a useful tool for the prediction of protein function and analysis of pathways for Z. mays researchers and is presented in this paper as a reference tool for the exploration of protein interactions in maize.

  18. Whole-Exome Sequencing Identifies Rare and Low-Frequency Coding Variants Associated with LDL Cholesterol

    Science.gov (United States)

    Lange, Leslie A.; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M.; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M.; Smith, Joshua D.; Turner, Emily H.; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A.; Holmen, Oddgeir L.; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A.; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C.; Correa, Adolfo; Griswold, Michael E.; Jakobsdottir, Johanna; Smith, Albert V.; Schreiner, Pamela J.; Feitosa, Mary F.; Zhang, Qunyuan; Huffman, Jennifer E.; Crosby, Jacy; Wassel, Christina L.; Do, Ron; Franceschini, Nora; Martin, Lisa W.; Robinson, Jennifer G.; Assimes, Themistocles L.; Crosslin, David R.; Rosenthal, Elisabeth A.; Tsai, Michael; Rieder, Mark J.; Farlow, Deborah N.; Folsom, Aaron R.; Lumley, Thomas; Fox, Ervin R.; Carlson, Christopher S.; Peters, Ulrike; Jackson, Rebecca D.; van Duijn, Cornelia M.; Uitterlinden, André G.; Levy, Daniel; Rotter, Jerome I.; Taylor, Herman A.; Gudnason, Vilmundur; Siscovick, David S.; Fornage, Myriam; Borecki, Ingrid B.; Hayward, Caroline; Rudan, Igor; Chen, Y. Eugene; Bottinger, Erwin P.; Loos, Ruth J.F.; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M.; Gabriel, Stacey B.; O’Donnell, Christopher J.; Post, Wendy S.; North, Kari E.; Reiner, Alexander P.; Boerwinkle, Eric; Psaty, Bruce M.; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P.; Cupples, L. Adrienne; Kooperberg, Charles; Wilson, James G.; Nickerson, Deborah A.; Abecasis, Goncalo R.; Rich, Stephen S.; Tracy, Russell P.; Willer, Cristen J.; Gabriel, Stacey B.; Altshuler, David M.; Abecasis, Gonçalo R.; Allayee, Hooman; Cresci, Sharon; Daly, Mark J.; de Bakker, Paul I.W.; DePristo, Mark A.; Do, Ron; Donnelly, Peter; Farlow, Deborah N.; Fennell, Tim; Garimella, Kiran; Hazen, Stanley L.; Hu, Youna; Jordan, Daniel M.; Jun, Goo; Kathiresan, Sekar; Kang, Hyun Min; Kiezun, Adam; Lettre, Guillaume; Li, Bingshan; Li, Mingyao; Newton-Cheh, Christopher H.; Padmanabhan, Sandosh; Peloso, Gina; Pulit, Sara; Rader, Daniel J.; Reich, David; Reilly, Muredach P.; Rivas, Manuel A.; Schwartz, Steve; Scott, Laura; Siscovick, David S.; Spertus, John A.; Stitziel, Nathaniel O.; Stoletzki, Nina; Sunyaev, Shamil R.; Voight, Benjamin F.; Willer, Cristen J.; Rich, Stephen S.; Akylbekova, Ermeg; Atwood, Larry D.; Ballantyne, Christie M.; Barbalic, Maja; Barr, R. Graham; Benjamin, Emelia J.; Bis, Joshua; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer; Budoff, Matthew; Burke, Greg; Buxbaum, Sarah; Carr, Jeff; Chen, Donna T.; Chen, Ida Y.; Chen, Wei-Min; Concannon, Pat; Crosby, Jacy; Cupples, L. Adrienne; D’Agostino, Ralph; DeStefano, Anita L.; Dreisbach, Albert; Dupuis, Josée; Durda, J. Peter; Ellis, Jaclyn; Folsom, Aaron R.; Fornage, Myriam; Fox, Caroline S.; Fox, Ervin; Funari, Vincent; Ganesh, Santhi K.; Gardin, Julius; Goff, David; Gordon, Ora; Grody, Wayne; Gross, Myron; Guo, Xiuqing; Hall, Ira M.; Heard-Costa, Nancy L.; Heckbert, Susan R.; Heintz, Nicholas; Herrington, David M.; Hickson, DeMarc; Huang, Jie; Hwang, Shih-Jen; Jacobs, David R.; Jenny, Nancy S.; Johnson, Andrew D.; Johnson, Craig W.; Kawut, Steven; Kronmal, Richard; Kurz, Raluca; Lange, Ethan M.; Lange, Leslie A.; Larson, Martin G.; Lawson, Mark; Lewis, Cora E.; Levy, Daniel; Li, Dalin; Lin, Honghuang; Liu, Chunyu; Liu, Jiankang; Liu, Kiang; Liu, Xiaoming; Liu, Yongmei; Longstreth, William T.; Loria, Cay; Lumley, Thomas; Lunetta, Kathryn; Mackey, Aaron J.; Mackey, Rachel; Manichaikul, Ani; Maxwell, Taylor; McKnight, Barbara; Meigs, James B.; Morrison, Alanna C.; Musani, Solomon K.; Mychaleckyj, Josyf C.; Nettleton, Jennifer A.; North, Kari; O’Donnell, Christopher J.; O’Leary, Daniel; Ong, Frank; Palmas, Walter; Pankow, James S.; Pankratz, Nathan D.; Paul, Shom; Perez, Marco; Person, Sharina D.; Polak, Joseph; Post, Wendy S.; Psaty, Bruce M.; Quinlan, Aaron R.; Raffel, Leslie J.; Ramachandran, Vasan S.; Reiner, Alexander P.; Rice, Kenneth; Rotter, Jerome I.; Sanders, Jill P.; Schreiner, Pamela; Seshadri, Sudha; Shea, Steve; Sidney, Stephen; Silverstein, Kevin; Smith, Nicholas L.; Sotoodehnia, Nona; Srinivasan, Asoke; Taylor, Herman A.; Taylor, Kent; Thomas, Fridtjof; Tracy, Russell P.; Tsai, Michael Y.; Volcik, Kelly A.; Wassel, Chrstina L.; Watson, Karol; Wei, Gina; White, Wendy; Wiggins, Kerri L.; Wilk, Jemma B.; Williams, O. Dale; Wilson, Gregory; Wilson, James G.; Wolf, Phillip; Zakai, Neil A.; Hardy, John; Meschia, James F.; Nalls, Michael; Singleton, Andrew; Worrall, Brad; Bamshad, Michael J.; Barnes, Kathleen C.; Abdulhamid, Ibrahim; Accurso, Frank; Anbar, Ran; Beaty, Terri; Bigham, Abigail; Black, Phillip; Bleecker, Eugene; Buckingham, Kati; Cairns, Anne Marie; Caplan, Daniel; Chatfield, Barbara; Chidekel, Aaron; Cho, Michael; Christiani, David C.; Crapo, James D.; Crouch, Julia; Daley, Denise; Dang, Anthony; Dang, Hong; De Paula, Alicia; DeCelie-Germana, Joan; Drumm, Allen DozorMitch; Dyson, Maynard; Emerson, Julia; Emond, Mary J.; Ferkol, Thomas; Fink, Robert; Foster, Cassandra; Froh, Deborah; Gao, Li; Gershan, William; Gibson, Ronald L.; Godwin, Elizabeth; Gondor, Magdalen; Gutierrez, Hector; Hansel, Nadia N.; Hassoun, Paul M.; Hiatt, Peter; Hokanson, John E.; Howenstine, Michelle; Hummer, Laura K.; Kanga, Jamshed; Kim, Yoonhee; Knowles, Michael R.; Konstan, Michael; Lahiri, Thomas; Laird, Nan; Lange, Christoph; Lin, Lin; Lin, Xihong; Louie, Tin L.; Lynch, David; Make, Barry; Martin, Thomas R.; Mathai, Steve C.; Mathias, Rasika A.; McNamara, John; McNamara, Sharon; Meyers, Deborah; Millard, Susan; Mogayzel, Peter; Moss, Richard; Murray, Tanda; Nielson, Dennis; Noyes, Blakeslee; O’Neal, Wanda; Orenstein, David; O’Sullivan, Brian; Pace, Rhonda; Pare, Peter; Parker, H. Worth; Passero, Mary Ann; Perkett, Elizabeth; Prestridge, Adrienne; Rafaels, Nicholas M.; Ramsey, Bonnie; Regan, Elizabeth; Ren, Clement; Retsch-Bogart, George; Rock, Michael; Rosen, Antony; Rosenfeld, Margaret; Ruczinski, Ingo; Sanford, Andrew; Schaeffer, David; Sell, Cindy; Sheehan, Daniel; Silverman, Edwin K.; Sin, Don; Spencer, Terry; Stonebraker, Jackie; Tabor, Holly K.; Varlotta, Laurie; Vergara, Candelaria I.; Weiss, Robert; Wigley, Fred; Wise, Robert A.; Wright, Fred A.; Wurfel, Mark M.; Zanni, Robert; Zou, Fei; Nickerson, Deborah A.; Rieder, Mark J.; Green, Phil; Shendure, Jay; Akey, Joshua M.; Bustamante, Carlos D.; Crosslin, David R.; Eichler, Evan E.; Fox, P. Keolu; Fu, Wenqing; Gordon, Adam; Gravel, Simon; Jarvik, Gail P.; Johnsen, Jill M.; Kan, Mengyuan; Kenny, Eimear E.; Kidd, Jeffrey M.; Lara-Garduno, Fremiet; Leal, Suzanne M.; Liu, Dajiang J.; McGee, Sean; O’Connor, Timothy D.; Paeper, Bryan; Robertson, Peggy D.; Smith, Joshua D.; Staples, Jeffrey C.; Tennessen, Jacob A.; Turner, Emily H.; Wang, Gao; Yi, Qian; Jackson, Rebecca; Peters, Ulrike; Carlson, Christopher S.; Anderson, Garnet; Anton-Culver, Hoda; Assimes, Themistocles L.; Auer, Paul L.; Beresford, Shirley; Bizon, Chris; Black, Henry; Brunner, Robert; Brzyski, Robert; Burwen, Dale; Caan, Bette; Carty, Cara L.; Chlebowski, Rowan; Cummings, Steven; Curb, J. David; Eaton, Charles B.; Ford, Leslie; Franceschini, Nora; Fullerton, Stephanie M.; Gass, Margery; Geller, Nancy; Heiss, Gerardo; Howard, Barbara V.; Hsu, Li; Hutter, Carolyn M.; Ioannidis, John; Jiao, Shuo; Johnson, Karen C.; Kooperberg, Charles; Kuller, Lewis; LaCroix, Andrea; Lakshminarayan, Kamakshi; Lane, Dorothy; Lasser, Norman; LeBlanc, Erin; Li, Kuo-Ping; Limacher, Marian; Lin, Dan-Yu; Logsdon, Benjamin A.; Ludlam, Shari; Manson, JoAnn E.; Margolis, Karen; Martin, Lisa; McGowan, Joan; Monda, Keri L.; Kotchen, Jane Morley; Nathan, Lauren; Ockene, Judith; O’Sullivan, Mary Jo; Phillips, Lawrence S.; Prentice, Ross L.; Robbins, John; Robinson, Jennifer G.; Rossouw, Jacques E.; Sangi-Haghpeykar, Haleh; Sarto, Gloria E.; Shumaker, Sally; Simon, Michael S.; Stefanick, Marcia L.; Stein, Evan; Tang, Hua; Taylor, Kira C.; Thomson, Cynthia A.; Thornton, Timothy A.; Van Horn, Linda; Vitolins, Mara; Wactawski-Wende, Jean; Wallace, Robert; Wassertheil-Smoller, Sylvia; Zeng, Donglin; Applebaum-Bowden, Deborah; Feolo, Michael; Gan, Weiniu; Paltoo, Dina N.; Sholinsky, Phyliss; Sturcke, Anne

    2014-01-01

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775

  19. HIV-1 envelope sequence-based diversity measures for identifying recent infections.

    Directory of Open Access Journals (Sweden)

    Alexis Kafando

    Full Text Available Identifying recent HIV-1 infections is crucial for monitoring HIV-1 incidence and optimizing public health prevention efforts. To identify recent HIV-1 infections, we evaluated and compared the performance of 4 sequence-based diversity measures including percent diversity, percent complexity, Shannon entropy and number of haplotypes targeting 13 genetic segments within the env gene of HIV-1. A total of 597 diagnostic samples obtained in 2013 and 2015 from recently and chronically HIV-1 infected individuals were selected. From the selected samples, 249 (134 from recent versus 115 from chronic infections env coding regions, including V1-C5 of gp120 and the gp41 ectodomain of HIV-1, were successfully amplified and sequenced by next generation sequencing (NGS using the Illumina MiSeq platform. The ability of the four sequence-based diversity measures to correctly identify recent HIV infections was evaluated using the frequency distribution curves, median and interquartile range and area under the curve (AUC of the receiver operating characteristic (ROC. Comparing the median and interquartile range and evaluating the frequency distribution curves associated with the 4 sequence-based diversity measures, we observed that the percent diversity, number of haplotypes and Shannon entropy demonstrated significant potential to discriminate recent from chronic infections (p<0.0001. Using the AUC of ROC analysis, only the Shannon entropy measure within three HIV-1 env segments could accurately identify recent infections at a satisfactory level. The env segments were gp120 C2_1 (AUC = 0.806, gp120 C2_3 (AUC = 0.805 and gp120 V3 (AUC = 0.812. Our results clearly indicate that the Shannon entropy measure represents a useful tool for predicting HIV-1 infection recency.

  20. Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrroly-sine containing genes

    DEFF Research Database (Denmark)

    Have, Christian Theil; Zambach, Sine; Christiansen, Henning

    2013-01-01

    for prediction of pyrrolysine incorporating genes in genomes of bacteria and archaea leading to insights about the factors driving pyrrolysine translation and identification of new gene candidates. The method predicts known conserved genes with high recall and predicts several other promising candidates...... for experimental verification. The method is implemented as a computational pipeline which is available on request....

  1. BLAT2DOLite: An Online System for Identifying Significant Relationships between Genetic Sequences and Diseases.

    Directory of Open Access Journals (Sweden)

    Liang Cheng

    Full Text Available The significantly related diseases of sequences could play an important role in understanding the functions of these sequences. In this paper, we introduced BLAT2DOLite, an online system for annotating human genes and diseases and identifying the significant relationships between sequences and diseases. Currently, BLAT2DOLite integrates Entrez Gene database and Disease Ontology Lite (DOLite, which contain loci of gene and relationships between genes and diseases. It utilizes hypergeometric test to calculate P-values between genes and diseases of DOLite. The system can be accessed from: http://123.59.132.21:8080/BLAT2DOLite. The corresponding web service is described in: http://123.59.132.21:8080/BLAT2DOLite/BLAT2DOLiteIDMappingPort?wsdl.

  2. Molecular dissection of a contiguous gene syndrome: Frequent submicroscopic deletions, evolutionarily conserved sequences, and a hypomethylated island in the Miller-Dieker chromosome region

    International Nuclear Information System (INIS)

    Ledbetter, D.H.; Ledbetter, S.A.; vanTuinen, P.

    1989-01-01

    The Miller-Dieker syndrome (MDS), composed of characteristic facial abnormalities and a severe neuronal migration disorder affecting the cerebral cortex, is caused by visible or submicroscopic deletions of chromosome band 17p13. Twelve anonymous DNA markers were tested against a panel of somatic cell hybrids containing 17p deletions from seven MDS patients. All patients, including three with normal karyotypes, are deleted for a variable set of 5-12 markers. Two highly polymorphic VNTR (variable number of tandem repeats) probes, YNZ22 and YNH37, are codeleted in all patients tested and make molecular diagnosis for this disorder feasible. By pulsed-field gel electrophoresis, YNZ22 and YNH37 were shown to be within 30 kilobases (kb) of each other. Cosmid clones containing both VNTR sequences were identified, and restriction mapping showed them to be 100 kb were completely deleted in all patients, providing a minimum estimate of the size of the MDS critical region. A hypomethylated island and evolutionarily conserved sequences were identified within this 100-kb region, indications of the presence of one or more expressed sequences potentially involved in the pathophysiology of this disorder. The conserved sequences were mapped to mouse chromosome 11 by using mouse-rat somatic cell hybrids, extending the remarkable homology between human chromosome 17 and mouse chromosome 11 by 30 centimorgans, into the 17p telomere region

  3. Functional brain activation differences in stuttering identified with a rapid fMRI sequence

    Science.gov (United States)

    Kraft, Shelly Jo; Choo, Ai Leen; Sharma, Harish; Ambrose, Nicoline G.

    2011-01-01

    The purpose of this study was to investigate whether brain activity related to the presence of stuttering can be identified with rapid functional MRI (fMRI) sequences that involved overt and covert speech processing tasks. The long-term goal is to develop sensitive fMRI approaches with developmentally appropriate tasks to identify deviant speech motor and auditory brain activity in children who stutter closer to the age at which recovery from stuttering is documented. Rapid sequences may be preferred for individuals or populations who do not tolerate long scanning sessions. In this report, we document the application of a picture naming and phoneme monitoring task in three minute fMRI sequences with adults who stutter (AWS). If relevant brain differences are found in AWS with these approaches that conform to previous reports, then these approaches can be extended to younger populations. Pairwise contrasts of brain BOLD activity between AWS and normally fluent adults indicated the AWS showed higher BOLD activity in the right inferior frontal gyrus (IFG), right temporal lobe and sensorimotor cortices during picture naming and and higher activity in the right IFG during phoneme monitoring. The right lateralized pattern of BOLD activity together with higher activity in sensorimotor cortices is consistent with previous reports, which indicates rapid fMRI sequences can be considered for investigating stuttering in younger participants. PMID:22133409

  4. Molecular defects identified by whole exome sequencing in a child with Fanconi anemia.

    Science.gov (United States)

    Zheng, Zhaojing; Geng, Juan; Yao, Ru-En; Li, Caihua; Ying, Daming; Shen, Yongnian; Ying, Lei; Yu, Yongguo; Fu, Qihua

    2013-11-10

    Fanconi anemia is a rare genetic disease characterized by bone marrow failure, multiple congenital malformations, and an increased susceptibility to malignancy. At least 15 genes have been identified that are involved in the pathogenesis of Fanconi anemia. However, it is still a challenge to assign the complementation group and to characterize the molecular defects in patients with Fanconi anemia. In the current study, whole exome sequencing was used to identify the affected gene(s) in a boy with Fanconi anemia. A recurring, non-synonymous mutation was found (c.3971C>T, p.P1324L) as well as a novel frameshift mutation (c.989_995del, p.H330LfsX2) in FANCA gene. Our results indicate that whole exome sequencing may be useful in clinical settings for rapid identification of disease-causing mutations in rare genetic disorders such as Fanconi anemia. © 2013 Elsevier B.V. All rights reserved.

  5. Identifying human disease genes through cross-species gene mapping of evolutionary conserved processes.

    Directory of Open Access Journals (Sweden)

    Martin Poot

    2011-05-01

    Full Text Available Understanding complex networks that modulate development in humans is hampered by genetic and phenotypic heterogeneity within and between populations. Here we present a method that exploits natural variation in highly diverse mouse genetic reference panels in which genetic and environmental factors can be tightly controlled. The aim of our study is to test a cross-species genetic mapping strategy, which compares data of gene mapping in human patients with functional data obtained by QTL mapping in recombinant inbred mouse strains in order to prioritize human disease candidate genes.We exploit evolutionary conservation of developmental phenotypes to discover gene variants that influence brain development in humans. We studied corpus callosum volume in a recombinant inbred mouse panel (C57BL/6J×DBA/2J, BXD strains using high-field strength MRI technology. We aligned mouse mapping results for this neuro-anatomical phenotype with genetic data from patients with abnormal corpus callosum (ACC development.From the 61 syndromes which involve an ACC, 51 human candidate genes have been identified. Through interval mapping, we identified a single significant QTL on mouse chromosome 7 for corpus callosum volume with a QTL peak located between 25.5 and 26.7 Mb. Comparing the genes in this mouse QTL region with those associated with human syndromes (involving ACC and those covered by copy number variations (CNV yielded a single overlap, namely HNRPU in humans and Hnrpul1 in mice. Further analysis of corpus callosum volume in BXD strains revealed that the corpus callosum was significantly larger in BXD mice with a B genotype at the Hnrpul1 locus than in BXD mice with a D genotype at Hnrpul1 (F = 22.48, p<9.87*10(-5.This approach that exploits highly diverse mouse strains provides an efficient and effective translational bridge to study the etiology of human developmental disorders, such as autism and schizophrenia.

  6. Cytoplasmic protein binding to highly conserved sequences in the 3' untranslated region of mouse protamine 2 mRNA, a translationally regulated transcript of male germ cells

    International Nuclear Information System (INIS)

    Kwon, Y.K.; Hecht, N.B.

    1991-01-01

    The expression of the protamines, the predominant nuclear proteins of mammalian spermatozoa, is regulated translationally during male germ-cell development. The 3' untranslated region (UTR) of protamine 1 mRNA has been reported to control its time of translation. To understand the mechanisms controlling translation of the protamine mRNAs, we have sought to identify cis elements of the 3' UTR of protamine 2 mRNA that are recognized by cytoplasmic factors. From gel retardation assays, two sequence elements are shown to form specific RNA-protein complexes. Protein binding sites of the two complexes were determined by RNase T1 mapping, by blocking the putative binding sites with antisense oligonucleotides, and by competition assays. The sequences of these elements, located between nucleotides + 537 and + 572 in protamine 2 mRNA, are highly conserved among postmeiotic translationally regulated nuclear proteins of the mammalian testis. Two closely linked protein binding sites were detected. UV-crosslinking studies revealed that a protein of about 18 kDa binds to one of the conserved sequences. These data demonstrate specific protein binding to a highly conserved 3' UTR of translationally regulated testicular mRNA

  7. Mouse transgenesis identifies conserved functional enhancers and cis-regulatory motif in the vertebrate LIM homeobox gene Lhx2 locus.

    Directory of Open Access Journals (Sweden)

    Alison P Lee

    Full Text Available The vertebrate Lhx2 is a member of the LIM homeobox family of transcription factors. It is essential for the normal development of the forebrain, eye, olfactory system and liver as well for the differentiation of lymphoid cells. However, despite the highly restricted spatio-temporal expression pattern of Lhx2, nothing is known about its transcriptional regulation. In mammals and chicken, Crb2, Dennd1a and Lhx2 constitute a conserved linkage block, while the intervening Dennd1a is lost in the fugu Lhx2 locus. To identify functional enhancers of Lhx2, we predicted conserved noncoding elements (CNEs in the human, mouse and fugu Crb2-Lhx2 loci and assayed their function in transgenic mouse at E11.5. Four of the eight CNE constructs tested functioned as tissue-specific enhancers in specific regions of the central nervous system and the dorsal root ganglia (DRG, recapitulating partial and overlapping expression patterns of Lhx2 and Crb2 genes. There was considerable overlap in the expression domains of the CNEs, which suggests that the CNEs are either redundant enhancers or regulating different genes in the locus. Using a large set of CNEs (810 CNEs associated with transcription factor-encoding genes that express predominantly in the central nervous system, we predicted four over-represented 8-mer motifs that are likely to be associated with expression in the central nervous system. Mutation of one of them in a CNE that drove reporter expression in the neural tube and DRG abolished expression in both domains indicating that this motif is essential for expression in these domains. The failure of the four functional enhancers to recapitulate the complete expression pattern of Lhx2 at E11.5 indicates that there must be other Lhx2 enhancers that are either located outside the region investigated or divergent in mammals and fishes. Other approaches such as sequence comparison between multiple mammals are required to identify and characterize such enhancers.

  8. Residential energy consumption and conservation programs: A systematic approach to identify inefficient households, provide meaningful feedback, and prioritize homes for conservation intervention

    Science.gov (United States)

    Macsleyne, Amelia Chadbourne Carus

    There are three main objectives for residential energy conservation policies: to reduce the use of fossil fuels, reduce greenhouse gas emissions, and reduce the energy costs seen by the consumer (U.S. Department of Energy: Strategic Objectives, 2006). A prominent difficulty currently facing conservation policy makers and program managers is how to identify and communicate with households that would be good candidates for conservation intervention, in such a way that affects a change in consumption patterns and is cost-effective. This research addresses this issue by separating the problem into three components: how to identify houses that are significantly more inefficient than comparable households; how to find the maximum financially-feasible investment in energy efficiency for a household in order to reduce annual energy costs and/or improve indoor comfort; and how to prioritize low-income households for a subsidized weatherization program. Each component of the problem is presented as a paper prepared for publication. Household consumption related to physical house efficiency, thermostat settings, and daily appliance usage is studied in the first and second paper by analyzing natural gas utility meter readings associated with over 10,000 households from 2001-2006. A rich description of a house's architectural characteristics and household demographics is attained by integrating publicly available databases based on the house address. This combination of information allows for the largest number of individual households studied at this level of detail to date. The third paper uses conservation program data from two natural gas utilities that administer and sponsor the program; over 1,000 weatherized households are included in this sample. This research focuses on natural gas-related household conservation. However, the same principles and methods could be applied for electricity-related conservation programs. We find positive policy implications from each of

  9. A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data

    Science.gov (United States)

    Lea, Amanda J.

    2015-01-01

    Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html. PMID:26599596

  10. Whole-exome sequencing identified a variant in EFTUD2 gene in establishing a genetic diagnosis.

    Science.gov (United States)

    Rengasamy Venugopalan, S; Farrow, E G; Lypka, M

    2017-06-01

    Craniofacial anomalies are complex and have an overlapping phenotype. Mandibulofacial Dysostosis and Oculo-Auriculo-Vertebral Spectrum are conditions that share common craniofacial phenotype and present a challenge in arriving at a diagnosis. In this report, we present a case of female proband who was given a differential diagnosis of Treacher Collins syndrome or Hemifacial Microsomia without certainty. Prior genetic testing reported negative for 22q deletion and FGFR screenings. The objective of this study was to demonstrate the critical role of whole-exome sequencing in establishing a genetic diagnosis of the proband. The participants were 14½-year-old affected female proband/parent trio. Proband/parent trio were enrolled in the study. Surgical tissue sample from the proband and parental blood samples were collected and prepared for whole-exome sequencing. Illumina HiSeq 2500 instrument was used for sequencing (125 nucleotide reads/84X coverage). Analyses of variants were performed using custom-developed software, RUNES and VIKING. Variant analyses following whole-exome sequencing identified a heterozygous de novo pathogenic variant, c.259C>T (p.Gln87*), in EFTUD2 (NM_004247.3) gene in the proband. Previous studies have reported that the variants in EFTUD2 gene were associated with Mandibulofacial Dysostosis with Microcephaly. Patients with facial asymmetry, micrognathia, choanal atresia and microcephaly should be analyzed for variants in EFTUD2 gene. Next-generation sequencing techniques, such as whole-exome sequencing offer great promise to improve the understanding of etiologies of sporadic genetic diseases. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  11. Identifying transposon insertions and their effects from RNA-sequencing data.

    Science.gov (United States)

    de Ruiter, Julian R; Kas, Sjors M; Schut, Eva; Adams, David J; Koudijs, Marco J; Wessels, Lodewyk F A; Jonkers, Jos

    2017-07-07

    Insertional mutagenesis using engineered transposons is a potent forward genetic screening technique used to identify cancer genes in mouse model systems. In the analysis of these screens, transposon insertion sites are typically identified by targeted DNA-sequencing and subsequently assigned to predicted target genes using heuristics. As such, these approaches provide no direct evidence that insertions actually affect their predicted targets or how transcripts of these genes are affected. To address this, we developed IM-Fusion, an approach that identifies insertion sites from gene-transposon fusions in standard single- and paired-end RNA-sequencing data. We demonstrate IM-Fusion on two separate transposon screens of 123 mammary tumors and 20 B-cell acute lymphoblastic leukemias, respectively. We show that IM-Fusion accurately identifies transposon insertions and their true target genes. Furthermore, by combining the identified insertion sites with expression quantification, we show that we can determine the effect of a transposon insertion on its target gene(s) and prioritize insertions that have a significant effect on expression. We expect that IM-Fusion will significantly enhance the accuracy of cancer gene discovery in forward genetic screens and provide initial insight into the biological effects of insertions on candidate cancer genes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Targeted next-generation sequencing analysis identifies novel mutations in families with severe familial exudative vitreoretinopathy

    Science.gov (United States)

    Huang, Xiao-Yan; Zhuang, Hong; Wu, Ji-Hong; Li, Jian-Kang; Hu, Fang-Yuan; Zheng, Yu; Tellier, Laurent Christian Asker M.; Zhang, Sheng-Hai; Gao, Feng-Juan; Zhang, Jian-Guo

    2017-01-01

    Purpose Familial exudative vitreoretinopathy (FEVR) is a genetically and clinically heterogeneous disease, characterized by failure of vascular development of the peripheral retina. The symptoms of FEVR vary widely among patients in the same family, and even between the two eyes of a given patient. This study was designed to identify the genetic defect in a patient cohort of ten Chinese families with a definitive diagnosis of FEVR. Methods To identify the causative gene, next-generation sequencing (NGS)-based target capture sequencing was performed. Segregation analysis of the candidate variant was performed in additional family members by using Sanger sequencing and quantitative real-time PCR (QPCR). Results Of the cohort of ten FEVR families, six pathogenic variants were identified, including four novel and two known heterozygous mutations. Of the variants identified, four were missense variants, and two were novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del]. The two novel heterozygous deletion mutations were not observed in the control subjects and could give rise to a relatively severe FEVR phenotype, which could be explained by the protein function prediction. Conclusions We identified two novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del] using targeted NGS as a causative mutation for FEVR. These genetic deletion variations exhibit a severe form of FEVR, with tractional retinal detachments compared with other known point mutations. The data further enrich the mutation spectrum of FEVR and enhance our understanding of genotype–phenotype correlations to provide useful information for disease diagnosis, prognosis, and effective genetic counseling. PMID:28867931

  13. Deep sequencing of uveal melanoma identifies a recurrent mutation in PLCB4

    DEFF Research Database (Denmark)

    Johansson, Peter; Aoude, Lauren G; Wadt, Karin

    2016-01-01

    Next generation sequencing of uveal melanoma (UM) samples has identified a number of recurrent oncogenic or loss-of-function mutations in key driver genes including: GNAQ, GNA11, EIF1AX, SF3B1 and BAP1. To search for additional driver mutations in this tumor type we carried out whole......, instead, a BRCA mutation signature predominated. In addition to mutations in the known UM driver genes, we found a recurrent mutation in PLCB4 (c.G1888T, p.D630Y, NM_000933), which was validated using Sanger sequencing. The identical mutation was also found in published UM sequence data (1 of 56 tumors......-genome or whole-exome sequencing of 28 tumors or primary cell lines. These samples have a low mutation burden, with a mean of 10.6 protein changing mutations per sample (range 0 to 53). As expected for these sun-shielded melanomas the mutation spectrum was not consistent with an ultraviolet radiation signature...

  14. Identifying and managing the conflicts between agriculture and biodiversity conservation in Europe–A review

    NARCIS (Netherlands)

    Henle, K.; Alard, D.; Clitherow, J.; Cobb, P.; Firbank, L.G.; Kull, T.; McCracken, D.; Moritz, R.F.A.; Niemelä, J.; Rebane, M.; Wascher, D.M.; Watt, A.; Young, J.

    2008-01-01

    This paper reviews conflicts between biodiversity conservation and agricultural activities in agricultural landscapes and evaluates strategies to reconcile such conflicts. Firstly, a historical perspective on the development of conflicts related to biodiversity in agricultural landscapes is

  15. Does taxonomic diversity in indicator groups influence their effectiveness in identifying priority areas for species conservation?

    DEFF Research Database (Denmark)

    Bladt, Jesper Stentoft; Larsen, Frank Wugt; Rahbek, Carsten

    2008-01-01

    The identification of priority areas for biodiversity conservation is a cornerstone of systematic conservation planning. However, biodiversity, or even the distribution of all species, cannot be directly quantified, due to the inherent complexity of natural systems. Species indicator groups may...... serve as important tools for the identification of priority areas for conservation. Yet, it is unclear which factors make certain indicator groups perform better than others. In this study, using data on the Danish distribution of 847 species of plants, vertebrates and insects, we assessed whether...... the taxonomic diversity in species indicator groups influence their effectiveness in the identification of priority areas for species conservation. We tested whether indicator groups comprising a higher taxonomic diversity (i.e. indicator groups consisting of species from many different taxonomic groups...

  16. Backcasting the decline of a vulnerable Great Plains reproductive ecotype: identifying threats and conservation priorities

    Science.gov (United States)

    Worthington, Thomas A.; Brewer, Shannon K.; Grabowski, Timothy B.; Mueller, Julia

    2014-01-01

    Conservation efforts for threatened or endangered species are challenging because the multi-scale factors that relate to their decline or inhibit their recovery are often unknown. To further exacerbate matters, the perceptions associated with the mechanisms of species decline are often viewed myopically rather than across the entire species range. We used over 80 years of fish presence data collected from the Great Plains and associated ecoregions of the United States, to investigate the relative influence of changing environmental factors on the historic and current truncated distributions of the Arkansas River shiner Notropis girardi. Arkansas River shiner represent a threatened reproductive ecotype considered especially well adapted to the harsh environmental extremes of the Great Plains. Historic (n = 163 records) and current (n = 47 records) species distribution models were constructed using a vector-based approach in MaxEnt by splitting the available data at a time when Arkansas River shiner dramatically declined. Discharge and stream order were significant predictors in both models; however, the shape of the relationship between the predictors and species presence varied between time periods. Drift distance (river fragment length available for ichthyoplankton downstream drift before meeting a barrier) was a more important predictor in the current model and indicated river segments 375–780 km had the highest probability of species presence. Performance for the historic and current models was high (area under the curve; AUC > 0.95); however, forecasting and backcasting to alternative time periods suggested less predictive power. Our results identify fragments that could be considered refuges for endemic plains fish species and we highlight significant environmental factors (e.g., discharge) that could be manipulated to aid recovery.

  17. Identifying conservation and restoration priorities for saproxylic and old-growth forest species: a case study in Switzerland.

    Science.gov (United States)

    Lachat, Thibault; Bütler, Rita

    2009-07-01

    Saproxylic (dead-wood-associated) and old-growth species are among the most threatened species in European forest ecosystems, as they are susceptible to intensive forest management. Identifying areas with particular relevant features of biodiversity is of prime concern when developing species conservation and habitat restoration strategies and in optimizing resource investments. We present an approach to identify regional conservation and restoration priorities even if knowledge on species distribution is weak, such as for saproxylic and old-growth species in Switzerland. Habitat suitability maps were modeled for an expert-based selection of 55 focal species, using an ecological niche factor analyses (ENFA). All the maps were then overlaid, in order to identify potential species' hotspots for different species groups of the 55 focal species (e.g., birds, fungi, red-listed species). We found that hotspots for various species groups did not correspond. Our results indicate that an approach based on "richness hotspots" may fail to conserve specific species groups. We hence recommend defining a biodiversity conservation strategy prior to implementing conservation/restoration efforts in specific regions. The conservation priority setting of the five biogeographical regions in Switzerland, however, did not differ when different hotspot definitions were applied. This observation emphasizes that the chosen method is robust. Since the ENFA needs only presence data, this species prediction method seems to be useful for any situation where the species distribution is poorly known and/or absence data are lacking. In order to identify priorities for either conservation or restoration efforts, we recommend a method based on presence data only, because absence data may reflect factors unrelated to species presence.

  18. Whole exome sequencing identifies novel mutation in eight Chinese children with isolated tetralogy of Fallot.

    Science.gov (United States)

    Liu, Lin; Wang, Hong-Dan; Cui, Cun-Ying; Qin, Yun-Yun; Fan, Tai-Bing; Peng, Bang-Tian; Zhang, Lian-Zhong; Wang, Cheng-Zeng

    2017-12-05

    Tetralogy of Fallot is the most common cyanotic congenital heart disease. However, its pathogenesis remains to be clarified. The purpose of this study was to identify the genetic variants in Tetralogy of Fallot by whole exome sequencing. Whole exome sequencing was performed among eight small families with Tetralogy of Fallot. Differential single nucleotide polymorphisms and small InDels were found by alignment within families and between families and then were verified by Sanger sequencing. Tetralogy of Fallot-related genes were determined by analysis using Gene Ontology /pathway, Online Mendelian Inheritance in Man, PubMed and other databases. A total of sixteen differential single nucleotide polymorphisms loci and eight differential small InDels were discovered. The sixteen differential single nucleotide polymorphisms loci were located on Chr 1, 2, 4, 5, 11, 12, 15, 22 and X. Among the sixteen single nucleotide polymorphisms loci, six has not been reported. The eight differential small InDels were located on Chr 2, 4, 9, 12, 17, 19 and X, whereas of the eight differential small InDels, two has not been reported. Analysis using Gene Ontology /pathway, Online Mendelian Inheritance in Man, PubMed and other databases revealed that PEX5 , NACA , ATXN2 , CELA1 , PCDHB4 and CTBP1 were associated with Tetralogy of Fallot. Our findings identify PEX5 , NACA , ATXN2 , CELA1 , PCDHB4 and CTBP1 mutations as underlying genetic causes of isolated tetralogy of Fallot.

  19. Exome Sequencing Identifies Potential Risk Variants for Mendelian Disorders at High Prevalence in Qatar

    Science.gov (United States)

    Rodriguez-Flores, Juan L.; Fakhro, Khalid; Hackett, Neil R.; Salit, Jacqueline; Fuller, Jennifer; Agosto-Perez, Francisco; Gharbiah, Maey; Malek, Joel A.; Zirie, Mahmoud; Jayyousi, Amin; Badii, Ramin; Al-Marri, Ajayeb Al-Nabet; Chouchane, Lotfi; Stadler, Dora J.; Hunter-Zinck, Haley; Mezey, Jason G.; Crystal, Ronald G.

    2013-01-01

    Exome sequencing of families of related individuals has been highly successful in identifying genetic polymorphisms responsible for Mendelian disorders. Here, we demonstrate the value of the reverse approach, where we use exome sequencing of a sample of unrelated individuals to analyze allele frequencies of known causal mutations for Mendelian diseases. We sequenced the exomes of 100 individuals representing the three major genetic subgroups of the Qatari population (Q1 Bedouin, Q2 Persian-South Asian, Q3 African) and identified 37 variants in 33 genes with effects on 36 clinically significant Mendelian diseases. These include variants not present in 1000 Genomes and variants at high frequency when compared to 1000 Genomes populations. Several of these Mendelian variants were only segregating in one Qatari subpopulation, where the observed subpopulation specificity trends were confirmed in an independent population of 386 Qataris. Pre-marital genetic screening in Qatar tests for only 4 out of the 37, such that this study provides a set of Mendelian disease variants with potential impact on the epidemiological profile of the population that could be incorporated into the testing program if further experimental and clinical characterization confirms high penetrance. PMID:24123366

  20. Complete genome sequence of Clostridium estertheticum DSM 8809, a microbe identified in spoiled vacuum packed beef

    Directory of Open Access Journals (Sweden)

    Zhongyi Yu

    2016-11-01

    Full Text Available Blown pack spoilage (BPS is a major issue for the beef industry. Aetiological agents of BPS involve members of a group of Clostridium species, including Clostridium estertheticum which has the ability to produce gas, mostly carbon dioxide, under anaerobic psychotrophic growth conditions. This spore-forming bacterium grows slowly under laboratory conditions, and it can take up to 3 months to produce a workable culture. These characteristics have limited the study of this commercially challenging bacterium. Consequently information on this bacterium is limited and no effective controls are currently available to confidently detect and manage this production risk. In this study the complete genome of Clostridium estertheticum DSM 8809 was determined by SMRT® sequencing. The genome consists of a circular chromosome of 4.7 Mbp along with a single plasmid carrying a potential tellurite resistance gene tehB and a Tn3-like resolvase-encoding gene tnpR. The genome sequence was searched for central metabolic pathways that would support its biochemical profile and several enzymes contributing to this phenotype were identified. Several putative antibiotic/biocide/metal resistance-encoding genes and virulence factors were also identified in the genome, a feature that requires further research. The availability of the genome sequence will provide a basic blueprint from which to develop valuable biomarkers that could support and improve the detection and control of this bacterium along the beef production chain.

  1. Somatic mosaicism of a CDKL5 mutation identified by next-generation sequencing.

    Science.gov (United States)

    Kato, Takeshi; Morisada, Naoya; Nagase, Hiroaki; Nishiyama, Masahiro; Toyoshima, Daisaku; Nakagawa, Taku; Maruyama, Azusa; Fu, Xue Jun; Nozu, Kandai; Wada, Hiroko; Takada, Satoshi; Iijima, Kazumoto

    2015-10-01

    CDKL5-related encephalopathy is an X-linked dominantly inherited disorder that is characterized by early infantile epileptic encephalopathy or atypical Rett syndrome. We describe a 5-year-old Japanese boy with intractable epilepsy, severe developmental delay, and Rett syndrome-like features. Onset was at 2 months, when his electroencephalogram showed sporadic single poly spikes and diffuse irregular poly spikes. We conducted a genetic analysis using an Illumina® TruSight™ One sequencing panel on a next-generation sequencer. We identified two epilepsy-associated single nucleotide variants in our case: CDKL5 p.Ala40Val and KCNQ2 p.Glu515Asp. CDKL5 p.Ala40Val has been previously reported to be responsible for early infantile epileptic encephalopathy. In our case, the CDKL5 heterozygous mutation showed somatic mosaicism because the boy's karyotype was 46,XY. The KCNQ2 variant p.Glu515Asp is known to cause benign familial neonatal seizures-1, and this variant showed paternal inheritance. Although we believe that the somatic mosaic CDKL5 mutation is mainly responsible for the neurological phenotype in the patient, the KCNQ2 variant might have some neurological effect. Genetic analysis by next-generation sequencing is capable of identifying multiple variants in a patient. Copyright © 2015 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.

  2. Genetic mapping and exome sequencing identify variants associated with five novel diseases.

    Directory of Open Access Journals (Sweden)

    Erik G Puffenberger

    Full Text Available The Clinic for Special Children (CSC has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain children. Among the Plain people, we have used single nucleotide polymorphism (SNP microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb that contain many genes (mean = 79. For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data.

  3. Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    Directory of Open Access Journals (Sweden)

    Devier Benjamin

    2007-08-01

    Full Text Available Abstract Background The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.

  4. Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats

    Directory of Open Access Journals (Sweden)

    Graner Andreas

    2008-10-01

    Full Text Available Abstract Background Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR index can be generated to map repetitive regions in genomic sequences. Results We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised. Conclusion An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences regions in uncharacterised genomic sequences. The restriction that a particular

  5. Full genome sequencing and genetic characterization of Eubenangee viruses identify Pata virus as a distinct species within the genus Orbivirus.

    Directory of Open Access Journals (Sweden)

    Manjunatha N Belaganahalli

    Full Text Available Eubenangee virus has previously been identified as the cause of Tammar sudden death syndrome (TSDS. Eubenangee virus (EUBV, Tilligery virus (TILV, Pata virus (PATAV and Ngoupe virus (NGOV are currently all classified within the Eubenangee virus species of the genus Orbivirus, family Reoviridae. Full genome sequencing confirmed that EUBV and TILV (both of which are from Australia show high levels of aa sequence identity (>92% in the conserved polymerase VP1(Pol, sub-core VP3(T2 and outer core VP7(T13 proteins, and are therefore appropriately classified within the same virus species. However, they show much lower amino acid (aa identity levels in their larger outer-capsid protein VP2 (<53%, consistent with membership of two different serotypes - EUBV-1 and EUBV-2 (respectively. In contrast PATAV showed significantly lower levels of aa sequence identity with either EUBV or TILV (with <71% in VP1(Pol and VP3(T2, and <57% aa identity in VP7(T13 consistent with membership of a distinct virus species. A proposal has therefore been sent to the Reoviridae Study Group of ICTV to recognise 'Pata virus' as a new Orbivirus species, with the PATAV isolate as serotype 1 (PATAV-1. Amongst the other orbiviruses, PATAV shows closest relationships to Epizootic Haemorrhagic Disease virus (EHDV, with 80.7%, 72.4% and 66.9% aa identity in VP3(T2, VP1(Pol, and VP7(T13 respectively. Although Ngoupe virus was not available for these studies, like PATAV it was isolated in Central Africa, and therefore seems likely to also belong to the new species, possibly as a distinct 'type'. The data presented will facilitate diagnostic assay design and the identification of additional isolates of these viruses.

  6. Use of a mitochondrial COI sequence to identify species of the subtribe Aphidina (Hemiptera, Aphididae

    Directory of Open Access Journals (Sweden)

    Jianfeng WANG

    2011-08-01

    Full Text Available Aphids of the subtribe Aphidina are found mainly in the North Temperate Zone. The relative lack of diagnostic morphological characteristics has obscured the identification of species in this group. However, DNA-based taxonomic methods can clarify species relationships within this group. Sequence variation in a partial segment of the mitochondrial COI gene was highly effective for resolving species relationships within Aphidina. Forty-five species were correctly identified in a neighbor-joining tree. Mean intraspecific sequence divergence was 0.17%, with a range of 0.00% to 1.54%. Mean interspecific divergence within previously recognized genera or morphologically similar species groups was 4.54%, with variation mainly in the range of 3.50% to 8.00%. Possible reasons for anomalous levels of mean nucleotide divergence within or between some taxa are discussed.

  7. Whole-genome and Transcriptome Sequencing of Prostate Cancer Identify New Genetic Alterations Driving Disease Progression

    DEFF Research Database (Denmark)

    Ren, Shancheng; Wei, Gong-Hong; Liu, Dongbing

    2018-01-01

    BACKGROUND: Global disparities in prostate cancer (PCa) incidence highlight the urgent need to identify genomic abnormalities in prostate tumors in different ethnic populations including Asian men. OBJECTIVE: To systematically explore the genomic complexity and define disease-driven genetic......-scale and comprehensive genomic data of prostate cancer from Asian population. Identification of these genetic alterations may help advance prostate cancer diagnosis, prognosis, and treatment....... alterations in PCa. DESIGN, SETTING, AND PARTICIPANTS: The study sequenced whole-genome and transcriptome of tumor-benign paired tissues from 65 treatment-naive Chinese PCa patients. Subsequent targeted deep sequencing of 293 PCa-relevant genes was performed in another cohort of 145 prostate tumors. OUTCOME...

  8. Microsatellite Primers Identified by 454 Sequencing in the Floodplain Tree Species Eucalyptus victrix (Myrtaceae

    Directory of Open Access Journals (Sweden)

    Paul G. Nevill

    2013-05-01

    Full Text Available Premise of the study: Microsatellite primers were developed for Eucalyptus victrix (Myrtaceae to evaluate the population and spatial genetic structure of this widespread northwestern Australian riparian tree species, which may be impacted by hydrological changes associated with mining activity. Methods and Results: 454 GS-FLX shotgun sequencing was used to obtain 1895 sequences containing putative microsatellite motifs. Ten polymorphic microsatellite loci were identified and screened for variation in individuals from two populations in the Pilbara region. Observed heterozygosities ranged from 0.44 to 0.91 (mean: 0.66 and the number of alleles per locus ranged from five to 25 (average: 11. Conclusions: These microsatellite loci will be useful in future studies of population and spatial genetic structure in E. victrix, and inform the development of seed sourcing strategies for the species.

  9. Sequence-Based Introgression Mapping Identifies Candidate White Mold Tolerance Genes in Common Bean

    Directory of Open Access Journals (Sweden)

    Sujan Mamidi

    2016-07-01

    Full Text Available White mold, caused by the necrotrophic fungus (Lib. de Bary, is a major disease of common bean ( L.. WM7.1 and WM8.3 are two quantitative trait loci (QTL with major effects on tolerance to the pathogen. Advanced backcross populations segregating individually for either of the two QTL, and a recombinant inbred (RI population segregating for both QTL were used to fine map and confirm the genetic location of the QTL. The QTL intervals were physically mapped using the reference common bean genome sequence, and the physical intervals for each QTL were further confirmed by sequence-based introgression mapping. Using whole-genome sequence data from susceptible and tolerant DNA pools, introgressed regions were identified as those with significantly higher numbers of single-nucleotide polymorphisms (SNPs relative to the whole genome. By combining the QTL and SNP data, WM7.1 was located to a 660-kb region that contained 41 gene models on the proximal end of chromosome Pv07, while the WM8.3 introgression was narrowed to a 1.36-Mb region containing 70 gene models. The most polymorphic candidate gene in the WM7.1 region encodes a BEACH-domain protein associated with apoptosis. Within the WM8.3 interval, a receptor-like protein with the potential to recognize pathogen effectors was the most polymorphic gene. The use of gene and sequence-based mapping identified two candidate genes whose putative functions are consistent with the current model of pathogenicity.

  10. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    Science.gov (United States)

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  11. Identifying designatable units for intraspecific conservation prioritization: a hierarchical approach applied to the lake whitefish species complex (Coregonus spp.).

    Science.gov (United States)

    Mee, Jonathan A; Bernatchez, Louis; Reist, Jim D; Rogers, Sean M; Taylor, Eric B

    2015-06-01

    The concept of the designatable unit (DU) affords a practical approach to identifying diversity below the species level for conservation prioritization. However, its suitability for defining conservation units in ecologically diverse, geographically widespread and taxonomically challenging species complexes has not been broadly evaluated. The lake whitefish species complex (Coregonus spp.) is geographically widespread in the Northern Hemisphere, and it contains a great deal of variability in ecology and evolutionary legacy within and among populations, as well as a great deal of taxonomic ambiguity. Here, we employ a set of hierarchical criteria to identify DUs within the Canadian distribution of the lake whitefish species complex. We identified 36 DUs based on (i) reproductive isolation, (ii) phylogeographic groupings, (iii) local adaptation and (iv) biogeographic regions. The identification of DUs is required for clear discussion regarding the conservation prioritization of lake whitefish populations. We suggest conservation priorities among lake whitefish DUs based on biological consequences of extinction, risk of extinction and distinctiveness. Our results exemplify the need for extensive genetic and biogeographic analyses for any species with broad geographic distributions and the need for detailed evaluation of evolutionary history and adaptive ecological divergence when defining intraspecific conservation units.

  12. Contig Maps and Genomic Sequencing Identify Candidate Genes in the Usher 1C Locus

    Science.gov (United States)

    Higgins, Michael J.; Day, Colleen D.; Smilinich, Nancy J.; Ni, L.; Cooper, Paul R.; Nowak, Norma J.; Davies, Chris; de Jong, Pieter J.; Hejtmancik, Fielding; Evans, Glen A.; Smith, Richard J.H.; Shows, Thomas B.

    1998-01-01

    Usher syndrome 1C (USH1C) is a congenital condition manifesting profound hearing loss, the absence of vestibular function, and eventual retinal degeneration. The USH1C locus has been mapped genetically to a 2- to 3-cM interval in 11p14–15.1 between D11S899 and D11S861. In an effort to identify the USH1C disease gene we have isolated the region between these markers in yeast artificial chromosomes (YACs) using a combination of STS content mapping and Alu–PCR hybridization. The YAC contig is ∼3.5 Mb and has located several other loci within this interval, resulting in the order CEN-LDHA-SAA1-TPH-D11S1310-(D11S1888/KCNC1)-MYOD1-D11S902D11S921-D11S1890-TEL. Subsequent haplotyping and homozygosity analysis refined the location of the disease gene to a 400-kb interval between D11S902 and D11S1890 with all affected individuals being homozygous for the internal marker D11S921. To facilitate gene identification, the critical region has been converted into P1 artificial chromosome (PAC) clones using sequence-tagged sites (STSs) mapped to the YAC contig, Alu–PCR products generated from the YACs, and PAC end probes. A contig of >50 PAC clones has been assembled between D11S1310 and D11S1890, confirming the order of markers used in haplotyping. Three PAC clones representing nearly two-thirds of the USH1C critical region have been sequenced. PowerBLAST analysis identified six clusters of expressed sequence tags (ESTs), two known genes (BIR,SUR1) mapped previously to this region, and a previously characterized but unmapped gene NEFA (DNA binding/EF hand/acidic amino-acid-rich). GRAIL analysis identified 11 CpG islands and 73 exons of excellent quality. These data allowed the construction of a transcription map for the USH1C critical region, consisting of three known genes and six or more novel transcripts. Based on their map location, these loci represent candidate disease loci for USH1C. The NEFA gene was assessed as the USH1C locus by the sequencing of an amplified NEFA

  13. Exome Sequencing and Linkage Analysis Identified Novel Candidate Genes in Recessive Intellectual Disability Associated with Ataxia.

    Science.gov (United States)

    Jazayeri, Roshanak; Hu, Hao; Fattahi, Zohreh; Musante, Luciana; Abedini, Seyedeh Sedigheh; Hosseini, Masoumeh; Wienker, Thomas F; Ropers, Hans Hilger; Najmabadi, Hossein; Kahrizi, Kimia

    2015-10-01

    Intellectual disability (ID) is a neuro-developmental disorder which causes considerable socio-economic problems. Some ID individuals are also affected by ataxia, and the condition includes different mutations affecting several genes. We used whole exome sequencing (WES) in combination with homozygosity mapping (HM) to identify the genetic defects in five consanguineous families among our cohort study, with two affected children with ID and ataxia as major clinical symptoms. We identified three novel candidate genes, RIPPLY1, MRPL10, SNX14, and a new mutation in known gene SURF1. All are autosomal genes, except RIPPLY1, which is located on the X chromosome. Two are housekeeping genes, implicated in transcription and translation regulation and intracellular trafficking, and two encode mitochondrial proteins. The pathogenesis of these variants was evaluated by mutation classification, bioinformatic methods, review of medical and biological relevance, co-segregation studies in the particular family, and a normal population study. Linkage analysis and exome sequencing of a small number of affected family members is a powerful new technique which can be used to decrease the number of candidate genes in heterogenic disorders such as ID, and may even identify the responsible gene(s).

  14. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes.

    Science.gov (United States)

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4(-/-) mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.

  15. Identifying care actions to conserve dignity in end-of-life care.

    Science.gov (United States)

    Brown, Hilary; Johnston, Bridget; Ostlund, Ulrika

    2011-05-01

    Community nurses have a central role in the provision of palliative and end-of-life care; helping people to die with dignity is an important component of this care. To conserve dignity, care should comprise a broad range of actions addressing the distress that might impact on the patient's sense of dignity. These care actions need to be defined. This study aims to suggest care actions that conserve dignity at the end of life based on evidence from local experience and community nursing practice. Data were collected by focus group interviews and analysed by framework analysis using the Chochinov model of dignity as a predefined framework. Suggestions on care actions were given in relation to all themes. As part of a multi-phase project developing and testing a dignity care pathway, this study might help community nurses to conserve dying patients' dignity.

  16. Identifying the role of conservation biology for solving the environmental crisis.

    Science.gov (United States)

    Dalerum, Fredrik

    2014-11-01

    Humans are altering their living environment to an extent that could cause environmental collapse. Promoting change into environmental sustainability is therefore urgent. Despite a rapid expansion in conservation biology, appreciation of underlying causes and identification of long-term solutions have largely been lacking. I summarized knowledge regarding the environmental crisis, and argue that the most important contributions toward solutions come from economy, political sciences, and psychology. Roles of conservation biology include providing environmental protection until sustainable solutions have been found, evaluating the effectiveness of implemented solutions, and providing societies with information necessary to align effectively with environmental values. Because of the potential disciplinary discrepancy between finding long-term solutions and short-term protection, we may face critical trade-offs between allocations of resources toward achieving sustainability. Since biological knowledge is required for such trade-offs, an additional role for conservation biologists may be to provide guidance toward finding optimal strategies in such trade-offs.

  17. Globicatella sanguinis bacteraemia identified by partial 16S rRNA gene sequencing

    DEFF Research Database (Denmark)

    Abdul-Redha, Rawaa Jalil; Balslew, Ulla; Christensen, Jens Jørgen

    2007-01-01

    Globicatella sanguinis is a gram-positive coccus, resembling non-haemolytic streptococci. The organism has been isolated infrequently from normally sterile sites of humans. Three isolates obtained by blood culture could not be identified by Rapid 32 ID Strep, but partial sequencing of the 16S r......RNA gene revealed the identity of the isolated bacteria, and supplementary biochemical tests confirmed the species identification. The cases histories illustrate the dilemma of finding relevant, newly recognized, opportunistic pathogens and the identification achievement (s) that can be obtained by using...

  18. RePS: a sequence assembler that masks exact repeats identified from the shotgun data

    DEFF Research Database (Denmark)

    Wang, Jun; Wong, Gane Ka-Shu; Ni, Peixiang

    2002-01-01

    We describe a sequence assembler, RePS (repeat-masked Phrap with scaffolding), that explicitly identifies exact 20mer repeats from the shotgun data and removes them prior to the assembly. The established software is used to compute meaningful error probabilities for each base. Clone......-end-pairing information is used to construct scaffolds that order and orient the contigs. We show with real data for human and rice that reasonable assemblies are possible even at coverages of only 4x to 6x, despite having up to 42.2% in exact repeats. Udgivelsesdato: 2002-May...

  19. Comparative sequence analysis of Solanum and Arabidopsis in a hot spot for pathogen resistance on potato chromosome V reveals a patchwork of conserved and rapidly evolving genome segments

    Directory of Open Access Journals (Sweden)

    Bruggmann Rémy

    2007-05-01

    Full Text Available Abstract Background Quantitative phenotypic variation of agronomic characters in crop plants is controlled by environmental and genetic factors (quantitative trait loci = QTL. To understand the molecular basis of such QTL, the identification of the underlying genes is of primary interest and DNA sequence analysis of the genomic regions harboring QTL is a prerequisite for that. QTL mapping in potato (Solanum tuberosum has identified a region on chromosome V tagged by DNA markers GP21 and GP179, which contains a number of important QTL, among others QTL for resistance to late blight caused by the oomycete Phytophthora infestans and to root cyst nematodes. Results To obtain genomic sequence for the targeted region on chromosome V, two local BAC (bacterial artificial chromosome contigs were constructed and sequenced, which corresponded to parts of the homologous chromosomes of the diploid, heterozygous genotype P6/210. Two contiguous sequences of 417,445 and 202,781 base pairs were assembled and annotated. Gene-by-gene co-linearity was disrupted by non-allelic insertions of retrotransposon elements, stretches of diverged intergenic sequences, differences in gene content and gene order. The latter was caused by inversion of a 70 kbp genomic fragment. These features were also found in comparison to orthologous sequence contigs from three homeologous chromosomes of Solanum demissum, a wild tuber bearing species. Functional annotation of the sequence identified 48 putative open reading frames (ORF in one contig and 22 in the other, with an average of one ORF every 9 kbp. Ten ORFs were classified as resistance-gene-like, 11 as F-box-containing genes, 13 as transposable elements and three as transcription factors. Comparing potato to Arabidopsis thaliana annotated proteins revealed five micro-syntenic blocks of three to seven ORFs with A. thaliana chromosomes 1, 3 and 5. Conclusion Comparative sequence analysis revealed highly conserved collinear regions

  20. Completed Ensemble Empirical Mode Decomposition: a Robust Signal Processing Tool to Identify Sequence Strata

    Science.gov (United States)

    Purba, H.; Musu, J. T.; Diria, S. A.; Permono, W.; Sadjati, O.; Sopandi, I.; Ruzi, F.

    2018-03-01

    Well logging data provide many geological information and its trends resemble nonlinear or non-stationary signals. As long well log data recorded, there will be external factors can interfere or influence its signal resolution. A sensitive signal analysis is required to improve the accuracy of logging interpretation which it becomes an important thing to determine sequence stratigraphy. Complete Ensemble Empirical Mode Decomposition (CEEMD) is one of nonlinear and non-stationary signal analysis method which decomposes complex signal into a series of intrinsic mode function (IMF). Gamma Ray and Spontaneous Potential well log parameters decomposed into IMF-1 up to IMF-10 and each of its combination and correlation makes physical meaning identification. It identifies the stratigraphy and cycle sequence and provides an effective signal treatment method for sequence interface. This method was applied to BRK- 30 and BRK-13 well logging data. The result shows that the combination of IMF-5, IMF-6, and IMF-7 pattern represent short-term and middle-term while IMF-9 and IMF-10 represent the long-term sedimentation which describe distal front and delta front facies, and inter-distributary mouth bar facies, respectively. Thus, CEEMD clearly can determine the different sedimentary layer interface and better identification of the cycle of stratigraphic base level.

  1. Polyglutamine repeats are associated to specific sequence biases that are conserved among eukaryotes.

    Directory of Open Access Journals (Sweden)

    Matteo Ramazzotti

    Full Text Available Nine human neurodegenerative diseases, including Huntington's disease and several spinocerebellar ataxia, are associated to the aggregation of proteins comprising an extended tract of consecutive glutamine residues (polyQs once it exceeds a certain length threshold. This event is believed to be the consequence of the expansion of polyCAG codons during the replication process. This is in apparent contradiction with the fact that many polyQs-containing proteins remain soluble and are encoded by invariant genes in a number of eukaryotes. The latter suggests that polyQs expansion and/or aggregation might be counter-selected through a genetic and/or protein context. To identify this context, we designed a software that scrutinize entire proteomes in search for imperfect polyQs. The nature of residues flanking the polyQs and that of residues other than Gln within polyQs (insertions were assessed. We discovered strong amino acid residue biases robustly associated to polyQs in the 15 eukaryotic proteomes we examined, with an over-representation of Pro, Leu and His and an under-representation of Asp, Cys and Gly amino acid residues. These biases are conserved amongst unrelated proteins and are independent of specific functional classes. Our findings suggest that specific residues have been co-selected with polyQs during evolution. We discuss the possible selective pressures responsible of the observed biases.

  2. JDet: interactive calculation and visualization of function-related conservation patterns in multiple sequence alignments and structures.

    Science.gov (United States)

    Muth, Thilo; García-Martín, Juan A; Rausell, Antonio; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2012-02-15

    We have implemented in a single package all the features required for extracting, visualizing and manipulating fully conserved positions as well as those with a family-dependent conservation pattern in multiple sequence alignments. The program allows, among other things, to run different methods for extracting these positions, combine the results and visualize them in protein 3D structures and sequence spaces. JDet is a multiplatform application written in Java. It is freely available, including the source code, at http://csbg.cnb.csic.es/JDet. The package includes two of our recently developed programs for detecting functional positions in protein alignments (Xdet and S3Det), and support for other methods can be added as plug-ins. A help file and a guided tutorial for JDet are also available.

  3. Expanding protected areas beyond their terrestrial comfort zone: identifying spatial options for river conservation

    CSIR Research Space (South Africa)

    Nel, JL

    2009-08-01

    Full Text Available and processes in both new and existing protected areas. Data to address these objectives were collated in a Geographic Information System (GIS) and a conservation planning algorithm was used as a means of integrating the multiple objectives in a spatially...

  4. Participatory mapping to identify indigenous community use zones : Implications for conservation planning in southern Suriname

    NARCIS (Netherlands)

    Ramirez-Gomez, Sara O I; Brown, Greg; Verweij, Pita A.; Boot, René

    2016-01-01

    Large-scale development projects often overlap forest areas that support the livelihoods of indigenous peoples, threatening in situ conservation strategies for the protection of biological and cultural diversity. To address this problem, there is a need to integrate spatially-explicit information on

  5. A Highly Conserved GEQYQQLR Epitope Has Been Identified in the Nucleoprotein of Ebola Virus by Using an In Silico Approach

    Directory of Open Access Journals (Sweden)

    Mohammad Tuhin Ali

    2015-01-01

    Full Text Available Ebola virus (EBOV is a deadly virus that has caused several fatal outbreaks. Recently it caused another outbreak and resulted in thousands afflicted cases. Effective and approved vaccine or therapeutic treatment against this virus is still absent. In this study, we aimed to predict B-cell epitopes from several EBOV encoded proteins which may aid in developing new antibody-based therapeutics or viral antigen detection method against this virus. Multiple sequence alignment (MSA was performed for the identification of conserved region among glycoprotein (GP, nucleoprotein (NP, and viral structural proteins (VP40, VP35, and VP24 of EBOV. Next, different consensus immunogenic and conserved sites were predicted from the conserved region(s using various computational tools which are available in Immune Epitope Database (IEDB. Among GP, NP, VP40, VP35, and VP30 protein, only NP gave a 100% conserved GEQYQQLR B-cell epitope that fulfills the ideal features of an effective B-cell epitope and could lead a way in the milieu of Ebola treatment. However, successful in vivo and in vitro studies are prerequisite to determine the actual potency of our predicted epitope and establishing it as a preventing medication against all the fatal strains of EBOV.

  6. Exome sequencing of a large family identifies potential candidate genes contributing risk to bipolar disorder.

    Science.gov (United States)

    Zhang, Tianxiao; Hou, Liping; Chen, David T; McMahon, Francis J; Wang, Jen-Chyong; Rice, John P

    2018-03-01

    Bipolar disorder is a mental illness with lifetime prevalence of about 1%. Previous genetic studies have identified multiple chromosomal linkage regions and candidate genes that might be associated with bipolar disorder. The present study aimed to identify potential susceptibility variants for bipolar disorder using 6 related case samples from a four-generation family. A combination of exome sequencing and linkage analysis was performed to identify potential susceptibility variants for bipolar disorder. Our study identified a list of five potential candidate genes for bipolar disorder. Among these five genes, GRID1(Glutamate Receptor Delta-1 Subunit), which was previously reported to be associated with several psychiatric disorders and brain related traits, is particularly interesting. Variants with functional significance in this gene were identified from two cousins in our bipolar disorder pedigree. Our findings suggest a potential role for these genes and the related rare variants in the onset and development of bipolar disorder in this one family. Additional research is needed to replicate these findings and evaluate their patho-biological significance. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Transcriptome Sequencing of Chemically Induced Aquilaria sinensis to Identify Genes Related to Agarwood Formation.

    Science.gov (United States)

    Ye, Wei; Wu, Hongqing; He, Xin; Wang, Lei; Zhang, Weimin; Li, Haohua; Fan, Yunfei; Tan, Guohui; Liu, Taomei; Gao, Xiaoxia

    2016-01-01

    Agarwood is a traditional Chinese medicine used as a clinical sedative, carminative, and antiemetic drug. Agarwood is formed in Aquilaria sinensis when A. sinensis trees are threatened by external physical, chemical injury or endophytic fungal irritation. However, the mechanism of agarwood formation via chemical induction remains unclear. In this study, we characterized the transcriptome of different parts of a chemically induced A. sinensis trunk sample with agarwood. The Illumina sequencing platform was used to identify the genes involved in agarwood formation. A five-year-old Aquilaria sinensis treated by formic acid was selected. The white wood part (B1 sample), the transition part between agarwood and white wood (W2 sample), the agarwood part (J3 sample), and the rotten wood part (F5 sample) were collected for transcriptome sequencing. Accordingly, 54,685,634 clean reads, which were assembled into 83,467 unigenes, were obtained with a Q20 value of 97.5%. A total of 50,565 unigenes were annotated using the Nr, Nt, SWISS-PROT, KEGG, COG, and GO databases. In particular, 171,331,352 unigenes were annotated by various pathways, including the sesquiterpenoid (ko00909) and plant-pathogen interaction (ko03040) pathways. These pathways were related to sesquiterpenoid biosynthesis and defensive responses to chemical stimulation. The transcriptome data of the different parts of the chemically induced A. sinensis trunk provide a rich source of materials for discovering and identifying the genes involved in sesquiterpenoid production and in defensive responses to chemical stimulation. This study is the first to use de novo sequencing and transcriptome assembly for different parts of chemically induced A. sinensis. Results demonstrate that the sesquiterpenoid biosynthesis pathway and WRKY transcription factor play important roles in agarwood formation via chemical induction. The comparative analysis of the transcriptome data of agarwood and A. sinensis lays the foundation

  8. Exome sequencing identifies CTSK mutations in patients originally diagnosed as intermediate osteopetrosis☆

    Science.gov (United States)

    Pangrazio, Alessandra; Puddu, Alessandro; Oppo, Manuela; Valentini, Maria; Zammataro, Luca; Vellodi, Ashok; Gener, Blanca; Llano-Rivas, Isabel; Raza, Jamal; Atta, Irum; Vezzoni, Paolo; Superti-Furga, Andrea; Villa, Anna; Sobacchi, Cristina

    2014-01-01

    Autosomal Recessive Osteopetrosis is a genetic disorder characterized by increased bone density due to lack of resorption by the osteoclasts. Genetic studies have widely unraveled the molecular basis of the most severe forms, while cases of intermediate severity are more difficult to characterize, probably because of a large heterogeneity. Here, we describe the use of exome sequencing in the molecular diagnosis of 2 siblings initially thought to be affected by “intermediate osteopetrosis”, which identified a homozygous mutation in the CTSK gene. Prompted by this finding, we tested by Sanger sequencing 25 additional patients addressed to us for recessive osteopetrosis and found CTSK mutations in 4 of them. In retrospect, their clinical and radiographic features were found to be compatible with, but not typical for, Pycnodysostosis. We sought to identify modifier genes that might have played a role in the clinical manifestation of the disease in these patients, but our results were not informative. In conclusion, we underline the difficulties of differential diagnosis in some patients whose clinical appearance does not fit the classical malignant or benign picture and recommend that CTSK gene be included in the molecular diagnosis of high bone density conditions. PMID:24269275

  9. Identifying Genetic Signatures of Natural Selection Using Pooled Population Sequencing in Picea abies.

    Science.gov (United States)

    Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin

    2016-07-07

    The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. Copyright © 2016 Chen et al.

  10. Identifying and sequencing a Mycobacterium sp. strain F4 as a potential bioremediation agent for quinclorac.

    Science.gov (United States)

    Li, Yingying; Chen, Wu; Wang, Yunsheng; Luo, Kun; Li, Yue; Bai, Lianyang; Luo, Feng

    2017-01-01

    Quinclorac is a widely used herbicide in rice filed. Unfortunately, quinclorac residues are phytotoxic to many crops/vegetables. The degradation of quinclorac in nature is very slow. On the other hand, degradation of quinclorac using bacteria can be an effective and efficient method to reduce its contamination. In this study, we isolated a quinclorac bioremediation bacterium strain F4 from quinclorac contaminated soils. Based on morphological characteristics and 16S rRNA gene sequence analysis, we identified strain F4 as Mycobacterium sp. We investigated the effects of temperature, pH, inoculation size and initial quinclorac concentration on growth and degrading efficiency of F4 and determined the optimal quinclorac degrading condition of F4. Under optimal degrading conditions, F4 degraded 97.38% of quinclorac from an initial concentration of 50 mg/L in seven days. Our indoor pot experiment demonstrated that the degradation products were non-phytotoxic to tobacco. After analyzing the quinclorac degradation products of F4, we proposed that F4 could employ two pathways to degrade quinclorac: one is through methylation, the other is through dechlorination. Furthermore, we reconstructed the whole genome of F4 through single molecular sequencing and de novo assembly. We identified 77 methyltransferases and eight dehalogenases in the F4 genome to support our hypothesized degradation path.

  11. Exome sequencing identifies CTSK mutations in patients originally diagnosed as intermediate osteopetrosis.

    Science.gov (United States)

    Pangrazio, Alessandra; Puddu, Alessandro; Oppo, Manuela; Valentini, Maria; Zammataro, Luca; Vellodi, Ashok; Gener, Blanca; Llano-Rivas, Isabel; Raza, Jamal; Atta, Irum; Vezzoni, Paolo; Superti-Furga, Andrea; Villa, Anna; Sobacchi, Cristina

    2014-02-01

    Autosomal Recessive Osteopetrosis is a genetic disorder characterized by increased bone density due to lack of resorption by the osteoclasts. Genetic studies have widely unraveled the molecular basis of the most severe forms, while cases of intermediate severity are more difficult to characterize, probably because of a large heterogeneity. Here, we describe the use of exome sequencing in the molecular diagnosis of 2 siblings initially thought to be affected by "intermediate osteopetrosis", which identified a homozygous mutation in the CTSK gene. Prompted by this finding, we tested by Sanger sequencing 25 additional patients addressed to us for recessive osteopetrosis and found CTSK mutations in 4 of them. In retrospect, their clinical and radiographic features were found to be compatible with, but not typical for, Pycnodysostosis. We sought to identify modifier genes that might have played a role in the clinical manifestation of the disease in these patients, but our results were not informative. In conclusion, we underline the difficulties of differential diagnosis in some patients whose clinical appearance does not fit the classical malignant or benign picture and recommend that CTSK gene be included in the molecular diagnosis of high bone density conditions. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  12. Transcriptome sequencing in pediatric acute lymphoblastic leukemia identifies fusion genes associated with distinct DNA methylation profiles

    Directory of Open Access Journals (Sweden)

    Yanara Marincevic-Zuniga

    2017-08-01

    Full Text Available Abstract Background Structural chromosomal rearrangements that lead to expressed fusion genes are a hallmark of acute lymphoblastic leukemia (ALL. In this study, we performed transcriptome sequencing of 134 primary ALL patient samples to comprehensively detect fusion transcripts. Methods We combined fusion gene detection with genome-wide DNA methylation analysis, gene expression profiling, and targeted sequencing to determine molecular signatures of emerging ALL subtypes. Results We identified 64 unique fusion events distributed among 80 individual patients, of which over 50% have not previously been reported in ALL. Although the majority of the fusion genes were found only in a single patient, we identified several recurrent fusion gene families defined by promiscuous fusion gene partners, such as ETV6, RUNX1, PAX5, and ZNF384, or recurrent fusion genes, such as DUX4-IGH. Our data show that patients harboring these fusion genes displayed characteristic genome-wide DNA methylation and gene expression signatures in addition to distinct patterns in single nucleotide variants and recurrent copy number alterations. Conclusion Our study delineates the fusion gene landscape in pediatric ALL, including both known and novel fusion genes, and highlights fusion gene families with shared molecular etiologies, which may provide additional information for prognosis and therapeutic options in the future.

  13. Epidemiological analysis of Salmonella clusters identified by whole genome sequencing, England and Wales 2014.

    Science.gov (United States)

    Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J

    2018-05-01

    The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Identifying and sequencing a Mycobacterium sp. strain F4 as a potential bioremediation agent for quinclorac.

    Directory of Open Access Journals (Sweden)

    Yingying Li

    Full Text Available Quinclorac is a widely used herbicide in rice filed. Unfortunately, quinclorac residues are phytotoxic to many crops/vegetables. The degradation of quinclorac in nature is very slow. On the other hand, degradation of quinclorac using bacteria can be an effective and efficient method to reduce its contamination. In this study, we isolated a quinclorac bioremediation bacterium strain F4 from quinclorac contaminated soils. Based on morphological characteristics and 16S rRNA gene sequence analysis, we identified strain F4 as Mycobacterium sp. We investigated the effects of temperature, pH, inoculation size and initial quinclorac concentration on growth and degrading efficiency of F4 and determined the optimal quinclorac degrading condition of F4. Under optimal degrading conditions, F4 degraded 97.38% of quinclorac from an initial concentration of 50 mg/L in seven days. Our indoor pot experiment demonstrated that the degradation products were non-phytotoxic to tobacco. After analyzing the quinclorac degradation products of F4, we proposed that F4 could employ two pathways to degrade quinclorac: one is through methylation, the other is through dechlorination. Furthermore, we reconstructed the whole genome of F4 through single molecular sequencing and de novo assembly. We identified 77 methyltransferases and eight dehalogenases in the F4 genome to support our hypothesized degradation path.

  15. Whole-genome sequencing identifies recurrent somatic NOTCH2 mutations in splenic marginal zone lymphoma.

    Science.gov (United States)

    Kiel, Mark J; Velusamy, Thirunavukkarasu; Betz, Bryan L; Zhao, Lili; Weigelin, Helmut G; Chiang, Mark Y; Huebner-Chan, David R; Bailey, Nathanael G; Yang, David T; Bhagat, Govind; Miranda, Roberto N; Bahler, David W; Medeiros, L Jeffrey; Lim, Megan S; Elenitoba-Johnson, Kojo S J

    2012-08-27

    Splenic marginal zone lymphoma (SMZL), the most common primary lymphoma of spleen, is poorly understood at the genetic level. In this study, using whole-genome DNA sequencing (WGS) and confirmation by Sanger sequencing, we observed mutations identified in several genes not previously known to be recurrently altered in SMZL. In particular, we identified recurrent somatic gain-of-function mutations in NOTCH2, a gene encoding a protein required for marginal zone B cell development, in 25 of 99 (∼25%) cases of SMZL and in 1 of 19 (∼5%) cases of nonsplenic MZLs. These mutations clustered near the C-terminal proline/glutamate/serine/threonine (PEST)-rich domain, resulting in protein truncation or, rarely, were nonsynonymous substitutions affecting the extracellular heterodimerization domain (HD). NOTCH2 mutations were not present in other B cell lymphomas and leukemias, such as chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL; n = 15), mantle cell lymphoma (MCL; n = 15), low-grade follicular lymphoma (FL; n = 44), hairy cell leukemia (HCL; n = 15), and reactive lymphoid hyperplasia (n = 14). NOTCH2 mutations were associated with adverse clinical outcomes (relapse, histological transformation, and/or death) among SMZL patients (P = 0.002). These results suggest that NOTCH2 mutations play a role in the pathogenesis and progression of SMZL and are associated with a poor prognosis.

  16. microRNA expression profiling in fetal single ventricle malformation identified by deep sequencing.

    Science.gov (United States)

    Yu, Zhang-Bin; Han, Shu-Ping; Bai, Yun-Fei; Zhu, Chun; Pan, Ya; Guo, Xi-Rong

    2012-01-01

    microRNAs (miRNAs) have emerged as key regulators in many biological processes, particularly cardiac growth and development, although the specific miRNA expression profile associated with this process remains to be elucidated. This study aimed to characterize the cellular microRNA profile involved in the development of congenital heart malformation, through the investigation of single ventricle (SV) defects. Comprehensive miRNA profiling in human fetal SV cardiac tissue was performed by deep sequencing. Differential expression of 48 miRNAs was revealed by sequencing by oligonucleotide ligation and detection (SOLiD) analysis. Of these, 38 were down-regulated and 10 were up-regulated in differentiated SV cardiac tissue, compared to control cardiac tissue. This was confirmed by real-time quantitative reverse transcription-polymerase chain reaction (qRT-PCR) analysis. Predicted target genes of the 48 differentially expressed miRNAs were analyzed by gene ontology and categorized according to cellular process, regulation of biological process and metabolic process. Pathway-Express analysis identified the WNT and mTOR signaling pathways as the most significant processes putatively affected by the differential expression of these miRNAs. The candidate genes involved in cardiac development were identified as potential targets for these differentially expressed microRNAs and the collaborative network of microRNAs and cardiac development related-mRNAs was constructed. These data provide the basis for future investigation of the mechanism of the occurrence and development of fetal SV malformations.

  17. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    Science.gov (United States)

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  18. Genomic Aberrations in Crizotinib Resistant Lung Adenocarcinoma Samples Identified by Transcriptome Sequencing.

    Directory of Open Access Journals (Sweden)

    Ali Saber

    Full Text Available ALK-break positive non-small cell lung cancer (NSCLC patients initially respond to crizotinib, but resistance occurs inevitably. In this study we aimed to identify fusion genes in crizotinib resistant tumor samples. Re-biopsies of three patients were subjected to paired-end RNA sequencing to identify fusion genes using deFuse and EricScript. The IGV browser was used to determine presence of known resistance-associated mutations. Sanger sequencing was used to validate fusion genes and digital droplet PCR to validate mutations. ALK fusion genes were detected in all three patients with EML4 being the fusion partner. One patient had no additional fusion genes. Another patient had one additional fusion gene, but without a predicted open reading frame (ORF. The third patient had three additional fusion genes, of which two were derived from the same chromosomal region as the EML4-ALK. A predicted ORF was identified only in the CLIP4-VSNL1 fusion product. The fusion genes validated in the post-treatment sample were also present in the biopsy before crizotinib. ALK mutations (p.C1156Y and p.G1269A detected in the re-biopsies of two patients, were not detected in pre-treatment biopsies. In conclusion, fusion genes identified in our study are unlikely to be involved in crizotinib resistance based on presence in pre-treatment biopsies. The detection of ALK mutations in post-treatment tumor samples of two patients underlines their role in crizotinib resistance.

  19. Sequence, structure and function relationships in flaviviruses as assessed by evolutive aspects of its conserved non-structural protein domains.

    Science.gov (United States)

    da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas

    2017-10-28

    Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Whole exome sequencing identifies RAI1 mutation in a morbidly obese child diagnosed with ROHHAD syndrome.

    Science.gov (United States)

    Thaker, Vidhu V; Esteves, Kristyn M; Towne, Meghan C; Brownstein, Catherine A; James, Philip M; Crowley, Laura; Hirschhorn, Joel N; Elsea, Sarah H; Beggs, Alan H; Picker, Jonathan; Agrawal, Pankaj B

    2015-05-01

    The current obesity epidemic is attributed to complex interactions between genetic and environmental factors. However, a limited number of cases, especially those with early-onset severe obesity, are linked to single gene defects. Rapid-onset obesity with hypothalamic dysfunction, hypoventilation and autonomic dysregulation (ROHHAD) is one of the syndromes that presents with abrupt-onset extreme weight gain with an unknown genetic basis. To identify the underlying genetic etiology in a child with morbid early-onset obesity, hypoventilation, and autonomic and behavioral disturbances who was clinically diagnosed with ROHHAD syndrome. Design/Setting/Intervention: The index patient was evaluated at an academic medical center. Whole-exome sequencing was performed on the proband and his parents. Genetic variants were validated by Sanger sequencing. We identified a novel de novo nonsense mutation, c.3265 C>T (p.R1089X), in the retinoic acid-induced 1 (RAI1) gene in the proband. Mutations in the RAI1 gene are known to cause Smith-Magenis syndrome (SMS). On further evaluation, his clinical features were not typical of either SMS or ROHHAD syndrome. This study identifies a de novo RAI1 mutation in a child with morbid obesity and a clinical diagnosis of ROHHAD syndrome. Although extreme early-onset obesity, autonomic disturbances, and hypoventilation are present in ROHHAD, several of the clinical findings are consistent with SMS. This case highlights the challenges in the diagnosis of ROHHAD syndrome and its potential overlap with SMS. We also propose RAI1 as a candidate gene for children with morbid obesity.

  1. Exome sequencing identifies three novel candidate genes implicated in intellectual disability.

    Directory of Open Access Journals (Sweden)

    Zehra Agha

    Full Text Available Intellectual disability (ID is a major health problem mostly with an unknown etiology. Recently exome sequencing of individuals with ID identified novel genes implicated in the disease. Therefore the purpose of the present study was to identify the genetic cause of ID in one syndromic and two non-syndromic Pakistani families. Whole exome of three ID probands was sequenced. Missense variations in two plausible novel genes implicated in autosomal recessive ID were identified: lysine (K-specific methyltransferase 2B (KMT2B, zinc finger protein 589 (ZNF589, as well as hedgehog acyltransferase (HHAT with a de novo mutation with autosomal dominant mode of inheritance. The KMT2B recessive variant is the first report of recessive Kleefstra syndrome-like phenotype. Identification of plausible causative mutations for two recessive and a dominant type of ID, in genes not previously implicated in disease, underscores the large genetic heterogeneity of ID. These results also support the viewpoint that large number of ID genes converge on limited number of common networks i.e. ZNF589 belongs to KRAB-domain zinc-finger proteins previously implicated in ID, HHAT is predicted to affect sonic hedgehog, which is involved in several disorders with ID, KMT2B associated with syndromic ID fits the epigenetic module underlying the Kleefstra syndromic spectrum. The association of these novel genes in three different Pakistani ID families highlights the importance of screening these genes in more families with similar phenotypes from different populations to confirm the involvement of these genes in pathogenesis of ID.

  2. Mapping the transcription start points of the Staphylococcus aureus eap, emp, and vwb promoters reveals a conserved octanucleotide sequence that is essential for expression of these genes.

    Science.gov (United States)

    Harraghy, Niamh; Homerova, Dagmar; Herrmann, Mathias; Kormanec, Jan

    2008-01-01

    Mapping the transcription start points of the eap, emp, and vwb promoters revealed a conserved octanucleotide sequence (COS). Deleting this sequence abolished the expression of eap, emp, and vwb. However, electrophoretic mobility shift assays gave no evidence that this sequence was a binding site for SarA or SaeR, known regulators of eap and emp.

  3. Stakeholder-led science: engaging resource managers to identify science needs for long-term management of floodplain conservation lands

    Science.gov (United States)

    Bouska, Kristin L.; Lindner, Garth; Paukert, Craig P.; Jacobson, Robert B.

    2016-01-01

    Floodplains pose challenges to managers of conservation lands because of constantly changing interactions with their rivers. Although scientific knowledge and understanding of the dynamics and drivers of river-floodplain systems can provide guidance to floodplain managers, the scientific process often occurs in isolation from management. Further, communication barriers between scientists and managers can be obstacles to appropriate application of scientific knowledge. With the coproduction of science in mind, our objectives were the following: (1) to document management priorities of floodplain conservation lands, and (2) identify science needs required to better manage the identified management priorities under nonstationary conditions, i.e., climate change, through stakeholder queries and interactions. We conducted an online survey with 80 resource managers of floodplain conservation lands along the Upper and Middle Mississippi River and Lower Missouri River, USA, to evaluate management priority, management intensity, and available scientific information for management objectives and conservation targets. Management objectives with the least information available relative to priority included controlling invasive species, maintaining respectful relationships with neighbors, and managing native, nongame species. Conservation targets with the least information available to manage relative to management priority included pollinators, marsh birds, reptiles, and shore birds. A follow-up workshop and survey focused on clarifying science needs to achieve management objectives under nonstationary conditions. Managers agreed that metrics of inundation, including depth and extent of inundation, and frequency, duration, and timing of inundation would be the most useful metrics for management of floodplain conservation lands with multiple objectives. This assessment provides guidance for developing relevant and accessible science products to inform management of highly

  4. A freshwater biodiversity hotspot under pressure – assessing threats and identifying conservation needs for ancient Lake Ohrid

    Directory of Open Access Journals (Sweden)

    G. Kostoski

    2010-12-01

    Full Text Available Immediate conservation measures for world-wide freshwater resources are of eminent importance. This is particularly true for so-called ancient lakes. While these lakes are famous for being evolutionary theatres, often displaying an extraordinarily high degree of biodiversity and endemism, in many cases these biota are also experiencing extreme anthropogenic impact.

    Lake Ohrid, a major European biodiversity hotspot situated in a trans-frontier setting on the Balkans, is a prime example for a lake with a magnitude of narrow range endemic taxa that are under increasing anthropogenic pressure. Unfortunately, evidence for a "creeping biodiversity crisis" has accumulated over the last decades, and major socio-political changes have gone along with human-mediated environmental changes.

    Based on field surveys, monitoring data, published records, and expert interviews, we aimed to (1 assess threats to Lake Ohrids' (endemic biodiversity, (2 summarize existing conservation activities and strategies, and (3 outline future conservation needs for Lake Ohrid. We compiled threats to both specific taxa (and in cases to particular species as well as to the lake ecosystems itself. Major conservation concerns identified for Lake Ohrid are: (1 watershed impacts, (2 agriculture and forestry, (3 tourism and population growth, (4 non-indigenous species, (5 habitat alteration or loss, (6 unsustainable exploitation of fisheries, and (7 global climate change.

    Among the major (well-known threats with high impact are nutrient input (particularly of phosphorus, habitat conversion and silt load. Other threats are potentially of high impact but less well known. Such threats include pollution with hazardous substances (from sources such as mines, former industries, agriculture or climate change. We review and discuss institutional responsibilities, environmental monitoring and ecosystem management, existing parks and reserves, biodiversity and species

  5. Structural and functional analysis of mouse Msx1 gene promoter: sequence conservation with human MSX1 promoter points at potential regulatory elements.

    Science.gov (United States)

    Gonzalez, S M; Ferland, L H; Robert, B; Abdelhay, E

    1998-06-01

    Vertebrate Msx genes are related to one of the most divergent homeobox genes of Drosophila, the muscle segment homeobox (msh) gene, and are expressed in a well-defined pattern at sites of tissue interactions. This pattern of expression is conserved in vertebrates as diverse as quail, zebrafish, and mouse in a range of sites including neural crest, appendages, and craniofacial structures. In the present work, we performed structural and functional analyses in order to identify potential cis-acting elements that may be regulating Msx1 gene expression. To this end, a 4.9-kb segment of the 5'-flanking region was sequenced and analyzed for transcription-factor binding sites. Four regions showing a high concentration of these sites were identified. Transfection assays with fragments of regulatory sequences driving the expression of the bacterial lacZ reporter gene showed that a region of 4 kb upstream of the transcription start site contains positive and negative elements responsible for controlling gene expression. Interestingly, a fragment of 130 bp seems to contain the minimal elements necessary for gene expression, as its removal completely abolishes gene expression in cultured cells. These results are reinforced by comparison of this region with the human Msx1 gene promoter, which shows extensive conservation, including many consensus binding sites, suggesting a regulatory role for them.

  6. High throughput sequencing of small RNA component of leaves and inflorescence revealed conserved and novel miRNAs as well as phasiRNA loci in chickpea.

    Science.gov (United States)

    Srivastava, Sangeeta; Zheng, Yun; Kudapa, Himabindu; Jagadeeswaran, Guru; Hivrale, Vandana; Varshney, Rajeev K; Sunkar, Ramanjulu

    2015-06-01

    Among legumes, chickpea (Cicer arietinum L.) is the second most important crop after soybean. MicroRNAs (miRNAs) play important roles by regulating target gene expression important for plant development and tolerance to stress conditions. Additionally, recently discovered phased siRNAs (phasiRNAs), a new class of small RNAs, are abundantly produced in legumes. Nevertheless, little is known about these regulatory molecules in chickpea. The small RNA population was sequenced from leaves and flowers of chickpea to identify conserved and novel miRNAs as well as phasiRNAs/phasiRNA loci. Bioinformatics analysis revealed 157 miRNA loci for the 96 highly conserved and known miRNA homologs belonging to 38 miRNA families in chickpea. Furthermore, 20 novel miRNAs belonging to 17 miRNA families were identified. Sequence analysis revealed approximately 60 phasiRNA loci. Potential target genes likely to be regulated by these miRNAs were predicted and some were confirmed by modified 5' RACE assay. Predicted targets are mostly transcription factors that might be important for developmental processes, and others include superoxide dismutases, plantacyanin, laccases and F-box proteins that could participate in stress responses and protein degradation. Overall, this study provides an inventory of miRNA-target gene interactions for chickpea, useful for the comparative analysis of small RNAs among legumes. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  7. Integrated sequence analysis pipeline provides one-stop solution for identifying disease-causing mutations.

    Science.gov (United States)

    Hu, Hao; Wienker, Thomas F; Musante, Luciana; Kalscheuer, Vera M; Kahrizi, Kimia; Najmabadi, Hossein; Ropers, H Hilger

    2014-12-01

    Next-generation sequencing has greatly accelerated the search for disease-causing defects, but even for experts the data analysis can be a major challenge. To facilitate the data processing in a clinical setting, we have developed a novel medical resequencing analysis pipeline (MERAP). MERAP assesses the quality of sequencing, and has optimized capacity for calling variants, including single-nucleotide variants, insertions and deletions, copy-number variation, and other structural variants. MERAP identifies polymorphic and known causal variants by filtering against public domain databases, and flags nonsynonymous and splice-site changes. MERAP uses a logistic model to estimate the causal likelihood of a given missense variant. MERAP considers the relevant information such as phenotype and interaction with known disease-causing genes. MERAP compares favorably with GATK, one of the widely used tools, because of its higher sensitivity for detecting indels, its easy installation, and its economical use of computational resources. Upon testing more than 1,200 individuals with mutations in known and novel disease genes, MERAP proved highly reliable, as illustrated here for five families with disease-causing variants. We believe that the clinical implementation of MERAP will expedite the diagnostic process of many disease-causing defects. © 2014 WILEY PERIODICALS, INC.

  8. Whole-exome sequencing identifies novel candidate predisposition genes for familial polycythemia vera.

    Science.gov (United States)

    Hirvonen, Elina A M; Pitkänen, Esa; Hemminki, Kari; Aaltonen, Lauri A; Kilpivaara, Outi

    2017-04-20

    Polycythemia vera (PV), characterized by massive production of erythrocytes, is one of the myeloproliferative neoplasms. Most patients carry a somatic gain-of-function mutation in JAK2, c.1849G > T (p.Val617Phe), leading to constitutive activation of JAK-STAT signaling pathway. Familial clustering is also observed occasionally, but high-penetrance predisposition genes to PV have remained unidentified. We studied the predisposition to PV by exome sequencing (three cases) in a Finnish PV family with four patients. The 12 shared variants (maximum allowed minor allele frequency  G (p.Phe418Leu) in ZXDC, c.1931C > G (p.Pro644Arg) in ATN1, and c.701G > A (p.Arg234Gln) in LRRC3. We also observed a rare, predicted benign germline variant c.2912C > G (p.Ala971Gly) in BCORL1 in all four patients. Somatic mutations in BCORL1 have been reported in myeloid malignancies. We further screened the variants in eight PV patients in six other Finnish families, but no other carriers were found. Exome sequencing provides a powerful tool for the identification of novel variants, and understanding the familial predisposition of diseases. This is the first report on Finnish familial PV cases, and we identified three novel candidate variants that may predispose to the disease.

  9. Targeted exome sequencing identified novel USH2A mutations in Usher syndrome families.

    Directory of Open Access Journals (Sweden)

    Xiu-Feng Huang

    Full Text Available Usher syndrome (USH is a leading cause of deaf-blindness in autosomal recessive trait. Phenotypic and genetic heterogeneities in USH make molecular diagnosis much difficult. This is a pilot study aiming to develop an approach based on next-generation sequencing to determine the genetic defects in patients with USH or allied diseases precisely and effectively. Eight affected patients and twelve unaffected relatives from five unrelated Chinese USH families, including 2 pseudo-dominant ones, were recruited. A total of 144 known genes of inherited retinal diseases were selected for deep exome resequencing. Through systematic data analysis using established bioinformatics pipeline and segregation analysis, a number of genetic variants were released. Eleven mutations, eight of them were novel, in the USH2A gene were identified. Biparental mutations in USH2A were revealed in 2 families with pseudo-dominant inheritance. A proband was found to have triple mutations, two of them were supposed to locate in the same chromosome. In conclusion, this study revealed the genetic defects in the USH2A gene and demonstrated the robustness of targeted exome sequencing to precisely and rapidly determine genetic defects. The methodology provides a reliable strategy for routine gene diagnosis of USH.

  10. LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone

    KAUST Repository

    Chen, Peng

    2014-12-03

    Background Protein-ligand binding is important for some proteins to perform their functions. Protein-ligand binding sites are the residues of proteins that physically bind to ligands. Despite of the recent advances in computational prediction for protein-ligand binding sites, the state-of-the-art methods search for similar, known structures of the query and predict the binding sites based on the solved structures. However, such structural information is not commonly available. Results In this paper, we propose a sequence-based approach to identify protein-ligand binding residues. We propose a combination technique to reduce the effects of different sliding residue windows in the process of encoding input feature vectors. Moreover, due to the highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we construct several balanced data sets, for each of which a random forest (RF)-based classifier is trained. The ensemble of these RF classifiers forms a sequence-based protein-ligand binding site predictor. Conclusions Experimental results on CASP9 and CASP8 data sets demonstrate that our method compares favorably with the state-of-the-art protein-ligand binding site prediction methods.

  11. WHITE-DWARF-MAIN-SEQUENCE BINARIES IDENTIFIED FROM THE LAMOST PILOT SURVEY

    International Nuclear Information System (INIS)

    Ren Juanjuan; Luo Ali; Li Yinbi; Wei Peng; Zhao Jingkun; Zhao Yongheng; Song Yihan; Zhao Gang

    2013-01-01

    We present a set of white-dwarf-main-sequence (WDMS) binaries identified spectroscopically from the Large sky Area Multi-Object fiber Spectroscopic Telescope (LAMOST, also called the Guo Shou Jing Telescope) pilot survey. We develop a color selection criteria based on what is so far the largest and most complete Sloan Digital Sky Survey (SDSS) DR7 WDMS binary catalog and identify 28 WDMS binaries within the LAMOST pilot survey. The primaries in our binary sample are mostly DA white dwarfs except for one DB white dwarf. We derive the stellar atmospheric parameters, masses, and radii for the two components of 10 of our binaries. We also provide cooling ages for the white dwarf primaries as well as the spectral types for the companion stars of these 10 WDMS binaries. These binaries tend to contain hot white dwarfs and early-type companions. Through cross-identification, we note that nine binaries in our sample have been published in the SDSS DR7 WDMS binary catalog. Nineteen spectroscopic WDMS binaries identified by the LAMOST pilot survey are new. Using the 3σ radial velocity variation as a criterion, we find two post-common-envelope binary candidates from our WDMS binary sample

  12. Microfluidic screening and whole-genome sequencing identifies mutations associated with improved protein secretion by yeast

    DEFF Research Database (Denmark)

    Huang, Mingtao; Bai, Yunpeng; Sjostrom, Staffan L.

    2015-01-01

    There is an increasing demand for biotech-based production of recombinant proteins for use as pharmaceuticals in the food and feed industry and in industrial applications. Yeast Saccharomyces cerevisiae is among preferred cell factories for recombinant protein production, and there is increasing...... interest in improving its protein secretion capacity. Due to the complexity of the secretory machinery in eukaryotic cells, it is difficult to apply rational engineering for construction of improved strains. Here we used high-throughput microfluidics for the screening of yeast libraries, generated by UV...... mutagenesis. Several screening and sorting rounds resulted in the selection of eight yeast clones with significantly improved secretion of recombinant a-amylase. Efficient secretion was genetically stable in the selected clones. We performed whole-genome sequencing of the eight clones and identified 330...

  13. Scale-dependent complementarity of climatic velocity and environmental diversity for identifying priority areas for conservation under climate change.

    Science.gov (United States)

    Carroll, Carlos; Roberts, David R; Michalak, Julia L; Lawler, Joshua J; Nielsen, Scott E; Stralberg, Diana; Hamann, Andreas; Mcrae, Brad H; Wang, Tongli

    2017-11-01

    As most regions of the earth transition to altered climatic conditions, new methods are needed to identify refugia and other areas whose conservation would facilitate persistence of biodiversity under climate change. We compared several common approaches to conservation planning focused on climate resilience over a broad range of ecological settings across North America and evaluated how commonalities in the priority areas identified by different methods varied with regional context and spatial scale. Our results indicate that priority areas based on different environmental diversity metrics differed substantially from each other and from priorities based on spatiotemporal metrics such as climatic velocity. Refugia identified by diversity or velocity metrics were not strongly associated with the current protected area system, suggesting the need for additional conservation measures including protection of refugia. Despite the inherent uncertainties in predicting future climate, we found that variation among climatic velocities derived from different general circulation models and emissions pathways was less than the variation among the suite of environmental diversity metrics. To address uncertainty created by this variation, planners can combine priorities identified by alternative metrics at a single resolution and downweight areas of high variation between metrics. Alternately, coarse-resolution velocity metrics can be combined with fine-resolution diversity metrics in order to leverage the respective strengths of the two groups of metrics as tools for identification of potential macro- and microrefugia that in combination maximize both transient and long-term resilience to climate change. Planners should compare and integrate approaches that span a range of model complexity and spatial scale to match the range of ecological and physical processes influencing persistence of biodiversity and identify a conservation network resilient to threats operating at

  14. Identifying Farm Pond Habitat Suitability for the Common Moorhen (Gallinula chloropus: A Conservation-Perspective Approach

    Directory of Open Access Journals (Sweden)

    Chun-Hsien Lai

    2018-04-01

    Full Text Available The purpose of this study was to establish a habitat-suitability assessment model for Gallinula chloropus, or the Common Moorhen, to be applied to the selection of the most suitable farm pond for habitat conservation in Chiayi County, Taiwan. First, the fuzzy Delphi method was employed to evaluate habitat selection factors and calculate the weights of these factors. The results showed that the eight crucial factors, by importance, in descending order, were (1 area ratio of farmlands within 200 m of the farm pond; (2 pond area; (3 pond perimeter; (4 aquatic plant coverage of the pond surface; (5 drought period; (6 coverage of high and low shrubs around the pond bank; (7 bank type; and (8 water-surface-to-bank distance. Subsequently, field evaluations of 75 farm ponds in Chiayi County were performed. The results indicated that 15 farm ponds had highly-suitable habitats and were inhabited by unusually high numbers of Common Moorhens; these habitats were most in need of conservation. A total of two farm ponds were found to require habitat-environment improvements, and Common Moorhens with typical reproductive capacity could be appropriately introduced into 22 farm ponds to restore the ecosystem of the species. Additionally, the habitat suitability and number of Common Moorhens in 36 farm ponds were lower than average; these ponds could be used for agricultural irrigation, detention basins, or for recreational use by community residents. Finally, the total habitat suitability scores and occurrence of Common Moorhens in each farm pond were used to verify the accuracy of the habitat-suitability assessment model for the Common Moorhen. The overall accuracy was 0.8, and the Kappa value was 0.60, which indicates that the model established in this study exhibited high credibility. To sum up, this is an applicable framework not only to assess the habitat suitability of farm ponds for Common Moorhens, but also to determine whether a particular location may

  15. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  16. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    Science.gov (United States)

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  17. Application of small RNA sequencing to identify microRNAs in acute kidney injury and fibrosis

    Energy Technology Data Exchange (ETDEWEB)

    Pellegrini, Kathryn L. [Department of Medicine, Renal Division, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Gerlach, Cory V. [Department of Medicine, Renal Division, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, MA (United States); Laboratory of Systems Pharmacology, Harvard Program in Therapeutic Sciences, Harvard Medical School, Boston, MA (United States); Craciun, Florin L.; Ramachandran, Krithika [Department of Medicine, Renal Division, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Bijol, Vanesa [Department of Pathology, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Kissick, Haydn T. [Department of Surgery, Urology Division, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA (United States); Vaidya, Vishal S., E-mail: vvaidya@bwh.harvard.edu [Department of Medicine, Renal Division, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, MA (United States); Laboratory of Systems Pharmacology, Harvard Program in Therapeutic Sciences, Harvard Medical School, Boston, MA (United States)

    2016-12-01

    Establishing a microRNA (miRNA) expression profile in affected tissues provides an important foundation for the discovery of miRNAs involved in the development or progression of pathologic conditions. We conducted small RNA sequencing to generate a temporal profile of miRNA expression in the kidneys using a mouse model of folic acid-induced (250 mg/kg i.p.) kidney injury and fibrosis. From the 103 miRNAs that were differentially expressed over the time course (> 2-fold, p < 0.05), we chose to further investigate miR-18a-5p, which is expressed during the acute stage of the injury; miR-132-3p, which is upregulated during transition between acute and fibrotic injury; and miR-146b-5p, which is highly expressed at the peak of fibrosis. Using qRT-PCR, we confirmed the increased expression of these candidate miRNAs in the folic acid model as well as in other established mouse models of acute injury (ischemia/reperfusion injury) and fibrosis (unilateral ureteral obstruction). In situ hybridization confirmed high expression of miR-18a-5p, miR-132-3p and miR-146b-5p throughout the kidney cortex in mice and humans with severe kidney injury or fibrosis. When primary human proximal tubular epithelial cells were treated with model nephrotoxicants such as cadmium chloride (CdCl{sub 2}), arsenic trioxide, aristolochic acid (AA), potassium dichromate (K{sub 2}Cr{sub 2}O{sub 7}) and cisplatin, miRNA-132-3p was upregulated 4.3-fold after AA treatment and 1.5-fold after K{sub 2}Cr{sub 2}O{sub 7} and CdCl{sub 2} treatment. These results demonstrate the application of temporal small RNA sequencing to identify miR-18a, miR-132 and miR-146b as differentially expressed miRNAs during distinct phases of kidney injury and fibrosis progression. - Highlights: • We used small RNA sequencing to identify differentially expressed miRNAs in kidney. • Distinct patterns were found for acute injury and fibrotic stages in the kidney. • Upregulation of miR-18a, -132 and -146b was confirmed in mice

  18. Molecular profiling of appendiceal epithelial tumors using massively parallel sequencing to identify somatic mutations.

    Science.gov (United States)

    Liu, Xiaoying; Mody, Kabir; de Abreu, Francine B; Pipas, J Marc; Peterson, Jason D; Gallagher, Torrey L; Suriawinata, Arief A; Ripple, Gregory H; Hourdequin, Kathryn C; Smith, Kerrington D; Barth, Richard J; Colacchio, Thomas A; Tsapakos, Michael J; Zaki, Bassem I; Gardner, Timothy B; Gordon, Stuart R; Amos, Christopher I; Wells, Wendy A; Tsongalis, Gregory J

    2014-07-01

    Some epithelial neoplasms of the appendix, including low-grade appendiceal mucinous neoplasm and adenocarcinoma, can result in pseudomyxoma peritonei (PMP). Little is known about the mutational spectra of these tumor types and whether mutations may be of clinical significance with respect to therapeutic selection. In this study, we identified somatic mutations using the Ion Torrent AmpliSeq Cancer Hotspot Panel v2. Specimens consisted of 3 nonneoplastic retention cysts/mucocele, 15 low-grade mucinous neoplasms (LAMNs), 8 low-grade/well-differentiated mucinous adenocarcinomas with pseudomyxoma peritonei, and 12 adenocarcinomas with/without goblet cell/signet ring cell features. Barcoded libraries were prepared from up to 10 ng of extracted DNA and multiplexed on single 318 chips for sequencing. Data analysis was performed using Golden Helix SVS. Variants that remained after the analysis pipeline were individually interrogated using the Integrative Genomics Viewer. A single Janus kinase 3 (JAK3) mutation was detected in the mucocele group. Eight mutations were identified in the V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) and GNAS complex locus (GNAS) genes among LAMN samples. Additional gene mutations were identified in the AKT1 (v-akt murine thymoma viral oncogene homolog 1), APC (adenomatous polyposis coli), JAK3, MET (met proto-oncogene), phosphatidylinositol-4,5-bisphosphate 3-kinase (PIK3CA), RB1 (retinoblastoma 1), STK11 (serine/threonine kinase 11), and tumor protein p53 (TP53) genes. Among the PMPs, 6 mutations were detected in the KRAS gene and also in the GNAS, TP53, and RB1 genes. Appendiceal cancers showed mutations in the APC, ATM (ataxia telangiectasia mutated), KRAS, IDH1 [isocitrate dehydrogenase 1 (NADP+)], NRAS [neuroblastoma RAS viral (v-ras) oncogene homolog], PIK3CA, SMAD4 (SMAD family member 4), and TP53 genes. Our results suggest molecular heterogeneity among epithelial tumors of the appendix. Next generation sequencing efforts

  19. Sequencing of sporadic Attention-Deficit Hyperactivity Disorder (ADHD) identifies novel and potentially pathogenic de novo variants and excludes overlap with genes associated with autism spectrum disorder.

    Science.gov (United States)

    Kim, Daniel Seung; Burt, Amber A; Ranchalis, Jane E; Wilmot, Beth; Smith, Joshua D; Patterson, Karynne E; Coe, Bradley P; Li, Yatong K; Bamshad, Michael J; Nikolas, Molly; Eichler, Evan E; Swanson, James M; Nigg, Joel T; Nickerson, Deborah A; Jarvik, Gail P

    2017-06-01

    Attention-Deficit Hyperactivity Disorder (ADHD) has high heritability; however, studies of common variation account for ADHD variance. Using data from affected participants without a family history of ADHD, we sought to identify de novo variants that could account for sporadic ADHD. Considering a total of 128 families, two analyses were conducted in parallel: first, in 11 unaffected parent/affected proband trios (or quads with the addition of an unaffected sibling) we completed exome sequencing. Six de novo missense variants at highly conserved bases were identified and validated from four of the 11 families: the brain-expressed genes TBC1D9, DAGLA, QARS, CSMD2, TRPM2, and WDR83. Separately, in 117 unrelated probands with sporadic ADHD, we sequenced a panel of 26 genes implicated in intellectual disability (ID) and autism spectrum disorder (ASD) to evaluate whether variation in ASD/ID-associated genes were also present in participants with ADHD. Only one putative deleterious variant (Gln600STOP) in CHD1L was identified; this was found in a single proband. Notably, no other nonsense, splice, frameshift, or highly conserved missense variants in the 26 gene panel were identified and validated. These data suggest that de novo variant analysis in families with independently adjudicated sporadic ADHD diagnosis can identify novel genes implicated in ADHD pathogenesis. Moreover, that only one of the 128 cases (0.8%, 11 exome, and 117 MIP sequenced participants) had putative deleterious variants within our data in 26 genes related to ID and ASD suggests significant independence in the genetic pathogenesis of ADHD as compared to ASD and ID phenotypes. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  20. Identifying units for conservation using molecular systematics: the cautionary tale of the Karner blue butterfly.

    Science.gov (United States)

    Gompert, Zachariah; Nice, Chris C; Fordyce, James A; Forister, Matthew L; Shapiro, Arthur M

    2006-06-01

    The federally endangered North American Karner blue butterfly (Lycaeides melissa samuelis) and the closely related Melissa blue butterfly (L. m. melissa) can be distinguished based on life history and morphology. Western populations of L. m. samuelis share mitochondrial haplotypes with L. m. melissa populations, while eastern populations of L. m. samuelis have divergent haplotypes. Here we test two hypotheses concerning the presence of L. m. melissa mitochondrial haplotypes in western L. m. samuelis populations: (i) mitochondrial introgression has occurred from L. m. melissa populations into western L. m. samuelis populations, or (ii) western populations of the nominal L. m. samuelis are more closely related to L. m. melissa than to eastern L. m. samuelis populations, yet are phenotypically similar to the latter. A Bayesian algorithm was used to cluster 190 L. melissa individuals based on 143 informative amplified fragment length polymorphism (AFLP) loci. This method clearly differentiated L. m. samuelis and L. m. melissa. Thus, genomic divergence was greater between western L. m. samuelis populations and L. m. melissa populations than it was between western and eastern populations of L. m. samuelis. This supports the hypothesis that the presence of L. m. melissa mitochondrial haplotypes in western L. m. samuelis populations is the result of mitochondrial introgression. These data provide valuable information for conservation and management plans for the endangered L. m. samuelis, and illustrate the risks of using data from a single locus for diagnosing significant units of biodiversity for conservation.

  1. Using Range-Wide Abundance Modeling to Identify Key Conservation Areas for the Micro-Endemic Bolson Tortoise (Gopherus flavomarginatus.

    Directory of Open Access Journals (Sweden)

    Cinthya A Ureña-Aranda

    Full Text Available A widespread biogeographic pattern in nature is that population abundance is not uniform across the geographic range of species: most occurrence sites have relatively low numbers, whereas a few places contain orders of magnitude more individuals. The Bolson tortoise Gopherus flavomarginatus is endemic to a small region of the Chihuahuan Desert in Mexico, where habitat deterioration threatens this species with extinction. In this study we combined field burrows counts and the approach for modeling species abundance based on calculating the distance to the niche centroid to obtain range-wide abundance estimates. For the Bolson tortoise, we found a robust, negative relationship between observed burrows abundance and distance to the niche centroid, with a predictive capacity of 71%. Based on these results we identified four priority areas for the conservation of this microendemic and threatened tortoise. We conclude that this approach may be a useful approximation for identifying key areas for sampling and conservation efforts in elusive and rare species.

  2. Conservation and diversification of the miR166 family in soybean and potential roles of newly identified miR166s.

    Science.gov (United States)

    Li, Xuyan; Xie, Xin; Li, Ji; Cui, Yuhai; Hou, Yanming; Zhai, Lulu; Wang, Xiao; Fu, Yanli; Liu, Ranran; Bian, Shaomin

    2017-02-01

    microRNA166 (miR166) is a highly conserved family of miRNAs implicated in a wide range of cellular and physiological processes in plants. miR166 family generally comprises multiple miR166 members in plants, which might exhibit functional redundancy and specificity. The soybean miR166 family consists of 21 members according to the miRBase database. However, the evolutionary conservation and functional diversification of miR166 family members in soybean remain poorly understood. We identified five novel miR166s in soybean by data mining approach, thus enlarging the size of miR166 family from 21 to 26 members. Phylogenetic analyses of the 26 miR166s and their precursors indicated that soybean miR166 family exhibited both evolutionary conservation and diversification, and ten pairs of miR166 precursors with high sequence identity were individually grouped into a discrete clade in the phylogenetic tree. The analysis of genomic organization and evolution of MIR166 gene family revealed that eight segmental duplications and four tandem duplications might occur during evolution of the miR166 family in soybean. The cis-elements in promoters of MIR166 family genes and their putative targets pointed to their possible contributions to the functional conservation and diversification. The targets of soybean miR166s were predicted, and the cleavage of ATHB14-LIKE transcript was experimentally validated by RACE PCR. Further, the expression patterns of the five newly identified MIR166s and 12 target genes were examined during seed development and in response to abiotic stresses, which provided important clues for dissecting their functions and isoform specificity. This study enlarged the size of soybean miR166 family from 21 to 26 members, and the 26 soybean miR166s exhibited evolutionary conservation and diversification. These findings have laid a foundation for elucidating functional conservation and diversification of miR166 family members, especially during seed development or

  3. G-quadruplex DNA sequences are evolutionarily conserved and associated with distinct genomic features in Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    John A Capra

    2010-07-01

    Full Text Available G-quadruplex DNA is a four-stranded DNA structure formed by non-Watson-Crick base pairing between stacked sets of four guanines. Many possible functions have been proposed for this structure, but its in vivo role in the cell is still largely unresolved. We carried out a genome-wide survey of the evolutionary conservation of regions with the potential to form G-quadruplex DNA structures (G4 DNA motifs across seven yeast species. We found that G4 DNA motifs were significantly more conserved than expected by chance, and the nucleotide-level conservation patterns suggested that the motif conservation was the result of the formation of G4 DNA structures. We characterized the association of conserved and non-conserved G4 DNA motifs in Saccharomyces cerevisiae with more than 40 known genome features and gene classes. Our comprehensive, integrated evolutionary and functional analysis confirmed the previously observed associations of G4 DNA motifs with promoter regions and the rDNA, and it identified several previously unrecognized associations of G4 DNA motifs with genomic features, such as mitotic and meiotic double-strand break sites (DSBs. Conserved G4 DNA motifs maintained strong associations with promoters and the rDNA, but not with DSBs. We also performed the first analysis of G4 DNA motifs in the mitochondria, and surprisingly found a tenfold higher concentration of the motifs in the AT-rich yeast mitochondrial DNA than in nuclear DNA. The evolutionary conservation of the G4 DNA motif and its association with specific genome features supports the hypothesis that G4 DNA has in vivo functions that are under evolutionary constraint.

  4. Structured decision making as a conceptual framework to identify thresholds for conservation and management

    Science.gov (United States)

    Martin, J.; Runge, M.C.; Nichols, J.D.; Lubow, B.C.; Kendall, W.L.

    2009-01-01

    Thresholds and their relevance to conservation have become a major topic of discussion in the ecological literature. Unfortunately, in many cases the lack of a clear conceptual framework for thinking about thresholds may have led to confusion in attempts to apply the concept of thresholds to conservation decisions. Here, we advocate a framework for thinking about thresholds in terms of a structured decision making process. The purpose of this framework is to promote a logical and transparent process for making informed decisions for conservation. Specification of such a framework leads naturally to consideration of definitions and roles of different kinds of thresholds in the process. We distinguish among three categories of thresholds. Ecological thresholds are values of system state variables at which small changes bring about substantial changes in system dynamics. Utility thresholds are components of management objectives (determined by human values) and are values of state or performance variables at which small changes yield substantial changes in the value of the management outcome. Decision thresholds are values of system state variables at which small changes prompt changes in management actions in order to reach specified management objectives. The approach that we present focuses directly on the objectives of management, with an aim to providing decisions that are optimal with respect to those objectives. This approach clearly distinguishes the components of the decision process that are inherently subjective (management objectives, potential management actions) from those that are more objective (system models, estimates of system state). Optimization based on these components then leads to decision matrices specifying optimal actions to be taken at various values of system state variables. Values of state variables separating different actions in such matrices are viewed as decision thresholds. Utility thresholds are included in the objectives

  5. Structure-Related Roles for the Conservation of the HIV-1 Fusion Peptide Sequence Revealed by Nuclear Magnetic Resonance.

    Science.gov (United States)

    Serrano, Soraya; Huarte, Nerea; Rujas, Edurne; Andreu, David; Nieva, José L; Jiménez, María Angeles

    2017-10-17

    Despite extensive characterization of the human immunodeficiency virus type 1 (HIV-1) hydrophobic fusion peptide (FP), the structure-function relationships underlying its extraordinary degree of conservation remain poorly understood. Specifically, the fact that the tandem repeat of the FLGFLG tripeptide is absolutely conserved suggests that high hydrophobicity may not suffice to unleash FP function. Here, we have compared the nuclear magnetic resonance (NMR) structures adopted in nonpolar media by two FP surrogates, wtFP-tag and scrFP-tag, which had equal hydrophobicity but contained wild-type and scrambled core sequences LFLGFLG and FGLLGFL, respectively. In addition, these peptides were tagged at their C-termini with an epitope sequence that folded independently, thereby allowing Western blot detection without interfering with FP structure. We observed similar α-helical FP conformations for both specimens dissolved in the low-polarity medium 25% (v/v) 1,1,1,3,3,3-hexafluoro-2-propanol (HFIP), but important differences in contact with micelles of the membrane mimetic dodecylphosphocholine (DPC). Thus, whereas wtFP-tag preserved a helix displaying a Gly-rich ridge, the scrambled sequence lost in great part the helical structure upon being solubilized in DPC. Western blot analyses further revealed the capacity of wtFP-tag to assemble trimers in membranes, whereas membrane oligomers were not observed in the case of the scrFP-tag sequence. We conclude that, beyond hydrophobicity, preserving sequence order is an important feature for defining the secondary structures and oligomeric states adopted by the HIV FP in membranes.

  6. Extended exome sequencing identifies BACH2 as a novel major risk locus for Addison's disease.

    Science.gov (United States)

    Eriksson, D; Bianchi, M; Landegren, N; Nordin, J; Dalin, F; Mathioudaki, A; Eriksson, G N; Hultin-Rosenberg, L; Dahlqvist, J; Zetterqvist, H; Karlsson, Å; Hallgren, Å; Farias, F H G; Murén, E; Ahlgren, K M; Lobell, A; Andersson, G; Tandre, K; Dahlqvist, S R; Söderkvist, P; Rönnblom, L; Hulting, A-L; Wahlberg, J; Ekwall, O; Dahlqvist, P; Meadows, J R S; Bensing, S; Lindblad-Toh, K; Kämpe, O; Pielberg, G R

    2016-12-01

    Autoimmune disease is one of the leading causes of morbidity and mortality worldwide. In Addison's disease, the adrenal glands are targeted by destructive autoimmunity. Despite being the most common cause of primary adrenal failure, little is known about its aetiology. To understand the genetic background of Addison's disease, we utilized the extensively characterized patients of the Swedish Addison Registry. We developed an extended exome capture array comprising a selected set of 1853 genes and their potential regulatory elements, for the purpose of sequencing 479 patients with Addison's disease and 1394 controls. We identified BACH2 (rs62408233-A, OR = 2.01 (1.71-2.37), P = 1.66 × 10 -15 , MAF 0.46/0.29 in cases/controls) as a novel gene associated with Addison's disease development. We also confirmed the previously known associations with the HLA complex. Whilst BACH2 has been previously reported to associate with organ-specific autoimmune diseases co-inherited with Addison's disease, we have identified BACH2 as a major risk locus in Addison's disease, independent of concomitant autoimmune diseases. Our results may enable future research towards preventive disease treatment. © 2016 The Authors. Journal of Internal Medicine published by John Wiley & Sons Ltd on behalf of Association for Publication of The Journal of Internal Medicine.

  7. Next-generation sequencing identifies transportin 3 as the causative gene for LGMD1F.

    Directory of Open Access Journals (Sweden)

    Annalaura Torella

    Full Text Available Limb-girdle muscular dystrophies (LGMD are genetically and clinically heterogeneous conditions. We investigated a large family with autosomal dominant transmission pattern, previously classified as LGMD1F and mapped to chromosome 7q32. Affected members are characterized by muscle weakness affecting earlier the pelvic girdle and the ileopsoas muscles. We sequenced the whole exome of four family members and identified a shared heterozygous frame-shift variant in the Transportin 3 (TNPO3 gene, encoding a member of the importin-β super-family. The TNPO3 gene is mapped within the LGMD1F critical interval and its 923-amino acid human gene product is also expressed in skeletal muscle. In addition, we identified an isolated case of LGMD with a new missense mutation in the same gene. We localized the mutant TNPO3 around the nucleus, but not inside. The involvement of gene related to the nuclear transport suggests a novel disease mechanism leading to muscular dystrophy.

  8. Identifying resource manager information needs for the North Pacific Landscape Conservation Cooperative

    Science.gov (United States)

    Woodward, Andrea; Liedtke, Theresa; Jenni, Karen

    2014-01-01

    Landscape Conservation Cooperatives (LCCs) are a network of 22 public-private partnerships, defined by ecoregion, that share and provide science to ensure the sustainability of land, water, wildlife and cultural resources in North America. LLCs were established by the U.S. Department of Interior (DOI) in recognition that response to climate change must be coordinated on a landscape-level basis because important resources, ecosystem processes and resource management challenges extend beyond national wildlife refuges, Bureau of Land Management lands, national parks, and even international boundaries. Therefore, DOI agencies must work with other Federal, State, Tribal (U.S. indigenous peoples), First Nation (Canadian indigenous peoples), and local governments, as well as private landowners, to develop landscape-level strategies for understanding and responding to climate change.

  9. Screening of whole genome sequences identified high-impact variants for stallion fertility.

    Science.gov (United States)

    Schrimpf, Rahel; Gottschalk, Maren; Metzger, Julia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

    2016-04-14

    g.37455302G>A in NOTCH1 with the de-regressed estimated breeding values of the paternal component of the pregnancy rate per estrus (EBV-PAT). For 9 high-impact variants within the genes CFTR, OVGP1, FBXO43, TSSK6, PKD1, FOXP1, TCP11, SPATA31E1 and NOTCH1 (g.37453246G>C) absence of the homozygous mutant genotype in the validation sample of all 337 fertile stallions was obvious. Therefore, these variants were considered as potentially deleterious factors for stallion fertility. In conclusion, this study revealed 17 genetic variants with a predicted high damaging effect on protein structure and missing homozygous mutant genotype. The g.37455302G>A NOTCH1 variant was identified as a significant stallion fertility locus in Hanoverian stallions and further 9 candidate fertility loci with missing homozygous mutant genotypes were validated in a panel including 19 horse breeds. To our knowledge this is the first study in horses using next generation sequencing data to uncover strong candidate factors for stallion fertility.

  10. Conserved hypothetical protein Rv1977 in Mycobacterium tuberculosis strains contains sequence polymorphisms and might be involved in ongoing immune evasion.

    Science.gov (United States)

    Jiang, Yi; Liu, Haican; Wang, Xuezhi; Li, Guilian; Qiu, Yan; Dou, Xiangfeng; Wan, Kanglin

    2015-01-01

    Host immune pressure and associated parasite immune evasion are key features of host-pathogen co-evolution. A previous study showed that human T cell epitopes of Mycobacterium tuberculosis are evolutionarily hyperconserved and thus it was deduced that M. tuberculosis lacks antigenic variation and immune evasion. Here, we selected 151 clinical Mycobacterium tuberculosis isolates from China, amplified gene encoding Rv1977 and compared the sequences. The results showed that Rv1977, a conserved hypothetical protein, is not conserved in M. tuberculosis strains and there are polymorphisms existed in the protein. Some mutations, especially one frameshift mutation, occurred in the antigen Rv1977, which is uncommon in M.tb strains and may lead to the protein function altering. Mutations and deletion in the gene all affect one of three T cell epitopes and the changed T cell epitope contained more than one variable position, which may suggest ongoing immune evasion.

  11. Whole-exome sequencing and high throughput genotyping identified KCNJ11 as the thirteenth MODY gene.

    Science.gov (United States)

    Bonnefond, Amélie; Philippe, Julien; Durand, Emmanuelle; Dechaume, Aurélie; Huyvaert, Marlène; Montagne, Louise; Marre, Michel; Balkau, Beverley; Fajardy, Isabelle; Vambergue, Anne; Vatin, Vincent; Delplanque, Jérôme; Le Guilcher, David; De Graeve, Franck; Lecoeur, Cécile; Sand, Olivier; Vaxillaire, Martine; Froguel, Philippe

    2012-01-01

    Maturity-onset of the young (MODY) is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X). Here, we aimed to use whole-exome sequencing (WES) in a four-generation MODY-X family to identify a new susceptibility gene for MODY. WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing) was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay) of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out. By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130) present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11) co-segregated with diabetes in the family (with a LOD-score of 3.68). No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects. Beyond neonatal diabetes mellitus (NDM), KCNJ11 is also a MODY gene ('MODY13'), confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS). Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.

  12. Whole-exome sequencing and high throughput genotyping identified KCNJ11 as the thirteenth MODY gene.

    Directory of Open Access Journals (Sweden)

    Amélie Bonnefond

    Full Text Available BACKGROUND: Maturity-onset of the young (MODY is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X. Here, we aimed to use whole-exome sequencing (WES in a four-generation MODY-X family to identify a new susceptibility gene for MODY. METHODOLOGY: WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out. PRINCIPAL FINDINGS: By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130 present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11 co-segregated with diabetes in the family (with a LOD-score of 3.68. No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects. CONCLUSIONS/SIGNIFICANCE: Beyond neonatal diabetes mellitus (NDM, KCNJ11 is also a MODY gene ('MODY13', confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS. Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.

  13. Diagnostic SNPs for inferring population structure in American mink (Neovison vison) identified through RAD sequencing

    DEFF Research Database (Denmark)

    2015-01-01

    Data from: "Diagnostic SNPs for inferring population structure in American mink (Neovison vison) identified through RAD sequencing" in Genomic Resources Notes accepted 1 October 2014 to 30 November 2014....

  14. The Ebola virus VP35 protein binds viral immunostimulatory and host RNAs identified through deep sequencing.

    Directory of Open Access Journals (Sweden)

    Kari A Dilley

    Full Text Available Ebola virus and Marburg virus are members of the Filovirdae family and causative agents of hemorrhagic fever with high fatality rates in humans. Filovirus virulence is partially attributed to the VP35 protein, a well-characterized inhibitor of the RIG-I-like receptor pathway that triggers the antiviral interferon (IFN response. Prior work demonstrates the ability of VP35 to block potent RIG-I activators, such as Sendai virus (SeV, and this IFN-antagonist activity is directly correlated with its ability to bind RNA. Several structural studies demonstrate that VP35 binds short synthetic dsRNAs; yet, there are no data that identify viral immunostimulatory RNAs (isRNA or host RNAs bound to VP35 in cells. Utilizing a SeV infection model, we demonstrate that both viral isRNA and host RNAs are bound to Ebola and Marburg VP35s in cells. By deep sequencing the purified VP35-bound RNA, we identified the SeV copy-back defective interfering (DI RNA, previously identified as a robust RIG-I activator, as the isRNA bound by multiple filovirus VP35 proteins, including the VP35 protein from the West African outbreak strain (Makona EBOV. Moreover, RNAs isolated from a VP35 RNA-binding mutant were not immunostimulatory and did not include the SeV DI RNA. Strikingly, an analysis of host RNAs bound by wild-type, but not mutant, VP35 revealed that select host RNAs are preferentially bound by VP35 in cell culture. Taken together, these data support a model in which VP35 sequesters isRNA in virus-infected cells to avert RIG-I like receptor (RLR activation.

  15. The Ebola virus VP35 protein binds viral immunostimulatory and host RNAs identified through deep sequencing.

    Science.gov (United States)

    Dilley, Kari A; Voorhies, Alexander A; Luthra, Priya; Puri, Vinita; Stockwell, Timothy B; Lorenzi, Hernan; Basler, Christopher F; Shabman, Reed S

    2017-01-01

    Ebola virus and Marburg virus are members of the Filovirdae family and causative agents of hemorrhagic fever with high fatality rates in humans. Filovirus virulence is partially attributed to the VP35 protein, a well-characterized inhibitor of the RIG-I-like receptor pathway that triggers the antiviral interferon (IFN) response. Prior work demonstrates the ability of VP35 to block potent RIG-I activators, such as Sendai virus (SeV), and this IFN-antagonist activity is directly correlated with its ability to bind RNA. Several structural studies demonstrate that VP35 binds short synthetic dsRNAs; yet, there are no data that identify viral immunostimulatory RNAs (isRNA) or host RNAs bound to VP35 in cells. Utilizing a SeV infection model, we demonstrate that both viral isRNA and host RNAs are bound to Ebola and Marburg VP35s in cells. By deep sequencing the purified VP35-bound RNA, we identified the SeV copy-back defective interfering (DI) RNA, previously identified as a robust RIG-I activator, as the isRNA bound by multiple filovirus VP35 proteins, including the VP35 protein from the West African outbreak strain (Makona EBOV). Moreover, RNAs isolated from a VP35 RNA-binding mutant were not immunostimulatory and did not include the SeV DI RNA. Strikingly, an analysis of host RNAs bound by wild-type, but not mutant, VP35 revealed that select host RNAs are preferentially bound by VP35 in cell culture. Taken together, these data support a model in which VP35 sequesters isRNA in virus-infected cells to avert RIG-I like receptor (RLR) activation.

  16. Identifying Genetic Differences Between Dongxiang Blue-Shelled and White Leghorn Chickens Using Sequencing Data

    Directory of Open Access Journals (Sweden)

    Qing-bo Zhao

    2018-02-01

    Full Text Available The Dongxiang Blue-shelled chicken is one of the most valuable Chinese indigenous poultry breeds. However, compared to the Italian native White Leghorn, although this Chinese breed possesses numerous favorable characteristics, it also exhibits lower growth performance and fertility. Here, we utilized genotyping sequencing data obtained via genome reduction on a sequencing platform to detect 100,114 single nucleotide polymorphisms and perform further biological analysis and functional annotation. We employed cross-population extended haplotype homozygosity, eigenvector decomposition combined with genome-wide association studies (EigenGWAS, and efficient mixed-model association expedited methods to detect areas of the genome that are potential selected regions (PSR in both chicken breeds, and performed gene ontology (GO enrichment and quantitative trait loci (QTL analyses annotating using the Kyoto Encyclopedia of Genes and Genomes. The results of this study revealed a total of 2424 outlier loci (p-value <0.01, of which 2144 occur in the White Leghorn breed and 280 occur in the Dongxiang Blue-shelled chicken. These correspond to 327 and 94 PSRs containing 297 and 54 genes, respectively. The most significantly selected genes in Blue-shelled chicken are TMEM141 and CLIC3, while the SLCO1B3 gene, related to eggshell color, was identified via EigenGWAS. We show that the White Leghorn genes JARID2, RBMS3, GPC3, TRIB2, ROBO1, SAMSN1, OSBP2, and IGFALS are involved in immunity, reproduction, and growth, and thus might represent footprints of the selection process. In contrast, we identified six significantly enriched pathways in the Dongxiang Blue-shelled chicken that are related to amino acid and lipid metabolism as well as signal transduction. Our results also reveal the presence of a GO term associated with cell metabolism that occurs mainly in the White Leghorn breed, while the most significant QTL regions mapped to the Chicken QTL Database (GG_4

  17. A protein-binding domain, EH, identified in the receptor tyrosine kinase substrate Eps15 and conserved in evolution

    DEFF Research Database (Denmark)

    Wong, W T; Schumacher, C; Salcini, A E

    1995-01-01

    In this report we structurally and functionally define a binding domain that is involved in protein association and that we have designated EH (for Eps15 homology domain). This domain was identified in the tyrosine kinase substrate Eps15 on the basis of regional conservation with several heteroge......In this report we structurally and functionally define a binding domain that is involved in protein association and that we have designated EH (for Eps15 homology domain). This domain was identified in the tyrosine kinase substrate Eps15 on the basis of regional conservation with several...... heterogeneous proteins of yeast and nematode. The EH domain spans about 70 amino acids and shows approximately 60% overall amino acid conservation. We demonstrated the ability of the EH domain to specifically bind cytosolic proteins in normal and malignant cells of mesenchymal, epithelial, and hematopoietic...... (for Eps15-related). Structural comparison of Eps15 and Eps15r defines a family of signal transducers possessing extensive networking abilities including EH-mediated binding and association with Src homology 3-containing proteins....

  18. Deep sequencing of Brachypodium small RNAs at the global genome level identifies microRNAs involved in cold stress response

    Directory of Open Access Journals (Sweden)

    Chong Kang

    2009-09-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are endogenous small RNAs having large-scale regulatory effects on plant development and stress responses. Extensive studies of miRNAs have only been performed in a few model plants. Although miRNAs are proved to be involved in plant cold stress responses, little is known for winter-habit monocots. Brachypodium distachyon, with close evolutionary relationship to cool-season cereals, has recently emerged as a novel model plant. There are few reports of Brachypodium miRNAs. Results High-throughput sequencing and whole-genome-wide data mining led to the identification of 27 conserved miRNAs, as well as 129 predicted miRNAs in Brachypodium. For multiple-member conserved miRNA families, their sizes in Brachypodium were much smaller than those in rice and Populus. The genome organization of miR395 family in Brachypodium was quite different from that in rice. The expression of 3 conserved miRNAs and 25 predicted miRNAs showed significant changes in response to cold stress. Among these miRNAs, some were cold-induced and some were cold-suppressed, but all the conserved miRNAs were up-regulated under cold stress condition. Conclusion Our results suggest that Brachypodium miRNAs are composed of a set of conserved miRNAs and a large proportion of non-conserved miRNAs with low expression levels. Both kinds of miRNAs were involved in cold stress response, but all the conserved miRNAs were up-regulated, implying an important role for cold-induced miRNAs. The different size and genome organization of miRNA families in Brachypodium and rice suggest that the frequency of duplication events or the selection pressure on duplicated miRNAs are different between these two closely related plant species.

  19. Deep sequencing of Salmonella RNA associated with heterologous Hfq proteins in vivo reveals small RNAs as a major target class and identifies RNA processing phenotypes.

    Science.gov (United States)

    Sittka, Alexandra; Sharma, Cynthia M; Rolle, Katarzyna; Vogel, Jörg

    2009-01-01

    The bacterial Sm-like protein, Hfq, is a key factor for the stability and function of small non-coding RNAs (sRNAs) in Escherichia coli. Homologues of this protein have been predicted in many distantly related organisms yet their functional conservation as sRNA-binding proteins has not entirely been clear. To address this, we expressed in Salmonella the Hfq proteins of two eubacteria (Neisseria meningitides, Aquifex aeolicus) and an archaeon (Methanocaldococcus jannaschii), and analyzed the associated RNA by deep sequencing. This in vivo approach identified endogenous Salmonella sRNAs as a major target of the foreign Hfq proteins. New Salmonella sRNA species were also identified, and some of these accumulated specifically in the presence of a foreign Hfq protein. In addition, we observed specific RNA processing defects, e.g., suppression of precursor processing of SraH sRNA by Methanocaldococcus Hfq, or aberrant accumulation of extracytoplasmic target mRNAs of the Salmonella GcvB, MicA or RybB sRNAs. Taken together, our study provides evidence of a conserved inherent sRNA-binding property of Hfq, which may facilitate the lateral transmission of regulatory sRNAs among distantly related species. It also suggests that the expression of heterologous RNA-binding proteins combined with deep sequencing analysis of RNA ligands can be used as a molecular tool to dissect individual steps of RNA metabolism in vivo.

  20. Using Water Footprints to Identify Alternatives for Conserving Local Water Resources in California

    Directory of Open Access Journals (Sweden)

    D. L. Marrin

    2016-11-01

    Full Text Available As a management tool for addressing water consumption issues, footprints have become increasingly utilized on scales ranging from global to personal. A question posed by this paper is whether water footprint data that are routinely compiled for particular regions may be used to assess the effectiveness of actions taken by local residents to conserve local water resources. The current California drought has affected an agriculturally productive region with large population centers that consume a portion of the locally produced food, and the state’s arid climate demands a large volume of blue water as irrigation from its dwindling surface and ground water resources. Although California exports most of its food products, enough is consumed within the state so that residents shifting their food choices and/or habits could save as much or more local blue water as their reduction of household or office water use. One of those shifts is reducing the intake of animal-based products that require the most water of any food group on both a gravimetric and caloric basis. Another shift is reducing food waste, which represents a shared responsibility among consumers and retailers, however, consumer preferences ultimately drive much of this waste.

  1. Highly conserved intragenic HSV-2 sequences: Results from next-generation sequencing of HSV-2 UL and US regions from genital swabs collected from 3 continents.

    Science.gov (United States)

    Johnston, Christine; Magaret, Amalia; Roychoudhury, Pavitra; Greninger, Alexander L; Cheng, Anqi; Diem, Kurt; Fitzgibbon, Matthew P; Huang, Meei-Li; Selke, Stacy; Lingappa, Jairam R; Celum, Connie; Jerome, Keith R; Wald, Anna; Koelle, David M

    2017-10-01

    Understanding the variability in circulating herpes simplex virus type 2 (HSV-2) genomic sequences is critical to the development of HSV-2 vaccines. Genital lesion swabs containing ≥ 10 7 log 10 copies HSV DNA collected from Africa, the USA, and South America underwent next-generation sequencing, followed by K-mer based filtering and de novo genomic assembly. Sites of heterogeneity within coding regions in unique long and unique short (U L _U S ) regions were identified. Phylogenetic trees were created using maximum likelihood reconstruction. Among 46 samples from 38 persons, 1468 intragenic base-pair substitutions were identified. The maximum nucleotide distance between strains for concatenated U L_ U S segments was 0.4%. Phylogeny did not reveal geographic clustering. The most variable proteins had non-synonymous mutations in < 3% of amino acids. Unenriched HSV-2 DNA can undergo next-generation sequencing to identify intragenic variability. The use of clinical swabs for sequencing expands the information that can be gathered directly from these specimens. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015

    Directory of Open Access Journals (Sweden)

    Yuki Furuse

    2015-11-01

    Full Text Available Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV.

  3. Pooled-DNA sequencing identifies genomic regions of selection in Nigerian isolates of Plasmodium falciparum.

    Science.gov (United States)

    Oyebola, Kolapo M; Idowu, Emmanuel T; Olukosi, Yetunde A; Awolola, Taiwo S; Amambua-Ngwa, Alfred

    2017-06-29

    The burden of falciparum malaria is especially high in sub-Saharan Africa. Differences in pressure from host immunity and antimalarial drugs lead to adaptive changes responsible for high level of genetic variations within and between the parasite populations. Population-specific genetic studies to survey for genes under positive or balancing selection resulting from drug pressure or host immunity will allow for refinement of interventions. We performed a pooled sequencing (pool-seq) of the genomes of 100 Plasmodium falciparum isolates from Nigeria. We explored allele-frequency based neutrality test (Tajima's D) and integrated haplotype score (iHS) to identify genes under selection. Fourteen shared iHS regions that had at least 2 SNPs with a score > 2.5 were identified. These regions code for genes that were likely to have been under strong directional selection. Two of these genes were the chloroquine resistance transporter (CRT) on chromosome 7 and the multidrug resistance 1 (MDR1) on chromosome 5. There was a weak signature of selection in the dihydrofolate reductase (DHFR) gene on chromosome 4 and MDR5 genes on chromosome 13, with only 2 and 3 SNPs respectively identified within the iHS window. We observed strong selection pressure attributable to continued chloroquine and sulfadoxine-pyrimethamine use despite their official proscription for the treatment of uncomplicated malaria. There was also a major selective sweep on chromosome 6 which had 32 SNPs within the shared iHS region. Tajima's D of circumsporozoite protein (CSP), erythrocyte-binding antigen (EBA-175), merozoite surface proteins - MSP3 and MSP7, merozoite surface protein duffy binding-like (MSPDBL2) and serine repeat antigen (SERA-5) were 1.38, 1.29, 0.73, 0.84 and 0.21, respectively. We have demonstrated the use of pool-seq to understand genomic patterns of selection and variability in P. falciparum from Nigeria, which bears the highest burden of infections. This investigation identified known

  4. Whole exome sequencing identifies mutations in Usher syndrome genes in profoundly deaf Tunisian patients.

    Science.gov (United States)

    Riahi, Zied; Bonnet, Crystel; Zainine, Rim; Lahbib, Saida; Bouyacoub, Yosra; Bechraoui, Rym; Marrakchi, Jihène; Hardelin, Jean-Pierre; Louha, Malek; Largueche, Leila; Ben Yahia, Salim; Kheirallah, Moncef; Elmatri, Leila; Besbes, Ghazi; Abdelhak, Sonia; Petit, Christine

    2015-01-01

    Usher syndrome (USH) is an autosomal recessive disorder characterized by combined deafness-blindness. It accounts for about 50% of all hereditary deafness blindness cases. Three clinical subtypes (USH1, USH2, and USH3) are described, of which USH1 is the most severe form, characterized by congenital profound deafness, constant vestibular dysfunction, and a prepubertal onset of retinitis pigmentosa. We performed whole exome sequencing in four unrelated Tunisian patients affected by apparently isolated, congenital profound deafness, with reportedly normal ocular fundus examination. Four biallelic mutations were identified in two USH1 genes: a splice acceptor site mutation, c.2283-1G>T, and a novel missense mutation, c.5434G>A (p.Glu1812Lys), in MYO7A, and two previously unreported mutations in USH1G, i.e. a frameshift mutation, c.1195_1196delAG (p.Leu399Alafs*24), and a nonsense mutation, c.52A>T (p.Lys18*). Another ophthalmological examination including optical coherence tomography actually showed the presence of retinitis pigmentosa in all the patients. Our findings provide evidence that USH is under-diagnosed in Tunisian deaf patients. Yet, early diagnosis of USH is of utmost importance because these patients should undergo cochlear implant surgery in early childhood, in anticipation of the visual loss.

  5. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder

    Science.gov (United States)

    Yuen, Ryan KC; Merico, Daniele; Bookman, Matt; Howe, Jennifer L; Thiruvahindrapuram, Bhooma; Patel, Rohan V; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A; Walker, Susan; Marshall, Christian R; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D’Abate, Lia; Chan, Ada JS; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R; Nalpathamkalam, Thomas; Sung, Wilson WL; Tsoi, Fiona J; Wei, John; Xu, Lizhen; Tasse, Anne-Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie MacKinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A; Parr, Jeremy R; Spence, Sarah J; Vorstman, Jacob; Frey, Brendan J; Robinson, James T; Strug, Lisa J; Fernandez, Bridget A; Elsabbagh, Mayada; Carter, Melissa T; Hallmayer, Joachim; Knoppers, Bartha M; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H; Glazer, David; Pletcher, Mathew T; Scherer, Stephen W

    2017-01-01

    We are performing whole genome sequencing (WGS) of families with Autism Spectrum Disorder (ASD) to build a resource, named MSSNG, to enable the sub-categorization of phenotypes and underlying genetic factors involved. Here, we report WGS of 5,205 samples from families with ASD, accompanied by clinical information, creating a database accessible in a cloud platform, and through an internet portal with controlled access. We found an average of 73.8 de novo single nucleotide variants and 12.6 de novo insertion/deletions (indels) or copy number variations (CNVs) per ASD subject. We identified 18 new candidate ASD-risk genes such as MED13 and PHF3, and found that participants bearing mutations in susceptibility genes had significantly lower adaptive ability (p=6×10−4). In 294/2,620 (11.2%) of ASD cases, a molecular basis could be determined and 7.2% of these carried CNV/chromosomal abnormalities, emphasizing the importance of detecting all forms of genetic variation as diagnostic and therapeutic targets in ASD. PMID:28263302

  6. Antimicrobial susceptibility among clinical Nocardia species identified by multilocus sequence analysis.

    Science.gov (United States)

    McTaggart, Lisa R; Doucet, Jennifer; Witkowska, Maria; Richardson, Susan E

    2015-01-01

    Antimicrobial susceptibility patterns of 112 clinical isolates, 28 type strains, and 9 reference strains of Nocardia were determined using the Sensititre Rapmyco microdilution panel (Thermo Fisher, Inc.). Isolates were identified by highly discriminatory multilocus sequence analysis and were chosen to represent the diversity of species recovered from clinical specimens in Ontario, Canada. Susceptibility to the most commonly used drug, trimethoprim-sulfamethoxazole, was observed in 97% of isolates. Linezolid and amikacin were also highly effective; 100% and 99% of all isolates demonstrated a susceptible phenotype. For the remaining antimicrobials, resistance was species specific with isolates of Nocardia otitidiscaviarum, N. brasiliensis, N. abscessus complex, N. nova complex, N. transvalensis complex, N. farcinica, and N. cyriacigeorgica displaying the traditional characteristic drug pattern types. In addition, the antimicrobial susceptibility profiles of a variety of rarely encountered species isolated from clinical specimens are reported for the first time and were categorized into four additional drug pattern types. Finally, MICs for the control strains N. nova ATCC BAA-2227, N. asteroides ATCC 19247(T), and N. farcinica ATCC 23826 were robustly determined to demonstrate method reproducibility and suitability of the commercial Sensititre Rapmyco panel for antimicrobial susceptibility testing of Nocardia spp. isolated from clinical specimens. The reported values will facilitate quality control and standardization among laboratories. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  7. Whole exome sequencing identifies mutations in Usher syndrome genes in profoundly deaf Tunisian patients.

    Directory of Open Access Journals (Sweden)

    Zied Riahi

    Full Text Available Usher syndrome (USH is an autosomal recessive disorder characterized by combined deafness-blindness. It accounts for about 50% of all hereditary deafness blindness cases. Three clinical subtypes (USH1, USH2, and USH3 are described, of which USH1 is the most severe form, characterized by congenital profound deafness, constant vestibular dysfunction, and a prepubertal onset of retinitis pigmentosa. We performed whole exome sequencing in four unrelated Tunisian patients affected by apparently isolated, congenital profound deafness, with reportedly normal ocular fundus examination. Four biallelic mutations were identified in two USH1 genes: a splice acceptor site mutation, c.2283-1G>T, and a novel missense mutation, c.5434G>A (p.Glu1812Lys, in MYO7A, and two previously unreported mutations in USH1G, i.e. a frameshift mutation, c.1195_1196delAG (p.Leu399Alafs*24, and a nonsense mutation, c.52A>T (p.Lys18*. Another ophthalmological examination including optical coherence tomography actually showed the presence of retinitis pigmentosa in all the patients. Our findings provide evidence that USH is under-diagnosed in Tunisian deaf patients. Yet, early diagnosis of USH is of utmost importance because these patients should undergo cochlear implant surgery in early childhood, in anticipation of the visual loss.

  8. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    Science.gov (United States)

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  9. Global Transcriptome Sequencing Identifies Chlamydospore Specific Markers in Candida albicans and Candida dubliniensis

    LENUS (Irish Health Repository)

    Palige, Katja

    2013-04-15

    Candida albicans and Candida dubliniensis are pathogenic fungi that are highly related but differ in virulence and in some phenotypic traits. During in vitro growth on certain nutrient-poor media, C. albicans and C. dubliniensis are the only yeast species which are able to produce chlamydospores, large thick-walled cells of unknown function. Interestingly, only C. dubliniensis forms pseudohyphae with abundant chlamydospores when grown on Staib medium, while C. albicans grows exclusively as a budding yeast. In order to further our understanding of chlamydospore development and assembly, we compared the global transcriptional profile of both species during growth in liquid Staib medium by RNA sequencing. We also included a C. albicans mutant in our study which lacks the morphogenetic transcriptional repressor Nrg1. This strain, which is characterized by its constitutive pseudohyphal growth, specifically produces masses of chlamydospores in Staib medium, similar to C. dubliniensis. This comparative approach identified a set of putatively chlamydospore-related genes. Two of the homologous C. albicans and C. dubliniensis genes (CSP1 and CSP2) which were most strongly upregulated during chlamydospore development were analysed in more detail. By use of the green fluorescent protein as a reporter, the encoded putative cell wall related proteins were found to exclusively localize to C. albicans and C. dubliniensis chlamydospores. Our findings uncover the first chlamydospore specific markers in Candida species and provide novel insights in the complex morphogenetic development of these important fungal pathogens.

  10. Distinct and conserved prominin-1/CD133-positive retinal cell populations identified across species.

    Directory of Open Access Journals (Sweden)

    József Jászai

    2011-03-01

    Full Text Available Besides being a marker of various somatic stem cells in mammals, prominin-1 (CD133 plays a role in maintaining the photoreceptor integrity since mutations in the PROM1 gene are linked with retinal degeneration. In spite of that, little information is available regarding its distribution in eyes of non-mammalian vertebrates endowed with high regenerative abilities. To address this subject, prominin-1 cognates were isolated from axolotl, zebrafish and chicken, and their retinal compartmentalization was investigated and compared to that of their mammalian orthologue. Interestingly, prominin-1 transcripts--except for the axolotl--were not strictly restricted to the outer nuclear layer (i.e., photoreceptor cells, but they also marked distinct subdivisions of the inner nuclear layer (INL. In zebrafish, where the prominin-1 gene is duplicated (i.e., prominin-1a and prominin-1b, a differential expression was noted for both paralogues within the INL being localized either to its vitreal or scleral subdivision, respectively. Interestingly, expression of prominin-1a within the former domain coincided with Pax-6-positive cells that are known to act as progenitors upon injury-induced retino-neurogenesis. A similar, but minute population of prominin-1-positive cells located at the vitreal side of the INL was also detected in developing and adult mice. In chicken, however, prominin-1-positive cells appeared to be aligned along the scleral side of the INL reminiscent of zebrafish prominin-1b. Taken together our data indicate that in addition to conserved expression of prominin-1 in photoreceptors, significant prominin-1-expressing non-photoreceptor retinal cell populations are present in the vertebrate eye that might represent potential sources of stem/progenitor cells for regenerative therapies.

  11. Identifying the Needs of Opinion Leaders to Encourage Widespread Adoption of Water Conservation and Protection

    Science.gov (United States)

    Taylor, Melissa R.; Lamm, Alexa J.

    2017-01-01

    Opinion leaders are persuasive in convincing others within their social networks to adopt certain opinions and behaviors. By identifying and using opinion leaders, agricultural educators may be able to leverage individuals who have influence on others' opinions, thereby speeding up the adoption of new practices. In this article, we review a…

  12. An atlas of over 90.000 conserved noncoding sequences provides insight into crucifer regulatory regions

    NARCIS (Netherlands)

    Haudry, A.; Platts, A.E.; Vello, E.; Hoen, D.R.; Leclerq, M.; Williamson, R.J.; Forczek, E.; Joly-Lopez, Z.; Steffen, J.G.; Hazzouri, K.M.; Dewar, K.; Stinchcombe, J.R.; Schoen, D.J.; Wang, X.; Schmutz, J.; Town, C.D.; Edger, P.P.; Pires, J.C.; Schumaker, K.S.; Jarvis, D.E.; Mandakova, T.; Lysak, M.; Bergh, van den E.; Schranz, M.E.; Harrison, P.M.

    2013-01-01

    Despite the central importance of noncoding DNA to gene regulation and evolution, understanding of the extent of selection on plant noncoding DNA remains limited compared to that of other organisms. Here we report sequencing of genomes from three Brassicaceae species (Leavenworthia alabamica,

  13. Comparison of C. elegans and C. briggsae genome sequences reveals extensive conservation of chromosome organization and synteny.

    Directory of Open Access Journals (Sweden)

    LaDeana W Hillier

    2007-07-01

    Full Text Available To determine whether the distinctive features of Caenorhabditis elegans chromosomal organization are shared with the C. briggsae genome, we constructed a single nucleotide polymorphism-based genetic map to order and orient the whole genome shotgun assembly along the six C. briggsae chromosomes. Although these species are of the same genus, their most recent common ancestor existed 80-110 million years ago, and thus they are more evolutionarily distant than, for example, human and mouse. We found that, like C. elegans chromosomes, C. briggsae chromosomes exhibit high levels of recombination on the arms along with higher repeat density, a higher fraction of intronic sequence, and a lower fraction of exonic sequence compared with chromosome centers. Despite extensive intrachromosomal rearrangements, 1:1 orthologs tend to remain in the same region of the chromosome, and colinear blocks of orthologs tend to be longer in chromosome centers compared with arms. More strikingly, the two species show an almost complete conservation of synteny, with 1:1 orthologs present on a single chromosome in one species also found on a single chromosome in the other. The conservation of both chromosomal organization and synteny between these two distantly related species suggests roles for chromosome organization in the fitness of an organism that are only poorly understood presently.

  14. Identifying core habitat and connectivity for focal species in the interior cedar-hemlock forest of North America to complete a conservation area design

    Science.gov (United States)

    Lance Craighead; Baden Cross

    2007-01-01

    To identify the remaining areas of the Interior Cedar- Hemlock Forest of North America and prioritize them for conservation planning, the Craighead Environmental Research Institute has developed a 2-scale method for mapping critical habitat utilizing 1) a broad-scale model to identify important regional locations as the basis for a Conservation Area Design (CAD), and 2...

  15. Morphological and molecular methods to identify butternut (Juglans cinerea) and butternut hybrids: relevance to butternut conservation.

    Science.gov (United States)

    Ross-Davis, Amy; Huang, Zhonglian; McKenna, James; Ostry, Michael; Woeste, Keith

    2008-07-01

    Butternut (Juglans cinerea L.) is a native, cold-tolerant, hard-mast species formerly valued for its nuts and wood, which is now endangered. The most immediate threat to butternut restoration is the spread of butternut canker disease, caused by the exotic fungus Sirococcus clavigignenti-juglandacearum Nair, Kostichka & Kuntz. Other threats include the hybridization of butternut with the exotic Japanese walnut (Juglans ailantifolia Carr.) and poor regeneration. The hybrids, known as buartnuts, are vegetatively vigorous, highly fecund, more resistant than butternut to butternut canker disease and difficult to identify. We review the vegetative and reproductive morphological traits that distinguish butternut from hybrids and identify those that can be used by field biologists to separate the taxa. No single trait was sufficient to separate butternut from hybrids, but pith color, lenticel size, shape and abundance, and the presence or absence of a notch in the upper margin of leaf scars, can be used in combination with other traits to identify butternuts and exclude most hybrids. In at least one butternut population, reduced symptoms of butternut canker disease were significantly associated with a dark barked phenotype. We also describe two randomly amplified polymorphic DNA (RAPD) markers that differentiate butternuts from hybrids based on DNA polymorphism. Together, these results should assist in the identification and testing of non-hybrid butternut for breeding and reintroduction of the species to its former habitats.

  16. SEQATOMS: a web tool for identifying missing regions in PDB in sequence context

    NARCIS (Netherlands)

    Brandt, B.W.; Heringa, J.; Leunissen, J.A.M.

    2008-01-01

    With over 46 000 proteins, the Protein Data Bank (PDB) is the most important database with structural information of biological macromolecules. PDB files contain sequence and coordinate information. Residues present in the sequence can be absent from the coordinate section, which means their

  17. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder

    NARCIS (Netherlands)

    Yuen, Ryan K C; Merico, Daniele; Bookman, Matt; Howe, Jennifer L.; Thiruvahindrapuram, Bhooma; Patel, Rohan V.; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A.; Walker, Susan; Marshall, Christian R.; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D'Abate, Lia; Chan, Ada J S; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L.; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J.; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R.; Nalpathamkalam, Thomas; Sung, Wilson W L; Tsoi, Fiona J.; Wei, John; Xu, Lizhen; Tasse, Anne Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie Mackinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M.; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H.; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A.; Parr, Jeremy R.; Spence, Sarah J.; Vorstman, Jacob; Frey, Brendan J.; Robinson, James T.; Strug, Lisa J.; Fernandez, Bridget A.; Elsabbagh, Mayada; Carter, Melissa T.; Hallmayer, Joachim; Knoppers, Bartha M.; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H.; Glazer, David; Pletcher, Mathew T.; Scherer, Stephen W.

    2017-01-01

    We are performing whole-genome sequencing of families with autism spectrum disorder (ASD) to build a resource (MSSNG) for subcategorizing the phenotypes and underlying genetic factors involved. Here we report sequencing of 5,205 samples from families with ASD, accompanied by clinical information,

  18. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways

    NARCIS (Netherlands)

    Cirulli, Elizabeth T.; Lasseigne, Brittany N.; Petrovski, Slavé; Sapp, Peter C.; Dion, Patrick A.; Leblond, Claire S.; Couthouis, Julien; Lu, Yi-Fan; Wang, Quanli; Krueger, Brian J.; Ren, Zhong; Keebler, Jonathan; Han, Yujun; Levy, Shawn E.; Boone, Braden E.; Wimbish, Jack R.; Waite, Lindsay L.; Jones, Angela L.; Carulli, John P.; Day-Williams, Aaron G.; Staropoli, John F.; Xin, Winnie W.; Chesi, Alessandra; Raphael, Alya R.; McKenna-Yasek, Diane; Cady, Janet; de Jong, J. M. B. Vianney; Kenna, Kevin P.; Smith, Bradley N.; Topp, Simon; Miller, Jack; Gkazi, Athina; Al-Chalabi, Ammar; van den Berg, Leonard H.; Veldink, Jan; Silani, Vincenzo; Ticozzi, Nicola; Shaw, Christopher E.; Baloh, Robert H.; Appel, Stanley; Simpson, Ericka; Lagier-Tourenne, Clotilde; Pulst, Stefan M.; Gibson, Summer; Trojanowski, John Q.; Elman, Lauren; McCluskey, Leo; Grossman, Murray; Baas, Frank; ten Asbroek, Anneloor L. M. A.

    2015-01-01

    Amyotrophic lateral sclerosis (ALS) is a devastating neurological disease with no effective treatment. We report the results of a moderate-scale sequencing study aimed at increasing the number of genes known to contribute to predisposition for ALS. We performed whole-exome sequencing of 2869 ALS

  19. Identifying Students' Conceptions of Basic Principles in Sequence Stratigraphy

    Science.gov (United States)

    Herrera, Juan S.; Riggs, Eric M.

    2013-01-01

    Sequence stratigraphy is a major research subject in the geosciences academia and the oil industry. However, the geoscience education literature addressing students' understanding of the basic concepts of sequence stratigraphy is relatively thin, and the topic has not been well explored. We conducted an assessment of 27 students' conceptions of…

  20. Overexpression screens identify conserved dosage chromosome instability genes in yeast and human cancer

    Science.gov (United States)

    Duffy, Supipi; Fam, Hok Khim; Wang, Yi Kan; Styles, Erin B.; Kim, Jung-Hyun; Ang, J. Sidney; Singh, Tejomayee; Larionov, Vladimir; Shah, Sohrab P.; Andrews, Brenda; Boerkoel, Cornelius F.; Hieter, Philip

    2016-01-01

    Somatic copy number amplification and gene overexpression are common features of many cancers. To determine the role of gene overexpression on chromosome instability (CIN), we performed genome-wide screens in the budding yeast for yeast genes that cause CIN when overexpressed, a phenotype we refer to as dosage CIN (dCIN), and identified 245 dCIN genes. This catalog of genes reveals human orthologs known to be recurrently overexpressed and/or amplified in tumors. We show that two genes, TDP1, a tyrosyl-DNA-phosphdiesterase, and TAF12, an RNA polymerase II TATA-box binding factor, cause CIN when overexpressed in human cells. Rhabdomyosarcoma lines with elevated human Tdp1 levels also exhibit CIN that can be partially rescued by siRNA-mediated knockdown of TDP1. Overexpression of dCIN genes represents a genetic vulnerability that could be leveraged for selective killing of cancer cells through targeting of an unlinked synthetic dosage lethal (SDL) partner. Using SDL screens in yeast, we identified a set of genes that when deleted specifically kill cells with high levels of Tdp1. One gene was the histone deacetylase RPD3, for which there are known inhibitors. Both HT1080 cells overexpressing hTDP1 and rhabdomyosarcoma cells with elevated levels of hTdp1 were more sensitive to histone deacetylase inhibitors valproic acid (VPA) and trichostatin A (TSA), recapitulating the SDL interaction in human cells and suggesting VPA and TSA as potential therapeutic agents for tumors with elevated levels of hTdp1. The catalog of dCIN genes presented here provides a candidate list to identify genes that cause CIN when overexpressed in cancer, which can then be leveraged through SDL to selectively target tumors. PMID:27551064

  1. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Directory of Open Access Journals (Sweden)

    Pimlapas Leekitcharoenphon

    Full Text Available Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections.

  2. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Thorup Nielsen, Mette

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely...

  3. The effects of sequence and type of chemotherapy and radiation therapy on cosmesis and complications after breast conservation therapy

    International Nuclear Information System (INIS)

    Markiewicz, Deborah A.; Schultz, Delray J.; Haas, Jonathan A.; Harris, Eleanor E. R.; Fox, Kevin R.; Glick, John H.; Solin, Lawrence J.

    1996-01-01

    Purpose: Chemotherapy plays an increasingly important role in the treatment of both node-negative and node-positive breast cancer patients, but the optimal sequencing of chemotherapy and radiation therapy is not well established. The purpose of this study is to evaluate the interaction of sequence and type of chemotherapy and hormonal therapy given with radiation therapy on the cosmetic outcome and the incidence of complications of Stage I and II breast cancer patients treated with breast-conserving therapy. Methods and Materials: The records of 1053 Stage I and II breast cancer patients treated with curative intent with breast-conserving surgery, axillary dissection, and radiation therapy between 1977-1991 were reviewed. Median follow-up after treatment was 6.7 years. Two hundred fourteen patients received chemotherapy alone, 141 patients received hormonal therapy alone, 86 patients received both, and 612 patients received no adjuvant therapy. Patients who received chemotherapy ± hormonal therapy were grouped according to sequence of chemotherapy: (a) concurrent = concurrent chemotherapy with radiation therapy followed by chemotherapy; (b) sequential = radiation followed by chemotherapy or chemotherapy followed by radiation; and (c) sandwich = chemotherapy followed by concurrent chemotherapy and radiation followed by chemotherapy. Compared to node negative patients, node-positive patients more commonly received chemotherapy (77 vs. 9%, p < 0.0001) and/or hormonal therapy (40 vs. 14%, p < 0.0001). Among patients who received chemotherapy, the majority (243 patients) received concurrent chemotherapy and radiation therapy with two cycles of cytoxan and 5-fluorouracil (5-FU) administered during radiation followed by six cycles of chemotherapy with cytoxan, 5-fluorouracil and either methotrexate(CMF) or doxorubicin(CAF). For analysis of cosmesis, patients included were relapse free with 3 years minimum follow-up. Results: The use of chemotherapy had an adverse effect

  4. Inhibition of Hepatitis C Virus in Mice by a Small Interfering RNA Targeting a Highly Conserved Sequence in Viral IRES Pseudoknot.

    Directory of Open Access Journals (Sweden)

    Jae-Su Moon

    Full Text Available The hepatitis C virus (HCV internal ribosome entry site (IRES that directs cap-independent viral translation is a primary target for small interfering RNA (siRNA-based HCV antiviral therapy. However, identification of potent siRNAs against HCV IRES by bioinformatics-based siRNA design is a challenging task given the complexity of HCV IRES secondary and tertiary structures and association with multiple proteins, which can also dynamically change the structure of this cis-acting RNA element. In this work, we utilized siRNA tiling approach whereby siRNAs were tiled with overlapping sequences that were shifted by one or two nucleotides over the HCV IRES stem-loop structures III and IV spanning nucleotides (nts 277-343. Based on their antiviral activity, we mapped a druggable region (nts 313-343 where the targets of potent siRNAs were enriched. siIE22, which showed the greatest anti-HCV potency, targeted a highly conserved sequence across diverse HCV genotypes, locating within the IRES subdomain IIIf involved in pseudoknot formation. Stepwise target shifting toward the 5' or 3' direction by 1 or 2 nucleotides reduced the antiviral potency of siIE22, demonstrating the importance of siRNA accessibility to this highly structured and sequence-conserved region of HCV IRES for RNA interference. Nanoparticle-mediated systemic delivery of the stability-improved siIE22 derivative gs_PS1 siIE22, which contains a single phosphorothioate linkage on the guide strand, reduced the serum HCV genome titer by more than 4 log10 in a xenograft mouse model for HCV replication without generation of resistant variants. Our results provide a strategy for identifying potent siRNA species against a highly structured RNA target and offer a potential pan-HCV genotypic siRNA therapy that might be beneficial for patients resistant to current treatment regimens.

  5. Exome sequencing of Pakistani consanguineous families identifies 30 novel candidate genes for recessive intellectual disability.

    Science.gov (United States)

    Riazuddin, S; Hussain, M; Razzaq, A; Iqbal, Z; Shahzad, M; Polla, D L; Song, Y; van Beusekom, E; Khan, A A; Tomas-Roca, L; Rashid, M; Zahoor, M Y; Wissink-Lindhout, W M; Basra, M A R; Ansar, M; Agha, Z; van Heeswijk, K; Rasheed, F; Van de Vorst, M; Veltman, J A; Gilissen, C; Akram, J; Kleefstra, T; Assir, M Z; Grozeva, D; Carss, K; Raymond, F L; O'Connor, T D; Riazuddin, S A; Khan, S N; Ahmed, Z M; de Brouwer, A P M; van Bokhoven, H; Riazuddin, S

    2017-11-01

    Intellectual disability (ID) is a clinically and genetically heterogeneous disorder, affecting 1-3% of the general population. Although research into the genetic causes of ID has recently gained momentum, identification of pathogenic mutations that cause autosomal recessive ID (ARID) has lagged behind, predominantly due to non-availability of sizeable families. Here we present the results of exome sequencing in 121 large consanguineous Pakistani ID families. In 60 families, we identified homozygous or compound heterozygous DNA variants in a single gene, 30 affecting reported ID genes and 30 affecting novel candidate ID genes. Potential pathogenicity of these alleles was supported by co-segregation with the phenotype, low frequency in control populations and the application of stringent bioinformatics analyses. In another eight families segregation of multiple pathogenic variants was observed, affecting 19 genes that were either known or are novel candidates for ID. Transcriptome profiles of normal human brain tissues showed that the novel candidate ID genes formed a network significantly enriched for transcriptional co-expression (P<0.0001) in the frontal cortex during fetal development and in the temporal-parietal and sub-cortex during infancy through adulthood. In addition, proteins encoded by 12 novel ID genes directly interact with previously reported ID proteins in six known pathways essential for cognitive function (P<0.0001). These results suggest that disruptions of temporal parietal and sub-cortical neurogenesis during infancy are critical to the pathophysiology of ID. These findings further expand the existing repertoire of genes involved in ARID, and provide new insights into the molecular mechanisms and the transcriptome map of ID.

  6. The implications of drought and water conservation on the reuse of municipal wastewater: Recognizing impacts and identifying mitigation possibilities.

    Science.gov (United States)

    Tran, Quynh K; Jassby, David; Schwabe, Kurt A

    2017-11-01

    As water agencies continue to investigate opportunities to increase resilience and local water supply reliability in the face of drought and rising water scarcity, water conservation strategies and the reuse of treated municipal wastewater are garnering significant attention and adoption. Yet a simple water balance thought experiment illustrates that drought, and the conservation strategies that are often enacted in response to it, both likely limit the role reuse may play in improving local water supply reliability. For instance, as a particular drought progresses and agencies enact water conservation measures to cope with drought, influent flows likely decrease while influent pollution concentrations increase, particularly salinity, which adversely affects wastewater treatment plant (WWTP) costs and effluent quality and flow. Consequently, downstream uses of this effluent, whether to maintain streamflow and quality, groundwater recharge, or irrigation may be impacted. This is unfortunate since reuse is often heralded as a drought-proof mechanism to increase resilience. The objectives of this paper are two-fold. First, we illustrate-using a case study from Southern California during its most recent drought- how drought and water conservation strategies combine to reduce influent flow and quality and, subsequently, effluent flow and quality. Second, we use a recently developed regional water reuse decision support model (RWRM) to highlight cost-effective strategies that can be implemented to mitigate the impacts of drought on effluent water quality. While the solutions we identify cannot increase the flow of influent or effluent coming into or out of a treatment plant, they can improve the value of the remaining effluent in a cost-effective manner that takes into account the characteristics of its demand, whether it be for landscaping, golf courses, agricultural irrigation, or surface water augmentation. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. A bioinformatics approach for identifying transgene insertion sites using whole genome sequencing data.

    Science.gov (United States)

    Park, Doori; Park, Su-Hyun; Ban, Yong Wook; Kim, Youn Shic; Park, Kyoung-Cheul; Kim, Nam-Soo; Kim, Ju-Kon; Choi, Ik-Young

    2017-08-15

    Genetically modified crops (GM crops) have been developed to improve the agricultural traits of modern crop cultivars. Safety assessments of GM crops are of paramount importance in research at developmental stages and before releasing transgenic plants into the marketplace. Sequencing technology is developing rapidly, with higher output and labor efficiencies, and will eventually replace existing methods for the molecular characterization of genetically modified organisms. To detect the transgenic insertion locations in the three GM rice gnomes, Illumina sequencing reads are mapped and classified to the rice genome and plasmid sequence. The both mapped reads are classified to characterize the junction site between plant and transgene sequence by sequence alignment. Herein, we present a next generation sequencing (NGS)-based molecular characterization method, using transgenic rice plants SNU-Bt9-5, SNU-Bt9-30, and SNU-Bt9-109. Specifically, using bioinformatics tools, we detected the precise insertion locations and copy numbers of transfer DNA, genetic rearrangements, and the absence of backbone sequences, which were equivalent to results obtained from Southern blot analyses. NGS methods have been suggested as an effective means of characterizing and detecting transgenic insertion locations in genomes. Our results demonstrate the use of a combination of NGS technology and bioinformatics approaches that offers cost- and time-effective methods for assessing the safety of transgenic plants.

  8. The complete genomic sequence of a tentative new polerovirus identified in barley in South Korea.

    Science.gov (United States)

    Zhao, Fumei; Lim, Seungmo; Yoo, Ran Hee; Igori, Davaajargal; Kim, Sang-Min; Kwak, Do Yeon; Kim, Sun Lim; Lee, Bong Choon; Moon, Jae Sun

    2016-07-01

    The complete nucleotide sequence of a new barley polerovirus, tentatively named barley virus G (BVG), which was isolated in Gimje, South Korea, has been determined using an RNA sequencing technique combined with polymerase chain reaction methods. The viral genomic RNA of BVG is 5,620 nucleotides long and contains six typical open reading frames commonly observed in other poleroviruses. Sequence comparisons revealed that BVG is most closely related to maize yellow dwarf virus-RMV, with the highest amino acid identities being less than 90 % for all of the corresponding proteins. These results suggested that BVG is a member of a new species in the genus Polerovirus.

  9. Tombusvirus-yeast interactions identify conserved cell-intrinsic viral restriction factors

    Directory of Open Access Journals (Sweden)

    Zsuzsanna eSasvari

    2014-08-01

    Full Text Available To combat viral infections, plants possess innate and adaptive immune pathways, such as RNA silencing, R gene and recessive gene-mediated resistance mechanisms. However, it is likely that additional cell-intrinsic restriction factors (CIRF are also involved in limiting plant virus replication. This review discusses novel CIRFs with antiviral functions, many of them RNA-binding proteins or affecting the RNA binding activities of viral replication proteins. The CIRFs against tombusviruses have been identified in yeast (Saccharomyces cerevisiae, which is developed as an advanced model organism. Grouping of the identified CIRFs based on their known cellular functions and subcellular localization in yeast reveals that TBSV replication is limited by a wide variety of host gene functions. Yeast proteins with the highest connectivity in the network map include the well-characterized Xrn1p 5’-3’ exoribonuclease, Act1p actin protein and Cse4p centromere protein. The protein network map also reveals an important interplay between the pro-viral Hsp70 cellular chaperone and the antiviral co-chaperones, and possibly key roles for the ribosomal or ribosome-associated factors. We discuss the antiviral functions of selected CIRFs, such as the RNA binding nucleolin, ribonucleases, WW-domain proteins, single- and multi-domain cyclophilins, TPR-domain co-chaperones and cellular ion pumps. These restriction factors frequently target the RNA-binding region in the viral replication proteins, thus interfering with the recruitment of the viral RNA for replication and the assembly of the membrane-bound viral replicase. Although many of the characterized CIRFs act directly against TBSV, we propose that the TPR-domain co-chaperones function as guardians of the cellular Hsp70 chaperone system, which is subverted efficiently by TBSV for viral replicase assembly in the absence of the TPR-domain co-chaperones.

  10. The Comparison of Biochemical and Sequencing 16S rDNA Gene Methods to Identify Nontuberculous Mycobacteria

    Directory of Open Access Journals (Sweden)

    Shafipour1, M.

    2014-11-01

    Full Text Available The identification of Mycobacteria in the species level has great medical importance. Biochemical tests are laborious and time-consuming, so new techniques could be used to identify the species. This research aimed to the comparison of biochemical and sequencing 16S rDNA gene methods to identify nontuberculous Mycobacteria in patients suspected to tuberculosis in Golestan province which is the most prevalent region of tuberculosis in Iran. Among 3336 patients suspected to tuberculosis referred to hospitals and health care centres in Golestan province during 2010-2011, 319 (9.56% culture positive cases were collected. Identification of species by using biochemical tests was done. On the samples recognized as nontuberculous Mycobacteria, after DNA extraction by boiling, 16S rDNA PCR was done and their sequencing were identified by NCBI BLAST. Of the 319 positive samples in Golestan Province, 300 cases were M.tuberculosis and 19 cases (5.01% were identified as nontuberculous Mycobacteria by biochemical tests. 15 out of 19 nontuberculous Mycobacteria were identified by PCR and sequencing method as similar by biochemical methods (similarity rate: 78.9%. But after PCR, 1 case known as M.simiae by biochemical test was identified as M. lentiflavum and 3 other cases were identified as Nocardia. Biochemical methods corresponded to the 16S rDNA PCR and sequencing in 78.9% of cases. However, in identification of M. lentiflavum and Nocaria sp. the molecular method is better than biochemical methods.

  11. VWF mutations and new sequence variations identified in healthy controls are more frequent in the African-American population.

    Science.gov (United States)

    Bellissimo, Daniel B; Christopherson, Pamela A; Flood, Veronica H; Gill, Joan Cox; Friedman, Kenneth D; Haberichter, Sandra L; Shapiro, Amy D; Abshire, Thomas C; Leissinger, Cindy; Hoots, W Keith; Lusher, Jeanne M; Ragni, Margaret V; Montgomery, Robert R

    2012-03-01

    Diagnosis and classification of VWD is aided by molecular analysis of the VWF gene. Because VWF polymorphisms have not been fully characterized, we performed VWF laboratory testing and gene sequencing of 184 healthy controls with a negative bleeding history. The controls included 66 (35.9%) African Americans (AAs). We identified 21 new sequence variations, 13 (62%) of which occurred exclusively in AAs and 2 (G967D, T2666M) that were found in 10%-15% of the AA samples, suggesting they are polymorphisms. We identified 14 sequence variations reported previously as VWF mutations, the majority of which were type 1 mutations. These controls had VWF Ag levels within the normal range, suggesting that these sequence variations might not always reduce plasma VWF levels. Eleven mutations were found in AAs, and the frequency of M740I, H817Q, and R2185Q was 15%-18%. Ten AA controls had the 2N mutation H817Q; 1 was homozygous. The average factor VIII level in this group was 99 IU/dL, suggesting that this variation may confer little or no clinical symptoms. This study emphasizes the importance of sequencing healthy controls to understand ethnic-specific sequence variations so that asymptomatic sequence variations are not misidentified as mutations in other ethnic or racial groups.

  12. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Science.gov (United States)

    Greub, Gilbert; Kebbi-Beghdadi, Carole; Bertelli, Claire; Collyn, François; Riederer, Beat M; Yersin, Camille; Croxatto, Antony; Raoult, Didier

    2009-12-23

    With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  13. Patterns of genetic differentiation at MHC class I genes and microsatellites identify conservation units in the giant panda.

    Science.gov (United States)

    Zhu, Ying; Wan, Qiu-Hong; Yu, Bin; Ge, Yun-Fa; Fang, Sheng-Guo

    2013-10-22

    Evaluating patterns of genetic variation is important to identify conservation units (i.e., evolutionarily significant units [ESUs], management units [MUs], and adaptive units [AUs]) in endangered species. While neutral markers could be used to infer population history, their application in the estimation of adaptive variation is limited. The capacity to adapt to various environments is vital for the long-term survival of endangered species. Hence, analysis of adaptive loci, such as the major histocompatibility complex (MHC) genes, is critical for conservation genetics studies. Here, we investigated 4 classical MHC class I genes (Aime-C, Aime-F, Aime-I, and Aime-L) and 8 microsatellites to infer patterns of genetic variation in the giant panda (Ailuropoda melanoleuca) and to further define conservation units. Overall, we identified 24 haplotypes (9 for Aime-C, 1 for Aime-F, 7 for Aime-I, and 7 for Aime-L) from 218 individuals obtained from 6 populations of giant panda. We found that the Xiaoxiangling population had the highest genetic variation at microsatellites among the 6 giant panda populations and higher genetic variation at Aime-MHC class I genes than other larger populations (Qinling, Qionglai, and Minshan populations). Differentiation index (FST)-based phylogenetic and Bayesian clustering analyses for Aime-MHC-I and microsatellite loci both supported that most populations were highly differentiated. The Qinling population was the most genetically differentiated. The giant panda showed a relatively higher level of genetic diversity at MHC class I genes compared with endangered felids. Using all of the loci, we found that the 6 giant panda populations fell into 2 ESUs: Qinling and non-Qinling populations. We defined 3 MUs based on microsatellites: Qinling, Minshan-Qionglai, and Daxiangling-Xiaoxiangling-Liangshan. We also recommended 3 possible AUs based on MHC loci: Qinling, Minshan-Qionglai, and Daxiangling-Xiaoxiangling-Liangshan. Furthermore, we recommend

  14. Horse domestication and conservation genetics of Przewalski's horse inferred from sex chromosomal and autosomal sequences.

    Science.gov (United States)

    Lau, Allison N; Peng, Lei; Goto, Hiroki; Chemnick, Leona; Ryder, Oliver A; Makova, Kateryna D

    2009-01-01

    Despite their ability to interbreed and produce fertile offspring, there is continued disagreement about the genetic relationship of the domestic horse (Equus caballus) to its endangered wild relative, Przewalski's horse (Equus przewalskii). Analyses have differed as to whether or not Przewalski's horse is placed phylogenetically as a separate sister group to domestic horses. Because Przewalski's horse and domestic horse are so closely related, genetic data can also be used to infer domestication-specific differences between the two. To investigate the genetic relationship of Przewalski's horse to the domestic horse and to address whether evolution of the domestic horse is driven by males or females, five homologous introns (a total of approximately 3 kb) were sequenced on the X and Y chromosomes in two Przewalski's horses and three breeds of domestic horses: Arabian horse, Mongolian domestic horse, and Dartmoor pony. Five autosomal introns (a total of approximately 6 kb) were sequenced for these horses as well. The sequences of sex chromosomal and autosomal introns were used to determine nucleotide diversity and the forces driving evolution in these species. As a result, X chromosomal and autosomal data do not place Przewalski's horses in a separate clade within phylogenetic trees for horses, suggesting a close relationship between domestic and Przewalski's horses. It was also found that there was a lack of nucleotide diversity on the Y chromosome and higher nucleotide diversity than expected on the X chromosome in domestic horses as compared with the Y chromosome and autosomes. This supports the hypothesis that very few male horses along with numerous female horses founded the various domestic horse breeds. Patterns of nucleotide diversity among different types of chromosomes were distinct for Przewalski's in contrast to domestic horses, supporting unique evolutionary histories of the two species.

  15. Chromosome-wide mapping of DNA methylation patterns in normal and malignant prostate cells reveals pervasive methylation of gene-associated and conserved intergenic sequences

    Directory of Open Access Journals (Sweden)

    De Marzo Angelo M

    2011-06-01

    Full Text Available Abstract Background DNA methylation has been linked to genome regulation and dysregulation in health and disease respectively, and methods for characterizing genomic DNA methylation patterns are rapidly emerging. We have developed/refined methods for enrichment of methylated genomic fragments using the methyl-binding domain of the human MBD2 protein (MBD2-MBD followed by analysis with high-density tiling microarrays. This MBD-chip approach was used to characterize DNA methylation patterns across all non-repetitive sequences of human chromosomes 21 and 22 at high-resolution in normal and malignant prostate cells. Results Examining this data using computational methods that were designed specifically for DNA methylation tiling array data revealed widespread methylation of both gene promoter and non-promoter regions in cancer and normal cells. In addition to identifying several novel cancer hypermethylated 5' gene upstream regions that mediated epigenetic gene silencing, we also found several hypermethylated 3' gene downstream, intragenic and intergenic regions. The hypermethylated intragenic regions were highly enriched for overlap with intron-exon boundaries, suggesting a possible role in regulation of alternative transcriptional start sites, exon usage and/or splicing. The hypermethylated intergenic regions showed significant enrichment for conservation across vertebrate species. A sampling of these newly identified promoter (ADAMTS1 and SCARF2 genes and non-promoter (downstream or within DSCR9, C21orf57 and HLCS genes hypermethylated regions were effective in distinguishing malignant from normal prostate tissues and/or cell lines. Conclusions Comparison of chromosome-wide DNA methylation patterns in normal and malignant prostate cells revealed significant methylation of gene-proximal and conserved intergenic sequences. Such analyses can be easily extended for genome-wide methylation analysis in health and disease.

  16. Catchment-scale conservation units identified for the threatened Yarra pygmy perch (Nannoperca obscura) in highly modified river systems.

    Science.gov (United States)

    Brauer, Chris J; Unmack, Peter J; Hammer, Michael P; Adams, Mark; Beheregaray, Luciano B

    2013-01-01

    Habitat fragmentation caused by human activities alters metapopulation dynamics and decreases biological connectivity through reduced migration and gene flow, leading to lowered levels of population genetic diversity and to local extinctions. The threatened Yarra pygmy perch, Nannoperca obscura, is a poor disperser found in small, isolated populations in wetlands and streams of southeastern Australia. Modifications to natural flow regimes in anthropogenically-impacted river systems have recently reduced the amount of habitat for this species and likely further limited its opportunity to disperse. We employed highly resolving microsatellite DNA markers to assess genetic variation, population structure and the spatial scale that dispersal takes place across the distribution of this freshwater fish and used this information to identify conservation units for management. The levels of genetic variation found for N. obscura are amongst the lowest reported for a fish species (mean heterozygosity of 0.318 and mean allelic richness of 1.92). We identified very strong population genetic structure, nil to little evidence of recent migration among demes and a minimum of 11 units for conservation management, hierarchically nested within four major genetic lineages. A combination of spatial analytical methods revealed hierarchical genetic structure corresponding with catchment boundaries and also demonstrated significant isolation by riverine distance. Our findings have implications for the national recovery plan of this species by demonstrating that N. obscura populations should be managed at a catchment level and highlighting the need to restore habitat and avoid further alteration of the natural hydrology.

  17. Salmonella Persistence in Tomatoes Requires a Distinct Set of Metabolic Functions Identified by Transposon Insertion Sequencing

    Science.gov (United States)

    Desai, Prerak; Porwollik, Steffen; Canals, Rocio; Perez, Daniel R.; Chu, Weiping; McClelland, Michael; Teplitski, Max

    2016-01-01

    ABSTRACT Human enteric pathogens, such as Salmonella spp. and verotoxigenic Escherichia coli, are increasingly recognized as causes of gastroenteritis outbreaks associated with the consumption of fruits and vegetables. Persistence in plants represents an important part of the life cycle of these pathogens. The identification of the full complement of Salmonella genes involved in the colonization of the model plant (tomato) was carried out using transposon insertion sequencing analysis. With this approach, 230,000 transposon insertions were screened in tomato pericarps to identify loci with reduction in fitness, followed by validation of the screen results using competition assays of the isogenic mutants against the wild type. A comparison with studies in animals revealed a distinct plant-associated set of genes, which only partially overlaps with the genes required to elicit disease in animals. De novo biosynthesis of amino acids was critical to persistence within tomatoes, while amino acid scavenging was prevalent in animal infections. Fitness reduction of the Salmonella amino acid synthesis mutants was generally more severe in the tomato rin mutant, which hyperaccumulates certain amino acids, suggesting that these nutrients remain unavailable to Salmonella spp. within plants. Salmonella lipopolysaccharide (LPS) was required for persistence in both animals and plants, exemplifying some shared pathogenesis-related mechanisms in animal and plant hosts. Similarly to phytopathogens, Salmonella spp. required biosynthesis of amino acids, LPS, and nucleotides to colonize tomatoes. Overall, however, it appears that while Salmonella shares some strategies with phytopathogens and taps into its animal virulence-related functions, colonization of tomatoes represents a distinct strategy, highlighting this pathogen's flexible metabolism. IMPORTANCE Outbreaks of gastroenteritis caused by human pathogens have been increasingly associated with foods of plant origin, with tomatoes

  18. Exome Sequencing Identifies a Novel MAP3K14 Mutation in Recessive Atypical Combined Immunodeficiency

    Directory of Open Access Journals (Sweden)

    Nikola Schlechter

    2017-11-01

    Full Text Available Primary immunodeficiency disorders (PIDs render patients vulnerable to infection with a wide range of microorganisms and thus provide good in vivo models for the assessment of immune responses during infectious challenges. Priming of the immune system, especially in infancy, depends on different environmental exposures and medical practices. This may determine the timing and phenotype of clinical appearance of immune deficits as exemplified with early exposure to Bacillus Calmette-Guérin (BCG vaccination and dissemination in combined immunodeficiencies. Varied phenotype expression poses a challenge to identification of the putative immune deficit. Without the availability of genomic diagnosis and data analysis resources and with limited capacity for functional definition of immune pathways, it is difficult to establish a definitive diagnosis and to decide on appropriate treatment. This study describes the use of exome sequencing to identify a homozygous recessive variant in MAP3K14, NIKVal345Met, in a patient with combined immunodeficiency, disseminated BCG-osis, and paradoxically elevated lymphocytes. Laboratory testing confirmed hypogammaglobulinemia with normal CD19, but failed to confirm a definitive diagnosis for targeted treatment decisions. NIKVal345Met is predicted to be deleterious and pathogenic by two in silico prediction tools and is situated in a gene crucial for effective functioning of the non-canonical nuclear factor-kappa B signaling pathway. Functional analysis of NIKVal345Met- versus NIKWT-transfected human embryonic kidney-293T cells showed that this mutation significantly affects the kinase activity of NIK leading to decreased levels of phosphorylated IkappaB kinase-alpha (IKKα, the target of NIK. BCG-stimulated RAW264.7 cells transfected with NIKVal345Met also presented with reduced levels of phosphorylated IKKα, significantly increased p100 levels and significantly decreased p52 levels compared to cells transfected

  19. Requirement of Sequences outside the Conserved Kinase Domain of Fission Yeast Rad3p for Checkpoint Control

    Science.gov (United States)

    Chapman, Carolyn Riley; Evans, Sarah Tyler; Carr, Antony M.; Enoch, Tamar

    1999-01-01

    The fission yeast Rad3p checkpoint protein is a member of the phosphatidylinositol 3-kinase-related family of protein kinases, which includes human ATMp. Mutation of the ATM gene is responsible for the disease ataxia-telangiectasia. The kinase domain of Rad3p has previously been shown to be essential for function. Here, we show that although this domain is necessary, it is not sufficient, because the isolated kinase domain does not have kinase activity in vitro and cannot complement a rad3 deletion strain. Using dominant negative alleles of rad3, we have identified two sites N-terminal to the conserved kinase domain that are essential for Rad3p function. One of these sites is the putative leucine zipper, which is conserved in other phosphatidylinositol 3-kinase-related family members. The other is a novel motif, which may also mediate Rad3p protein–protein interactions. PMID:10512862

  20. G-NEST: a gene neighborhood scoring tool to identify co-conserved, co-expressed genes

    Directory of Open Access Journals (Sweden)

    Lemay Danielle G

    2012-09-01

    Full Text Available Abstract Background In previous studies, gene neighborhoods—spatial clusters of co-expressed genes in the genome—have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Scoring Tool (G-NEST which combines genomic location, gene expression, and evolutionary sequence conservation data to score putative gene neighborhoods across all possible window sizes simultaneously. Results Using G-NEST on atlases of mouse and human tissue expression data, we found that large neighborhoods of ten or more genes are extremely rare in mammalian genomes. When they do occur, neighborhoods are typically composed of families of related genes. Both the highest scoring and the largest neighborhoods in mammalian genomes are formed by tandem gene duplication. Mammalian gene neighborhoods contain highly and variably expressed genes. Co-localized noisy gene pairs exhibit lower evolutionary conservation of their adjacent genome locations, suggesting that their shared transcriptional background may be disadvantageous. Genes that are essential to mammalian survival and reproduction are less likely to occur in neighborhoods, although neighborhoods are enriched with genes that function in mitosis. We also found that gene orientation and protein-protein interactions are partially responsible for maintenance of gene neighborhoods. Conclusions Our experiments using G-NEST confirm that tandem gene duplication is the primary driver of non-random gene order in mammalian genomes. Non-essentiality, co-functionality, gene orientation, and protein-protein interactions are additional forces that maintain gene neighborhoods, especially those formed by tandem duplicates. We expect G-NEST to be useful for other applications such as the identification of core regulatory modules, common transcriptional backgrounds, and chromatin domains. The

  1. Combinatorial binding in human and mouse embryonic stem cells identifies conserved enhancers active in early embryonic development.

    Directory of Open Access Journals (Sweden)

    Jonathan Göke

    2011-12-01

    Full Text Available Transcription factors are proteins that regulate gene expression by binding to cis-regulatory sequences such as promoters and enhancers. In embryonic stem (ES cells, binding of the transcription factors OCT4, SOX2 and NANOG is essential to maintain the capacity of the cells to differentiate into any cell type of the developing embryo. It is known that transcription factors interact to regulate gene expression. In this study we show that combinatorial binding is strongly associated with co-localization of the transcriptional co-activator Mediator, H3K27ac and increased expression of nearby genes in embryonic stem cells. We observe that the same loci bound by Oct4, Nanog and Sox2 in ES cells frequently drive expression in early embryonic development. Comparison of mouse and human ES cells shows that less than 5% of individual binding events for OCT4, SOX2 and NANOG are shared between species. In contrast, about 15% of combinatorial binding events and even between 53% and 63% of combinatorial binding events at enhancers active in early development are conserved. Our analysis suggests that the combination of OCT4, SOX2 and NANOG binding is critical for transcription in ES cells and likely plays an important role for embryogenesis by binding at conserved early developmental enhancers. Our data suggests that the fast evolutionary rewiring of regulatory networks mainly affects individual binding events, whereas "gene regulatory hotspots" which are bound by multiple factors and active in multiple tissues throughout early development are under stronger evolutionary constraints.

  2. Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing.

    Science.gov (United States)

    Hochgerner, Hannah; Zeisel, Amit; Lönnerberg, Peter; Linnarsson, Sten

    2018-02-01

    The dentate gyrus of the hippocampus is a brain region in which neurogenesis persists into adulthood; however, the relationship between developmental and adult dentate gyrus neurogenesis has not been examined in detail. Here we used single-cell RNA sequencing to reveal the molecular dynamics and diversity of dentate gyrus cell types in perinatal, juvenile, and adult mice. We found distinct quiescent and proliferating progenitor cell types, linked by transient intermediate states to neuroblast stages and fully mature granule cells. We observed shifts in the molecular identity of quiescent and proliferating radial glia and granule cells during the postnatal period that were then maintained through adult stages. In contrast, intermediate progenitor cells, neuroblasts, and immature granule cells were nearly indistinguishable at all ages. These findings demonstrate the fundamental similarity of postnatal and adult neurogenesis in the hippocampus and pinpoint the early postnatal transformation of radial glia from embryonic progenitors to adult quiescent stem cells.

  3. The Number, Organization, and Size of Polymorphic Membrane Protein Coding Sequences as well as the Most Conserved Pmp Protein Differ within and across Chlamydia Species.

    Science.gov (United States)

    Van Lent, Sarah; Creasy, Heather Huot; Myers, Garry S A; Vanrompay, Daisy

    2016-01-01

    Variation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis. As different Pmp proteins might be indispensible for the pathogenesis of different Chlamydia species, this study investigated the conservation of Pmp proteins both within and across C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci. The pmp coding sequences were annotated in 16 C. trachomatis, 6 C. pneumoniae, 2 C. abortus, and 16 C. psittaci genomes. The number and organization of polymorphic membrane coding sequences differed within and across the analyzed Chlamydia species. The length of coding sequences of pmpA,pmpB, and pmpH was conserved among all analyzed genomes, while the length of pmpE/F and pmpG, and remarkably also of the subtype pmpD, differed among the analyzed genomes. PmpD, PmpA, PmpH, and PmpA were the most conserved Pmp in C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci, respectively. PmpB was the most conserved Pmp across the 4 analyzed Chlamydia species. © 2016 S. Karger AG, Basel.

  4. Genetic conservation planning for forest tree species in Western North America under future climate change: Employing a novel approach to identify conservation gaps

    Science.gov (United States)

    L.K. Gray; E.J. Russell; Q.E. Barber; A. Hamann

    2017-01-01

    Among the 17 provinces, territories, and states that comprise western North America, approximately 18 percent of the 8.4 million km2 of forested land base is designated as protected areas to ensure the in situ conservation of forest biodiversity. Jurisdictions vary substantially however, in their responsibilities, protected area coverage, and conservation policies....

  5. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Science.gov (United States)

    Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

    2014-01-01

    Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738

  6. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Directory of Open Access Journals (Sweden)

    Zheng Ping

    2014-01-01

    Full Text Available Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer.

  7. Multi-species comparative analysis of the equine ACE gene identifies a highly conserved potential transcription factor binding site in intron 16.

    Science.gov (United States)

    Hamilton, Natasha A; Tammen, Imke; Raadsma, Herman W

    2013-01-01

    Angiotensin converting enzyme (ACE) is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D) polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs) discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism.

  8. Multi-species comparative analysis of the equine ACE gene identifies a highly conserved potential transcription factor binding site in intron 16.

    Directory of Open Access Journals (Sweden)

    Natasha A Hamilton

    Full Text Available Angiotensin converting enzyme (ACE is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism.

  9. An Efficient Method for Identifying Gene Fusions by Targeted RNA Sequencing from Fresh Frozen and FFPE Samples.

    Directory of Open Access Journals (Sweden)

    Jonathan A Scolnick

    Full Text Available Fusion genes are known to be key drivers of tumor growth in several types of cancer. Traditionally, detecting fusion genes has been a difficult task based on fluorescent in situ hybridization to detect chromosomal abnormalities. More recently, RNA sequencing has enabled an increased pace of fusion gene identification. However, RNA-Seq is inefficient for the identification of fusion genes due to the high number of sequencing reads needed to detect the small number of fusion transcripts present in cells of interest. Here we describe a method, Single Primer Enrichment Technology (SPET, for targeted RNA sequencing that is customizable to any target genes, is simple to use, and efficiently detects gene fusions. Using SPET to target 5701 exons of 401 known cancer fusion genes for sequencing, we were able to identify known and previously unreported gene fusions from both fresh-frozen and formalin-fixed paraffin-embedded (FFPE tissue RNA in both normal tissue and cancer cells.

  10. Sequence-based Screening for Rare Enzymes: New Insights into the World of AMDases Reveal a Conserved Motif and 58 Novel Enzymes Clustering in Eight Distinct Families.

    Directory of Open Access Journals (Sweden)

    Janine Maimanakos

    2016-08-01

    Full Text Available Arylmalonate-Decarboxylases (AMDases, EC 4.1.1.76 are very rare and mostly underexplored enzymes. Currently only four known and biochemically characterized representatives exist. However, their ability to decarboxylate α-disubstituted malonic acid derivatives to optically pure products without cofactors makes them attractive and promising candidates for the use as biocatalysts in industrial processes. Until now, AMDases could not be separated from other members of the aspartate/glutamate racemase superfamily based on their gene sequences. Within this work, a search algorithm was developed that enables a reliable prediction of AMDase activity for potential candidates. Based on specific sequence patterns and screening methods 58 novel AMDase candidate genes could be identified in this work. Thereby, AMDases with the conserved sequence pattern of Bordetella bronchiseptica’s prototype appeared to be limited to the classes of Alpha-, Beta- and Gammaproteobacteria. Amino acid homologies and comparison of gene surrounding sequences enabled the classification of eight enzyme clusters. Particularly striking is the accumulation of genes coding for different transporters of the TTT family, TRAP transporters and ABC transporters as well as genes coding for mandelate racemases/muconate lactonizing enzymes that might be involved in substrate uptake or degradation of AMDase products. Further, three novel AMDases were characterized which showed a high enantiomeric excess (>99% of the (R-enantiomer of flurbiprofen. These are the recombinant AmdA and AmdV from Variovorax sp. strains HH01 and HH02, originated from soil, and AmdP from Polymorphum gilvum found by a data base search. Altogether our findings give new insights into the class of AMDases and reveal many previously unknown enzyme candidates with high potential for bioindustrial processes.

  11. Structural and Sequence Similarities of Hydra Xeroderma Pigmentosum A Protein to Human Homolog Suggest Early Evolution and Conservation

    Directory of Open Access Journals (Sweden)

    Apurva Barve

    2013-01-01

    Full Text Available Xeroderma pigmentosum group A (XPA is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1 and replication protein A 70 kDa subunit (RPA70 proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla.

  12. Exome sequencing identifies NBEAL2 as the causative gene for gray platelet syndrome

    NARCIS (Netherlands)

    Albers, Cornelis A.; Cvejic, Ana; Favier, Rémi; Bouwmans, Evelien E.; Alessi, Marie-Christine; Bertone, Paul; Jordan, Gregory; Kettleborough, Ross N. W.; Kiddle, Graham; Kostadima, Myrto; Read, Randy J.; Sipos, Botond; Sivapalaratnam, Suthesh; Smethurst, Peter A.; Stephens, Jonathan; Voss, Katrin; Nurden, Alan; Rendon, Augusto; Nurden, Paquita; Ouwehand, Willem H.

    2011-01-01

    Gray platelet syndrome (GPS) is a predominantly recessive platelet disorder that is characterized by mild thrombocytopenia with large platelets and a paucity of α-granules; these abnormalities cause mostly moderate but in rare cases severe bleeding. We sequenced the exomes of four unrelated

  13. Identifying Learning Behaviors by Contextualizing Differential Sequence Mining with Action Features and Performance Evolution

    Science.gov (United States)

    Kinnebrew, John S.; Biswas, Gautam

    2012-01-01

    Our learning-by-teaching environment, Betty's Brain, captures a wealth of data on students' learning interactions as they teach a virtual agent. This paper extends an exploratory data mining methodology for assessing and comparing students' learning behaviors from these interaction traces. The core algorithm employs sequence mining techniques to…

  14. Use of microsatellite markers derived from whole genome sequence data for identifying polymorphism in Phytophthora ramorum

    Science.gov (United States)

    Kelly Ivors; Matteo Garbelotto; Ineke De Vries; Peter Bonants

    2006-01-01

    Investigating the population genetics of Phytophthora ramorum, the causal agent of sudden oak death (SOD), is critical to understanding the biology and epidemiology of this important phytopathogen. Raw sequence data (445,000 reads) of P. ramorum was provided by the Joint Genome Institute. Our objective was to develop and utilize...

  15. Sequence analysis of the its-2 region: a tool to identify strains of Scenedesmus (Chlorophyceae)

    NARCIS (Netherlands)

    Van Hannen, E.J.; Lürling, M.; Van Donk, E.

    2000-01-01

    The genetic distances between several strains of Senedesmus obliquus (Turp,) Kutz,, S, acutus Hortobagyi, and S, naegelii Chod. calculated from ITS-2 sequences were found to be smaller than the genetic distances within other strains of Scenedesmus-that is, in S, acuminatus (Lagerh,) Chod, and S,

  16. Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer

    NARCIS (Netherlands)

    Wang, Kai; Yuen, Siu Tsan; Xu, Jiangchun; Lee, Siu Po; Yan, Helen H N; Shi, Stephanie T; Siu, Hoi Cheong; Deng, Shibing; Chu, Kent Man; Law, Simon; Chan, Kok Hoe; Chan, Annie S Y; Tsui, Wai Yin; Ho, Siu Lun; Chan, Anthony K W; Man, Jonathan L K; Foglizzo, Valentina; Ng, Man Kin; Chan, April S; Ching, Yick Pang; Cheng, Grace H W; Xie, Tao; Fernandez, Julio; Li, Vivian S W; Clevers, Hans; Rejto, Paul A; Mao, Mao; Leung, Suet Yi

    Gastric cancer is a heterogeneous disease with diverse molecular and histological subtypes. We performed whole-genome sequencing in 100 tumor-normal pairs, along with DNA copy number, gene expression and methylation profiling, for integrative genomic analysis. We found subtype-specific genetic and

  17. An Approach to Evaluate Comprehensive Plan and Identify Priority Lands for Future Land Use Development to Conserve More Ecological Values

    Directory of Open Access Journals (Sweden)

    Long Zhou

    2018-01-01

    Full Text Available Urbanization has significant impacts on the regional environmental quality through altering natural lands, converting them to urban built-up areas. One common strategy applied by urban planners to manage urbanization and preserve natural resources is to make a comprehensive plan and concentrate future land use in certain areas. However, in practice, planners used to make future land use planning mainly based on their subjective interpretations with limited ecological supporting evidence and analysis. Here, we propose a new approach composed of ecological modelling and land use zoning in the spatial matrix to evaluate the comprehensive plan and identify priority lands for sustainable land use planning. We use the city of Corvallis, OR, as the test bed to demonstrate this new approach. The results indicate that the Corvallis Comprehensive Plan 1998–2020 featured with compact development is not performing efficiently in conserving ecological values, and the land use plan featured with mixed-use spreading development generated by the proposed approach meets the city’s land demands for urban growth, and conserves 103% more ecological value of retaining storm water nitrogen, 270% more ecological value of retaining storm water phosphorus and 19% more ecological value in storing carbon in the whole watershed. This study indicates that if planned with scientific analysis and evidence, spreading urban development does not necessarily result in less sustainable urban environment than the compact development recommended in smart growth.

  18. Diverse Array of New Viral Sequences Identified in Worldwide Populations of the Asian Citrus Psyllid (Diaphorina citri) Using Viral Metagenomics.

    Science.gov (United States)

    Nouri, Shahideh; Salem, Nidá; Nigg, Jared C; Falk, Bryce W

    2015-12-16

    The Asian citrus psyllid, Diaphorina citri, is the natural vector of the causal agent of Huanglongbing (HLB), or citrus greening disease. Together; HLB and D. citri represent a major threat to world citrus production. As there is no cure for HLB, insect vector management is considered one strategy to help control the disease, and D. citri viruses might be useful. In this study, we used a metagenomic approach to analyze viral sequences associated with the global population of D. citri. By sequencing small RNAs and the transcriptome coupled with bioinformatics analysis, we showed that the virus-like sequences of D. citri are diverse. We identified novel viral sequences belonging to the picornavirus superfamily, the Reoviridae, Parvoviridae, and Bunyaviridae families, and an unclassified positive-sense single-stranded RNA virus. Moreover, a Wolbachia prophage-related sequence was identified. This is the first comprehensive survey to assess the viral community from worldwide populations of an agricultural insect pest. Our results provide valuable information on new putative viruses, some of which may have the potential to be used as biocontrol agents. Insects have the most species of all animals, and are hosts to, and vectors of, a great variety of known and unknown viruses. Some of these most likely have the potential to be important fundamental and/or practical resources. In this study, we used high-throughput next-generation sequencing (NGS) technology and bioinformatics analysis to identify putative viruses associated with Diaphorina citri, the Asian citrus psyllid. D. citri is the vector of the bacterium causing Huanglongbing (HLB), currently the most serious threat to citrus worldwide. Here, we report several novel viral sequences associated with D. citri. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  19. cDNA cloning and sequencing of human fibrillarin, a conserved nucleolar protein recognized by autoimmune antisera

    International Nuclear Information System (INIS)

    Aris, J.P.; Blobel, G.

    1991-01-01

    The authors have isolated a 1.1-kilobase cDNA clone that encodes human fibrillarin by screening a hepatoma library in parallel with DNA probes derived from the fibrillarin genes of Saccharomyces cerevisiae (NOP1) and Xenopus laevis. RNA blot analysis indicates that the corresponding mRNA is ∼1,300 nucleotides in length. Human fibrillarin expressed in vitro migrates on SDS gels as a 36-kDa protein that is specifically immunoprecipitated by antisera from humans with scleroderma autoimmune disease. Human fibrillarin contains an amino-terminal repetitive domain ∼75-80 amino acids in length that is rich in glycine and arginine residues and is similar to amino-terminal domains in the yeast and Xenopus fibrillarins. The occurrence of a putative RNA-binding domain and an RNP consensus sequence within the protein is consistent with the association of fibrillarin with small nucleolar RNAs. Protein sequence alignments show that 67% of amino acids from human fibrillarin are identical to those in yeast fibrillarin and that 81% are identical to those in Xenopus fibrillarin. This identity suggests the evolutionary conservation of an important function early in the pathway for ribosome biosynthesis

  20. The highly conserved codon following the slippery sequence supports -1 frameshift efficiency at the HIV-1 frameshift site.

    Directory of Open Access Journals (Sweden)

    Suneeth F Mathew

    Full Text Available HIV-1 utilises -1 programmed ribosomal frameshifting to translate structural and enzymatic domains in a defined proportion required for replication. A slippery sequence, U UUU UUA, and a stem-loop are well-defined RNA features modulating -1 frameshifting in HIV-1. The GGG glycine codon immediately following the slippery sequence (the 'intercodon' contributes structurally to the start of the stem-loop but has no defined role in current models of the frameshift mechanism, as slippage is inferred to occur before the intercodon has reached the ribosomal decoding site. This GGG codon is highly conserved in natural isolates of HIV. When the natural intercodon was replaced with a stop codon two different decoding molecules-eRF1 protein or a cognate suppressor tRNA-were able to access and decode the intercodon prior to -1 frameshifting. This implies significant slippage occurs when the intercodon is in the (perhaps distorted ribosomal A site. We accommodate the influence of the intercodon in a model of frame maintenance versus frameshifting in HIV-1.

  1. Novel Genetic Variants of Sporadic Atrial Septal Defect (ASD) in a Chinese Population Identified by Whole-Exome Sequencing (WES).

    Science.gov (United States)

    Liu, Yong; Cao, Yu; Li, Yaxiong; Lei, Dongyun; Li, Lin; Hou, Zong Liu; Han, Shen; Meng, Mingyao; Shi, Jianlin; Zhang, Yayong; Wang, Yi; Niu, Zhaoyi; Xie, Yanhua; Xiao, Benshan; Wang, Yuanfei; Li, Xiao; Yang, Lirong; Wang, Wenju; Jiang, Lihong

    2018-03-05

    BACKGROUND Recently, mutations in several genes have been described to be associated with sporadic ASD, but some genetic variants remain to be identified. The aim of this study was to use whole-exome sequencing (WES) combined with bioinformatics analysis to identify novel genetic variants in cases of sporadic congenital ASD, followed by validation by Sanger sequencing. MATERIAL AND METHODS Five Han patients with secundum ASD were recruited, and their tissue samples were analyzed by WES, followed by verification by Sanger sequencing of tissue and blood samples. Further evaluation using blood samples included 452 additional patients with sporadic secundum ASD (212 male and 240 female patients) and 519 healthy subjects (252 male and 267 female subjects) for further verification by a multiplexed MassARRAY system. Bioinformatic analyses were performed to identify novel genetic variants associated with sporadic ASD. RESULTS From five patients with sporadic ASD, a total of 181,762 genomic variants in 33 exon loci, validated by Sanger sequencing, were selected and underwent MassARRAY analysis in 452 patients with ASD and 519 healthy subjects. Three loci with high mutation frequencies, the 138665410 FOXL2 gene variant, the 23862952 MYH6 gene variant, and the 71098693 HYDIN gene variant were found to be significantly associated with sporadic ASD (PASD (PASD, and supported the use of WES and bioinformatics analysis to identify disease-associated mutations.

  2. Identifying effective actions to guide volunteer-based and nationwide conservation efforts for a ground-nesting farmland bird.

    Science.gov (United States)

    Santangeli, Andrea; Arroyo, Beatriz; Millon, Alexandre; Bretagnolle, Vincent

    2015-08-01

    1. Modern farming practices threaten wildlife in different ways, and failure to identify the complexity of multiple threats acting in synergy may result in ineffective management. To protect ground-nesting birds in farmland, monitoring and mitigating impacts of mechanical harvesting is crucial. 2. Here, we use 6 years of data from a nationwide volunteer-based monitoring scheme of the Montagu's harrier, a ground-nesting raptor, in French farmlands. We assess the effectiveness of alternative nest protection measures and map their potential benefit to the species. 3. We show that unprotected nests in cultivated land are strongly negatively affected by harvesting and thus require active management. Further, we show that protection from harvesting alone (e.g. by leaving a small unharvested buffer around the nest) is impaired by post-harvest predation at nests that become highly conspicuous after harvest. Measures that simultaneously protect from harvesting and predation (by adding a fence around the nest) significantly enhance nest productivity. 4. The map of expected gain from nest protection in relation to available volunteers' workforce pinpoints large areas of high expected gain from nest protection that are not matched by equally high workforce availability. This mismatch suggests that the impact of nest protection can be further improved by increasing volunteer efforts in key areas where they are low relative to the expected gain they could have. 5. Synthesis and applications . This study shows that synergistic interplay of multiple factors (e.g. mechanical harvesting and predation) may completely undermine the success of well-intentioned conservation efforts. However, identifying areas where the greatest expected gains can be achieved relative to effort expended can minimize the risk of wasted volunteer actions. Overall, this study underscores the importance of citizen science for collecting large-scale data useful for producing science and ultimately informs

  3. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome.

    Science.gov (United States)

    Benoit, Joshua B; Adelman, Zach N; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C; Szuter, Elise M; Hagan, Richard W; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M; Nelson, David R; Rosendale, Andrew J; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R; Ioannidis, Panagiotis; Waterhouse, Robert M; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J Spencer; Gondhalekar, Ameya D; Scharf, Michael E; Peterson, Brittany F; Raje, Kapil R; Hottel, Benjamin A; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S T; Duncan, Elizabeth J; Murali, Shwetha C; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C; Muzny, Donna M; Wheeler, David; Panfilio, Kristen A; Vargas Jentzsch, Iris M; Vargo, Edward L; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T; Anderson, Michelle A E; Jones, Jeffery W; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D; Attardo, Geoffrey M; Robertson, Hugh M; Zdobnov, Evgeny M; Ribeiro, Jose M C; Gibbs, Richard A; Werren, John H; Palli, Subba R; Schal, Coby; Richards, Stephen

    2016-02-02

    The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host-symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human-bed bug and symbiont-bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite.

  4. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome

    Science.gov (United States)

    Benoit, Joshua B.; Adelman, Zach N.; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C.; Szuter, Elise M.; Hagan, Richard W.; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M.; Nelson, David R.; Rosendale, Andrew J.; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M.; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R.; Ioannidis, Panagiotis; Waterhouse, Robert M.; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J. Spencer; Gondhalekar, Ameya D.; Scharf, Michael E.; Peterson, Brittany F.; Raje, Kapil R.; Hottel, Benjamin A.; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S. T.; Duncan, Elizabeth J.; Murali, Shwetha C.; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L.; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C.; Muzny, Donna M.; Wheeler, David; Panfilio, Kristen A.; Vargas Jentzsch, Iris M.; Vargo, Edward L.; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T.; Anderson, Michelle A. E.; Jones, Jeffery W.; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D.; Attardo, Geoffrey M.; Robertson, Hugh M.; Zdobnov, Evgeny M.; Ribeiro, Jose M. C.; Gibbs, Richard A.; Werren, John H.; Palli, Subba R.; Schal, Coby; Richards, Stephen

    2016-01-01

    The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host–symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human–bed bug and symbiont–bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite. PMID:26836814

  5. The genome sequence of the emerging common midwife toad virus identifies an evolutionary intermediate within ranaviruses.

    Science.gov (United States)

    Mavian, Carla; López-Bueno, Alberto; Balseiro, Ana; Casais, Rosa; Alcamí, Antonio; Alejo, Alí

    2012-04-01

    Worldwide amphibian population declines have been ascribed to global warming, increasing pollution levels, and other factors directly related to human activities. These factors may additionally be favoring the emergence of novel pathogens. In this report, we have determined the complete genome sequence of the emerging common midwife toad ranavirus (CMTV), which has caused fatal disease in several amphibian species across Europe. Phylogenetic and gene content analyses of the first complete genomic sequence from a ranavirus isolated in Europe show that CMTV is an amphibian-like ranavirus (ALRV). However, the CMTV genome structure is novel and represents an intermediate evolutionary stage between the two previously described ALRV groups. We find that CMTV clusters with several other ranaviruses isolated from different hosts and locations which might also be included in this novel ranavirus group. This work sheds light on the phylogenetic relationships within this complex group of emerging, disease-causing viruses.

  6. High throughput sequencing identifies chilling responsive genes in sweetpotato (Ipomoea batatas Lam.) during storage.

    Science.gov (United States)

    Xie, Zeyi; Zhou, Zhilin; Li, Hongmin; Yu, Jingjing; Jiang, Jiaojiao; Tang, Zhonghou; Ma, Daifu; Zhang, Baohong; Han, Yonghua; Li, Zongyun

    2018-05-21

    Sweetpotato (Ipomoea batatas L.) is a globally important economic food crop. It belongs to Convolvulaceae family and origins in the tropics; however, sweetpotato is sensitive to cold stress during storage. In this study, we performed transcriptome sequencing to investigate the sweetpotato response to chilling stress during storage. A total of 110,110 unigenes were generated via high-throughput sequencing. Differentially expressed genes (DEGs) analysis showed that 18,681 genes were up-regulated and 21,983 genes were down-regulated in low temperature condition. Many DEGs were related to the cell membrane system, antioxidant enzymes, carbohydrate metabolism, and hormone metabolism, which are potentially associated with sweetpotato resistance to low temperature. The existence of DEGs suggests a molecular basis for the biochemical and physiological consequences of sweetpotato in low temperature storage conditions. Our analysis will provide a new target for enhancement of sweetpotato cold stress tolerance in postharvest storage through genetic manipulation. Copyright © 2018. Published by Elsevier Inc.

  7. Identifying spatial clustering properties of the 1997-2003 Liguria (Northern Italy) forest-fire sequence

    International Nuclear Information System (INIS)

    Telesca, Luciano; Amatulli, Giuseppe; Lasaponara, Rosa; Lovallo, Michele; Santulli, Adriano

    2007-01-01

    The spatial clustering of the forest-fire sequence (1997-2003) of Liguria Region (Northern Italy) has been analysed using the correlation dimension D C , calculated by means of the correlation integral method. Studying the variations of this parameter, we recognize the presence of a strong variability of the spatial clusterization, modulated by seasonal cycles. Furthermore, we found that the larger fires (size >400 ha) mark the cyclic behaviour of the correlation dimension

  8. Epitope Sequences in Dengue Virus NS1 Protein Identified by Monoclonal Antibodies

    Directory of Open Access Journals (Sweden)

    Leticia Barboza Rocha

    2017-10-01

    Full Text Available Dengue nonstructural protein 1 (NS1 is a multi-functional glycoprotein with essential functions both in viral replication and modulation of host innate immune responses. NS1 has been established as a good surrogate marker for infection. In the present study, we generated four anti-NS1 monoclonal antibodies against recombinant NS1 protein from dengue virus serotype 2 (DENV2, which were used to map three NS1 epitopes. The sequence 193AVHADMGYWIESALNDT209 was recognized by monoclonal antibodies 2H5 and 4H1BC, which also cross-reacted with Zika virus (ZIKV protein. On the other hand, the sequence 25VHTWTEQYKFQPES38 was recognized by mAb 4F6 that did not cross react with ZIKV. Lastly, a previously unidentified DENV2 NS1-specific epitope, represented by the sequence 127ELHNQTFLIDGPETAEC143, is described in the present study after reaction with mAb 4H2, which also did not cross react with ZIKV. The selection and characterization of the epitope, specificity of anti-NS1 mAbs, may contribute to the development of diagnostic tools able to differentiate DENV and ZIKV infections.

  9. Identifying Potential Conservation Corridors Along the Mongolia-Russia Border Using Resource Selection Functions: A Case Study on Argali Sheep

    Directory of Open Access Journals (Sweden)

    Buyanaa Chimeddorj

    2013-12-01

    Full Text Available The disruption of animal movements is known to affect wildlife populations, particularly large bodied, free-ranging mammals that require large geographic ranges to survive. Corridors commonly connect fragmented wildlife populations and their habitats, yet identifying corridors rarely uses data on habitat selection and movements of target species. New technologies and analytical tools make it possible to better integrate landscape patterns with spatial behavioral data. We show how resource selection functions can describe habitat suitability using continuous and multivariate metrics to determine potential wildlife movement corridors. During 2005–2010, we studied movements of argali sheep ( Ovis ammon near the Mongolia-Russia border using radio-telemetry and modeled their spatial distribution in relation to landscape features to create a spatially explicit habitat suitability surface to identify potential transboundary conservation corridors. Argali sheep habitat selection in western Mongolia positively correlated with elevation, ruggedness index, and distance to border. In other words, argali were tended use areas with higher elevation, rugged topography, and distances farther from the international border. We suggest that these spatial modeling approaches offer ways to design and identify wildlife corridors more objectively and holistically, and can be applied to many other target species.

  10. Leaf Transcriptome Sequencing for Identifying Genic-SSR Markers and SNP Heterozygosity in Crossbred Mango Variety 'Amrapali' (Mangifera indica L.).

    Science.gov (United States)

    Mahato, Ajay Kumar; Sharma, Nimisha; Singh, Akshay; Srivastav, Manish; Jaiprakash; Singh, Sanjay Kumar; Singh, Anand Kumar; Sharma, Tilak Raj; Singh, Nagendra Kumar

    2016-01-01

    Mango (Mangifera indica L.) is called "king of fruits" due to its sweetness, richness of taste, diversity, large production volume and a variety of end usage. Despite its huge economic importance genomic resources in mango are scarce and genetics of useful horticultural traits are poorly understood. Here we generated deep coverage leaf RNA sequence data for mango parental varieties 'Neelam', 'Dashehari' and their hybrid 'Amrapali' using next generation sequencing technologies. De-novo sequence assembly generated 27,528, 20,771 and 35,182 transcripts for the three genotypes, respectively. The transcripts were further assembled into a non-redundant set of 70,057 unigenes that were used for SSR and SNP identification and annotation. Total 5,465 SSR loci were identified in 4,912 unigenes with 288 type I SSR (n ≥ 20 bp). One hundred type I SSR markers were randomly selected of which 43 yielded PCR amplicons of expected size in the first round of validation and were designated as validated genic-SSR markers. Further, 22,306 SNPs were identified by aligning high quality sequence reads of the three mango varieties to the reference unigene set, revealing significantly enhanced SNP heterozygosity in the hybrid Amrapali. The present study on leaf RNA sequencing of mango varieties and their hybrid provides useful genomic resource for genetic improvement of mango.

  11. Novel nonphosphorylated peptides with conserved sequences selectively bind to Grb7 SH2 domain with affinity comparable to its phosphorylated ligand.

    Directory of Open Access Journals (Sweden)

    Dan Zhang

    Full Text Available The Grb7 (growth factor receptor-bound 7 protein, a member of the Grb7 protein family, is found to be highly expressed in such metastatic tumors as breast cancer, esophageal cancer, liver cancer, etc. The src-homology 2 (SH2 domain in the C-terminus is reported to be mainly involved in Grb7 signaling pathways. Using the random peptide library, we identified a series of Grb7 SH2 domain-binding nonphosphorylated peptides in the yeast two-hybrid system. These peptides have a conserved GIPT/K/N sequence at the N-terminus and G/WD/IP at the C-terminus, and the region between the N-and C-terminus contains fifteen amino acids enriched with serines, threonines and prolines. The association between the nonphosphorylated peptides and the Grb7 SH2 domain occurred in vitro and ex vivo. When competing for binding to the Grb7 SH2 domain in a complex, one synthesized nonphosphorylated ligand, containing the twenty-two amino acid-motif sequence, showed at least comparable affinity to the phosphorylated ligand of ErbB3 in vitro, and its overexpression inhibited the proliferation of SK-BR-3 cells. Such nonphosphorylated peptides may be useful for rational design of drugs targeted against cancers that express high levels of Grb7 protein.

  12. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Directory of Open Access Journals (Sweden)

    Gilbert Greub

    Full Text Available BACKGROUND: With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. METHODS/PRINCIPAL FINDINGS: We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. CONCLUSIONS/SIGNIFICANCE: This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  13. High Throughput Sequencing of Small RNAs in the Two Cucurbita Germplasm with Different Sodium Accumulation Patterns Identifies Novel MicroRNAs Involved in Salt Stress Response.

    Science.gov (United States)

    Xie, Junjun; Lei, Bo; Niu, Mengliang; Huang, Yuan; Kong, Qiusheng; Bie, Zhilong

    2015-01-01

    MicroRNAs (miRNAs), a class of small non-coding RNAs, recognize their mRNA targets based on perfect sequence complementarity. MiRNAs lead to broader changes in gene expression after plants are exposed to stress. High-throughput sequencing is an effective method to identify and profile small RNA populations in non-model plants under salt stresses, significantly improving our knowledge regarding miRNA functions in salt tolerance. Cucurbits are sensitive to soil salinity, and the Cucurbita genus is used as the rootstock of other cucurbits to enhance salt tolerance. Several cucurbit crops have been used for miRNA sequencing but salt stress-related miRNAs in cucurbit species have not been reported. In this study, we subjected two Cucurbita germplasm, namely, N12 (Cucurbita. maxima Duch.) and N15 (Cucurbita. moschata Duch.), with different sodium accumulation patterns, to Illumina sequencing to determine small RNA populations in root tissues after 4 h of salt treatment and control. A total of 21,548,326 and 19,394,108 reads were generated from the control and salt-treated N12 root tissues, respectively. By contrast, 19,108,240 and 20,546,052 reads were obtained from the control and salt-treated N15 root tissues, respectively. Fifty-eight conserved miRNA families and 33 novel miRNAs were identified in the two Cucurbita germplasm. Seven miRNAs (six conserved miRNAs and one novel miRNAs) were up-regulated in salt-treated N12 and N15 samples. Most target genes of differentially expressed novel miRNAs were transcription factors and salt stress-responsive proteins, including dehydration-induced protein, cation/H+ antiporter 18, and CBL-interacting serine/threonine-protein kinase. The differential expression of miRNAs between the two Cucurbita germplasm under salt stress conditions and their target genes demonstrated that novel miRNAs play an important role in the response of the two Cucurbita germplasm to salt stress. The present study initially explored small RNAs in the

  14. High Throughput Sequencing of Small RNAs in the Two Cucurbita Germplasm with Different Sodium Accumulation Patterns Identifies Novel MicroRNAs Involved in Salt Stress Response.

    Directory of Open Access Journals (Sweden)

    Junjun Xie

    Full Text Available MicroRNAs (miRNAs, a class of small non-coding RNAs, recognize their mRNA targets based on perfect sequence complementarity. MiRNAs lead to broader changes in gene expression after plants are exposed to stress. High-throughput sequencing is an effective method to identify and profile small RNA populations in non-model plants under salt stresses, significantly improving our knowledge regarding miRNA functions in salt tolerance. Cucurbits are sensitive to soil salinity, and the Cucurbita genus is used as the rootstock of other cucurbits to enhance salt tolerance. Several cucurbit crops have been used for miRNA sequencing but salt stress-related miRNAs in cucurbit species have not been reported. In this study, we subjected two Cucurbita germplasm, namely, N12 (Cucurbita. maxima Duch. and N15 (Cucurbita. moschata Duch., with different sodium accumulation patterns, to Illumina sequencing to determine small RNA populations in root tissues after 4 h of salt treatment and control. A total of 21,548,326 and 19,394,108 reads were generated from the control and salt-treated N12 root tissues, respectively. By contrast, 19,108,240 and 20,546,052 reads were obtained from the control and salt-treated N15 root tissues, respectively. Fifty-eight conserved miRNA families and 33 novel miRNAs were identified in the two Cucurbita germplasm. Seven miRNAs (six conserved miRNAs and one novel miRNAs were up-regulated in salt-treated N12 and N15 samples. Most target genes of differentially expressed novel miRNAs were transcription factors and salt stress-responsive proteins, including dehydration-induced protein, cation/H+ antiporter 18, and CBL-interacting serine/threonine-protein kinase. The differential expression of miRNAs between the two Cucurbita germplasm under salt stress conditions and their target genes demonstrated that novel miRNAs play an important role in the response of the two Cucurbita germplasm to salt stress. The present study initially explored small

  15. Whole-exome sequencing identifies USH2A mutations in a pseudo-dominant Usher syndrome family.

    Science.gov (United States)

    Zheng, Sui-Lian; Zhang, Hong-Liang; Lin, Zhen-Lang; Kang, Qian-Yan

    2015-10-01

    Usher syndrome (USH) is an autosomal recessive (AR) multi-sensory degenerative disorder leading to deaf-blindness. USH is clinically subdivided into three subclasses, and 10 genes have been identified thus far. Clinical and genetic heterogeneities in USH make a precise diagnosis difficult. A dominant‑like USH family in successive generations was identified, and the present study aimed to determine the genetic predisposition of this family. Whole‑exome sequencing was performed in two affected patients and an unaffected relative. Systematic data were analyzed by bioinformatic analysis to remove the candidate mutations via step‑wise filtering. Direct Sanger sequencing and co‑segregation analysis were performed in the pedigree. One novel and two known mutations in the USH2A gene were identified, and were further confirmed by direct sequencing and co‑segregation analysis. The affected mother carried compound mutations in the USH2A gene, while the unaffected father carried a heterozygous mutation. The present study demonstrates that whole‑exome sequencing is a robust approach for the molecular diagnosis of disorders with high levels of genetic heterogeneity.

  16. Recurrent chimeric RNAs enriched in human prostate cancer identified by deep sequencing

    Science.gov (United States)

    Kannan, Kalpana; Wang, Liguo; Wang, Jianghua; Ittmann, Michael M.; Li, Wei; Yen, Laising

    2011-01-01

    Transcription-induced chimeric RNAs, possessing sequences from different genes, are expected to increase the proteomic diversity through chimeric proteins or altered regulation. Despite their importance, few studies have focused on chimeric RNAs especially regarding their presence/roles in human cancers. By deep sequencing the transcriptome of 20 human prostate cancer and 10 matched benign prostate tissues, we obtained 1.3 billion sequence reads, which led to the identification of 2,369 chimeric RNA candidates. Chimeric RNAs occurred in significantly higher frequency in cancer than in matched benign samples. Experimental investigation of a selected 46 set led to the confirmation of 32 chimeric RNAs, of which 27 were highly recurrent and previously undescribed in prostate cancer. Importantly, a subset of these chimeras was present in prostate cancer cell lines, but not detectable in primary human prostate epithelium cells, implying their associations with cancer. These chimeras contain discernable 5′ and 3′ splice sites at the RNA junction, indicating that their formation is mediated by splicing. Their presence is also largely independent of the expression of parental genes, suggesting that other factors are involved in their production and regulation. One chimera, TMEM79-SMG5, is highly differentially expressed in human cancer samples and therefore a potential biomarker. The prevalence of chimeric RNAs may allow the limited number of human genes to encode a substantially larger number of RNAs and proteins, forming an additional layer of cellular complexity. Together, our results suggest that chimeric RNAs are widespread, and increased chimeric RNA events could represent a unique class of molecular alteration in cancer. PMID:21571633

  17. Application of CHD1 Gene and EE0.6 Sequences to Identify Sexes of Several Protected Bird Species in Taiwan

    Directory of Open Access Journals (Sweden)

    E.-C. Lin

    2011-06-01

    Full Text Available Many bird species, for example: Crested Serpent Eagle (Spilornis cheela hoya, Collared Scops (Owl Otus bakkamoena, Tawny Fish Owl (Ketupa flavipes, Crested Goshawk (Accipiter trivirgatus, and Grass Owl (Tyto longimembris... etc, are monomorphic, which is difficult to identify their sex simply by their outward appearance. Especially for those monomorphic endangered species, finding an effective tool to identify their sex beside outward appearance is needed for further captive breeding programs or other conservation plans. In this study, we collected samples of Black Swan (Cygmus atratus and Nicobar Pigeon (Caloenas nicobarica, two aviaries introduced monomorphic species served as control group, and Crested Serpent Eagle, Collared Scops Owl, Tawny Fish Owl, Crested Goshawk, and Grass Owl, five protected monomorphic species in Taiwan. We used sex-specific primers of avian CHD1 (chromo-helicase-DNA-binding gene and EE0.6 (EcoRI 0.6-kb fragment sequences to identify the sex of these birds. The results showed that CHD1 gene primers could be used to correctly identify the sex of Black Swans, Nicobar Pigeons and Crested Serpent Eagles, but it could not be used to correctly identify sex in Collared Scops Owls, Tawny Fish Owls, and Crested Goshawks. In the sex identification using EE0.6 sequence fragment, A, C, D and E primer sets could be used for sexing Black Swans; A, B, C, and D primer sets could be used for sexing Crested Serpent Eagles; and E primer set could be used for sexing Nicobar Pigeons and the two owl species. Correct determination of sex is the first step if a captive breeding measure is required. We have demonstrated that several of the existing primer sets can be used for sex determination of several captive breeding and indigenous bird species.

  18. TAPDANCE: An automated tool to identify and annotate transposon insertion CISs and associations between CISs from next generation sequence data

    Directory of Open Access Journals (Sweden)

    Sarver Aaron L

    2012-06-01

    Full Text Available Abstract Background Next generation sequencing approaches applied to the analyses of transposon insertion junction fragments generated in high throughput forward genetic screens has created the need for clear informatics and statistical approaches to deal with the massive amount of data currently being generated. Previous approaches utilized to 1 map junction fragments within the genome and 2 identify Common Insertion Sites (CISs within the genome are not practical due to the volume of data generated by current sequencing technologies. Previous approaches applied to this problem also required significant manual annotation. Results We describe Transposon Annotation Poisson Distribution Association Network Connectivity Environment (TAPDANCE software, which automates the identification of CISs within transposon junction fragment insertion data. Starting with barcoded sequence data, the software identifies and trims sequences and maps putative genomic sequence to a reference genome using the bowtie short read mapper. Poisson distribution statistics are then applied to assess and rank genomic regions showing significant enrichment for transposon insertion. Novel methods of counting insertions are used to ensure that the results presented have the expected characteristics of informative CISs. A persistent mySQL database is generated and utilized to keep track of sequences, mappings and common insertion sites. Additionally, associations between phenotypes and CISs are also identified using Fisher’s exact test with multiple testing correction. In a case study using previously published data we show that the TAPDANCE software identifies CISs as previously described, prioritizes them based on p-value, allows holistic visualization of the data within genome browser software and identifies relationships present in the structure of the data. Conclusions The TAPDANCE process is fully automated, performs similarly to previous labor intensive approaches

  19. Molecular forensics in avian conservation: a DNA-based approach for identifying mammalian predators of ground-nesting birds and eggs.

    Science.gov (United States)

    Hopken, Matthew W; Orning, Elizabeth K; Young, Julie K; Piaggio, Antoinette J

    2016-01-07

    The greater sage-grouse (Centrocercus urophasianus) is a ground-nesting bird from the Northern Rocky Mountains and a species at risk of extinction in in multiple U.S. states and Canada. Herein we report results from a proof of concept that mitochondrial and nuclear DNAs from mammalian predator saliva could be non-invasively collected from depredated greater sage-grouse eggshells and carcasses and used for predator species identification. Molecular forensic approaches have been applied to identify predators from depredated remains as one strategy to better understand predator-prey dynamics and guide management strategies. This can aid conservation efforts by correctly identifying predators most likely to impact threatened and endangered species. DNA isolated from non-invasive samples around nesting sites (e.g. fecal or hair samples) is one method that can increase the success and accuracy of predator species identification when compared to relying on nest remains alone. Predator saliva DNA was collected from depredated eggshells and carcasses using swabs. We sequenced two partial fragments of two mitochondrial genes and obtained microsatellite genotypes using canid specific primers for species and individual identification, respectively. Using this multilocus approach we were able to identify predators, at least down to family, from 11 out of 14 nests (79%) and three out of seven carcasses (47%). Predators detected most frequently were canids (86%), while other taxa included rodents, a striped skunk, and cattle. We attempted to match the genotypes of individual coyotes obtained from eggshells and carcasses with those obtained from fecal samples and coyotes collected in the areas, but no genotype matches were found. Predation is a main cause of nest failure in ground-nesting birds and can impact reproduction and recruitment. To inform predator management for ground-nesting bird conservation, accurate identification of predator species is necessary. Considering

  20. Exome sequencing identifies highly recurrent MED12 somatic mutations in breast fibroadenoma.

    Science.gov (United States)

    Lim, Weng Khong; Ong, Choon Kiat; Tan, Jing; Thike, Aye Aye; Ng, Cedric Chuan Young; Rajasegaran, Vikneswari; Myint, Swe Swe; Nagarajan, Sanjanaa; Nasir, Nur Diyana Md; McPherson, John R; Cutcutache, Ioana; Poore, Gregory; Tay, Su Ting; Ooi, Wei Siong; Tan, Veronique Kiak Mien; Hartman, Mikael; Ong, Kong Wee; Tan, Benita K T; Rozen, Steven G; Tan, Puay Hoon; Tan, Patrick; Teh, Bin Tean

    2014-08-01

    Fibroadenomas are the most common breast tumors in women under 30 (refs. 1,2). Exome sequencing of eight fibroadenomas with matching whole-blood samples revealed recurrent somatic mutations solely in MED12, which encodes a Mediator complex subunit. Targeted sequencing of an additional 90 fibroadenomas confirmed highly frequent MED12 exon 2 mutations (58/98, 59%) that are probably somatic, with 71% of mutations occurring in codon 44. Using laser capture microdissection, we show that MED12 fibroadenoma mutations are present in stromal but not epithelial mammary cells. Expression profiling of MED12-mutated and wild-type fibroadenomas revealed that MED12 mutations are associated with dysregulated estrogen signaling and extracellular matrix organization. The fibroadenoma MED12 mutation spectrum is nearly identical to that of previously reported MED12 lesions in uterine leiomyoma but not those of other tumors. Benign tumors of the breast and uterus, both of which are key target tissues of estrogen, may thus share a common genetic basis underpinned by highly frequent and specific MED12 mutations.

  1. Multiplexed resequencing analysis to identify rare variants in pooled DNA with barcode indexing using next-generation sequencer.

    Science.gov (United States)

    Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji

    2010-07-01

    We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.

  2. HMMerThread: detecting remote, functional conserved domains in entire genomes by combining relaxed sequence-database searches with fold recognition.

    Directory of Open Access Journals (Sweden)

    Charles Richard Bradshaw

    Full Text Available Conserved domains in proteins are one of the major sources of functional information for experimental design and genome-level annotation. Though search tools for conserved domain databases such as Hidden Markov Models (HMMs are sensitive in detecting conserved domains in proteins when they share sufficient sequence similarity, they tend to miss more divergent family members, as they lack a reliable statistical framework for the detection of low sequence similarity. We have developed a greatly improved HMMerThread algorithm that can detect remotely conserved domains in highly divergent sequences. HMMerThread combines relaxed conserved domain searches with fold recognition to eliminate false positive, sequence-based identifications. With an accuracy of 90%, our software is able to automatically predict highly divergent members of conserved domain families with an associated 3-dimensional structure. We give additional confidence to our predictions by validation across species. We have run HMMerThread searches on eight proteomes including human and present a rich resource of remotely conserved domains, which adds significantly to the functional annotation of entire proteomes. We find ∼4500 cross-species validated, remotely conserved domain predictions in the human proteome alone. As an example, we find a DNA-binding domain in the C-terminal part of the A-kinase anchor protein 10 (AKAP10, a PKA adaptor that has been implicated in cardiac arrhythmias and premature cardiac death, which upon stress likely translocates from mitochondria to the nucleus/nucleolus. Based on our prediction, we propose that with this HLH-domain, AKAP10 is involved in the transcriptional control of stress response. Further remotely conserved domains we discuss are examples from areas such as sporulation, chromosome segregation and signalling during immune response. The HMMerThread algorithm is able to automatically detect the presence of remotely conserved domains in

  3. Recurrent targeted genes of hepatitis B virus in the liver cancer genomes identified by a next-generation sequencing-based approach.

    Directory of Open Access Journals (Sweden)

    Dong Ding

    Full Text Available Integration of the viral DNA into host chromosomes was found in most of the hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs. Here we devised a massive anchored parallel sequencing (MAPS method using next-generation sequencing to isolate and sequence HBV integrants. Applying MAPS to 40 pairs of HBV-related HCC tissues (cancer and adjacent tissues, we identified 296 HBV integration events corresponding to 286 unique integration sites (UISs with precise HBV-Human DNA junctions. HBV integration favored chromosome 17 and preferentially integrated into human transcript units. HBV targeted genes were enriched in GO terms: cAMP metabolic processes, T cell differentiation and activation, TGF beta receptor pathway, ncRNA catabolic process, and dsRNA fragmentation and cellular response to dsRNA. The HBV targeted genes include 7 genes (PTPRJ, CNTN6, IL12B, MYOM1, FNDC3B, LRFN2, FN1 containing IPR003961 (Fibronectin, type III domain, 7 genes (NRG3, MASP2, NELL1, LRP1B, ADAM21, NRXN1, FN1 containing IPR013032 (EGF-like region, conserved site, and three genes (PDE7A, PDE4B, PDE11A containing IPR002073 (3', 5'-cyclic-nucleotide phosphodiesterase. Enriched pathways include hsa04512 (ECM-receptor interaction, hsa04510 (Focal adhesion, and hsa04012 (ErbB signaling pathway. Fewer integration events were found in cancers compared to cancer-adjacent tissues, suggesting a clonal expansion model in HCC development. Finally, we identified 8 genes that were recurrent target genes by HBV integration including fibronectin 1 (FN1 and telomerase reverse transcriptase (TERT1, two known recurrent target genes, and additional novel target genes such as SMAD family member 5 (SMAD5, phosphatase and actin regulator 4 (PHACTR4, and RNA binding protein fox-1 homolog (C. elegans 1 (RBFOX1. Integrating analysis with recently published whole-genome sequencing analysis, we identified 14 additional recurrent HBV target genes, greatly expanding the HBV recurrent target list

  4. Next generation sequencing identifies abnormal Y chromosome and candidate causal variants in premature ovarian failure patients.

    Science.gov (United States)

    Lee, Yujung; Kim, Changshin; Park, YoungJoon; Pyun, Jung-A; Kwack, KyuBum

    2016-12-01

    Premature ovarian failure (POF) is characterized by heterogeneous genetic causes such as chromosomal abnormalities and variants in causal genes. Recently, development of techniques made next generation sequencing (NGS) possible to detect genome wide variants including chromosomal abnormalities. Among 37 Korean POF patients, XY karyotype with distal part deletions of Y chromosome, Yp11.32-31 and Yp12 end part, was observed in two patients through NGS. Six deleterious variants in POF genes were also detected which might explain the pathogenesis of POF with abnormalities in the sex chromosomes. Additionally, the two POF patients had no mutation in SRY but three non-synonymous variants were detected in genes regarding sex reversal. These findings suggest candidate causes of POF and sex reversal and show the propriety of NGS to approach the heterogeneous pathogenesis of POF. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Whole-exome sequencing identifies common and rare variant metabolic QTLs in a Middle Eastern population.

    Science.gov (United States)

    Yousri, Noha A; Fakhro, Khalid A; Robay, Amal; Rodriguez-Flores, Juan L; Mohney, Robert P; Zeriri, Hassina; Odeh, Tala; Kader, Sara Abdul; Aldous, Eman K; Thareja, Gaurav; Kumar, Manish; Al-Shakaki, Alya; Chidiac, Omar M; Mohamoud, Yasmin A; Mezey, Jason G; Malek, Joel A; Crystal, Ronald G; Suhre, Karsten

    2018-01-23

    Metabolomics-genome-wide association studies (mGWAS) have uncovered many metabolic quantitative trait loci (mQTLs) influencing human metabolic individuality, though predominantly in European cohorts. By combining whole-exome sequencing with a high-resolution metabolomics profiling for a highly consanguineous Middle Eastern population, we discover 21 common variant and 12 functional rare variant mQTLs, of which 45% are novel altogether. We fine-map 10 common variant mQTLs to new metabolite ratio associations, and 11 common variant mQTLs to putative protein-altering variants. This is the first work to report common and rare variant mQTLs linked to diseases and/or pharmacological targets in a consanguineous Arab cohort, with wide implications for precision medicine in the Middle East.

  6. Identifying Likely Transmission Pathways within a 10-Year Community Outbreak of Tuberculosis by High-Depth Whole Genome Sequencing.

    Directory of Open Access Journals (Sweden)

    Alexander C Outhred

    Full Text Available Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways.We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants.Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade.Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster.

  7. Some AFLP amplicons are highly conserved DNA sequences mapping to the same linkage groups in two F2 populations of carrot

    Directory of Open Access Journals (Sweden)

    Santos Carlos A.F.

    2002-01-01

    Full Text Available Amplified fragment length polymorphism (AFLP is a fast and reliable tool to generate a large number of DNA markers. In two unrelated F2 populations of carrot (Daucus carota L., Brasilia x HCM and B493 x QAL (wild carrot, it was hypothesized that DNA 1 digested with the same restriction endonuclease enzymes and amplified with the same primer combination and 2 sharing the same position in polyacrylamide gels should be conserved sequences. To test this hypothesis AFLP fragments from polyacrylamide gels were eluted, reamplified, separated in agarose gels, purified, cloned and sequenced. Among thirty-one paired fragments from each F2 population, twenty-six had identity greater than 91% and five presented identity of 24% to 44%. Among the twenty-six conserved AFLPs only one mapped to different linkage groups in the two populations while four of the five less-conserved bands mapped to different linkage groups. Of eight SCAR (sequence characterized amplified regions primers tested, one conserved AFLP resulted in co-dominant markers in both populations. Screening among 14 carrot inbreds or cultivars with three AFLP-SCAR primers revealed clear and polymorphic PCR products, with similar molecular sizes on agarose gels. The development of co-dominant markers based on conserved AFLP fragments will be useful to detect seed mixtures among hybrids, to improve and to merge linkage maps and to study diversity and phylogenetic relationships.

  8. Greater than the sum of its parts: single-nucleus sequencing identifies convergent evolution of independent EGFR mutants in GBM.

    Science.gov (United States)

    Gini, Beatrice; Mischel, Paul S

    2014-08-01

    Single-cell sequencing approaches are needed to characterize the genomic diversity of complex tumors, shedding light on their evolutionary paths and potentially suggesting more effective therapies. In this issue of Cancer Discovery, Francis and colleagues develop a novel integrative approach to identify distinct tumor subpopulations based on joint detection of clonal and subclonal events from bulk tumor and single-nucleus whole-genome sequencing, allowing them to infer a subclonal architecture. Surprisingly, the authors identify convergent evolution of multiple, mutually exclusive, independent EGFR gain-of-function variants in a single tumor. This study demonstrates the value of integrative single-cell genomics and highlights the biologic primacy of EGFR as an actionable target in glioblastoma. ©2014 American Association for Cancer Research.

  9. Integrative analysis of functional genomic annotations and sequencing data to identify rare causal variants via hierarchical modeling

    Directory of Open Access Journals (Sweden)

    Marinela eCapanu

    2015-05-01

    Full Text Available Identifying the small number of rare causal variants contributing to disease has beena major focus of investigation in recent years, but represents a formidable statisticalchallenge due to the rare frequencies with which these variants are observed. In thiscommentary we draw attention to a formal statistical framework, namely hierarchicalmodeling, to combine functional genomic annotations with sequencing data with theobjective of enhancing our ability to identify rare causal variants. Using simulations weshow that in all configurations studied, the hierarchical modeling approach has superiordiscriminatory ability compared to a recently proposed aggregate measure of deleteriousness,the Combined Annotation-Dependent Depletion (CADD score, supportingour premise that aggregate functional genomic measures can more accurately identifycausal variants when used in conjunction with sequencing data through a hierarchicalmodeling approach

  10. Use of Whole-Genus Genome Sequence Data To Develop a Multilocus Sequence Typing Tool That Accurately Identifies Yersinia Isolates to the Species and Subspecies Levels

    Science.gov (United States)

    Hall, Miquette; Chattaway, Marie A.; Reuter, Sandra; Savin, Cyril; Strauch, Eckhard; Carniel, Elisabeth; Connor, Thomas; Van Damme, Inge; Rajakaruna, Lakshani; Rajendram, Dunstan; Jenkins, Claire; Thomson, Nicholas R.

    2014-01-01

    The genus Yersinia is a large and diverse bacterial genus consisting of human-pathogenic species, a fish-pathogenic species, and a large number of environmental species. Recently, the phylogenetic and population structure of the entire genus was elucidated through the genome sequence data of 241 strains encompassing every known species in the genus. Here we report the mining of this enormous data set to create a multilocus sequence typing-based scheme that can identify Yersinia strains to the species level to a level of resolution equal to that for whole-genome sequencing. Our assay is designed to be able to accurately subtype the important human-pathogenic species Yersinia enterocolitica to whole-genome resolution levels. We also report the validation of the scheme on 386 strains from reference laboratory collections across Europe. We propose that the scheme is an important molecular typing system to allow accurate and reproducible identification of Yersinia isolates to the species level, a process often inconsistent in nonspecialist laboratories. Additionally, our assay is the most phylogenetically informative typing scheme available for Y. enterocolitica. PMID:25339391

  11. Strong minor groove base conservation in sequence logos implies DNA distortion or base flipping during replication and transcription initiation | Center for Cancer Research

    Science.gov (United States)

    Dubbed "Tom's T" by Dhruba Chattoraj, the unusually conserved thymine at position +7 in bacteriophage P1 plasmid RepA DNA binding sites rises above repressor and acceptor sequence logos. The T appears to represent base flipping prior to helix opening in this DNA replication initation protein.

  12. Activity of Posaconazole and Other Antifungal Agents against Mucorales Strains Identified by Sequencing of Internal Transcribed Spacers▿

    Science.gov (United States)

    Alastruey-Izquierdo, Ana; Castelli, Maria Victoria; Cuesta, Isabel; Monzon, Araceli; Cuenca-Estrella, Manuel; Rodriguez-Tudela, Juan Luis

    2009-01-01

    The antifungal susceptibility profiles of 77 clinical strains of Mucorales species, identified by internal transcribed spacer sequencing, were analyzed. MICs obtained at 24 and 48 h were compared. Amphotericin B was the most active agent against all isolates, except for Cunninghamella and Apophysomyces isolates. Posaconazole also showed good activity for all species but Cunninghamella bertholletiae. Voriconazole had no activity against any of the fungi tested. Terbinafine showed good activity, except for Rhizopus oryzae, Mucor circinelloides, and Rhizomucor variabilis isolates. PMID:19171801

  13. Activity of posaconazole and other antifungal agents against Mucorales strains identified by sequencing of internal transcribed spacers.

    Science.gov (United States)

    Alastruey-Izquierdo, Ana; Castelli, Maria Victoria; Cuesta, Isabel; Monzon, Araceli; Cuenca-Estrella, Manuel; Rodriguez-Tudela, Juan Luis

    2009-04-01

    The antifungal susceptibility profiles of 77 clinical strains of Mucorales species, identified by internal transcribed spacer sequencing, were analyzed. MICs obtained at 24 and 48 h were compared. Amphotericin B was the most active agent against all isolates, except for Cunninghamella and Apophysomyces isolates. Posaconazole also showed good activity for all species but Cunninghamella bertholletiae. Voriconazole had no activity against any of the fungi tested. Terbinafine showed good activity, except for Rhizopus oryzae, Mucor circinelloides, and Rhizomucor variabilis isolates.

  14. Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol

    NARCIS (Netherlands)

    L.A. Lange (Leslie); Y. Hu (Youna); H. Zhang (He); C. Xue (Chenyi); E.M. Schmidt (Ellen); Z.-Z. Tang (Zheng-Zheng); C. Bizon (Chris); E.M. Lange (Ethan); G.D. Smith; E.H. Turner (Emily); Y. Jun (Yang); H.M. Kang (Hyun Min); G.M. Peloso (Gina); P. Auer (Paul); K.-P. Li (Kuo-Ping); J. Flannick (Jason); J. Zhang (Ji); C. Fuchsberger (Christian); K. Gaulton (Kyle); C.M. Lindgren (Cecilia); A. Locke (Adam); A.K. Manning (Alisa); X. Sim (Xueling); M.A. Rivas (Manuel); O.L. Holmen (Oddgeir); R.F. Gottesman (Rebecca); Y. Lu (Yingchang); D. Ruderfer (Douglas); E.A. Stahl (Eli); Q. Duan (Qing); Y. Li (Yun); P. Durda (Peter); S. Jiao (Shuo); A.J. Isaacs (Aaron); A. Hofman (Albert); J.C. Bis (Joshua); D.D. Correa; M.D. Griswold (Michael); M. Jakobsdottir (Margret); G.D. Smith; P.J. Schreiner (Pamela); M.F. Feitosa (Mary Furlan); Q. Zhang (Qunyuan); J.E. Huffman (Jennifer); S. Crosby; C.L. Wassel (Christina); R. Do (Ron); N. Franceschini (Nora); L.W. Martin (Lisa); J.G. Robinson (Jennifer); T.L. Assimes (Themistocles); D.R. Crosslin (David); E.A. Rosenthal (Elisabeth); M.Y. Tsai (Michael); M. Rieder (Mark); D.N. Farlow (Deborah); A.R. Folsom (Aaron); T. Lumley (Thomas); E.R. Fox (Ervin); C.S. Carlson (Christopher); U. Peters (Ulrike); R.D. Jackson (Rebecca); C.M. van Duijn (Cornelia); A.G. Uitterlinden (André); D. Levy (Daniel); J.I. Rotter (Jerome); H.A. Taylor (Herman); V. Gudnason (Vilmundur); D.S. Siscovick (David); M. Fornage (Myriam); I.B. Borecki (Ingrid); C. Hayward (Caroline); I. Rudan (Igor); Y.E. Chen (Y. Eugene); E.P. Bottinger (Erwin); R.J.F. Loos (Ruth); P. Sætrom (Pål); K. Hveem (Kristian); M. Boehnke (Michael); L. Groop (Leif); M.I. McCarthy (Mark); T. Meitinger (Thomas); C. Ballantyne (Christie); S.B. Gabriel (Stacey); C.J. O'Donnell (Christopher); W.S. Post (Wendy S.); K.E. North (Kari); A. Reiner (Alexander); E.A. Boerwinkle (Eric); B.M. Psaty (Bruce); D. Altshuler (David); S. Kathiresan (Sekar); D.Y. Lin (Dan); G.P. Jarvik (Gail); L.A. Cupples (Adrienne); C. Kooperberg (Charles); J.G. Wilson (James); D.A. Nickerson (Deborah); G.R. Abecasis (Gonçalo); S.S. Rich (Stephen); R.P. Tracy (Russell); C.J. Willer (Cristen)

    2014-01-01

    textabstractElevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency

  15. Discovery and molecular characterization of a new luteovirus identified by high-throughput sequencing from apple

    Science.gov (United States)

    ‘Rapid Apple Decline’ (RAD) is a newly emerging problem of young, dwarf apple trees in the northeastern USA. The affected trees show trunk necrosis, bark cracking and canker formation before collapsing in the summer. In this study, a new luteovirus and three common viruses were identified from apple...

  16. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    NARCIS (Netherlands)

    Hu, H; Haas, S.A.; Chelly, J.; Esch, H. Van; Raynaud, M.; Brouwer, A.P. de; Weinert, S.; Froyen, G.; Frints, S.G.; Laumonnier, F.; Zemojtel, T.; Love, M.I.; Richard, H.; Emde, A.K.; Bienek, M.; Jensen, C.; Hambrock, M.; Fischer, U.; Langnick, C.; Feldkamp, M.; Wissink-Lindhout, W.; Lebrun, N.; Castelnau, L.; Rucci, J.; Montjean, R.; Dorseuil, O.; Billuart, P.; Stuhlmann, T.; Shaw, M.; Corbett, M.A.; Gardner, A.; Willis-Owen, S.; Tan, C.; Friend, K.L.; Belet, S.; Roozendaal, K.E. van; Jimenez-Pocquet, M.; Moizard, M.P.; Ronce, N.; Sun, R.; O'Keeffe, S.; Chenna, R.; Bommel, A. van; Goke, J.; Hackett, A.; Field, M.; Christie, L.; Boyle, J.; Haan, E.; Nelson, J.; Turner, G.; Baynam, G.; Gillessen-Kaesbach, G.; Muller, U.; Steinberger, D.; Budny, B.; Badura-Stronka, M.; Latos-Bielenska, A.; Ousager, L.B.; Wieacker, P.; Rodriguez Criado, G.; Bondeson, M.L.; Anneren, G.; Dufke, A.; Cohen, M.; Maldergem, L. Van; Vincent-Delorme, C.; Echenne, B.; Simon-Bouy, B.; Kleefstra, T.; Willemsen, M.H.; Fryns, J.P.; Devriendt, K.; Ullmann, R.; Vingron, M.; Wrogemann, K.; Wienker, T.F.; Tzschach, A.; Bokhoven, H. van; Gecz, J.; Jentsch, T.J.; Chen, W.; Ropers, H.H.; Kalscheuer, V.M.

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or

  17. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    DEFF Research Database (Denmark)

    Hu, H; Haas, S A; Chelly, J

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes...

  18. Exome sequencing identifies early gastric carcinoma as an early stage of advanced gastric cancer.

    Directory of Open Access Journals (Sweden)

    Guhyun Kang

    Full Text Available Gastric carcinoma is one of the major causes of cancer-related mortality worldwide. Early detection and treatment leads to an excellent prognosis in patients with early gastric cancer (EGC, whereas the prognosis of patients with advanced gastric cancer (AGC remains poor. It is unclear whether EGCs and AGCs are distinct entities or whether EGCs are the beginning stages of AGCs. We performed whole exome sequencing of four samples from patients with EGC and compared the results with those from AGCs. In both EGCs and AGCs, a total of 268 genes were commonly mutated and independent mutations were additionally found in EGCs (516 genes and AGCs (3104 genes. A higher frequency of C>G transitions was observed in intestinal-type compared to diffuse-type carcinomas (P = 0.010. The DYRK3, GPR116, MCM10, PCDH17, PCDHB1, RDH5 and UNC5C genes are recurrently mutated in EGCs and may be involved in early carcinogenesis.

  19. Analysis of 90 Mb of the potato genome reveals conservation of gene structures and order with tomato but divergence in repetitive sequence composition

    Directory of Open Access Journals (Sweden)

    O'Brien Kimberly

    2008-06-01

    Full Text Available Abstract Background The Solanaceae family contains a number of important crop species including potato (Solanum tuberosum which is grown for its underground storage organ known as a tuber. Albeit the 4th most important food crop in the world, other than a collection of ~220,000 Expressed Sequence Tags, limited genomic sequence information is currently available for potato and advances in potato yield and nutrition content would be greatly assisted through access to a complete genome sequence. While morphologically diverse, Solanaceae species such as potato, tomato, pepper, and eggplant share not only genes but also gene order thereby permitting highly informative comparative genomic analyses. Results In this study, we report on analysis 89.9 Mb of potato genomic sequence representing 10.2% of the genome generated through end sequencing of a potato bacterial artificial chromosome (BAC clone library (87 Mb and sequencing of 22 potato BAC clones (2.9 Mb. The GC content of potato is very similar to Solanum lycopersicon (tomato and other dicotyledonous species yet distinct from the monocotyledonous grass species, Oryza sativa. Parallel analyses of repetitive sequences in potato and tomato revealed substantial differences in their abundance, 34.2% in potato versus 46.3% in tomato, which is consistent with the increased genome size per haploid genome of these two Solanum species. Specific classes and types of repetitive sequences were also differentially represented between these two species including a telomeric-related repetitive sequence, ribosomal DNA, and a number of unclassified repetitive sequences. Comparative analyses between tomato and potato at the gene level revealed a high level of conservation of gene content, genic feature, and gene order although discordances in synteny were observed. Conclusion Genomic level analyses of potato and tomato confirm that gene sequence and gene order are conserved between these solanaceous species and that

  20. Applying Unique Molecular Identifiers in Next Generation Sequencing Reveals a Constrained Viral Quasispecies Evolution under Cross-Reactive Antibody Pressure Targeting Long Alpha Helix of Hemagglutinin

    Science.gov (United States)

    Hauck, Nastasja C.; Kirpach, Josiane; Kiefer, Christina; Farinelle, Sophie; Morris, Stephen A.; Muller, Claude P.; Lu, I-Na

    2018-01-01

    To overcome yearly efforts and costs for the production of seasonal influenza vaccines, new approaches for the induction of broadly protective and long-lasting immune responses have been developed in the past decade. To warrant safety and efficacy of the emerging crossreactive vaccine candidates, it is critical to understand the evolution of influenza viruses in response to these new immune pressures. Here we applied unique molecular identifiers in next generation sequencing to analyze the evolution of influenza quasispecies under in vivo antibody pressure targeting the hemagglutinin (HA) long alpha helix (LAH). Our vaccine targeting LAH of hemagglutinin elicited significant seroconversion and protection against homologous and heterologous influenza virus strains in mice. The vaccine not only significantly reduced lung viral titers, but also induced a well-known bottleneck effect by decreasing virus diversity. In contrast to the classical bottleneck effect, here we showed a significant increase in the frequency of viruses with amino acid sequences identical to that of vaccine targeting LAH domain. No escape mutant emerged after vaccination. These results not only support the potential of a universal influenza vaccine targeting the conserved LAH domains, but also clearly demonstrate that the well-established bottleneck effect on viral quasispecies evolution does not necessarily generate escape mutants. PMID:29587397

  1. Applying Unique Molecular Identifiers in Next Generation Sequencing Reveals a Constrained Viral Quasispecies Evolution under Cross-Reactive Antibody Pressure Targeting Long Alpha Helix of Hemagglutinin

    Directory of Open Access Journals (Sweden)

    Nastasja C. Hauck

    2018-03-01

    Full Text Available To overcome yearly efforts and costs for the production of seasonal influenza vaccines, new approaches for the induction of broadly protective and long-lasting immune responses have been developed in the past decade. To warrant safety and efficacy of the emerging crossreactive vaccine candidates, it is critical to understand the evolution of influenza viruses in response to these new immune pressures. Here we applied unique molecular identifiers in next generation sequencing to analyze the evolution of influenza quasispecies under in vivo antibody pressure targeting the hemagglutinin (HA long alpha helix (LAH. Our vaccine targeting LAH of hemagglutinin elicited significant seroconversion and protection against homologous and heterologous influenza virus strains in mice. The vaccine not only significantly reduced lung viral titers, but also induced a well-known bottleneck effect by decreasing virus diversity. In contrast to the classical bottleneck effect, here we showed a significant increase in the frequency of viruses with amino acid sequences identical to that of vaccine targeting LAH domain. No escape mutant emerged after vaccination. These results not only support the potential of a universal influenza vaccine targeting the conserved LAH domains, but also clearly demonstrate that the well-established bottleneck effect on viral quasispecies evolution does not necessarily generate escape mutants.

  2. A survey of single nucleotide polymorphisms identified from whole-genome sequencing and their functional effect in the porcine genome.

    Science.gov (United States)

    Keel, B N; Nonneman, D J; Rohrer, G A

    2017-08-01

    Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole-genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire-Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss-of-function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss-of-function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  3. Analysis of ecological context for identifying vegetation and animal conservation planning foci: An example from the arid South-western USA

    Science.gov (United States)

    Hamazaki, T.; Thompson, B.C.; Locke, B.A.; Boykin, K.G.

    2003-01-01

    In developing conservation strategies, it is important to maximize effects of conservation within a specified land tract and to maximize conservation effects on surrounding area (ecological context). The authors proposed two criteria to select biotic entities for conservation foci: (1) the relative occurrence of fauna or flora in a tract is greater than that of an ecological context region; and (2) occurrence of the fauna or flora is relatively limited in the ecological context region. Using extensive spatial data on vegetation and wildlife habitat distribution, the authors identified strategic vegetation and fauna conservation foci for the 400 000 ha Fort Bliss military reservation in New Mexico and Texas relative to a 164 km radius ecological context region intersecting seven ecological zones and the predicted habitat distribution of 616 animal species. The authors set two specific criteria: (1) predicted area of a species' occurrence is 5% (Fort Bliss is 4.2% of the region). These criteria selected one vegetation class and 40 animal species. Further, these vegetation and animal foci were primarily located in two areas of Fort Bliss. Sensitivity analyses with other analytical radii corroborated the context radius used. Conservation of the two areas and associated taxa will maximize the contribution of Fort Bliss's conservation efforts in its ecological proximity. This relatively simple but information-rich process represents economical and defensible preliminary contextual analysis for detailed conservation planning.

  4. Deep sequencing identifies ethnicity-specific bacterial signatures in the oral microbiome.

    Directory of Open Access Journals (Sweden)

    Matthew R Mason

    Full Text Available Oral infections have a strong ethnic predilection; suggesting that ethnicity is a critical determinant of oral microbial colonization. Dental plaque and saliva samples from 192 subjects belonging to four major ethnicities in the United States were analyzed using terminal restriction fragment length polymorphism (t-RFLP and 16S pyrosequencing. Ethnicity-specific clustering of microbial communities was apparent in saliva and subgingival biofilms, and a machine-learning classifier was capable of identifying an individual's ethnicity from subgingival microbial signatures. The classifier identified African Americans with a 100% sensitivity and 74% specificity and Caucasians with a 50% sensitivity and 91% specificity. The data demonstrates a significant association between ethnic affiliation and the composition of the oral microbiome; to the extent that these microbial signatures appear to be capable of discriminating between ethnicities.

  5. Exome sequencing identifies rare deleterious mutations in DNA repair genes FANCC and BLM as potential breast cancer susceptibility alleles.

    Directory of Open Access Journals (Sweden)

    Ella R Thompson

    2012-09-01

    Full Text Available Despite intensive efforts using linkage and candidate gene approaches, the genetic etiology for the majority of families with a multi-generational breast cancer predisposition is unknown. In this study, we used whole-exome sequencing of thirty-three individuals from 15 breast cancer families to identify potential predisposing genes. Our analysis identified families with heterozygous, deleterious mutations in the DNA repair genes FANCC and BLM, which are responsible for the autosomal recessive disorders Fanconi Anemia and Bloom syndrome. In total, screening of all exons in these genes in 438 breast cancer families identified three with truncating mutations in FANCC and two with truncating mutations in BLM. Additional screening of FANCC mutation hotspot exons identified one pathogenic mutation among an additional 957 breast cancer families. Importantly, none of the deleterious mutations were identified among 464 healthy controls and are not reported in the 1,000 Genomes data. Given the rarity of Fanconi Anemia and Bloom syndrome disorders among Caucasian populations, the finding of multiple deleterious mutations in these critical DNA repair genes among high-risk breast cancer families is intriguing and suggestive of a predisposing role. Our data demonstrate the utility of intra-family exome-sequencing approaches to uncover cancer predisposition genes, but highlight the major challenge of definitively validating candidates where the incidence of sporadic disease is high, germline mutations are not fully penetrant, and individual predisposition genes may only account for a tiny proportion of breast cancer families.

  6. High-throughput sequencing enhanced phage display identifies peptides that bind mycobacteria

    CSIR Research Space (South Africa)

    Ngubane, NAC

    2013-11-01

    Full Text Available . The displayed peptides are flanked by two cysteine residues, which are oxidized during phage assembly to a disulfide bond, resulting in a loop constrained peptide. We initially used the traditional clone picking method to identify the enriched clones... of the library, 1.236109 heptapeptides, it represented sufficient depth to measure the quantitative enrich- ment of relevant peptides. To confirm successful enrichment during selection, we characterized the reduction in diversity of the pool in the consecutive...

  7. Particular Candida albicans strains in the digestive tract of dyspeptic patients, identified by multilocus sequence typing.

    Directory of Open Access Journals (Sweden)

    Yan-Bing Gong

    Full Text Available BACKGROUND: Candida albicans is a human commensal that is also responsible for chronic gastritis and peptic ulcerous disease. Little is known about the genetic profiles of the C. albicans strains in the digestive tract of dyspeptic patients. The aim of this study was to evaluate the prevalence, diversity, and genetic profiles among C. albicans isolates recovered from natural colonization of the digestive tract in the dyspeptic patients. METHODS AND FINDINGS: Oral swab samples (n = 111 and gastric mucosa samples (n = 102 were obtained from a group of patients who presented dyspeptic symptoms or ulcer complaints. Oral swab samples (n = 162 were also obtained from healthy volunteers. C. albicans isolates were characterized and analyzed by multilocus sequence typing. The prevalence of Candida spp. in the oral samples was not significantly different between the dyspeptic group and the healthy group (36.0%, 40/111 vs. 29.6%, 48/162; P > 0.05. However, there were significant differences between the groups in the distribution of species isolated and the genotypes of the C. albicans isolates. C. albicans was isolated from 97.8% of the Candida-positive subjects in the dyspeptic group, but from only 56.3% in the healthy group (P < 0.001. DST1593 was the dominant C. albicans genotype from the digestive tract of the dyspeptic group (60%, 27/45, but not the healthy group (14.8%, 4/27 (P < 0.001. CONCLUSIONS: Our data suggest a possible link between particular C. albicans strain genotypes and the host microenvironment. Positivity for particular C. albicans genotypes could signify susceptibility to dyspepsia.

  8. Identifying Greater Sage-Grouse source and sink habitats for conservation planning in an energy development landscape.

    Science.gov (United States)

    Kirol, Christopher P; Beck, Jeffrey L; Huzurbazar, Snehalata V; Holloran, Matthew J; Miller, Scott N

    2015-06-01

    Conserving a declining species that is facing many threats, including overlap of its habitats with energy extraction activities, depends upon identifying and prioritizing the value of the habitats that remain. In addition, habitat quality is often compromised when source habitats are lost or fragmented due to anthropogenic development. Our objective was to build an ecological model to classify and map habitat quality in terms of source or sink dynamics for Greater Sage-Grouse (Centrocercus urophasianus) in the Atlantic Rim Project Area (ARPA), a developing coalbed natural gas field in south-central Wyoming, USA. We used occurrence and survival modeling to evaluate relationships between environmental and anthropogenic variables at multiple spatial scales and for all female summer life stages, including nesting, brood-rearing, and non-brooding females. For each life stage, we created resource selection functions (RSFs). We weighted the RSFs and combined them to form a female summer occurrence map. We modeled survival also as a function of spatial variables for nest, brood, and adult female summer survival. Our survival-models were mapped as survival probability functions individually and then combined with fixed vital rates in a fitness metric model that, when mapped, predicted habitat productivity (productivity map). Our results demonstrate a suite of environmental and anthropogenic variables at multiple scales that were predictive of occurrence and survival. We created a source-sink map by overlaying our female summer occurrence map and productivity map to predict habitats contributing to population surpluses (source habitats) or deficits (sink habitat) and low-occurrence habitats on the landscape. The source-sink map predicted that of the Sage-Grouse habitat within the ARPA, 30% was primary source, 29% was secondary source, 4% was primary sink, 6% was secondary sink, and 31% was low occurrence. Our results provide evidence that energy development and avoidance of

  9. Whole-exome sequencing identifies novel MPL and JAK2 mutations in triple-negative myeloproliferative neoplasms.

    Science.gov (United States)

    Milosevic Feenstra, Jelena D; Nivarthi, Harini; Gisslinger, Heinz; Leroy, Emilie; Rumi, Elisa; Chachoua, Ilyas; Bagienski, Klaudia; Kubesova, Blanka; Pietra, Daniela; Gisslinger, Bettina; Milanesi, Chiara; Jäger, Roland; Chen, Doris; Berg, Tiina; Schalling, Martin; Schuster, Michael; Bock, Christoph; Constantinescu, Stefan N; Cazzola, Mario; Kralovics, Robert

    2016-01-21

    Essential thrombocythemia (ET) and primary myelofibrosis (PMF) are chronic diseases characterized by clonal hematopoiesis and hyperproliferation of terminally differentiated myeloid cells. The disease is driven by somatic mutations in exon 9 of CALR or exon 10 of MPL or JAK2-V617F in >90% of the cases, whereas the remaining cases are termed "triple negative." We aimed to identify the disease-causing mutations in the triple-negative cases of ET and PMF by applying whole-exome sequencing (WES) on paired tumor and control samples from 8 patients. We found evidence of clonal hematopoiesis in 5 of 8 studied cases based on clonality analysis and presence of somatic genetic aberrations. WES identified somatic mutations in 3 of 8 cases. We did not detect any novel recurrent somatic mutations. In 3 patients with clonal hematopoiesis analyzed by WES, we identified a somatic MPL-S204P, a germline MPL-V285E mutation, and a germline JAK2-G571S variant. We performed Sanger sequencing of the entire coding region of MPL in 62, and of JAK2 in 49 additional triple-negative cases of ET or PMF. New somatic (T119I, S204F, E230G, Y591D) and 1 germline (R321W) MPL mutation were detected. All of the identified MPL mutations were gain-of-function when analyzed in functional assays. JAK2 variants were identified in 5 of 57 triple-negative cases analyzed by WES and Sanger sequencing combined. We could demonstrate that JAK2-V625F and JAK2-F556V are gain-of-function mutations. Our results suggest that triple-negative cases of ET and PMF do not represent a homogenous disease entity. Cases with polyclonal hematopoiesis might represent hereditary disorders. © 2016 by The American Society of Hematology.

  10. Conserved PCR primer set designing for closely-related species to complete mitochondrial genome sequencing using a sliding window-based PSO algorithm.

    Directory of Open Access Journals (Sweden)

    Cheng-Hong Yang

    Full Text Available BACKGROUND: Complete mitochondrial (mt genome sequencing is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. For long template sequencing, i.e., like the entire mtDNA, it is essential to design primers for Polymerase Chain Reaction (PCR amplicons which are partly overlapping each other. The presented chromosome walking strategy provides the overlapping design to solve the problem for unreliable sequencing data at the 5' end and provides the effective sequencing. However, current algorithms and tools are mostly focused on the primer design for a local region in the genomic sequence. Accordingly, it is still challenging to provide the primer sets for the entire mtDNA. METHODOLOGY/PRINCIPAL FINDINGS: The purpose of this study is to develop an integrated primer design algorithm for entire mt genome in general, and for the common primer sets for closely-related species in particular. We introduce ClustalW to generate the multiple sequence alignment needed to find the conserved sequences in closely-related species. These conserved sequences are suitable for designing the common primers for the entire mtDNA. Using a heuristic algorithm particle swarm optimization (PSO, all the designed primers were computationally validated to fit the common primer design constraints, such as the melting temperature, primer length and GC content, PCR product length, secondary structure, specificity, and terminal limitation. The overlap requirement for PCR amplicons in the entire mtDNA is satisfied by defining the overlapping region with the sliding window technology. Finally, primer sets were designed within the overlapping region. The primer sets for the entire mtDNA sequences were successfully demonstrated in the example of two closely-related fish species. The pseudo code for the primer design algorithm is provided. CONCLUSIONS/SIGNIFICANCE: In conclusion, it can be said that our proposed sliding window-based PSO

  11. Saprolegniaceae identified on amphibian eggs throughout the Pacific Northwest, USA, by internal transcribed spacer sequences and phylogenetic analysis.

    Science.gov (United States)

    Petrisko, Jill E; Pearl, Christopher A; Pilliod, David S; Sheridan, Peter P; Williams, Charles F; Peterson, Charles R; Bury, R Bruce

    2008-01-01

    We assessed the diversity and phylogeny of Saprolegniaceae on amphibian eggs from the Pacific Northwest, with particular focus on Saprolegnia ferax, a species implicated in high egg mortality. We identified isolates from eggs of six amphibians with the internal transcribed spacer (ITS) and 5.8S gene regions and BLAST of the GenBank database. We identified 68 sequences as Saprolegniaceae and 43 sequences as true fungi from at least nine genera. Our phylogenetic analysis of the Saprolegniaceae included isolates within the genera Saprolegnia, Achlya and Leptolegnia. Our phylogeny grouped S. semihypogyna with Achlya rather than with the Saprolegnia reference sequences. We found only one isolate that grouped closely with S. ferax, and this came from a hatchery-raised salmon (Idaho) that we sampled opportunistically. We had representatives of 7-12 species and three genera of Saprolegniaceae on our amphibian eggs. Further work on the ecological roles of different species of Saprolegniaceae is needed to clarify their potential importance in amphibian egg mortality and potential links to population declines.

  12. Exome Sequencing Identified a Recessive RDH12 Mutation in a Family with Severe Early-Onset Retinitis Pigmentosa

    Directory of Open Access Journals (Sweden)

    Bo Gong

    2015-01-01

    Full Text Available Retinitis pigmentosa (RP is the most important hereditary retinal disease caused by progressive degeneration of the photoreceptor cells. This study is to identify gene mutations responsible for autosomal recessive retinitis pigmentosa (arRP in a Chinese family using next-generation sequencing technology. A Chinese family with 7 members including two individuals affected with severe early-onset RP was studied. All patients underwent a complete ophthalmic examination. Exome sequencing was performed on a single RP patient (the proband of this family and direct Sanger sequencing on other family members and normal controls was followed to confirm the causal mutations. A homozygous mutation c.437Tidentified as being related to the phenotype of this arRP family. This homozygous mutation was detected in the two affected patients, but not present in other family members and 600 normal controls. Another three normal members in the family were found to carry this heterozygous missense mutation. Our results emphasize the importance of c.437T

  13. Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

    Science.gov (United States)

    Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

    2014-01-01

    High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

  14. Small RNA analysis in Petunia hybrida identifies unusual tissue-specific expression patterns of conserved miRNAs and of a 24mer RNA

    Science.gov (United States)

    Tedder, Philip; Zubko, Elena; Westhead, David R.; Meyer, Peter

    2009-01-01

    Two pools of small RNAs were cloned from inflorescences of Petunia hybrida using a 5′-ligation dependent and a 5′-ligation independent approach. The two libraries were integrated into a public website that allows the screening of individual sequences against 359,769 unique clones. The library contains 15 clones with 100% identity and 53 clones with one mismatch to miRNAs described for other plant species. For two conserved miRNAs, miR159 and miR390, we find clear differences in tissue-specific distribution, compared with other species. This shows that evolutionary conservation of miRNA sequences does not necessarily include a conservation of the miRNA expression profile. Almost 60% of all clones in the database are 24-nucleotide clones. In accordance with the role of 24mers in marking repetitive regions, we find them distributed across retroviral and transposable element sequences but other 24mers map to promoter regions and to different transcript regions. For one target region we observe tissue-specific variation of matching 24mers, which demonstrates that, as for 21mers, 24mer concentrations are not necessarily identical in different tissues. Asymmetric distribution of a putative novel miRNA in the two libraries suggests that the cloning method can be selective for the representation of certain small RNAs in a collection. PMID:19369427

  15. A zebrafish screen for craniofacial mutants identifies wdr68 as a highly conserved gene required for endothelin-1 expression

    Directory of Open Access Journals (Sweden)

    Amsterdam Adam

    2006-06-01

    Full Text Available Abstract Background Craniofacial birth defects result from defects in cranial neural crest (NC patterning and morphogenesis. The vertebrate craniofacial skeleton is derived from cranial NC cells and the patterning of these cells occurs within the pharyngeal arches. Substantial efforts have led to the identification of several genes required for craniofacial skeletal development such as the endothelin-1 (edn1 signaling pathway that is required for lower jaw formation. However, many essential genes required for craniofacial development remain to be identified. Results Through screening a collection of insertional zebrafish mutants containing approximately 25% of the genes essential for embryonic development, we present the identification of 15 essential genes that are required for craniofacial development. We identified 3 genes required for hyomandibular development. We also identified zebrafish models for Campomelic Dysplasia and Ehlers-Danlos syndrome. To further demonstrate the utility of this method, we include a characterization of the wdr68 gene. We show that wdr68 acts upstream of the edn1 pathway and is also required for formation of the upper jaw equivalent, the palatoquadrate. We also present evidence that the level of wdr68 activity required for edn1 pathway function differs between the 1st and 2nd arches. Wdr68 interacts with two minibrain-related kinases, Dyrk1a and Dyrk1b, required for embryonic growth and myotube differentiation, respectively. We show that a GFP-Wdr68 fusion protein localizes to the nucleus with Dyrk1a in contrast to an engineered loss of function mutation Wdr68-T284F that no longer accumulated in the cell nucleus and failed to rescue wdr68 mutant animals. Wdr68 homologs appear to exist in all eukaryotic genomes. Notably, we found that the Drosophila wdr68 homolog CG14614 could substitute for the vertebrate wdr68 gene even though insects lack the NC cell lineage. Conclusion This work represents a systematic

  16. Validation of rearrangement break points identified by paired-end sequencing in natural populations of Drosophila melanogaster.

    Science.gov (United States)

    Cridland, Julie M; Thornton, Kevin R

    2010-01-13

    Several recent studies have focused on the evolution of recently duplicated genes in Drosophila. Currently, however, little is known about the evolutionary forces acting upon duplications that are segregating in natural populations. We used a high-throughput, paired-end sequencing platform (Illumina) to identify structural variants in a population sample of African D. melanogaster. Polymerase chain reaction and sequencing confirmation of duplications detected by multiple, independent paired-ends showed that paired-end sequencing reliably uncovered the break points of structural rearrangements and allowed us to identify a number of tandem duplications segregating within a natural population. Our confirmation experiments show that rates of confirmation are very high, even at modest coverage. Our results also compare well with previous studies using microarrays (Emerson J, Cardoso-Moreira M, Borevitz JO, Long M. 2008. Natural selection shapes genome wide patterns of copy-number polymorphism in Drosophila melanogaster. Science. 320:1629-1631. and Dopman EB, Hartl DL. 2007. A portrait of copy-number polymorphism in Drosophila melanogaster. Proc Natl Acad Sci U S A. 104:19920-19925.), which both gives us confidence in the results of this study as well as confirms previous microarray results.We were also able to identify whole-gene duplications, such as a novel duplication of Or22a, an olfactory receptor, and identify copy-number differences in genes previously known to be under positive selection, like Cyp6g1, which confers resistance to dichlorodiphenyltrichloroethane. Several "hot spots" of duplications were detected in this study, which indicate that particular regions of the genome may be more prone to generating duplications. Finally, population frequency analysis of confirmed events also showed an excess of rare variants in our population, which indicates that duplications segregating in the population may be deleterious and ultimately destined to be lost from the

  17. Genome-wide linkage, exome sequencing and functional analyses identify ABCB6 as the pathogenic gene of dyschromatosis universalis hereditaria.

    Directory of Open Access Journals (Sweden)

    Hong Liu

    Full Text Available As a genetic disorder of abnormal pigmentation, the molecular basis of dyschromatosis universalis hereditaria (DUH had remained unclear until recently when ABCB6 was reported as a causative gene of DUH.We performed genome-wide linkage scan using Illumina Human 660W-Quad BeadChip and exome sequencing analyses using Agilent SureSelect Human All Exon Kits in a multiplex Chinese DUH family to identify the pathogenic mutations and verified the candidate mutations using Sanger sequencing. Quantitative RT-PCR and Immunohistochemistry was performed to verify the expression of the pathogenic gene, Zebrafish was also used to confirm the functional role of ABCB6 in melanocytes and pigmentation.Genome-wide linkage (assuming autosomal dominant inheritance mode and exome sequencing analyses identified ABCB6 as the disease candidate gene by discovering a coding mutation (c.1358C>T; p.Ala453Val that co-segregates with the disease phenotype. Further mutation analysis of ABCB6 in four other DUH families and two sporadic cases by Sanger sequencing confirmed the mutation (c.1358C>T; p.Ala453Val and discovered a second, co-segregating coding mutation (c.964A>C; p.Ser322Lys in one of the four families. Both mutations were heterozygous in DUH patients and not present in the 1000 Genome Project and dbSNP database as well as 1,516 unrelated Chinese healthy controls. Expression analysis in human skin and mutagenesis interrogation in zebrafish confirmed the functional role of ABCB6 in melanocytes and pigmentation. Given the involvement of ABCB6 mutations in coloboma, we performed ophthalmological examination of the DUH carriers of ABCB6 mutations and found ocular abnormalities in them.Our study has advanced our understanding of DUH pathogenesis and revealed the shared pathological mechanism between pigmentary DUH and ocular coloboma.

  18. Identifying orchid hotspots for biodiversity conservation in Laos: the limestone karst vegetation of Vang Vieng District, Vientiane Province

    Directory of Open Access Journals (Sweden)

    Pankaj Kumar

    2016-10-01

    Full Text Available A project to study the phytodiversity of the Indo-Burma Biodiversity Hotspot (IBBH was initiated by Kadoorie Farm and Botanic Garden, Hong Kong, in 2011, with the aim of surveying primary forest fragments and identifying conservation priorities within this expansive but highly threatened ecoregion. Vang Vieng District of Vientiane Province, northern Laos, was chosen as a focus for a pilot expedition, since it features an extensive karst landscape that has barely been explored. Together with officials from the Ministry of Science and Technology, Government of Lao PDR, surveys of three sites were conducted in April 2012, at the end of the dry northeast monsoon season. Emphasis was placed on Orchidaceae because it is among the most species-rich and commercially exploited flowering plant families in the region. A total of 179 specimens were collected, of which approximately 135 were unique taxa accounting for 29.6% of the orchids found in Laos and 5.8% of those found in IBBH as a whole, and equivalent to 0.27 species/hectare within the area surveyed, substantially higher than published figures for other limestone areas in the region, such as Cuc Phuong National Park in Vietnam (0.0055 species/hectare and Perlis State in Peninsular Malaysia (0.0036 species/hectare.  At least one is a species new to science, nine represent new distributional records for Laos and a further nine are new records for Vientiane Province. A list of the species encountered during the study is presented and the significance of the findings is discussed. Major threats to the natural environment in northern Laos are highlighted.

  19. The Runt domain of AML1 (RUNX1) binds a sequence-conserved RNA motif that mimics a DNA element.

    Science.gov (United States)

    Fukunaga, Junichi; Nomura, Yusuke; Tanaka, Yoichiro; Amano, Ryo; Tanaka, Taku; Nakamura, Yoshikazu; Kawai, Gota; Sakamoto, Taiichi; Kozu, Tomoko

    2013-07-01

    AML1 (RUNX1) is a key transcription factor for hematopoiesis that binds to the Runt-binding double-stranded DNA element (RDE) of target genes through its N-terminal Runt domain. Aberrations in the AML1 gene are frequently found in human leukemia. To better understand AML1 and its potential utility for diagnosis and therapy, we obtained RNA aptamers that bind specifically to the AML1 Runt domain. Enzymatic probing and NMR analyses revealed that Apt1-S, which is a truncated variant of one of the aptamers, has a CACG tetraloop and two stem regions separated by an internal loop. All the isolated aptamers were found to contain the conserved sequence motif 5'-NNCCAC-3' and 5'-GCGMGN'N'-3' (M:A or C; N and N' form Watson-Crick base pairs). The motif contains one AC mismatch and one base bulged out. Mutational analysis of Apt1-S showed that three guanines of the motif are important for Runt binding as are the three guanines of RDE, which are directly recognized by three arginine residues of the Runt domain. Mutational analyses of the Runt domain revealed that the amino acid residues used for Apt1-S binding were similar to those used for RDE binding. Furthermore, the aptamer competed with RDE for binding to the Runt domain in vitro. These results demonstrated that the Runt domain of the AML1 protein binds to the motif of the aptamer that mimics DNA. Our findings should provide new insights into RNA function and utility in both basic and applied sciences.

  20. Use of genotyping by sequencing data to develop a high-throughput and multifunctional SNP panel for conservation applications in Pacific lamprey.

    Science.gov (United States)

    Hess, Jon E; Campbell, Nathan R; Docker, Margaret F; Baker, Cyndi; Jackson, Aaron; Lampman, Ralph; McIlraith, Brian; Moser, Mary L; Statler, David P; Young, William P; Wildbill, Andrew J; Narum, Shawn R

    2015-01-01

    Next-generation sequencing data can be mined for highly informative single nucleotide polymorphisms (SNPs) to develop high-throughput genomic assays for nonmodel organisms. However, choosing a set of SNPs to address a variety of objectives can be difficult because SNPs are often not equally informative. We developed an optimal combination of 96 high-throughput SNP assays from a total of 4439 SNPs identified in a previous study of Pacific lamprey (Entosphenus tridentatus) and used them to address four disparate objectives: parentage analysis, species identification and characterization of neutral and adaptive variation. Nine of these SNPs are FST outliers, and five of these outliers are localized within genes and significantly associated with geography, run-timing and dwarf life history. Two of the 96 SNPs were diagnostic for two other lamprey species that were morphologically indistinguishable at early larval stages and were sympatric in the Pacific Northwest. The majority (85) of SNPs in the panel were highly informative for parentage analysis, that is, putatively neutral with high minor allele frequency across the species' range. Results from three case studies are presented to demonstrate the broad utility of this panel of SNP markers in this species. As Pacific lamprey populations are undergoing rapid decline, these SNPs provide an important resource to address critical uncertainties associated with the conservation and recovery of this imperiled species. © 2014 John Wiley & Sons Ltd.

  1. Human T-cell recognition of synthetic peptides representing conserved and variant sequences from the merozoite surface protein 2 of Plasmodium falciparum

    DEFF Research Database (Denmark)

    Theander, T G; Hviid, L; Dodoo, D

    1997-01-01

    Merozoite surface protein 2 (MSP2) is a malaria vaccine candidate currently undergoing clinical trials. We analyzed the peripheral blood mononuclear cell (PBMC) response to synthetic peptides corresponding to conserved and variant regions of the FCQ-27 allelic form of MSP2 in Ghanaian individuals....... The findings are encouraging for the development of a vaccine based on these T-epitope containing regions of MSP2, as the peptides were broadly recognized suggesting that they can bind to diverse HLA alleles and also because they include conserved MSP2 sequences. Immunisation with a vaccine construct...

  2. Germ-line variants identified by next generation sequencing in a panel of estrogen and cancer associated genes correlate with poor clinical outcome in Lynch syndrome patients.

    Science.gov (United States)

    Jóri, Balazs; Kamps, Rick; Xanthoulea, Sofia; Delvoux, Bert; Blok, Marinus J; Van de Vijver, Koen K; de Koning, Bart; Oei, Felicia Trups; Tops, Carli M; Speel, Ernst Jm; Kruitwagen, Roy F; Gomez-Garcia, Encarna B; Romano, Andrea

    2015-12-01

    The risk to develop colorectal and endometrial cancers among subjects testing positive for a pathogenic Lynch syndrome mutation varies, making the risk prediction difficult. Genetic risk modifiers alter the risk conferred by inherited Lynch syndrome mutations, and their identification can improve genetic counseling. We aimed at identifying rare genetic modifiers of the risk of Lynch syndrome endometrial cancer. A family based approach was used to assess the presence of genetic risk modifiers among 35 Lynch syndrome mutation carriers having either a poor clinical phenotype (early age of endometrial cancer diagnosis or multiple cancers) or a neutral clinical phenotype. Putative genetic risk modifiers were identified by Next Generation Sequencing among a panel of 154 genes involved in endometrial physiology and carcinogenesis. A simple pipeline, based on an allele frequency lower than 0.001 and on predicted non-conservative amino-acid substitutions returned 54 variants that were considered putative risk modifiers. The presence of two or more risk modifying variants in women carrying a pathogenic Lynch syndrome mutation was associated with a poor clinical phenotype. A gene-panel is proposed that comprehends genes that can carry variants with putative modifying effects on the risk of Lynch syndrome endometrial cancer. Validation in further studies is warranted before considering the possible use of this tool in genetic counseling.

  3. Cytoplasmic male sterility-associated chimeric open reading frames identified by mitochondrial genome sequencing of four Cajanus genotypes.

    Science.gov (United States)

    Tuteja, Reetu; Saxena, Rachit K; Davila, Jaime; Shah, Trushar; Chen, Wenbin; Xiao, Yong-Li; Fan, Guangyi; Saxena, K B; Alverson, Andrew J; Spillane, Charles; Town, Christopher; Varshney, Rajeev K

    2013-10-01

    The hybrid pigeonpea (Cajanus cajan) breeding technology based on cytoplasmic male sterility (CMS) is currently unique among legumes and displays major potential for yield increase. CMS is defined as a condition in which a plant is unable to produce functional pollen grains. The novel chimeric open reading frames (ORFs) produced as a results of mitochondrial genome rearrangements are considered to be the main cause of CMS. To identify these CMS-related ORFs in pigeonpea, we sequenced the mitochondrial genomes of three C. cajan lines (the male-sterile line ICPA 2039, the maintainer line ICPB 2039, and the hybrid line ICPH 2433) and of the wild relative (Cajanus cajanifolius ICPW 29). A single, circular-mapping molecule of length 545.7 kb was assembled and annotated for the ICPA 2039 line. Sequence annotation predicted 51 genes, including 34 protein-coding and 17 RNA genes. Comparison of the mitochondrial genomes from different Cajanus genotypes identified 31 ORFs, which differ between lines within which CMS is present or absent. Among these chimeric ORFs, 13 were identified by comparison of the related male-sterile and maintainer lines. These ORFs display features that are known to trigger CMS in other plant species and to represent the most promising candidates for CMS-related mitochondrial rearrangements in pigeonpea.

  4. Whole exome sequencing of a consanguineous family identifies the possible modifying effect of a globally rare AK5 allelic variant in celiac disease development among Saudi patients.

    Directory of Open Access Journals (Sweden)

    Jumana Yousuf Al-Aama

    Full Text Available Celiac disease (CD, a multi-factorial auto-inflammatory disease of the small intestine, is known to occur in both sporadic and familial forms. Together HLA and Non-HLA genes can explain up to 50% of CD's heritability. In order to discover the missing heritability due to rare variants, we have exome sequenced a consanguineous Saudi family presenting CD in an autosomal recessive (AR pattern. We have identified a rare homozygous insertion c.1683_1684insATT, in the conserved coding region of AK5 gene that showed classical AR model segregation in this family. Sequence validation of 200 chromosomes each of sporadic CD cases and controls, revealed that this extremely rare (EXac MAF 0.000008 mutation is highly penetrant among general Saudi populations (MAF is 0.62. Genotype and allelic distribution analysis have indicated that this AK5 (c.1683_1684insATT mutation is negatively selected among patient groups and positively selected in the control group, in whom it may modify the risk against CD development [p<0.002]. Our observation gains additional support from computational analysis which predicted that Iso561 insertion shifts the existing H-bonds between 400th and 556th amino acid residues lying near the functional domain of adenylate kinase. This shuffling of amino acids and their H-bond interactions is likely to disturb the secondary structure orientation of the polypeptide and induces the gain-of-function in nucleoside phosphate kinase activity of AK5, which may eventually down-regulates the reactivity potential of CD4+ T-cells against gluten antigens. Our study underlines the need to have population-specific genome databases to avoid false leads and to identify true candidate causal genes for the familial form of celiac disease.

  5. Targeted/exome sequencing identified mutations in ten Chinese patients diagnosed with Noonan syndrome and related disorders

    Directory of Open Access Journals (Sweden)

    Shanshan Xu

    2017-10-01

    Full Text Available Abstract Background Noonan syndrome (NS and Noonan syndrome with multiple lentigines (NSML are autosomal dominant developmental disorders. NS and NSML are caused by abnormalities in genes that encode proteins related to the RAS-MAPK pathway, including PTPN11, RAF1, BRAF, and MAP2K. In this study, we diagnosed ten NS or NSML patients via targeted sequencing or whole exome sequencing (TS/WES. Methods TS/WES was performed to identify mutations in ten Chinese patients who exhibited the following manifestations: potential facial dysmorphisms, short stature, congenital heart defects, and developmental delay. Sanger sequencing was used to confirm the suspected pathological variants in the patients and their family members. Results TS/WES revealed three mutations in the PTPN11 gene, three mutations in RAF1 gene, and four mutations in BRAF gene in the NS and NSML patients who were previously diagnosed based on the abovementioned clinical features. All the identified mutations were determined to be de novo mutations. However, two patients who carried the same mutation in the RAF1 gene presented different clinical features. One patient with multiple lentigines was diagnosed with NSML, while the other patient without lentigines was diagnosed with NS. In addition, a patient who carried a hotspot mutation in the BRAF gene was diagnosed with NS instead of cardiofaciocutaneous syndrome (CFCS. Conclusions TS/WES has emerged as a useful tool for definitive diagnosis and accurate genetic counseling of atypical cases. In this study, we analyzed ten Chinese patients diagnosed with NS and related disorders and identified their correspondingPTPN11, RAF1, and BRAF mutations. Among the target genes, BRAF showed the same degree of correlation with NS incidence as that of PTPN11 or RAF1.

  6. Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

    Science.gov (United States)

    Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

    2013-09-01

    Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.

  7. T cell receptor sequencing of early-stage breast cancer tumors identifies altered clonal structure of the T cell repertoire.

    Science.gov (United States)

    Beausang, John F; Wheeler, Amanda J; Chan, Natalie H; Hanft, Violet R; Dirbas, Frederick M; Jeffrey, Stefanie S; Quake, Stephen R

    2017-11-28

    Tumor-infiltrating T cells play an important role in many cancers, and can improve prognosis and yield therapeutic targets. We characterized T cells infiltrating both breast cancer tumors and the surrounding normal breast tissue to identify T cells specific to each, as well as their abundance in peripheral blood. Using immune profiling of the T cell beta-chain repertoire in 16 patients with early-stage breast cancer, we show that the clonal structure of the tumor is significantly different from adjacent breast tissue, with the tumor containing ∼2.5-fold greater density of T cells and higher clonality compared with normal breast. The clonal structure of T cells in blood and normal breast is more similar than between blood and tumor, and could be used to distinguish tumor from normal breast tissue in 14 of 16 patients. Many T cell sequences overlap between tissue and blood from the same patient, including ∼50% of T cells between tumor and normal breast. Both tumor and normal breast contain high-abundance "enriched" sequences that are absent or of low abundance in the other tissue. Many of these T cells are either not detected or detected with very low frequency in the blood, suggesting the existence of separate compartments of T cells in both tumor and normal breast. Enriched T cell sequences are typically unique to each patient, but a subset is shared between many different patients. We show that many of these are commonly generated sequences, and thus unlikely to play an important role in the tumor microenvironment. Copyright © 2017 the Author(s). Published by PNAS.

  8. Identification and characterization of putative conserved IAM ...

    African Journals Online (AJOL)

    Available putative AMI sequences from a wide array of monocot and dicot plants were identified and the phylogenetic tree was constructed and analyzed. We identified in this tree, a clade that contained sequences from species across the plant kingdom suggesting that AMI is conserved and may have a primary role in plant ...

  9. Identifying Rare Variation in Cases of Schizophrenia in the Isolated Population of the Faroe Islands using Whole-genome Sequencing

    DEFF Research Database (Denmark)

    Als, Thomas Damm; Lescai, Francesco; Dahl, Hans

    to map risk variants involved in complex traits. We aim at utilizing samples of cases and controls of the isolated population of the Faroe Islands to conduct whole-genome-sequence analysis in order to identify rare genetic variants associated with schizophrenia. We will search for rare genetic variants...... of developing SZ. However, these studies are designed to examining only “the common variant” proportion of the genomic landscape of SZ. Due to increased genetic drift during founding and potential bottlenecks, followed by population expansion, isolated populations may be particularly useful in identifying rare...... disease variants, that may appear at higher frequencies and/or within a more clearly distinct haplotype structure compared to outbred populations. Small isolated populations also typically show reduced phenotypic, genetic and environmental heterogeneity, thus making them advantageous in studies aiming...

  10. Deletion of a conserved regulatory element required for Hmx1 expression in craniofacial mesenchyme in the dumbo rat: a newly identified cause of congenital ear malformation

    Directory of Open Access Journals (Sweden)

    Lely A. Quina

    2012-11-01

    Hmx1 is a homeodomain transcription factor expressed in the developing eye, peripheral ganglia, and branchial arches of avian and mammalian embryos. Recent studies have identified a loss-of-function allele at the HMX1 locus as the causative mutation in the oculo-auricular syndrome (OAS in humans, characterized by ear and eye malformations. The mouse dumbo (dmbo mutation, with similar effects on ear and eye development, also results from a loss-of-function mutation in the Hmx1 gene. A recessive dmbo mutation causing ear malformation in rats has been mapped to the chromosomal region containing the Hmx1 gene, but the nature of the causative allele is unknown. Here we show that dumbo rats and mice exhibit similar neonatal ear and eye phenotypes. In midgestation embryos, dumbo rats show a specific loss of Hmx1 expression in neural-crest-derived craniofacial mesenchyme (CM, whereas Hmx1 is expressed normally in retinal progenitors, sensory ganglia and in CM, which is derived from mesoderm. High-throughput resequencing of 1 Mb of rat chromosome 14 from dmbo/dmbo rats, encompassing the Hmx1 locus, reveals numerous divergences from the rat genomic reference sequence, but no coding changes in Hmx1. Fine genetic mapping narrows the dmbo critical region to an interval of ∼410 kb immediately downstream of the Hmx1 transcription unit. Further sequence analysis of this region reveals a 5777-bp deletion located ∼80 kb downstream in dmbo/dmbo rats that is not apparent in 137 other rat strains. The dmbo deletion region contains a highly conserved domain of ∼500 bp, which is a candidate distal enhancer and which exhibits a similar relationship to Hmx genes in all vertebrate species for which data are available. We conclude that the rat dumbo phenotype is likely to result from loss of function of an ultraconserved enhancer specifically regulating Hmx1 expression in neural-crest-derived CM. Dysregulation of Hmx1 expression is thus a candidate mechanism for congenital ear

  11. Exome sequencing identifies compound heterozygous mutations in CYP4V2 in a pedigree with retinitis pigmentosa.

    Directory of Open Access Journals (Sweden)

    Yun Wang

    Full Text Available Retinitis pigmentosa (RP is a heterogeneous group of progressive retinal degenerations characterized by pigmentation and atrophy in the mid-periphery of the retina. Twenty two subjects from a four-generation Chinese family with RP and thin cornea, congenital cataract and high myopia is reported in this study. All family members underwent complete ophthalmologic examinations. Patients of the family presented with bone spicule-shaped pigment deposits in retina, retinal vascular attenuation, retinal and choroidal dystrophy, as well as punctate opacity of the lens, reduced cornea thickness and high myopia. Peripheral venous blood was obtained from all patients and their family members for genetic analysis. After mutation analysis in a few known RP candidate genes, exome sequencing was used to analyze the exomes of 3 patients III2, III4, III6 and the unaffected mother II2. A total of 34,693 variations shared by 3 patients were subjected to several filtering steps against existing variation databases. Identified variations were verified in the rest family members by PCR and Sanger sequencing. Compound heterozygous c.802-8_810del17insGC and c.1091-2A>G mutations of the CYP4V2 gene, known as genetic defects for Bietti crystalline corneoretinal dystrophy, were identified as causative mutations for RP of this family.

  12. Cross-comparison of the genome sequences from human, chimpanzee, Neanderthal and a Denisovan hominin identifies novel potentially compensated mutations

    Directory of Open Access Journals (Sweden)

    Zhang Guojie

    2011-07-01

    Full Text Available Abstract The recent publication of the draft genome sequences of the Neanderthal and a ~50,000-year-old archaic hominin from Denisova Cave in southern Siberia has ushered in a new age in molecular archaeology. We previously cross-compared the human, chimpanzee and Neanderthal genome sequences with respect to a set of disease-causing/disease-associated missense and regulatory mutations (Human Gene Mutation Database and succeeded in identifying genetic variants which, although apparently pathogenic in humans, may represent a 'compensated' wild-type state in at least one of the other two species. Here, in an attempt to identify further 'potentially compensated mutations' (PCMs of interest, we have compared our dataset of disease-causing/disease-associated mutations with their corresponding nucleotide positions in the Denisovan hominin, Neanderthal and chimpanzee genomes. Of the 15 human putatively disease-causing mutations that were found to be compensated in chimpanzee, Denisovan or Neanderthal, only a solitary F5 variant (Val1736Met was specific to the Denisovan. In humans, this missense mutation is associated with activated protein C resistance and an increased risk of thromboembolism and recurrent miscarriage. It is unclear at this juncture whether this variant was indeed a PCM in the Denisovan or whether it could instead have been associated with disease in this ancient hominin.

  13. Targeted next generation sequencing identifies functionally deleterious germline mutations in novel genes in early-onset/familial prostate cancer.

    Directory of Open Access Journals (Sweden)

    Paula Paulo

    2018-04-01

    Full Text Available Considering that mutations in known prostate cancer (PrCa predisposition genes, including those responsible for hereditary breast/ovarian cancer and Lynch syndromes, explain less than 5% of early-onset/familial PrCa, we have sequenced 94 genes associated with cancer predisposition using next generation sequencing (NGS in a series of 121 PrCa patients. We found monoallelic truncating/functionally deleterious mutations in seven genes, including ATM and CHEK2, which have previously been associated with PrCa predisposition, and five new candidate PrCa associated genes involved in cancer predisposing recessive disorders, namely RAD51C, FANCD2, FANCI, CEP57 and RECQL4. Furthermore, using in silico pathogenicity prediction of missense variants among 18 genes associated with breast/ovarian cancer and/or Lynch syndrome, followed by KASP genotyping in 710 healthy controls, we identified "likely pathogenic" missense variants in ATM, BRIP1, CHEK2 and TP53. In conclusion, this study has identified putative PrCa predisposing germline mutations in 14.9% of early-onset/familial PrCa patients. Further data will be necessary to confirm the genetic heterogeneity of inherited PrCa predisposition hinted in this study.

  14. Somatic mutation profiles of MSI and MSS colorectal cancer identified by whole exome next generation sequencing and bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Bernd Timmermann

    Full Text Available BACKGROUND: Colorectal cancer (CRC is with approximately 1 million cases the third most common cancer worldwide. Extensive research is ongoing to decipher the underlying genetic patterns with the hope to improve early cancer diagnosis and treatment. In this direction, the recent progress in next generation sequencing technologies has revolutionized the field of cancer genomics. However, one caveat of these studies remains the large amount of genetic variations identified and their interpretation. METHODOLOGY/PRINCIPAL FINDINGS: Here we present the first work on whole exome NGS of primary colon cancers. We performed 454 whole exome pyrosequencing of tumor as well as adjacent not affected normal colonic tissue from microsatellite stable (MSS and microsatellite instable (MSI colon cancer patients and identified more than 50,000 small nucleotide variations for each tissue. According to predictions based on MSS and MSI pathomechanisms we identified eight times more somatic non-synonymous variations in MSI cancers than in MSS and we were able to reproduce the result in four additional CRCs. Our bioinformatics filtering approach narrowed down the rate of most significant mutations to 359 for MSI and 45 for MSS CRCs with predicted altered protein functions. In both CRCs, MSI and MSS, we found somatic mutations in the intracellular kinase domain of bone morphogenetic protein receptor 1A, BMPR1A, a gene where so far germline mutations are associated with juvenile polyposis syndrome, and show that the mutations functionally impair the protein function. CONCLUSIONS/SIGNIFICANCE: We conclude that with deep sequencing of tumor exomes one may be able to predict the microsatellite status of CRC and in addition identify potentially clinically relevant mutations.

  15. De novo sequencing of circulating miRNAs identifies novel markers predicting clinical outcome of locally advanced breast cancer

    Directory of Open Access Journals (Sweden)

    Wu Xiwei

    2012-03-01

    Full Text Available Abstract Background MicroRNAs (miRNAs have been recently detected in the circulation of cancer patients, where they are associated with clinical parameters. Discovery profiling of circulating small RNAs has not been reported in breast cancer (BC, and was carried out in this study to identify blood-based small RNA markers of BC clinical outcome. Methods The pre-treatment sera of 42 stage II-III locally advanced and inflammatory BC patients who received neoadjuvant chemotherapy (NCT followed by surgical tumor resection were analyzed for marker identification by deep sequencing all circulating small RNAs. An independent validation cohort of 26 stage II-III BC patients was used to assess the power of identified miRNA markers. Results More than 800 miRNA species were detected in the circulation, and observed patterns showed association with histopathological profiles of BC. Groups of circulating miRNAs differentially associated with ER/PR/HER2 status and inflammatory BC were identified. The relative levels of selected miRNAs measured by PCR showed consistency with their abundance determined by deep sequencing. Two circulating miRNAs, miR-375 and miR-122, exhibited strong correlations with clinical outcomes, including NCT response and relapse with metastatic disease. In the validation cohort, higher levels of circulating miR-122 specifically predicted metastatic recurrence in stage II-III BC patients. Conclusions Our study indicates that certain miRNAs can serve as potential blood-based biomarkers for NCT response, and that miR-122 prevalence in the circulation predicts BC metastasis in early-stage patients. These results may allow optimized chemotherapy treatments and preventive anti-metastasis interventions in future clinical applications.

  16. Pooled Enrichment Sequencing Identifies Diversity and Evolutionary Pressures at NLR Resistance Genes within a Wild Tomato Population.

    Science.gov (United States)

    Stam, Remco; Scheikl, Daniela; Tellier, Aurélien

    2016-06-02

    Nod-like receptors (NLRs) are nucleotide-binding domain and leucine-rich repeats containing proteins that are important in plant resistance signaling. Many of the known pathogen resistance (R) genes in plants are NLRs and they can recognize pathogen molecules directly or indirectly. As such, divergence and copy number variants at these genes are found to be high between species. Within populations, positive and balancing selection are to be expected if plants coevolve with their pathogens. In order to understand the complexity of R-gene coevolution in wild nonmodel species, it is necessary to identify the full range of NLRs and infer their evolutionary history. Here we investigate and reveal polymorphism occurring at 220 NLR genes within one population of the partially selfing wild tomato species Solanum pennellii. We use a combination of enrichment sequencing and pooling ten individuals, to specifically sequence NLR genes in a resource and cost-effective manner. We focus on the effects which different mapping and single nucleotide polymorphism calling software and settings have on calling polymorphisms in customized pooled samples. Our results are accurately verified using Sanger sequencing of polymorphic gene fragments. Our results indicate that some NLRs, namely 13 out of 220, have maintained polymorphism within our S. pennellii population. These genes show a wide range of πN/πS ratios and differing site frequency spectra. We compare our observed rate of heterozygosity with expectations for this selfing and bottlenecked population. We conclude that our method enables us to pinpoint NLR genes which have experienced natural selection in their habitat. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Pooled Enrichment Sequencing Identifies Diversity and Evolutionary Pressures at NLR Resistance Genes within a Wild Tomato Population

    Science.gov (United States)

    Stam, Remco; Scheikl, Daniela; Tellier, Aurélien

    2016-01-01

    Nod-like receptors (NLRs) are nucleotide-binding domain and leucine-rich repeats containing proteins that are important in plant resistance signaling. Many of the known pathogen resistance (R) genes in plants are NLRs and they can recognize pathogen molecules directly or indirectly. As such, divergence and copy number variants at these genes are found to be high between species. Within populations, positive and balancing selection are to be expected if plants coevolve with their pathogens. In order to understand the complexity of R-gene coevolution in wild nonmodel species, it is necessary to identify the full range of NLRs and infer their evolutionary history. Here we investigate and reveal polymorphism occurring at 220 NLR genes within one population of the partially selfing wild tomato species Solanum pennellii. We use a combination of enrichment sequencing and pooling ten individuals, to specifically sequence NLR genes in a resource and cost-effective manner. We focus on the effects which different mapping and single nucleotide polymorphism calling software and settings have on calling polymorphisms in customized pooled samples. Our results are accurately verified using Sanger sequencing of polymorphic gene fragments. Our results indicate that some NLRs, namely 13 out of 220, have maintained polymorphism within our S. pennellii population. These genes show a wide range of πN/πS ratios and differing site frequency spectra. We compare our observed rate of heterozygosity with expectations for this selfing and bottlenecked population. We conclude that our method enables us to pinpoint NLR genes which have experienced natural selection in their habitat. PMID:27189991

  18. The First Endogenous Herpesvirus, Identified in the Tarsier Genome, and Novel Sequences from Primate Rhadinoviruses and Lymphocryptoviruses

    Science.gov (United States)

    Aswad, Amr; Katzourakis, Aris

    2014-01-01

    Herpesviridae is a diverse family of large and complex pathogens whose genomes are extremely difficult to sequence. This is particularly true for clinical samples, and if the virus, host, or both genomes are being sequenced for the first time. Although herpesviruses are known to occasionally integrate in host genomes, and can also be inherited in a Mendelian fashion, they are notably absent from the genomic fossil record comprised of endogenous viral elements (EVEs). Here, we combine paleovirological and metagenomic approaches to both explore the constituent viral diversity of mammalian genomes and search for endogenous herpesviruses. We describe the first endogenous herpesvirus from the genome of the Philippine tarsier, belonging to the Roseolovirus genus, and characterize its highly defective genome that is integrated and flanked by unambiguous host DNA. From a draft assembly of the aye-aye genome, we use bioinformatic tools to reveal over 100,000 bp of a novel rhadinovirus that is the first lemur gammaherpesvirus, closely related to Kaposi's sarcoma-associated virus. We also identify 58 genes of Pan paniscus lymphocryptovirus 1, the bonobo equivalent of human Epstein-Barr virus. For each of the viruses, we postulate gene function via comparative analysis to known viral relatives. Most notably, the evidence from gene content and phylogenetics suggests that the aye-aye sequences represent the most basal known rhadinovirus, and indicates that tumorigenic herpesviruses have been infecting primates since their emergence in the late Cretaceous. Overall, these data show that a genomic fossil record of herpesviruses exists despite their extremely large genomes, and expands the known diversity of Herpesviridae, which will aid the characterization of pathogenesis. Our analytical approach illustrates the benefit of intersecting evolutionary approaches with metagenomics, genetics and paleovirology. PMID:24945689

  19. Genotyping-by-sequencing in an orphan plant species Physocarpus opulifolius helps identify the evolutionary origins of the genus Prunus.

    Science.gov (United States)

    Buti, Matteo; Sargent, Daniel J; Mhelembe, Khethani G; Delfino, Pietro; Tobutt, Kenneth R; Velasco, Riccardo

    2016-05-11

    The Rosaceae family encompasses numerous genera exhibiting morphological diversification in fruit types and plant habit as well as a wide variety of chromosome numbers. Comparative genomics between various Rosaceous genera has led to the hypothesis that the ancestral genome of the family contained nine chromosomes, however, the synteny studies performed in the Rosaceae to date encompass species with base chromosome numbers x = 7 (Fragaria), x = 8 (Prunus), and x = 17 (Malus), and no study has included species from one of the many Rosaceous genera containing a base chromosome number of x = 9. A genetic linkage map of the species Physocarpus opulifolius (x = 9) was populated with sequence characterised SNP markers using genotyping by sequencing. This allowed for the first time, the extent of the genome diversification of a Rosaceous genus with a base chromosome number of x = 9 to be performed. Orthologous loci distributed throughout the nine chromosomes of Physocarpus and the eight chromosomes of Prunus were identified which permitted a meaningful comparison of the genomes of these two genera to be made. The study revealed a high level of macro-synteny between the two genomes, and relatively few chromosomal rearrangements, as has been observed in studies of other Rosaceous genomes, lending further support for a relatively simple model of genomic evolution in Rosaceae.

  20. Flavonoid Biosynthesis Genes Putatively Identified in the Aromatic Plant Polygonum minus via Expressed Sequences Tag (EST Analysis

    Directory of Open Access Journals (Sweden)

    Zamri Zainal

    2012-02-01

    Full Text Available P. minus is an aromatic plant, the leaf of which is widely used as a food additive and in the perfume industry. The leaf also accumulates secondary metabolites that act as active ingredients such as flavonoid. Due to limited genomic and transcriptomic data, the biosynthetic pathway of flavonoids is currently unclear. Identification of candidate genes involved in the flavonoid biosynthetic pathway will significantly contribute to understanding the biosynthesis of active compounds. We have constructed a standard cDNA library from P. minus leaves, and two normalized full-length enriched cDNA libraries were constructed from stem and root organs in order to create a gene resource for the biosynthesis of secondary metabolites, especially flavonoid biosynthesis. Thus, large‑scale sequencing of P. minus cDNA libraries identified 4196 expressed sequences tags (ESTs which were deposited in dbEST in the National Center of Biotechnology Information (NCBI. From the three constructed cDNA libraries, 11 ESTs encoding seven genes were mapped to the flavonoid biosynthetic pathway. Finally, three flavonoid biosynthetic pathway-related ESTs chalcone synthase, CHS (JG745304, flavonol synthase, FLS (JG705819 and leucoanthocyanidin dioxygenase, LDOX (JG745247 were selected for further examination by quantitative RT-PCR (qRT-PCR in different P. minus organs. Expression was detected in leaf, stem and root. Gene expression studies have been initiated in order to better understand the underlying physiological processes.

  1. De Novo Transcriptome Sequencing of Olea europaea L. to Identify Genes Involved in the Development of the Pollen Tube.

    Science.gov (United States)

    Iaria, Domenico; Chiappetta, Adriana; Muzzalupo, Innocenzo

    2016-01-01

    In olive (Olea europaea L.), the processes controlling self-incompatibility are still unclear and the molecular basis underlying this process are still not fully characterized. In order to determine compatibility relationships, using next-generation sequencing techniques and a de novo transcriptome assembly strategy, we show that pollen tubes from different olive plants, grown in vitro in a medium containing its own pistil and in combination pollen/pistil from self-sterile and self-fertile cultivars, have a distinct gene expression profile and many of the differentially expressed sequences between the samples fall within gene families involved in the development of the pollen tube, such as lipase, carboxylesterase, pectinesterase, pectin methylesterase, and callose synthase. Moreover, different genes involved in signal transduction, transcription, and growth are overrepresented. The analysis also allowed us to identify members in actin and actin depolymerization factor and fibrin gene family and member of the Ca(2+) binding gene family related to the development and polarization of pollen apical tip. The whole transcriptomic analysis, through the identification of the differentially expressed transcripts set and an extended functional annotation analysis, will lead to a better understanding of the mechanisms of pollen germination and pollen tube growth in the olive.

  2. Identification of novel and conserved microRNAs related to drought stress in potato by deep sequencing.

    Science.gov (United States)

    Zhang, Ning; Yang, Jiangwei; Wang, Zemin; Wen, Yikai; Wang, Jie; He, Wenhui; Liu, Bailin; Si, Huaijun; Wang, Di

    2014-01-01

    MicroRNAs (miRNAs) are a group of small, non-coding RNAs that play important roles in plant growth, development and stress response. There have been an increasing number of investigations aimed at discovering miRNAs and analyzing their functions in model plants (such as Arabidopsis thaliana and rice). In this research, we constructed small RNA libraries from both polyethylene glycol (PEG 6,000) treated and control potato samples, and a large number of known and novel miRNAs were identified. Differential expression analysis showed that 100 of the known miRNAs were down-regulated and 99 were up-regulated as a result of PEG stress, while 119 of the novel miRNAs were up-regulated and 151 were down-regulated. Based on target prediction, annotation and expression analysis of the miRNAs and their putative target genes, 4 miRNAs were identified as regulating drought-related genes (miR811, miR814, miR835, miR4398). Their target genes were MYB transcription factor (CV431094), hydroxyproline-rich glycoprotein (TC225721), quaporin (TC223412) and WRKY transcription factor (TC199112), respectively. Relative expression trends of those miRNAs were the same as that predicted by Solexa sequencing and they showed a negative correlation with the expression of the target genes. The results provide molecular evidence for the possible involvement of miRNAs in the process of drought response and/or tolerance in the potato plant.

  3. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum).

    Science.gov (United States)

    Li, Xuyan; Hou, Yanming; Zhang, Li; Zhang, Wenhao; Quan, Chen; Cui, Yuhai; Bian, Shaomin

    2014-01-01

    MicroRNAs (miRNAs) are a class of endogenous, approximately 21nt in length, non-coding RNA, which mediate the expression of target genes primarily at post-transcriptional levels. miRNAs play critical roles in almost all plant cellular and metabolic processes. Although numerous miRNAs have been identified in the plant kingdom, the miRNAs in blueberry, which is an economically important small fruit crop, still remain totally unknown. In this study, we reported a computational identification of miRNAs and their targets in blueberry. By conducting an EST-based comparative genomics approach, 9 potential vco-miRNAs were discovered from 22,402 blueberry ESTs according to a series of filtering criteria, designated as vco-miR156-5p, vco-miR156-3p, vco-miR1436, vco-miR1522, vco-miR4495, vco-miR5120, vco-miR5658, vco-miR5783, and vco-miR5986. Based on sequence complementarity between miRNA and its target transcript, 34 target ESTs from blueberry and 70 targets from other species were identified for the vco-miRNAs. The targets were found to be involved in transcription, RNA splicing and binding, DNA duplication, signal transduction, transport and trafficking, stress response, as well as synthesis and metabolic process. These findings will greatly contribute to future research in regard to functions and regulatory mechanisms of blueberry miRNAs.

  4. Use of deep whole-genome sequencing data to identify structure risk variants in breast cancer susceptibility genes.

    Science.gov (United States)

    Guo, Xingyi; Shi, Jiajun; Cai, Qiuyin; Shu, Xiao-Ou; He, Jing; Wen, Wanqing; Allen, Jamie; Pharoah, Paul; Dunning, Alison; Hunter, David J; Kraft, Peter; Easton, Douglas F; Zheng, Wei; Long, Jirong

    2018-03-01

    Functional disruptions of susceptibility genes by large genomic structure variant (SV) deletions in germlines are known to be associated with cancer risk. However, few studies have been conducted to systematically search for SV deletions in breast cancer susceptibility genes. We analysed deep (> 30x) whole-genome sequencing (WGS) data generated in blood samples from 128 breast cancer patients of Asian and European descent with either a strong family history of breast cancer or early cancer onset disease. To identify SV deletions in known or suspected breast cancer susceptibility genes, we used multiple SV calling tools including Genome STRiP, Delly, Manta, BreakDancer and Pindel. SV deletions were detected by at least three of these bioinformatics tools in five genes. Specifically, we identified heterozygous deletions covering a fraction of the coding regions of BRCA1 (with approximately 80kb in two patients), and TP53 genes (with ∼1.6 kb in two patients), and of intronic regions (∼1 kb) of the PALB2 (one patient), PTEN (three patients) and RAD51C genes (one patient). We confirmed the presence of these deletions using real-time quantitative PCR (qPCR). Our study identified novel SV deletions in breast cancer susceptibility genes and the identification of such SV deletions may improve clinical testing.

  5. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak.

    Science.gov (United States)

    Saelens, Joseph W; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M; Xet-Mull, Ana M; Stout, Jason E; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M

    2015-12-01

    Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  6. Comparative anatomy of the human APRT gene and enzyme: nucleotide sequence divergence and conservation of a nonrandom CpG dinucleotide arrangement

    International Nuclear Information System (INIS)

    Broderick, T.P.; Schaff, D.A.; Bertino, A.M.; Dush, M.K.; Tischfield, J.A.; Stambrook, P.J.

    1987-01-01

    The functional human adenine phosphoribosyltransferase (APRT) gene is <2.6 kilobases in length and contains five exons. The amino acid sequences of APRTs have been highly conserved throughout evolution. The human enzyme is 82%, 90%, and 40% identical to the mouse, hamster, and Escherichia coli enzymes, respectively. The promoter region of the human APRT gene, like that of several other housekeeping genes, lacks TATA and CCAAT boxes but contains five GC boxes that are potential binding sites for the Sp1 transcription factor. The distal three, however, are dispensable for gene expression. Comparison between human and mouse APRT gene nucleotide sequences reveals a high degree of homology within protein coding regions but an absence of significant homology in 5' flanking, 3' untranslated, and intron sequences, except for similarly positioned GC boxes in the promoter region and a 26-base-pair region in intron 3. This 26-base-pair sequence is 92% identical with a similarly positioned sequence in the mouse gene and is also found in intron 3 of the hamster gene, suggesting that its retention may be a consequence of stringent selection. The positions of all introns have been precisely retained in the human and both rodent genes. Retention of an elevated CpG dinucleotide content, despite loss of sequence homology, suggests that there may be selection for CpG dinucleotides in these regions and that their maintenance may be important for APRT gene function

  7. Massively parallel signature sequencing and bioinformatics analysis identifies up-regulation of TGFBI and SOX4 in human glioblastoma.

    Directory of Open Access Journals (Sweden)

    Biaoyang Lin

    Full Text Available BACKGROUND: A comprehensive network-based understanding of molecular pathways abnormally altered in glioblastoma multiforme (GBM is essential for developing effective therapeutic approaches for this deadly disease. METHODOLOGY/PRINCIPAL FINDINGS: Applying a next generation sequencing technology, massively parallel signature sequencing (MPSS, we identified a total of 4535 genes that are differentially expressed between normal brain and GBM tissue. The expression changes of three up-regulated genes, CHI3L1, CHI3L2, and FOXM1, and two down-regulated genes, neurogranin and L1CAM, were confirmed by quantitative PCR. Pathway analysis revealed that TGF- beta pathway related genes were significantly up-regulated in GBM tumor samples. An integrative pathway analysis of the TGF beta signaling network identified two alternative TGF-beta signaling pathways mediated by SOX4 (sex determining region Y-box 4 and TGFBI (Transforming growth factor beta induced. Quantitative RT-PCR and immunohistochemistry staining demonstrated that SOX4 and TGFBI expression is elevated in GBM tissues compared with normal brain tissues at both the RNA and protein levels. In vitro functional studies confirmed that TGFBI and SOX4 expression is increased by TGF-beta stimulation and decreased by a specific inhibitor of TGF-beta receptor 1 kinase. CONCLUSIONS/SIGNIFICANCE: Our MPSS database for GBM and normal brain tissues provides a useful resource for the scientific community. The identification of non-SMAD mediated TGF-beta signaling pathways acting through SOX4 and TGFBI (GENE ID:7045 in GBM indicates that these alternative pathways should be considered, in addition to the canonical SMAD mediated pathway, in the development of new therapeutic strategies targeting TGF-beta signaling in GBM. Finally, the construction of an extended TGF-beta signaling network with overlaid gene expression changes between GBM and normal brain extends our understanding of the biology of GBM.

  8. Sequencing illustrates the transcriptional response of Legionella pneumophila during infection and identifies seventy novel small non-coding RNAs.

    LENUS (Irish Health Repository)

    Weissenmayer, Barbara A

    2011-01-01

    Second generation sequencing has prompted a number of groups to re-interrogate the transcriptomes of several bacterial and archaeal species. One of the central findings has been the identification of complex networks of small non-coding RNAs that play central roles in transcriptional regulation in all growth conditions and for the pathogen\\'s interaction with and survival within host cells. Legionella pneumophila is a gram-negative facultative intracellular human pathogen with a distinct biphasic lifestyle. One of its primary environmental hosts in the free-living amoeba Acanthamoeba castellanii and its infection by L. pneumophila mimics that seen in human macrophages. Here we present analysis of strand specific sequencing of the transcriptional response of L. pneumophila during exponential and post-exponential broth growth and during the replicative and transmissive phase of infection inside A. castellanii. We extend previous microarray based studies as well as uncovering evidence of a complex regulatory architecture underpinned by numerous non-coding RNAs. Over seventy new non-coding RNAs could be identified; many of them appear to be strain specific and in configurations not previously reported. We discover a family of non-coding RNAs preferentially expressed during infection conditions and identify a second copy of 6S RNA in L. pneumophila. We show that the newly discovered putative 6S RNA as well as a number of other non-coding RNAs show evidence for antisense transcription. The nature and extent of the non-coding RNAs and their expression patterns suggests that these may well play central roles in the regulation of Legionella spp. specific traits and offer clues as to how L. pneumophila adapts to its intracellular niche. The expression profiles outlined in the study have been deposited into Genbank\\'s Gene Expression Omnibus (GEO) database under the series accession GSE27232.

  9. Identification, characterization, and utilization of genome-wide simple sequence repeats to identify a QTL for acidity in apple

    Science.gov (United States)

    2012-01-01

    Background Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome. Results A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation. Conclusions This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding

  10. RNA sequencing of Populus x canadensis roots identifies key molecular mechanisms underlying physiological adaption to excess zinc.

    Directory of Open Access Journals (Sweden)

    Andrea Ariani

    Full Text Available Populus x canadensis clone I-214 exhibits a general indicator phenotype in response to excess Zn, and a higher metal uptake in roots than in shoots with a reduced translocation to aerial parts under hydroponic conditions. This physiological adaptation seems mainly regulated by roots, although the molecular mechanisms that underlie these processes are still poorly understood. Here, differential expression analysis using RNA-sequencing technology was used to identify the molecular mechanisms involved in the response to excess Zn in root. In order to maximize specificity of detection of differentially expressed (DE genes, we consider the intersection of genes identified by three distinct statistical approaches (61 up- and 19 down-regulated and validate them by RT-qPCR, yielding an agreement of 93% between the two experimental techniques. Gene Ontology (GO terms related to oxidation-reduction processes, transport and cellular iron ion homeostasis were enriched among DE genes, highlighting the importance of metal homeostasis in adaptation to excess Zn by P. x canadensis clone I-214. We identified the up-regulation of two Populus metal transporters (ZIP2 and NRAMP1 probably involved in metal uptake, and the down-regulation of a NAS4 gene involved in metal translocation. We identified also four Fe-homeostasis transcription factors (two bHLH38 genes, FIT and BTS that were differentially expressed, probably for reducing Zn-induced Fe-deficiency. In particular, we suggest that the down-regulation of FIT transcription factor could be a mechanism to cope with Zn-induced Fe-deficiency in Populus. These results provide insight into the molecular mechanisms involved in adaption to excess Zn in Populus spp., but could also constitute a starting point for the identification and characterization of molecular markers or biotechnological targets for possible improvement of phytoremediation performances of poplar trees.

  11. Analyses of Tissue Culture Adaptation of Human Herpesvirus-6A by Whole Genome Deep Sequencing Redefines the Reference Sequence and Identifies Virus Entry Complex Changes.

    Science.gov (United States)

    Tweedy, Joshua G; Escriva, Eric; Topf, Maya; Gompels, Ursula A

    2017-12-31

    Tissue-culture adaptation of viruses can modulate infection. Laboratory passage and bacterial artificial chromosome (BAC)mid cloning of human cytomegalovirus, HCMV, resulted in genomic deletions and rearrangements altering genes encoding the virus entry complex, which affected cellular tropism, virulence, and vaccine development. Here, we analyse these effects on the reference genome for related betaherpesviruses, Roseolovirus, human herpesvirus 6A (HHV-6A) strain U1102. This virus is also naturally "cloned" by germline subtelomeric chromosomal-integration in approximately 1% of human populations, and accurate references are key to understanding pathological relationships between exogenous and endogenous virus. Using whole genome next-generation deep-sequencing Illumina-based methods, we compared the original isolate to tissue-culture passaged and the BACmid-cloned virus. This re-defined the reference genome showing 32 corrections and 5 polymorphisms. Furthermore, minor variant analyses of passaged and BACmid virus identified emerging populations of a further 32 single nucleotide polymorphisms (SNPs) in 10 loci, half non-synonymous indicating cell-culture selection. Analyses of the BAC-virus genome showed deletion of the BAC cassette via loxP recombination removing green fluorescent protein (GFP)-based selection. As shown for HCMV culture effects, select HHV-6A SNPs mapped to genes encoding mediators of virus cellular entry, including virus envelope glycoprotein genes gB and the gH/gL complex. Comparative models suggest stabilisation of the post-fusion conformation. These SNPs are essential to consider in vaccine-design, antimicrobial-resistance, and pathogenesis.

  12. Deep Sequencing Reveals the Complete Genome and Evidence for Transcriptional Activity of the First Virus-Like Sequences Identified in Aristotelia chilensis (Maqui Berry

    Directory of Open Access Journals (Sweden)

    Javier Villacreses

    2015-04-01

    Full Text Available Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1. High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs: ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV, Petuvirus genus. ORF1 encodes a movement protein (MP; ORF2 a Reverse Transcriptase (RT and a Ribonuclease H (RNase H domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs, AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq. Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  13. Sequence Conservation and Sexually Dimorphic Expression of the Ftz-F1 Gene in the Crustacean Daphnia magna.

    Directory of Open Access Journals (Sweden)

    Nur Syafiqah Mohamad Ishak

    Full Text Available Identifying the genes required for environmental sex determination is important for understanding the evolution of diverse sex determination mechanisms in animals. Orthologs of Drosophila orphan receptor Fushi tarazu factor-1 (Ftz-F1 are known to function in genetic sex determination. In contrast, their roles in environmental sex determination remain unknown. In this study, we have cloned and characterized the Ftz-F1 ortholog in the branchiopod crustacean Daphnia magna, which produces males in response to environmental stimuli. Similar to that observed in Drosophila, D. magna Ftz-F1 (DapmaFtz-F1 produces two splicing variants, αFtz-F1 and βFtz-F1, which encode 699 and 777 amino acids, respectively. Both isoforms share a DNA-binding domain, a ligand-binding domain, and an AF-2 activation domain and differ only at the A/B domain. The phylogenetic position and genomic structure of DapmaFtz-F1 suggested that this gene has diverged from an ancestral gene common to branchiopod crustacean and insect Ftz-F1 genes. qRT-PCR showed that at the one cell and gastrulation stages, both DapmaFtz-F1 isoforms are two-fold more abundant in males than in females. In addition, in later stages, their sexual dimorphic expressions were maintained in spite of reduced expression. Time-lapse imaging of DapmaFtz-F1 RNAi embryos was performed in H2B-GFP expressing transgenic Daphnia, demonstrating that development of the RNAi embryos slowed down after the gastrulation stage and stopped at 30-48 h after ovulation. DapmaFtz-F1 shows high homology to insect Ftz-F1 orthologs based on its amino acid sequence and exon-intron organization. The sexually dimorphic expression of DapmaFtz-F1 suggests that it plays a role in environmental sex determination of D. magna.

  14. Targeted high-throughput sequencing identifies mutations in atlastin-1 as a cause of hereditary sensory neuropathy type I.

    Science.gov (United States)

    Guelly, Christian; Zhu, Peng-Peng; Leonardis, Lea; Papić, Lea; Zidar, Janez; Schabhüttl, Maria; Strohmaier, Heimo; Weis, Joachim; Strom, Tim M; Baets, Jonathan; Willems, Jan; De Jonghe, Peter; Reilly, Mary M; Fröhlich, Eleonore; Hatz, Martina; Trajanoski, Slave; Pieber, Thomas R; Janecke, Andreas R; Blackstone, Craig; Auer-Grumbach, Michaela

    2011-01-07

    Hereditary sensory neuropathy type I (HSN I) is an axonal form of autosomal-dominant hereditary motor and sensory neuropathy distinguished by prominent sensory loss that leads to painless injuries. Unrecognized, these can result in delayed wound healing and osteomyelitis, necessitating distal amputations. To elucidate the genetic basis of an HSN I subtype in a family in which mutations in the few known HSN I genes had been excluded, we employed massive parallel exon sequencing of the 14.3 Mb disease interval on chromosome 14q. We detected a missense mutation (c.1065C>A, p.Asn355Lys) in atlastin-1 (ATL1), a gene that is known to be mutated in early-onset hereditary spastic paraplegia SPG3A and that encodes the large dynamin-related GTPase atlastin-1. The mutant protein exhibited reduced GTPase activity and prominently disrupted ER network morphology when expressed in COS7 cells, strongly supporting pathogenicity. An expanded screen in 115 additional HSN I patients identified two further dominant ATL1 mutations (c.196G>C [p.Glu66Gln] and c.976 delG [p.Val326TrpfsX8]). This study highlights an unexpected major role for atlastin-1 in the function of sensory neurons and identifies HSN I and SPG3A as allelic disorders.

  15. Use of DNA sequences to identify forensically important fly species and their distribution in the coastal region of Central California.

    Science.gov (United States)

    Nakano, Angie; Honda, Jeff

    2015-08-01

    Forensic entomology has gained prominence in recent years, as improvements in DNA technology and molecular methods have allowed insect and other arthropod evidence to become increasingly useful in criminal and civil investigations. However, comprehensive faunal inventories are still needed, including cataloging local DNA sequences for forensically significant Diptera. This multi-year fly-trapping study was built upon and expanded a previous survey of these flies in Santa Clara County, including the addition of genetic barcoding data from collected species of flies. Flies from the families Calliphoridae, Sarcophagidae, and Muscidae were trapped in meat-baited traps set in a variety of locations throughout the county. Flies were identified using morphological features and confirmed by molecular analysis. A total of 16 calliphorid species, 11 sarcophagid species, and four muscid species were collected and differentiated. This study found more species of flies than previous area surveys and established new county records for two calliphorid species: Cynomya cadaverina and Chrysomya rufifacies. Differences were found in fly fauna in different areas of the county, indicating the importance of microclimates in the distribution of these flies. Molecular analysis supported the use of DNA barcoding as an effective method of identifying cryptic fly species. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  16. Epidemiological study on the penicillin resistance of clinical Streptococcus pneumoniae isolates identified as the common sequence types.

    Science.gov (United States)

    Gao, Wei; Shi, Wei; Chen, Chang-hui; Wen, De-nian; Tian, Jin; Yao, Kai-hu

    2016-10-20

    There were some limitation in the current interpretation about the penicillin resistance mechanism of clinical Streptococcus pneumoniae isolates at the strain level. To explore the possibilities of studying the mechanism based on the sequence types (ST) of this bacteria, 488 isolates collected in Beijing from 1997-2014 and 88 isolates collected in Youyang County, Chongqing and Zhongjiang County, Sichuan in 2015 were analyzed by penicillin minimum inhibitory concentration (MIC) distribution and annual distribution. The results showed that the penicillin MICs of the all isolates covering by the given ST in Beijing have a defined range, either penicillin MIC penicillin MICs in the first few years after it was identified. The penicillin MIC of isolates identified as common STs and collected in Youyang County, Chongqing and Sichuan Zhongjiang County, including the ST271, ST320 and ST81, was around 0.25~2 mg/L (≥0.25 mg/L). Our study revealed the epidemiological distribution of penicillin MICs of the given STs determined in clinical S. pneumoniae isolates, suggesting that it is reasonable to research the penicillin resistance mechanism based on the STs of this bacteria.

  17. TSSer: an automated method to identify transcription start sites in prokaryotic genomes from differential RNA sequencing data.

    Science.gov (United States)

    Jorjani, Hadi; Zavolan, Mihaela

    2014-04-01

    Accurate identification of transcription start sites (TSSs) is an essential step in the analysis of transcription regulatory networks. In higher eukaryotes, the capped analysis of gene expression technology enabled comprehensive annotation of TSSs in genomes such as those of mice and humans. In bacteria, an equivalent approach, termed differential RNA sequencing (dRNA-seq), has recently been proposed, but the application of this approach to a large number of genomes is hindered by the paucity of computational analysis methods. With few exceptions, when the method has been used, annotation of TSSs has been largely done manually. In this work, we present a computational method called 'TSSer' that enables the automatic inference of TSSs from dRNA-seq data. The method rests on a probabilistic framework for identifying both genomic positions that are preferentially enriched in the dRNA-seq data as well as preferentially captured relative to neighboring genomic regions. Evaluating our approach for TSS calling on several publicly available datasets, we find that TSSer achieves high consistency with the curated lists of annotated TSSs, but identifies many additional TSSs. Therefore, TSSer can accelerate genome-wide identification of TSSs in bacterial genomes and can aid in further characterization of bacterial transcription regulatory networks. TSSer is freely available under GPL license at http://www.clipz.unibas.ch/TSSer/index.php

  18. Whole-Exome Sequencing Identified a Novel Compound Heterozygous Mutation of LRRC6 in a Chinese Primary Ciliary Dyskinesia Patient

    Directory of Open Access Journals (Sweden)

    Lv Liu

    2018-01-01

    Full Text Available Primary ciliary dyskinesia (PCD is a clinical rare peculiar disorder, mainly featured by respiratory infection, tympanitis, nasosinusitis, and male infertility. Previous study demonstrated it is an autosomal recessive disease and by 2017 almost 40 pathologic genes have been identified. Among them are the leucine-rich repeat- (LRR- containing 6 (LRRC6 codes for a 463-amino-acid cytoplasmic protein, expressed distinctively in motile cilia cells, including the testis cells and the respiratory epithelial cells. In this study, we applied whole-exome sequencing combined with PCD-known genes filtering to explore the genetic lesion of a PCD patient. A novel compound heterozygous mutation in LRRC6 (c.183T>G/p.N61K; c.179-1G>A was identified and coseparated in this family. The missense mutation (c.183T>G/p.N61K may lead to a substitution of asparagine by lysine at position 61 in exon 3 of LRRC6. The splice site mutation (c.179-1G>A may cause a premature stop codon in exon 4 and decrease the mRNA levels of LRRC6. Both mutations were not present in our 200 local controls, dbSNP, and 1000 genomes. Three bioinformatics programs also predicted that both mutations are deleterious. Our study not only further supported the importance of LRRC6 in PCD, but also expanded the spectrum of LRRC6 mutations and will contribute to the genetic diagnosis and counseling of PCD patients.

  19. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    DEFF Research Database (Denmark)

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang

    2015-01-01

    . Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication...

  20. Using Personal Water Footprints to Identify Consumer Food Choices that Influence the Conservation of Local Water Resources

    Science.gov (United States)

    Marrin, D. L.

    2015-12-01

    As the global demand for water and food escalates, the emphasis is on supply side factors rather than demand side factors such as consumers, whose personal water footprints are dominated (>90%) by food. Personal footprints include the water embedded in foods that are produced locally as well as those imported, raising the question of whether local shifts in people's food choices and habits could assist in addressing local water shortages. The current situation in California is interesting in that drought has affected an agriculturally productive region where a substantial portion of its food products are consumed by the state's large population. Unlike most agricultural regions where green water is the primary source of water for crops, California's arid climate demands an enormous volume of blue water as irrigation from its dwindling surface and ground water resources. Although California exports many of its food products, enough is consumed in-state so that residents making relatively minor shifts their food choices could save as much local blue water as their implementing more drastic reductions in household water use (comprising food group on both a caloric and gravimetric basis. Another change is wasting less food, which is a shared responsibility among consumers, producers and retailers; however, consumers' actions and preferences ultimately drive much of the waste. Personal water footprints suggest a role for individuals in conserving local water resources that is neither readily obvious nor a major focus of most conservation programs.

  1. De Novo Transcriptome Sequencing in Passiflora edulis Sims to Identify Genes and Signaling Pathways Involved in Cold Tolerance

    Directory of Open Access Journals (Sweden)

    Sian Liu

    2017-11-01

    Full Text Available The passion fruit (Passiflora edulis Sims, also known as the purple granadilla, is widely cultivated as the new darling of the fruit market throughout southern China. This exotic and perennial climber is adapted to warm and humid climates, and thus is generally intolerant of cold. There is limited information about gene regulation and signaling pathways related to the cold stress response in this species. In this study, two transcriptome libraries (KEDU_AP vs. GX_AP were constructed from the aerial parts of cold-tolerant and cold-susceptible varieties of P. edulis, respectively. Overall, 126,284,018 clean reads were obtained, and 86,880 unigenes with a mean size of 1449 bp were assembled. Of these, there were 64,067 (73.74% unigenes with significant similarity to publicly available plant protein sequences. Expression profiles were generated, and 3045 genes were found to be significantly differentially expressed between the KEDU_AP and GX_AP libraries, including 1075 (35.3% up-regulated and 1970 (64.7% down-regulated. These included 36 genes in enriched pathways of plant hormone signal transduction, and 56 genes encoding putative transcription factors. Six genes involved in the ICE1–CBF–COR pathway were induced in the cold-tolerant variety, and their expression levels were further verified using quantitative real-time PCR. This report is the first to identify genes and signaling pathways involved in cold tolerance using high-throughput transcriptome sequencing in P. edulis. These findings may provide useful insights into the molecular mechanisms regulating cold tolerance and genetic breeding in Passiflora spp.

  2. Identifying Faults Associated with the 2001 Avoca Induced(?) Seismicity Sequence of Western New York State Using Potential Field Wavelets.

    Science.gov (United States)

    Horowitz, F. G.; Ebinger, C.; Jordan, T. E.

    2017-12-01

    Results from recent DOE and USGS sponsored projects in the (intraplate) northeastern portions of the US and southeastern portions of Canada have identified locations of steeply dipping structures - many previously unknown - from a Poisson wavelet multiscale edge ('worm') analysis of gravity and magnetic fields. The Avoca sequence of induced(?) seismicity in western New York state occurred during January and February of 2001. The Avoca earthquake sequence is associated with industrial hydraulic fracturing activity "related to a proposed natural gas storage facility near Avoca to be constructed by solution mining" (Kim, 2001). The main Avoca event was a felt Mb = 3.2 earthquake on Feb. 3, 2001 recorded by the Lamont Cooperative Seismic Network. Earlier, smaller events were located by the Canadian Geological Survey's seismic network north of the Canadian border - implying that the event locations might be biased because they occurred off the southern edge of the array. Some of these events were also felt locally, according to local newspaper reports. By plotting the location of the seismic events and that of the injection well - reported via it's API number - we find a strong correlation with structures detected via our potential field worms. The injection occurred near a NE-SW striking structure that was not activated. All but one of the earthquakes occurred about 5 km north of the injection well on or nearby to an E-W striking structure that appears to intersect the NE-SW structure. The final, small (MN=2.2) earthquake was located on a different complex structure about 10 km north of the other events. We suggest that potential field methods such as ours might be appropriate to locating structures of concern for induced seismic activity in association with industrial activity. Reference: Kim, W.-Y. (2001). The Lamont cooperative seismic network and the national seismic system: Earthquake hazard studies in the northeastern United States. Tech. Rep. 98-01, Lamont

  3. High-throughput sequencing of the T cell receptor β gene identifies aggressive early-stage mycosis fungoides.

    Science.gov (United States)

    de Masson, Adele; O'Malley, John T; Elco, Christopher P; Garcia, Sarah S; Divito, Sherrie J; Lowry, Elizabeth L; Tawa, Marianne; Fisher, David C; Devlin, Phillip M; Teague, Jessica E; Leboeuf, Nicole R; Kirsch, Ilan R; Robins, Harlan; Clark, Rachael A; Kupper, Thomas S

    2018-05-09

    Mycosis fungoides (MF), the most common cutaneous T cell lymphoma (CTCL) is a malignancy of skin-tropic memory T cells. Most MF cases present as early stage (stage I A/B, limited to the skin), and these patients typically have a chronic, indolent clinical course. However, a small subset of early-stage cases develop progressive and fatal disease. Because outcomes can be so different, early identification of this high-risk population is an urgent unmet clinical need. We evaluated the use of next-generation high-throughput DNA sequencing of the T cell receptor β gene ( TCRB ) in lesional skin biopsies to predict progression and survival in a discovery cohort of 208 patients with CTCL (177 with MF) from a 15-year longitudinal observational clinical study. We compared these data to the results in an independent validation cohort of 101 CTCL patients (87 with MF). The tumor clone frequency (TCF) in lesional skin, measured by high-throughput sequencing of the TCRB gene, was an independent prognostic factor of both progression-free and overall survival in patients with CTCL and MF in particular. In early-stage patients, a TCF of >25% in the skin was a stronger predictor of progression than any other established prognostic factor (stage IB versus IA, presence of plaques, high blood lactate dehydrogenase concentration, large-cell transformation, or age). The TCF therefore may accurately predict disease progression in early-stage MF. Early identification of patients at high risk for progression could help identify candidates who may benefit from allogeneic hematopoietic stem cell transplantation before their disease becomes treatment-refractory. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

  4. A RAD-Based Genetic Map for Anchoring Scaffold Sequences and Identifying QTLs in Bitter Gourd (Momordica charantia)

    Science.gov (United States)

    Cui, Junjie; Luo, Shaobo; Niu, Yu; Huang, Rukui; Wen, Qingfang; Su, Jianwen; Miao, Nansheng; He, Weiming; Dong, Zhensheng; Cheng, Jiaowen; Hu, Kailin

    2018-01-01

    Genetic mapping is a basic tool necessary for anchoring assembled scaffold sequences and for identifying QTLs controlling important traits. Though bitter gourd (Momordica charantia) is both consumed and used as a medicinal, research on its genomics and genetic mapping is severely limited. Here, we report the construction of a restriction site associated DNA (RAD)-based genetic map for bitter gourd using an F2 mapping population comprising 423 individuals derived from two cultivated inbred lines, the gynoecious line ‘K44’ and the monoecious line ‘Dali-11.’ This map comprised 1,009 SNP markers and spanned a total genetic distance of 2,203.95 cM across the 11 linkage groups. It anchored a total of 113 assembled scaffolds that covered about 251.32 Mb (85.48%) of the 294.01 Mb assembled genome. In addition, three horticulturally important traits including sex expression, fruit epidermal structure, and immature fruit color were evaluated using a combination of qualitative and quantitative data. As a result, we identified three QTL/gene loci responsible for these traits in three environments. The QTL/gene gy/fffn/ffn, controlling sex expression involved in gynoecy, first female flower node, and female flower number was detected in the reported region. Particularly, two QTLs/genes, Fwa/Wr and w, were found to be responsible for fruit epidermal structure and white immature fruit color, respectively. This RAD-based genetic map promotes the assembly of the bitter gourd genome and the identified genetic loci will accelerate the cloning of relevant genes in the future. PMID:29706980

  5. Panel-based whole exome sequencing identifies novel mutations in microphthalmia and anophthalmia patients showing complex Mendelian inheritance patterns.

    Science.gov (United States)

    Riera, Marina; Wert, Ana; Nieto, Isabel; Pomares, Esther

    2017-11-01

    Microphthalmia and anophthalmia (MA) are congenital eye abnormalities that show an extremely high clinical and genetic complexity. In this study, we evaluated the implementation of whole exome sequencing (WES) for the genetic analysis of MA patients. This approach was used to investigate three unrelated families in which previous single-gene analyses failed to identify the molecular cause. A total of 47 genes previously associated with nonsyndromic MA were included in our panel. WES was performed in one affected patient from each family using the AmpliSeq TM Exome technology and the Ion Proton TM platform. A novel heterozygous OTX2 missense mutation was identified in a patient showing bilateral anophthalmia who inherited the variant from a parent who was a carrier, but showed no sign of the condition. We also describe a new PAX6 missense variant in an autosomal-dominant pedigree affected by mild bilateral microphthalmia showing high intrafamiliar variability, with germline mosaicism determined to be the most plausible molecular cause of the disease. Finally, a heterozygous missense mutation in RBP4 was found to be responsible in an isolated case of bilateral complex microphthalmia. This study highlights that panel-based WES is a reliable and effective strategy for the genetic diagnosis of MA. Furthermore, using this technique, the mutational spectrum of these diseases was broadened, with novel variants identified in each of the OTX2, PAX6, and RBP4 genes. Moreover, we report new cases of reduced penetrance, mosaicism, and variable phenotypic expressivity associated with MA, further demonstrating the heterogeneity of such disorders. © 2017 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.

  6. A RAD-Based Genetic Map for Anchoring Scaffold Sequences and Identifying QTLs in Bitter Gourd (Momordica charantia

    Directory of Open Access Journals (Sweden)

    Junjie Cui

    2018-04-01

    Full Text Available Genetic mapping is a basic tool necessary for anchoring assembled scaffold sequences and for identifying QTLs controlling important traits. Though bitter gourd (Momordica charantia is both consumed and used as a medicinal, research on its genomics and genetic mapping is severely limited. Here, we report the construction of a restriction site associated DNA (RAD-based genetic map for bitter gourd using an F2 mapping population comprising 423 individuals derived from two cultivated inbred lines, the gynoecious line ‘K44’ and the monoecious line ‘Dali-11.’ This map comprised 1,009 SNP markers and spanned a total genetic distance of 2,203.95 cM across the 11 linkage groups. It anchored a total of 113 assembled scaffolds that covered about 251.32 Mb (85.48% of the 294.01 Mb assembled genome. In addition, three horticulturally important traits including sex expression, fruit epidermal structure, and immature fruit color were evaluated using a combination of qualitative and quantitative data. As a result, we identified three QTL/gene loci responsible for these traits in three environments. The QTL/gene gy/fffn/ffn, controlling sex expression involved in gynoecy, first female flower node, and female flower number was detected in the reported region. Particularly, two QTLs/genes, Fwa/Wr and w, were found to be responsible for fruit epidermal structure and white immature fruit color, respectively. This RAD-based genetic map promotes the assembly of the bitter gourd genome and the identified genetic loci will accelerate the cloning of relevant genes in the future.

  7. Whole-Exome Sequencing Identifies One De Novo Variant in the FGD6 Gene in a Thai Family with Autism Spectrum Disorder

    Directory of Open Access Journals (Sweden)

    Chuphong Thongnak

    2018-01-01

    Full Text Available Autism spectrum disorder (ASD has a strong genetic basis, although the genetics of autism is complex and it is unclear. Genetic testing such as microarray or sequencing was widely used to identify autism markers, but they are unsuccessful in several cases. The objective of this study is to identify causative variants of autism in two Thai families by using whole-exome sequencing technique. Whole-exome sequencing was performed with autism-affected children from two unrelated families. Each sample was sequenced on SOLiD 5500xl Genetic Analyzer system followed by combined bioinformatics pipeline including annotation and filtering process to identify candidate variants. Candidate variants were validated, and the segregation study with other family members was performed using Sanger sequencing. This study identified a possible causative variant for ASD, c.2951G>A, in the FGD6 gene. We demonstrated the potential for ASD genetic variants associated with ASD using whole-exome sequencing and a bioinformatics filtering procedure. These techniques could be useful in identifying possible causative ASD variants, especially in cases in which variants cannot be identified by other techniques.

  8. A bacterial genetic screen identifies functional coding sequences of the insect mariner transposable element Famar1 amplified from the genome of the earwig, Forficula auricularia.

    Science.gov (United States)

    Barry, Elizabeth G; Witherspoon, David J; Lampe, David J

    2004-02-01

    Transposons of the mariner family are widespread in animal genomes and have apparently infected them by horizontal transfer. Most species carry only old defective copies of particular mariner transposons that have diverged greatly from their active horizontally transferred ancestor, while a few contain young, very similar, and active copies. We report here the use of a whole-genome screen in bacteria to isolate somewhat diverged Famar1 copies from the European earwig, Forficula auricularia, that encode functional transposases. Functional and nonfunctional coding sequences of Famar1 and nonfunctional copies of Ammar1 from the European honey bee, Apis mellifera, were sequenced to examine their molecular evolution. No selection for sequence conservation was detected in any clade of a tree derived from these sequences, not even on branches leading to functional copies. This agrees with the current model for mariner transposon evolution that expects neutral evolution within particular hosts, with selection for function occurring only upon horizontal transfer to a new host. Our results further suggest that mariners are not finely tuned genetic entities and that a greater amount of sequence diversification than had previously been appreciated can occur in functional copies in a single host lineage. Finally, this method of isolating active copies can be used to isolate other novel active transposons without resorting to reconstruction of ancestral sequences.

  9. Structural Conservation Despite Huge Sequence Diversity Allows EPCR Binding by the PfEMP1 Family Implicated in Severe Childhood Malaria

    DEFF Research Database (Denmark)

    Lau, Clinton K.Y.; Turner, Louise; Jespersen, Jakob S.

    2015-01-01

    with severe childhood malaria. We combine crystal structures of CIDRa1:EPCR complexes with analysis of 885 CIDRa1 sequences, showing that the EPCR-binding surfaces of CIDRa1 domains are conserved in shape and bonding potential, despite dramatic sequence diversity. Additionally, these domains mimic features...... of the natural EPCR ligand and can block this ligand interaction. Using peptides corresponding to the EPCR-binding region, antibodies can be purified from individuals in malaria-endemic regions that block EPCR binding of diverse CIDRa1 variants. This highlights the extent to which such a surface protein family......The PfEMP1 family of surface proteins is central for Plasmodium falciparum virulence and must retain the ability to bind to host receptors while also diversifying to aid immune evasion. The interaction between CIDRa1 domains of PfEMP1 and endothelial protein C receptor (EPCR) is associated...

  10. Whole exome sequencing identifies novel genes for fetal hemoglobin response to hydroxyurea in children with sickle cell anemia.

    Science.gov (United States)

    Sheehan, Vivien A; Crosby, Jacy R; Sabo, Aniko; Mortier, Nicole A; Howard, Thad A; Muzny, Donna M; Dugan-Perez, Shannon; Aygun, Banu; Nottage, Kerri A; Boerwinkle, Eric; Gibbs, Richard A; Ware, Russell E; Flanagan, Jonathan M

    2014-01-01

    Hydroxyurea has proven efficacy in children and adults with sickle cell anemia (SCA), but with considerable inter-individual variability in the amount of fetal hemoglobin (HbF) produced. Sibling and twin studies indicate that some of that drug response variation is heritable. To test the hypothesis that genetic modifiers influence pharmacological induction of HbF, we investigated phenotype-genotype associations using whole exome sequencing of children with SCA treated prospectively with hydroxyurea to maximum tolerated dose (MTD). We analyzed 171 unrelated patients enrolled in two prospective clinical trials, all treated with dose escalation to MTD. We examined two MTD drug response phenotypes: HbF (final %HbF minus baseline %HbF), and final %HbF. Analyzing individual genetic variants, we identified multiple low frequency and common variants associated with HbF induction by hydroxyurea. A validation cohort of 130 pediatric sickle cell patients treated to MTD with hydroxyurea was genotyped for 13 non-synonymous variants with the strongest association with HbF response to hydroxyurea in the discovery cohort. A coding variant in Spalt-like transcription factor, or SALL2, was associated with higher final HbF in this second independent replication sample and SALL2 represents an outstanding novel candidate gene for further investigation. These findings may help focus future functional studies and provide new insights into the pharmacological HbF upregulation by hydroxyurea in patients with SCA.

  11. Automatically Identifying Fusion Events between GLUT4 Storage Vesicles and the Plasma Membrane in TIRF Microscopy Image Sequences

    Directory of Open Access Journals (Sweden)

    Jian Wu

    2015-01-01

    Full Text Available Quantitative analysis of the dynamic behavior about membrane-bound secretory vesicles has proven to be important in biological research. This paper proposes a novel approach to automatically identify the elusive fusion events between VAMP2-pHluorin labeled GLUT4 storage vesicles (GSVs and the plasma membrane. The differentiation is implemented to detect the initiation of fusion events by modified forward subtraction of consecutive frames in the TIRFM image sequence. Spatially connected pixels in difference images brighter than a specified adaptive threshold are grouped into a distinct fusion spot. The vesicles are located at the intensity-weighted centroid of their fusion spots. To reveal the true in vivo nature of a fusion event, 2D Gaussian fitting for the fusion spot is used to derive the intensity-weighted centroid and the spot size during the fusion process. The fusion event and its termination can be determined according to the change of spot size. The method is evaluated on real experiment data with ground truth annotated by expert cell biologists. The evaluation results show that it can achieve relatively high accuracy comparing favorably to the manual analysis, yet at a small fraction of time.

  12. A targeted sequencing panel identifies rare damaging variants in multiple genes in the cranial neural tube defect, anencephaly.

    Science.gov (United States)

    Ishida, M; Cullup, T; Boustred, C; James, C; Docker, J; English, C; Lench, N; Copp, A J; Moore, G E; Greene, N D E; Stanier, P

    2018-04-01

    Neural tube defects (NTDs) affecting the brain (anencephaly) are lethal before or at birth, whereas lower spinal defects (spina bifida) may lead to lifelong neurological handicap. Collectively, NTDs rank among the most common birth defects worldwide. This study focuses on anencephaly, which despite having a similar frequency to spina bifida and being the most common type of NTD observed in mouse models, has had more limited inclusion in genetic studies. A genetic influence is strongly implicated in determining risk of NTDs and a molecular diagnosis is of fundamental importance to families both in terms of understanding the origin of the condition and for managing future pregnancies. Here we used a custom panel of 191 NTD candidate genes to screen 90 patients with cranial NTDs (n = 85 anencephaly and n = 5 craniorachischisis) with a targeted exome sequencing platform. After filtering and comparing to our in-house control exome database (N = 509), we identified 397 rare variants (minor allele frequency, MAF < 1%), 21 of which were previously unreported and predicted damaging. This included 1 frameshift (PDGFRA), 2 stop-gained (MAT1A; NOS2) and 18 missense variations. Together with evidence for oligogenic inheritance, this study provides new information on the possible genetic causation of anencephaly. © 2017 The Authors. Clinical Genetics published by John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  13. Whole Genome Sequencing Identifies a Missense Mutation in HES7 Associated with Short Tails in Asian Domestic Cats.

    Science.gov (United States)

    Xu, Xiao; Sun, Xin; Hu, Xue-Song; Zhuang, Yan; Liu, Yue-Chen; Meng, Hao; Miao, Lin; Yu, He; Luo, Shu-Jin

    2016-08-25

    Domestic cats exhibit abundant variations in tail morphology and serve as an excellent model to study the development and evolution of vertebrate tails. Cats with shortened and kinked tails were first recorded in the Malayan archipelago by Charles Darwin in 1868 and remain quite common today in Southeast and East Asia. To elucidate the genetic basis of short tails in Asian cats, we built a pedigree of 13 cats segregating at the trait with a founder from southern China and performed linkage mapping based on whole genome sequencing data from the pedigree. The short-tailed trait was mapped to a 5.6 Mb region of Chr E1, within which the substitution c. 5T > C in the somite segmentation-related gene HES7 was identified as the causal mutation resulting in a missense change (p.V2A). Validation in 245 unrelated cats confirmed the correlation between HES7-c. 5T > C and Chinese short-tailed feral cats as well as the Japanese Bobtail breed, indicating a common genetic basis of the two. In addition, some of our sampled kinked-tailed cats could not be explained by either HES7 or the Manx-related T-box, suggesting at least three independent events in the evolution of domestic cats giving rise to short-tailed traits.

  14. Mining environmental high-throughput sequence data sets to identify divergent amplicon clusters for phylogenetic reconstruction and morphotype visualization.

    Science.gov (United States)

    Gimmler, Anna; Stoeck, Thorsten

    2015-08-01

    Environmental high-throughput sequencing (envHTS) is a very powerful tool, which in protistan ecology is predominantly used for the exploration of diversity and its geographic and local patterns. We here used a pyrosequenced V4-SSU rDNA data set from a solar saltern pond a