WorldWideScience

Sample records for sanger sequencing identifies

  1. Sanger dideoxy sequencing of DNA.

    Science.gov (United States)

    Walker, Sarah E; Lorsch, Jon

    2013-01-01

    While the ease and reduced cost of automated DNA sequencing has largely obviated the need for manual dideoxy sequencing for routine purposes, specific applications require manual DNA sequencing. For instance, in studies of enzymes or proteins that bind or modify DNA, a DNA ladder is often used to map the site at which an enzyme is bound or a modification occurs. In these cases, the Sanger method for dideoxy sequencing provides a rapid and facile method for producing a labeled DNA ladder. Copyright © 2013 Elsevier Inc. All rights reserved.

  2. Phenotype-genotype correlation with Sanger sequencing identified retinol dehydrogenase 12 (RDH12) compound heterozygous variants in a Chinese family with Leber congenital amaurosis.

    Science.gov (United States)

    Li, Yun; Pan, Qing; Gu, Yang-Shun

    2017-05-01

    Leber congenital amaurosis (LCA) is a group of clinically and genetically heterogeneous retinal dystrophy. To date, 22 genes are known to be responsible for LCA, and some specific phenotypic features could provide significant prognostic information for a potential genetic etiology. This study is to identify gene variants responsible for LCA in a Chinese family using direct Sanger sequencing, with the help of phenotype-genotype correlations. A Chinese family with six members including two individuals affected with LCA was studied. All patients underwent a complete ophthalmic examination. Based on phenotype-genotype correlation, direct Sanger sequencing was performed to identify the candidate gene on all family members and normal controls. Targeted next-generation sequencing was used to exclude other known LCA genes. By Sanger sequencing, we identified two novel missense variants in the retinol dehydrogenase 12 (RDH12) gene: a c.164C>A transversion predicting a p.T55K substitution, and a c.535C>G transversion predicting a p.H179D substitution. The two affected subjects carried both RDH12 variants, while their parents and offspring carried only one of heterozygous variants, showing complete cosegregation of the variants. The compound heterozygous variants were not present in 600 normal controls. Besides, the RDH12 variants were confirmed by targeted next-generation sequencing. The RDH12 compound heterozygous variants might be the cause of the LCA family. Our study adds to the molecular spectrum of RDH12-related retinopathy and offers an effective example of the power of phenotype-genotype correlations in molecular diagnosis of LCA.

  3. CRISP-ID: decoding CRISPR mediated indels by Sanger sequencing

    National Research Council Canada - National Science Library

    Dehairs, Jonas; Talebi, Ali; Cherifi, Yacine; Swinnen, Johannes V

    2016-01-01

    .... Here, we present CRISP-ID, a web application which uses a unique algorithm for genotyping up to three alleles from a single Sanger sequencing trace, providing a robust and readily accessible platform...

  4. CRISP-ID: decoding CRISPR mediated indels by Sanger sequencing.

    Science.gov (United States)

    Dehairs, Jonas; Talebi, Ali; Cherifi, Yacine; Swinnen, Johannes V

    2016-07-01

    The advent of next generation gene editing technologies has revolutionized the fields of genome engineering in allowing the generation of gene knockout models and functional gene analysis. However, the screening of resultant clones remains challenging due to the simultaneous presence of different indels. Here, we present CRISP-ID, a web application which uses a unique algorithm for genotyping up to three alleles from a single Sanger sequencing trace, providing a robust and readily accessible platform to directly identify indels and significantly speed up the characterization of clones.

  5. Targeted next-generation sequencing can replace Sanger sequencing in clinical diagnostics

    NARCIS (Netherlands)

    Sikkema-Raddatz, B.; Johansson, L.F.; de Boer, E.N.; Almomani, R.; Boven, L.G.; van den Berg, M.P.; van Spaendonck-Zwarts, K.Y.; van Tintelen, J.P.; Sijmons, R.H.; Jongbloed, J.D.H.; Sinke, R.J.

    Mutation detection through exome sequencing allows simultaneous analysis of all coding sequences of genes. However, it cannot yet replace Sanger sequencing (SS) in diagnostics because of incomplete representation and coverage of exons leading to missing clinically relevant mutations. Targeted

  6. Comparison of the Equine Reference Sequence with Its Sanger Source Data and New Illumina Reads.

    Directory of Open Access Journals (Sweden)

    Jovan Rebolledo-Mendez

    Full Text Available The reference assembly for the domestic horse, EquCab2, published in 2009, was built using approximately 30 million Sanger reads from a Thoroughbred mare named Twilight. Contiguity in the assembly was facilitated using nearly 315 thousand BAC end sequences from Twilight's half brother Bravo. Since then, it has served as the foundation for many genome-wide analyses that include not only the modern horse, but ancient horses and other equid species as well. As data mapped to this reference has accumulated, consistent variation between mapped datasets and the reference, in terms of regions with no read coverage, single nucleotide variants, and small insertions/deletions have become apparent. In many cases, it is not clear whether these differences are the result of true sequence variation between the research subjects' and Twilight's genome or due to errors in the reference. EquCab2 is regarded as "The Twilight Assembly." The objective of this study was to identify inconsistencies between the EquCab2 assembly and the source Twilight Sanger data used to build it. To that end, the original Sanger and BAC end reads have been mapped back to this equine reference and assessed with the addition of approximately 40X coverage of new Illumina Paired-End sequence data. The resulting mapped datasets identify those regions with low Sanger read coverage, as well as variation in genomic content that is not consistent with either the original Twilight Sanger data or the new genomic sequence data generated from Twilight on the Illumina platform. As the haploid EquCab2 reference assembly was created using Sanger reads derived largely from a single individual, the vast majority of variation detected in a mapped dataset comprised of those same Sanger reads should be heterozygous. In contrast, homozygous variations would represent either errors in the reference or contributions from Bravo's BAC end sequences. Our analysis identifies 720,843 homozygous discrepancies

  7. Comparison of base composition analysis and Sanger sequencing of mitochondrial DNA for four U.S. population groups.

    Science.gov (United States)

    Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M

    2014-01-01

    A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.

  8. Translating sanger-based routine DNA diagnostics into generic massive parallel ion semiconductor sequencing

    NARCIS (Netherlands)

    Diekstra, A.; Bosgoed, E.A.J.; Rikken, A.; Lier, B. van; Kamsteeg, E.J.; Tychon, M.W.J.; Derks, R.C.; Soest, R.A.; Mensenkamp, A.R.; Scheffer, H.; Neveling, K.; Nelen, M.R.

    2015-01-01

    BACKGROUND: Dideoxy-based chain termination sequencing developed by Sanger is the gold standard sequencing approach and allows clinical diagnostics of disorders with relatively low genetic heterogeneity. Recently, new next generation sequencing (NGS) technologies have found their way into diagnostic

  9. Translating sanger-based routine DNA diagnostics into generic massive parallel ion semiconductor sequencing.

    Science.gov (United States)

    Diekstra, Adinda; Bosgoed, Ermanno; Rikken, Alwin; van Lier, Bart; Kamsteeg, Erik-Jan; Tychon, Marloes; Derks, Ronny C; van Soest, Ronald A; Mensenkamp, Arjen R; Scheffer, Hans; Neveling, Kornelia; Nelen, Marcel R

    2015-01-01

    Dideoxy-based chain termination sequencing developed by Sanger is the gold standard sequencing approach and allows clinical diagnostics of disorders with relatively low genetic heterogeneity. Recently, new next generation sequencing (NGS) technologies have found their way into diagnostic laboratories, enabling the sequencing of large targeted gene panels or exomes. The development of benchtop NGS instruments now allows the analysis of single genes or small gene panels, making these platforms increasingly competitive with Sanger sequencing. We developed a generic automated ion semiconductor sequencing work flow that can be used in a clinical setting and can serve as a substitute for Sanger sequencing. Standard amplicon-based enrichment remained identical to PCR for Sanger sequencing. A novel postenrichment pooling strategy was developed, limiting the number of library preparations and reducing sequencing costs up to 70% compared to Sanger sequencing. A total of 1224 known pathogenic variants were analyzed, yielding an analytical sensitivity of 99.92% and specificity of 99.99%. In a second experiment, a total of 100 patient-derived DNA samples were analyzed using a blind analysis. The results showed an analytical sensitivity of 99.60% and specificity of 99.98%, comparable to Sanger sequencing. Ion semiconductor sequencing can be a first choice mutation scanning technique, independent of the genes analyzed. © 2014 American Association for Clinical Chemistry.

  10. Simple protocol for population (Sanger sequencing for Zika virus genomic regions

    Directory of Open Access Journals (Sweden)

    Gabriela Bastos Cabral

    2017-11-01

    Full Text Available BACKGROUND A number of Zika virus (ZIKV sequences were obtained using Next-generation sequencing (NGS, a methodology widely applied in genetic diversity studies and virome discovery. However Sanger method is still a robust, affordable, rapid and specific tool to obtain valuable sequences. OBJECTIVE The aim of this study was to develop a simple and robust Sanger sequencing protocol targeting ZIKV relevant genetic regions, as envelope protein and nonstructural protein 5 (NS5. In addition, phylogenetic analysis of the ZIKV strains obtained using the present protocol and their comparison with previously published NGS sequences were also carried out. METHODS Six Vero cells isolates from serum and one urine sample were available to develop the procedure. Primer sets were designed in order to conduct a nested RT-PCR and a Sanger sequencing protocols. Bayesian analysis was used to infer phylogenetic relationships. FINDINGS Seven complete ZIKV envelope protein (1,571 kb and six partial NS5 (0,798 Kb were obtained using the protocol, with no amplification of NS5 gene from urine sample. Two NS5 sequences presented ambiguities at positions 495 and 196. Nucleotide analysis of a Sanger sequence and consensus sequence of previously NGS study revealed 100% identity. ZIKV strains described here clustered within the Asian lineage. MAIN CONCLUSIONS The present study provided a simple and low-cost Sanger protocol to sequence relevant genes of the ZIKV genome. The identity of Sanger generated sequences with published consensus NGS support the use of Sanger method for ZIKV population studies. The regions evaluated were able to provide robust phylogenetic signals and may be used to conduct molecular epidemiological studies and monitor viral evolution.

  11. Simple protocol for population (Sanger) sequencing for Zika virus genomic regions.

    Science.gov (United States)

    Cabral, Gabriela Bastos; Ferreira, João Leandro de Paula; Souza, Renato Pereira de; Cunha, Mariana Sequetin; Luchs, Adriana; Figueiredo, Cristina Adelaide; Brígido, Luís Fernando de Macedo

    2018-01-01

    A number of Zika virus (ZIKV) sequences were obtained using Next-generation sequencing (NGS), a methodology widely applied in genetic diversity studies and virome discovery. However Sanger method is still a robust, affordable, rapid and specific tool to obtain valuable sequences. The aim of this study was to develop a simple and robust Sanger sequencing protocol targeting ZIKV relevant genetic regions, as envelope protein and nonstructural protein 5 (NS5). In addition, phylogenetic analysis of the ZIKV strains obtained using the present protocol and their comparison with previously published NGS sequences were also carried out. Six Vero cells isolates from serum and one urine sample were available to develop the procedure. Primer sets were designed in order to conduct a nested RT-PCR and a Sanger sequencing protocols. Bayesian analysis was used to infer phylogenetic relationships. Seven complete ZIKV envelope protein (1,571 kb) and six partial NS5 (0,798 Kb) were obtained using the protocol, with no amplification of NS5 gene from urine sample. Two NS5 sequences presented ambiguities at positions 495 and 196. Nucleotide analysis of a Sanger sequence and consensus sequence of previously NGS study revealed 100% identity. ZIKV strains described here clustered within the Asian lineage. The present study provided a simple and low-cost Sanger protocol to sequence relevant genes of the ZIKV genome. The identity of Sanger generated sequences with published consensus NGS support the use of Sanger method for ZIKV population studies. The regions evaluated were able to provide robust phylogenetic signals and may be used to conduct molecular epidemiological studies and monitor viral evolution.

  12. Homozygosity mapping and targeted sanger sequencing reveal genetic defects underlying inherited retinal disease in families from pakistan.

    Directory of Open Access Journals (Sweden)

    Maleeha Maria

    Full Text Available Homozygosity mapping has facilitated the identification of the genetic causes underlying inherited diseases, particularly in consanguineous families with multiple affected individuals. This knowledge has also resulted in a mutation dataset that can be used in a cost and time effective manner to screen frequent population-specific genetic variations associated with diseases such as inherited retinal disease (IRD.We genetically screened 13 families from a cohort of 81 Pakistani IRD families diagnosed with Leber congenital amaurosis (LCA, retinitis pigmentosa (RP, congenital stationary night blindness (CSNB, or cone dystrophy (CD. We employed genome-wide single nucleotide polymorphism (SNP array analysis to identify homozygous regions shared by affected individuals and performed Sanger sequencing of IRD-associated genes located in the sizeable homozygous regions. In addition, based on population specific mutation data we performed targeted Sanger sequencing (TSS of frequent variants in AIPL1, CEP290, CRB1, GUCY2D, LCA5, RPGRIP1 and TULP1, in probands from 28 LCA families.Homozygosity mapping and Sanger sequencing of IRD-associated genes revealed the underlying mutations in 10 families. TSS revealed causative variants in three families. In these 13 families four novel mutations were identified in CNGA1, CNGB1, GUCY2D, and RPGRIP1.Homozygosity mapping and TSS revealed the underlying genetic cause in 13 IRD families, which is useful for genetic counseling as well as therapeutic interventions that are likely to become available in the near future.

  13. GLASS: assisted and standardized assessment of gene variations from Sanger sequence trace data.

    Science.gov (United States)

    Pal, Karol; Bystry, Vojtech; Reigl, Tomas; Demko, Martin; Krejci, Adam; Touloumenidou, Tasoula; Stalika, Evangelia; Tichy, Boris; Ghia, Paolo; Stamatopoulos, Kostas; Pospisilova, Sarka; Malcikova, Jitka; Darzentas, Nikos

    2017-12-01

    Sanger sequencing is still being employed for sequence variant detection by many laboratories, especially in a clinical setting. However, chromatogram interpretation often requires manual inspection and in some cases, considerable expertise. We present GLASS, a web-based Sanger sequence trace viewer, editor, aligner and variant caller, built to assist with the assessment of variations in 'curated' or user-provided genes. Critically, it produces a standardized variant output as recommended by the Human Genome Variation Society. GLASS is freely available at http://bat.infspire.org/genomepd/glass/ with source code at https://github.com/infspiredBAT/GLASS. nikos.darzentas@gmail.com or malcikova.jitka@fnbrno.cz. Supplementary data are available at Bioinformatics online.

  14. Rapid Sanger sequencing of the 16S rRNA gene for identification of some common pathogens.

    Directory of Open Access Journals (Sweden)

    Linxiang Chen

    Full Text Available Conventional Sanger sequencing remains time-consuming and laborious. In this study, we developed a rapid improved sequencing protocol of 16S rRNA for pathogens identification by using a new combination of SYBR Green I real-time PCR and Sanger sequencing with FTA® cards. To compare the sequencing quality of this method with conventional Sanger sequencing, 12 strains, including three kinds of strains (1 reference strain and 3 clinical strains, which were previously identified by biochemical tests, which have 4 Pseudomonas aeruginosa, 4 Staphyloccocus aureus and 4 Escherichia coli, were targeted. Additionally, to validate the sequencing results and bacteria identification, expanded specimens with 90 clinical strains, also comprised of the three kinds of strains which included 30 samples respectively, were performed as just described. The results showed that although statistical differences (P<0.05 were found in sequencing quality between the two methods, their identification results were all correct and consistent. The workload, the time consumption and the cost per batch were respectively light versus heavy, 8 h versus 11 h and $420 versus $400. In the 90 clinical strains, all of the Pseudomonas aeruginosa and Staphyloccocus aureus strains were correctly identified, but only 26.7% of the Escherichia coli strains were recognized as Escherichia coli, while 33.3% as Shigella sonnei and 40% as Shigella dysenteriae. The protocol described here is a rapid, reliable, stable and convenient method for 16S rRNA sequencing, and can be used for Pseudomonas aeruginosa and Staphyloccocus aureus identification, yet it is not completely suitable for discriminating Escherichia coli and Shigella strains.

  15. Sanger sequencing solved a cryptic case of severe alpha₁-antitrypsin deficiency.

    Science.gov (United States)

    Zhan, Shing H; Abboud, Raja T; Jung, Benjamin; Kuchinka, Brian; Ralston, Diana; Casey, Brett; Mattman, Andre

    2012-04-01

    Alpha(1)-antitrypsin deficiency (AATD) is a clinically under-diagnosed genetic disorder that originates from deleterious mutations in the alpha(1)-antitrypsin (AAT) gene, SERPINA1. Severe deficiency is associated with significant pulmonary and hepatic malfunctions. Conventional clinical diagnosis involves the evaluation of serum AAT level and detection of diseased protein isoforms. In this communication, we describe the investigations of a case of severe AATD in which the AAT levels were well below those expected from the MZ phenotype determined by isoelectric focusing for protease inhibitor type (IEF PI-typing). In addition to the traditional diagnostic method that combines the assessment of serum AAT concentration and IEF PI-typing, we investigated the SERPINA1 gene of the proband and participating family members for mutations using Sanger sequencing. We identified a novel mutation (M409T) in the proband, initially missed by the standard diagnostic approach. The novel mutation was present in 4 out of 8 family members who participated in the study. This report illustrates the diagnostic value of incorporating exon sequencing of the AAT gene into the algorithm for evaluating AATD, particularly when the AAT serum level is significantly lower than expected from IEF PI-typing. Copyright © 2012 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  16. Poly Peak Parser: Method and software for identification of unknown indels using Sanger Sequencing of PCR products

    Science.gov (United States)

    Hill, Jonathon T.; Demarest, Bradley L.; Bisgrove, Brent W.; Su, Yi-chu; Smith, Megan; Yost, H. Joseph

    2015-01-01

    Background Genome editing techniques, including ZFN, TALEN and CRISPR, have created a need to rapidly screen many F1 individuals to identify carriers of indels and determine the sequences of the mutations. Current techniques require multiple clones of the targeted region to be sequenced for each individual, which is inefficient when many individuals must be analyzed. Direct Sanger sequencing of a PCR amplified region surrounding the target site is efficient, but Sanger sequencing genomes heterozygous for an indel results in a string of “double peaks” due to the mismatched region. Results In order to facilitate indel identification, we developed an online tool called Poly Peak Parser (available at http://yost.genetics.utah.edu/software.php) that is able to separate chromatogram data containing ambiguous base calls into wild-type and mutant allele sequences. This tool allows the nature of the indel to be determined from a single sequencing run per individual performed directly on a PCR product spanning the targeted site, without cloning. Conclusions The method and algorithm described here facilitate rapid identification and sequence characterization of heterozygous mutant carriers generated by genome editing. Although designed for screening F1 individuals, this tool can also be used to identify heterozygous indels in many contexts. PMID:25160973

  17. Poly peak parser: Method and software for identification of unknown indels using sanger sequencing of polymerase chain reaction products.

    Science.gov (United States)

    Hill, Jonathon T; Demarest, Bradley L; Bisgrove, Brent W; Su, Yi-Chu; Smith, Megan; Yost, H Joseph

    2014-12-01

    Genome editing techniques, including ZFN, TALEN, and CRISPR, have created a need to rapidly screen many F1 individuals to identify carriers of indels and determine the sequences of the mutations. Current techniques require multiple clones of the targeted region to be sequenced for each individual, which is inefficient when many individuals must be analyzed. Direct Sanger sequencing of a polymerase chain reaction (PCR) amplified region surrounding the target site is efficient, but Sanger sequencing genomes heterozygous for an indel results in a string of "double peaks" due to the mismatched region. To facilitate indel identification, we developed an online tool called Poly Peak Parser (available at http://yost.genetics.utah.edu/software.php) that is able to separate chromatogram data containing ambiguous base calls into wild-type and mutant allele sequences. This tool allows the nature of the indel to be determined from a single sequencing run per individual performed directly on a PCR product spanning the targeted site, without cloning. The method and algorithm described here facilitate rapid identification and sequence characterization of heterozygous mutant carriers generated by genome editing. Although designed for screening F1 individuals, this tool can also be used to identify heterozygous indels in many contexts. © 2014 The Authors. Developmental Dynamics published by Wiley Periodicals, Inc. on behalf of American Association of Anatomists.

  18. An effective combination of sanger and next generation sequencing in diagnostics of primary ciliary dyskinesia.

    Science.gov (United States)

    Djakow, Jana; Kramná, Lenka; Dušátková, Lenka; Uhlík, Jiří; Pursiheimo, Juha-Pekka; Svobodová, Tamara; Pohunek, Petr; Cinek, Ondřej

    2016-05-01

    Primary ciliary dyskinesia (PCD) is a multigenic autosomal recessive condition affecting respiratory tract and other organs where ciliary motility is required. The extent of its genetic heterogeneity is remarkable. The aim of the study was to develop a cost-effective pipeline for genetic diagnostics using a combination of Sanger and next generation sequencing (NGS). Data and samples of 33 families with 38 affected subjects with PCD diagnosed in childhood were collected over the territory of the Czech Republic. A panel of 18 PCD causative or candidate genes was implemented into an Illumina TruSeq Custom Amplicon NGS assay, and three ancestral mutations in SPAG1 were screened by conventional Sanger sequencing, which was also used for the confirmation of the NGS results and for the analysis of familial segregation. The causative gene was DNAH5 in 11/33 (33%) probands, SPAG1 in 8/33 (24%), and DNAI1, CCDC40, LRRC6 in one family each. If the high proportion of subjects with bi-allelic ancestral mutations in SPAG1 is corroborated in other Caucasian populations, a simple Sanger sequencing test for these three mutations may serve as an effective pre-screening step, being followed by an NGS panel for other, much larger, PCD genes. We present a combination of Sanger sequencing with an NGS panel for known and candidate PCD genes, implemented in a moderate-size national collection of patients. This strategy has proven to be cost-effective, rapid and reliable, and was able to detect the causative gene in two thirds of our PCD patients. © 2015 Wiley Periodicals, Inc.

  19. Comparing Whole-Genome Sequencing with Sanger Sequencing for spa Typing of Methicillin-Resistant Staphylococcus aureus

    DEFF Research Database (Denmark)

    Bartels, Mette Damkjaer; Petersen, Andreas; Worning, Peder

    2014-01-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013......, and an in-house analysis pipeline determines the spa types. Due to national surveillance, all MRSA isolates are sent to Statens Serum Institut, where the spa type is determined by PCR and Sanger sequencing. The purpose of this study was to evaluate the reliability of the spa types obtained by 150-bp paired......-end Illumina WGS. MRSA isolates from new MRSA patients in 2013 (n = 699) in the capital region of Denmark were included. We found a 97% agreement between spa types obtained by the two methods. All isolates achieved a spa type by both methods. Nineteen isolates differed in spa types by the two methods, in most...

  20. Simplified large-scale Sanger genome sequencing for influenza A/H3N2 virus.

    Directory of Open Access Journals (Sweden)

    Hong Kai Lee

    Full Text Available BACKGROUND: The advent of next-generation sequencing technologies and the resultant lower costs of sequencing have enabled production of massive amounts of data, including the generation of full genome sequences of pathogens. However, the small genome size of the influenza virus arguably justifies the use of the more conventional Sanger sequencing technology which is still currently more readily available in most diagnostic laboratories. RESULTS: We present a simplified Sanger-based genome sequencing method for sequencing the influenza A/H3N2 virus in a large-scale format. The entire genome sequencing was completed with 19 reverse transcription-polymerase chain reactions (RT-PCRs and 39 sequencing reactions. This method was tested on 15 native clinical samples and 15 culture isolates, respectively, collected between 2009 and 2011. The 15 native clinical samples registered quantification cycle values ranging from 21.0 to 30.56, which were equivalent to 2.4×10(3-1.4×10(6 viral copies/µL of RNA extract. All the PCR-amplified products were sequenced directly without PCR product purification. Notably, high quality sequencing data up to 700 bp were generated for all the samples tested. The completed sequence covered 408,810 nucleotides in total, with 13,627 nucleotides per genome, attaining 100% coding completeness. Of all the bases produced, an average of 89.49% were Phred quality value 40 (QV40 bases (representing an accuracy of circa one miscall for every 10,000 bases or higher, and an average of 93.46% were QV30 bases (one miscall every 1000 bases or higher. CONCLUSIONS: This sequencing protocol has been shown to be cost-effective and less labor-intensive in obtaining full influenza genomes. The constant high quality of sequences generated imparts confidence in extending the application of this non-purified amplicon sequencing approach to other gene sequencing assays, with appropriate use of suitably designed primers.

  1. Insights into bacterioplankton community structure from Sundarbans mangrove ecoregion using Sanger and Illumina MiSeq sequencing approaches: A comparative analysis

    Directory of Open Access Journals (Sweden)

    Anwesha Ghosh

    2017-03-01

    Full Text Available Next generation sequencing using platforms such as Illumina MiSeq provides a deeper insight into the structure and function of bacterioplankton communities in coastal ecosystems compared to traditional molecular techniques such as clone library approach which incorporates Sanger sequencing. In this study, structure of bacterioplankton communities was investigated from two stations of Sundarbans mangrove ecoregion using both Sanger and Illumina MiSeq sequencing approaches. The Illumina MiSeq data is available under the BioProject ID PRJNA35180 and Sanger sequencing data under accession numbers KX014101-KX014140 (Stn1 and KX014372-KX014410 (Stn3. Proteobacteria-, Firmicutes- and Bacteroidetes-like sequences retrieved from both approaches appeared to be abundant in the studied ecosystem. The Illumina MiSeq data (2.1 GB provided a deeper insight into the structure of bacterioplankton communities and revealed the presence of bacterial phyla such as Actinobacteria, Cyanobacteria, Tenericutes, Verrucomicrobia which were not recovered based on Sanger sequencing. A comparative analysis of bacterioplankton communities from both stations highlighted the presence of genera that appear in both stations and genera that occur exclusively in either station. However, both the Sanger sequencing and Illumina MiSeq data were coherent at broader taxonomic levels. Pseudomonas, Devosia, Hyphomonas and Erythrobacter-like sequences were the abundant bacterial genera found in the studied ecosystem. Both the sequencing methods showed broad coherence although as expected the Illumina MiSeq data helped identify rarer bacterioplankton groups and also showed the presence of unassigned OTUs indicating possible presence of novel bacterioplankton from the studied mangrove ecosystem.

  2. Comparing whole-genome sequencing with Sanger sequencing for spa typing of methicillin-resistant Staphylococcus aureus.

    Science.gov (United States)

    Bartels, Mette Damkjær; Petersen, Andreas; Worning, Peder; Nielsen, Jesper Boye; Larner-Svensson, Hanna; Johansen, Helle Krogh; Andersen, Leif Percival; Jarløv, Jens Otto; Boye, Kit; Larsen, Anders Rhod; Westh, Henrik

    2014-12-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and an in-house analysis pipeline determines the spa types. Due to national surveillance, all MRSA isolates are sent to Statens Serum Institut, where the spa type is determined by PCR and Sanger sequencing. The purpose of this study was to evaluate the reliability of the spa types obtained by 150-bp paired-end Illumina WGS. MRSA isolates from new MRSA patients in 2013 (n = 699) in the capital region of Denmark were included. We found a 97% agreement between spa types obtained by the two methods. All isolates achieved a spa type by both methods. Nineteen isolates differed in spa types by the two methods, in most cases due to the lack of 24-bp repeats in the whole-genome-sequenced isolates. These related but incorrect spa types should have no consequence in outbreak investigations, since all epidemiologically linked isolates, regardless of spa type, will be included in the single nucleotide polymorphism (SNP) analysis. This will reveal the close relatedness of the spa types. In conclusion, our data show that WGS is a reliable method to determine the spa type of MRSA. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  3. Automated Sanger DNA sequencing with one label in less than four lanes on gel.

    Science.gov (United States)

    Ansorge, W; Voss, H; Wirkner, U; Schwager, C; Stegemann, J; Pepperkok, R; Zimmermann, J; Erfle, H

    1989-01-01

    Novel Sanger dideoxy sequencing with only one fluorescent dye label for the four bases of one clone and sequence determination in two lanes on polyacrylamide gel is presented, loading A greater than G in one lane and T greater than C in the other. Sequencing reactions for the two bases in each lane are carried out in one tube. At present the ratio of ddATP:ddGTP and ddTTP:ddCPT is set to 5:1 in the two tubes. Distinction between the two bases in one lane is done by comparing the different magnitudes of the peaks. This method increases the capacity since more clones may be run simultaneously on one gel, while keeping the reliability and simplicity that comes with the use of only one fluorescent dye for the four bases of one clone. At present about 200 bases are determined with the one-dye two-lane method on the EMBL's automated fluorescent DNA sequencer, using T7 DNA polymerase. The error rate in the deduced sequence is about 1%. The technique is used for the determination of overlaps in mapping projects. In principle, it is possible to determine the sequence with one dye in only one lane on the gel by choosing the proper ddNTP ratios for all four bases, carrying out reactions in one tube and applying the product in one lane, but the error rate for this one-lane method seems too high at present and further improvements in the uniformity of peaks obtainable with the T7 DNA polymerase or other enzymes are required.

  4. NPM1 mutation analysis in Acute Myeloid Leukemia: Comparison of three techniques Sanger Sequencing, Pyrosequencing, and Real Time PCR.

    Science.gov (United States)

    Kumar, Dushyant; Mehta, Anurag; Panigrahi, Manoj Kumar; Nath, Sukanta; Saikia, Kandarpa Kumar

    2017-11-13

    Nucleophosmin-1 (NPM1) mutations have prognostic importance in acute myeloid leukemia (AML) patients with intermediate-risk karyotype at diagnosis. Approximately 30% of newly diagnosed cytogenetically normal AML (CN-AML) patients harbor NPM1 mutation in India. In this study we compared the efficiency of three molecular techniques in detecting NPM1 mutation in perepherial blood and bone marrow samples. In a single centre cohort we analyzed 165 CN-AML bone marrow/peripheral blood for NPM1 mutation analysis. Around 30 % CN-AML presented with the NPM1 mutation. For the detection, three methods were compared: Sanger sequencing, Pyrosequencing, and Real-timePCR. NPM1 exon 12 mutations were observed in 52 (31.51%) of all CN-AML cases. The sensitivity of Sanger sequencing, Pyrosequencing, andReal-time PCR: 80%, 90%, and 95%, whereas specificity was 95%, 100%, 100% respectively. While minimum limit of mutation detection was 20%-30% for Sanger sequencing, 1-5% for Pyrosequencing and 0.1-1% for Real Time PCR. Sequencing method which is reference method has lowest sensitive and sometimes difficult to interpret. Whereas real-time PCR is a highlysensitive method for mutation detection but limited for specific mutation type, While in our study pyrosequencing emerge as best suitable technique for thedetection of NPM1 mutation detection by Pyr analysis on the basis of its easy of interpretation and less time-consuming processes than sanger sequencing.

  5. Diagnostic single gene analyses beyond Sanger. Economic high-throughput sequencing of small genes involved in congenital coagulation and platelet disorders.

    Science.gov (United States)

    Najm, Juliane; Rath, Matthias; Schröder, Winnie; Felbor, Ute

    2017-07-17

    Molecular testing of congenital coagulation and platelet disorders offers confirmation of clinical diagnoses, supports genetic counselling, and enables predictive and prenatal diagnosis. In some cases, genotype-phenotype correlations are important for predicting the clinical course of the disease and adaptation of individualized therapy. Until recently, genotyping has been mainly performed by Sanger sequencing. While next generation sequencing (NGS) enables the parallel analysis of multiple genes, the cost-value ratio of custom-made panels can be unfavorable for analyses of specific small genes. The aim of this study was to transfer genotyping of small genes involved in congenital coagulation and platelet disorders from Sanger sequencing to an NGS-based method. A LR-PCR approach for target enrichment of the entire genomic regions of the genes F7, F10, F11, F12, GATA1, MYH9, TUBB1 and WAS was combined with high-throughput sequencing on a MiSeq platform. NGS detected all variants that had previously been identified by Sanger sequencing. Our results demonstrate that this approach is an accurate and flexible tool for molecular genetic diagnostics of single small genes.

  6. Diagnosis of Fanconi Anemia: Mutation Analysis by Multiplex Ligation-Dependent Probe Amplification and PCR-Based Sanger Sequencing

    Directory of Open Access Journals (Sweden)

    Johan J. P. Gille

    2012-01-01

    Full Text Available Fanconi anemia (FA is a rare inherited disease characterized by developmental defects, short stature, bone marrow failure, and a high risk of malignancies. FA is heterogeneous: 15 genetic subtypes have been distinguished so far. A clinical diagnosis of FA needs to be confirmed by testing cells for sensitivity to cross-linking agents in a chromosomal breakage test. As a second step, DNA testing can be employed to elucidate the genetic subtype of the patient and to identify the familial mutations. This knowledge allows preimplantation genetic diagnosis (PGD and enables prenatal DNA testing in future pregnancies. Although simultaneous testing of all FA genes by next generation sequencing will be possible in the near future, this technique will not be available immediately for all laboratories. In addition, in populations with strong founder mutations, a limited test using Sanger sequencing and MLPA will be a cost-effective alternative. We describe a strategy and optimized conditions for the screening of FANCA, FANCB, FANCC, FANCE, FANCF, and FANCG and present the results obtained in a cohort of 54 patients referred to our diagnostic service since 2008. In addition, the follow up with respect to genetic counseling and carrier screening in the families is discussed.

  7. Diagnosis of Fanconi Anemia: Mutation Analysis by Multiplex Ligation-Dependent Probe Amplification and PCR-Based Sanger Sequencing

    Science.gov (United States)

    Gille, Johan J. P.; Floor, Karijn; Kerkhoven, Lianne; Ameziane, Najim; Joenje, Hans; de Winter, Johan P.

    2012-01-01

    Fanconi anemia (FA) is a rare inherited disease characterized by developmental defects, short stature, bone marrow failure, and a high risk of malignancies. FA is heterogeneous: 15 genetic subtypes have been distinguished so far. A clinical diagnosis of FA needs to be confirmed by testing cells for sensitivity to cross-linking agents in a chromosomal breakage test. As a second step, DNA testing can be employed to elucidate the genetic subtype of the patient and to identify the familial mutations. This knowledge allows preimplantation genetic diagnosis (PGD) and enables prenatal DNA testing in future pregnancies. Although simultaneous testing of all FA genes by next generation sequencing will be possible in the near future, this technique will not be available immediately for all laboratories. In addition, in populations with strong founder mutations, a limited test using Sanger sequencing and MLPA will be a cost-effective alternative. We describe a strategy and optimized conditions for the screening of FANCA, FANCB, FANCC, FANCE, FANCF, and FANCG and present the results obtained in a cohort of 54 patients referred to our diagnostic service since 2008. In addition, the follow up with respect to genetic counseling and carrier screening in the families is discussed. PMID:22778927

  8. 454 next generation-sequencing outperforms allele-specific PCR, Sanger sequencing, and pyrosequencing for routine KRAS mutation analysis of formalin-fixed, paraffin-embedded samples

    Science.gov (United States)

    Altimari, Annalisa; de Biase, Dario; De Maglio, Giovanna; Gruppioni, Elisa; Capizzi, Elisa; Degiovanni, Alessio; D’Errico, Antonia; Pession, Annalisa; Pizzolitto, Stefano; Fiorentino, Michelangelo; Tallini, Giovanni

    2013-01-01

    Detection of KRAS mutations in archival pathology samples is critical for therapeutic appropriateness of anti-EGFR monoclonal antibodies in colorectal cancer. We compared the sensitivity, specificity, and accuracy of Sanger sequencing, ARMS-Scorpion (TheraScreen®) real-time polymerase chain reaction (PCR), pyrosequencing, chip array hybridization, and 454 next-generation sequencing to assess KRAS codon 12 and 13 mutations in 60 nonconsecutive selected cases of colorectal cancer. Twenty of the 60 cases were detected as wild-type KRAS by all methods with 100% specificity. Among the 40 mutated cases, 13 were discrepant with at least one method. The sensitivity was 85%, 90%, 93%, and 92%, and the accuracy was 90%, 93%, 95%, and 95% for Sanger sequencing, TheraScreen real-time PCR, pyrosequencing, and chip array hybridization, respectively. The main limitation of Sanger sequencing was its low analytical sensitivity, whereas TheraScreen real-time PCR, pyrosequencing, and chip array hybridization showed higher sensitivity but suffered from the limitations of predesigned assays. Concordance between the methods was k = 0.79 for Sanger sequencing and k > 0.85 for the other techniques. Tumor cell enrichment correlated significantly with the abundance of KRAS-mutated deoxyribonucleic acid (DNA), evaluated as ΔCt for TheraScreen real-time PCR (P = 0.03), percentage of mutation for pyrosequencing (P = 0.001), ratio for chip array hybridization (P = 0.003), and percentage of mutation for 454 next-generation sequencing (P = 0.004). Also, 454 next-generation sequencing showed the best cross correlation for quantification of mutation abundance compared with all the other methods (P < 0.001). Our comparison showed the superiority of next-generation sequencing over the other techniques in terms of sensitivity and specificity. Next-generation sequencing will replace Sanger sequencing as the reference technique for diagnostic detection of KRAS mutation in archival tumor tissues. PMID

  9. Comparison of Illumina de novo assembled and Sanger sequenced viral genomes: A case study for RNA viruses recovered from the plant pathogenic fungus Sclerotinia sclerotiorum.

    Science.gov (United States)

    Khalifa, Mahmoud E; Varsani, Arvind; Ganley, Austen R D; Pearson, Michael N

    2016-07-02

    The advent of 'next generation sequencing' (NGS) technologies has led to the discovery of many novel mycoviruses, the majority of which are sufficiently different from previously sequenced viruses that there is no appropriate reference sequence on which to base the sequence assembly. Although many new genome sequences are generated by NGS, confirmation of the sequence by Sanger sequencing is still essential for formal classification by the International Committee for the Taxonomy of Viruses (ICTV), although this is currently under review. To empirically test the validity of de novo assembled mycovirus genomes from dsRNA extracts, we compared the results from Illumina sequencing with those from random cloning plus targeted PCR coupled with Sanger sequencing for viruses from five Sclerotinia sclerotiorum isolates. Through Sanger sequencing we detected nine viral genomes while through Illumina sequencing we detected the same nine viruses plus one additional virus from the same samples. Critically, the Illumina derived sequences share >99.3 % identity to those obtained by cloning and Sanger sequencing. Although, there is scope for errors in de novo assembled viral genomes, our results demonstrate that by maximising the proportion of viral sequence in the data and using sufficiently rigorous quality controls, it is possible to generate de novo genome sequences of comparable accuracy from Illumina sequencing to those obtained by Sanger sequencing. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Identification of novel BRCA founder mutations in Middle Eastern breast cancer patients using capture and Sanger sequencing analysis.

    Science.gov (United States)

    Bu, Rong; Siraj, Abdul K; Al-Obaisi, Khadija A S; Beg, Shaham; Al Hazmi, Mohsen; Ajarim, Dahish; Tulbah, Asma; Al-Dayel, Fouad; Al-Kuraya, Khawla S

    2016-09-01

    Ethnic differences of breast cancer genomics have prompted us to investigate the spectra of BRCA1 and BRCA2 mutations in different populations. The prevalence and effect of BRCA 1 and BRCA 2 mutations in Middle Eastern population is not fully explored. To characterize the prevalence of BRCA mutations in Middle Eastern breast cancer patients, BRCA mutation screening was performed in 818 unselected breast cancer patients using Capture and/or Sanger sequencing. 19 short tandem repeat (STR) markers were used for founder mutation analysis. In our study, nine different types of deleterious mutation were identified in 28 (3.4%) cases, 25 (89.3%) cases in BRCA 1 and 3 (10.7%) cases in BRCA 2. Seven recurrent mutations identified accounted for 92.9% (26/28) of all the mutant cases. Haplotype analysis was performed to confirm c.1140 dupG and c.4136_4137delCT mutations as novel putative founder mutation, accounting for 46.4% (13/28) of all BRCA mutant cases and 1.6% (13/818) of all the breast cancer cases, respectively. Moreover, BRCA 1 mutation was significantly associated with BRCA 1 protein expression loss (p = 0.0005). Our finding revealed that a substantial number of BRCA mutations were identified in clinically high risk breast cancer from Middle East region. Identification of the mutation spectrum, prevalence and founder effect in Middle Eastern population facilitates genetic counseling, risk assessment and development of cost-effective screening strategy. © 2016 UICC.

  11. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L. using sanger and next generation sequencing platforms: development and applications.

    Directory of Open Access Journals (Sweden)

    Himabindu Kudapa

    Full Text Available A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201, comprising 46,369 transcript assembly contigs (TACs has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8% of the TACs and gene ontology assignments were determined for 21,471 (46.3%. The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs and intron spanning regions (ISRs for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding

  12. A comparison of parallel pyrosequencing and sanger clone-based sequencing and its impact on the characterization of the genetic diversity of HIV-1.

    Directory of Open Access Journals (Sweden)

    Binhua Liang

    Full Text Available BACKGROUND: Pyrosequencing technology has the potential to rapidly sequence HIV-1 viral quasispecies without requiring the traditional approach of cloning. In this study, we investigated the utility of ultra-deep pyrosequencing to characterize genetic diversity of the HIV-1 gag quasispecies and assessed the possible contribution of pyrosequencing technology in studying HIV-1 biology and evolution. METHODOLOGY/PRINCIPAL FINDINGS: HIV-1 gag gene was amplified from 96 patients using nested PCR. The PCR products were cloned and sequenced using capillary based Sanger fluorescent dideoxy termination sequencing. The same PCR products were also directly sequenced using the 454 pyrosequencing technology. The two sequencing methods were evaluated for their ability to characterize quasispecies variation, and to reveal sites under host immune pressure for their putative functional significance. A total of 14,034 variations were identified by 454 pyrosequencing versus 3,632 variations by Sanger clone-based (SCB sequencing. 11,050 of these variations were detected only by pyrosequencing. These undetected variations were located in the HIV-1 Gag region which is known to contain putative cytotoxic T lymphocyte (CTL and neutralizing antibody epitopes, and sites related to virus assembly and packaging. Analysis of the positively selected sites derived by the two sequencing methods identified several differences. All of them were located within the CTL epitope regions. CONCLUSIONS/SIGNIFICANCE: Ultra-deep pyrosequencing has proven to be a powerful tool for characterization of HIV-1 genetic diversity with enhanced sensitivity, efficiency, and accuracy. It also improved reliability of downstream evolutionary and functional analysis of HIV-1 quasispecies.

  13. 454 next generation-sequencing outperforms allele-specific PCR, Sanger sequencing, and pyrosequencing for routine KRAS mutation analysis of formalin-fixed, paraffin-embedded samples

    Directory of Open Access Journals (Sweden)

    Altimari A

    2013-08-01

    Full Text Available Annalisa Altimari,1,* Dario de Biase,2,* Giovanna De Maglio,3 Elisa Gruppioni,1 Elisa Capizzi,1 Alessio Degiovanni,1 Antonia D'Errico,1 Annalisa Pession,2 Stefano Pizzolitto,3 Michelangelo Fiorentino,1,# Giovanni Tallini2,#1Laboratory of Molecular Oncologic and Transplantation Pathology, S. Orsola-Malpighi Hospital, Bologna, 2Laboratory of Molecular Pathology, Anatomic Pathology, Bellaria Hospital, Bologna, 3Department of Pathology, S. Maria della Misericordia Hospital, Udine, Italy*These authors contributed equally to this work #These authors share senior authorshipAbstract: Detection of KRAS mutations in archival pathology samples is critical for therapeutic appropriateness of anti-EGFR monoclonal antibodies in colorectal cancer. We compared the sensitivity, specificity, and accuracy of Sanger sequencing, ARMS-Scorpion (TheraScreen® real-time polymerase chain reaction (PCR, pyrosequencing, chip array hybridization, and 454 next-generation sequencing to assess KRAS codon 12 and 13 mutations in 60 nonconsecutive selected cases of colorectal cancer. Twenty of the 60 cases were detected as wild-type KRAS by all methods with 100% specificity. Among the 40 mutated cases, 13 were discrepant with at least one method. The sensitivity was 85%, 90%, 93%, and 92%, and the accuracy was 90%, 93%, 95%, and 95% for Sanger sequencing, TheraScreen real-time PCR, pyrosequencing, and chip array hybridization, respectively. The main limitation of Sanger sequencing was its low analytical sensitivity, whereas TheraScreen real-time PCR, pyrosequencing, and chip array hybridization showed higher sensitivity but suffered from the limitations of predesigned assays. Concordance between the methods was k = 0.79 for Sanger sequencing and k > 0.85 for the other techniques. Tumor cell enrichment correlated significantly with the abundance of KRAS-mutated deoxyribonucleic acid (DNA, evaluated as ΔCt for TheraScreen real-time PCR (P = 0.03, percentage of mutation for

  14. Screening for duplications, deletions and a common intronic mutation detects 35% of second mutations in patients with USH2A monoallelic mutations on Sanger sequencing.

    Science.gov (United States)

    Steele-Stallard, Heather B; Le Quesne Stabej, Polona; Lenassi, Eva; Luxon, Linda M; Claustres, Mireille; Roux, Anne-Francoise; Webster, Andrew R; Bitner-Glindzicz, Maria

    2013-08-08

    Usher Syndrome is the leading cause of inherited deaf-blindness. It is divided into three subtypes, of which the most common is Usher type 2, and the USH2A gene accounts for 75-80% of cases. Despite recent sequencing strategies, in our cohort a significant proportion of individuals with Usher type 2 have just one heterozygous disease-causing mutation in USH2A, or no convincing disease-causing mutations across nine Usher genes. The purpose of this study was to improve the molecular diagnosis in these families by screening USH2A for duplications, heterozygous deletions and a common pathogenic deep intronic variant USH2A: c.7595-2144A>G. Forty-nine Usher type 2 or atypical Usher families who had missing mutations (mono-allelic USH2A or no mutations following Sanger sequencing of nine Usher genes) were screened for duplications/deletions using the USH2A SALSA MLPA reagent kit (MRC-Holland). Identification of USH2A: c.7595-2144A>G was achieved by Sanger sequencing. Mutations were confirmed by a combination of reverse transcription PCR using RNA extracted from nasal epithelial cells or fibroblasts, and by array comparative genomic hybridisation with sequencing across the genomic breakpoints. Eight mutations were identified in 23 Usher type 2 families (35%) with one previously identified heterozygous disease-causing mutation in USH2A. These consisted of five heterozygous deletions, one duplication, and two heterozygous instances of the pathogenic variant USH2A: c.7595-2144A>G. No variants were found in the 15 Usher type 2 families with no previously identified disease-causing mutations. In 11 atypical families, none of whom had any previously identified convincing disease-causing mutations, the mutation USH2A: c.7595-2144A>G was identified in a heterozygous state in one family. All five deletions and the heterozygous duplication we report here are novel. This is the first time that a duplication in USH2A has been reported as a cause of Usher syndrome. We found that 8 of

  15. Barcoding the food chain: from Sanger to high-throughput sequencing.

    Science.gov (United States)

    Littlefair, Joanne E; Clare, Elizabeth L

    2016-11-01

    Society faces the complex challenge of supporting biodiversity and ecosystem functioning, while ensuring food security by providing safe traceable food through an ever-more-complex global food chain. The increase in human mobility brings the added threat of pests, parasites, and invaders that further complicate our agro-industrial efforts. DNA barcoding technologies allow researchers to identify both individual species, and, when combined with universal primers and high-throughput sequencing techniques, the diversity within mixed samples (metabarcoding). These tools are already being employed to detect market substitutions, trace pests through the forensic evaluation of trace "environmental DNA", and to track parasitic infections in livestock. The potential of DNA barcoding to contribute to increased security of the food chain is clear, but challenges remain in regulation and the need for validation of experimental analysis. Here, we present an overview of the current uses and challenges of applied DNA barcoding in agriculture, from agro-ecosystems within farmland to the kitchen table.

  16. The empirical power of rare variant association methods: results from sanger sequencing in 1,998 individuals.

    Directory of Open Access Journals (Sweden)

    Martin Ladouceur

    2012-02-01

    Full Text Available The role of rare genetic variation in the etiology of complex disease remains unclear. However, the development of next-generation sequencing technologies offers the experimental opportunity to address this question. Several novel statistical methodologies have been recently proposed to assess the contribution of rare variation to complex disease etiology. Nevertheless, no empirical estimates comparing their relative power are available. We therefore assessed the parameters that influence their statistical power in 1,998 individuals Sanger-sequenced at seven genes by modeling different distributions of effect, proportions of causal variants, and direction of the associations (deleterious, protective, or both in simulated continuous trait and case/control phenotypes. Our results demonstrate that the power of recently proposed statistical methods depend strongly on the underlying hypotheses concerning the relationship of phenotypes with each of these three factors. No method demonstrates consistently acceptable power despite this large sample size, and the performance of each method depends upon the underlying assumption of the relationship between rare variants and complex traits. Sensitivity analyses are therefore recommended to compare the stability of the results arising from different methods, and promising results should be replicated using the same method in an independent sample. These findings provide guidance in the analysis and interpretation of the role of rare base-pair variation in the etiology of complex traits and diseases.

  17. Sanger DNA-sequencing reactions performed in a solid-phase nanoreactor directly coupled to capillary gel electrophoresis.

    Science.gov (United States)

    Soper, S A; Williams, D C; Xu, Y; Lassiter, S J; Zhang, Y; Ford, S M; Bruch, R C

    1998-10-01

    A miniaturized, solid-phase nanoreactor was developed to prepare Sanger DNA-sequencing ladders which was directly interfaced to a capillary gel electrophoresis system. A biotinylated fragment of the rat brain actin gene (1 kbp) was amplified by PCR and attached to the interior wall of an (aminoalkyl)silane-derivatized fused-silica capillary tube via a biotin/streptavidin/biotin linkage. Coverage of the capillary wall with the biotinylated DNA averaged 77 +/- 10%. Stability of the anchored template under pressure (33 nL/s) and electroosmotic flows (11.3 nL/s) were favorable, requiring rinsing for > 150 h to reduce the surface coverage by only 50%. In addition, the immobilized template was stable toward temperatures required for preparing sequencing ladders, even under cycling conditions. Standard Sanger dideoxynucleotide termination performed in a large-volume (approximately 8 microL) solid-phase reactor using the thermally stable polymerase enzymes Taq and Vent and the polymerases T7 and Bst with off-line slab gel electrophoresis and autoradiographic detection indicated that acceptable fragment generation was achieved only in the case of the thermally stable polymerases. Banding was not apparent for T7 and Bst since all reagents were inserted into the column in a single plug at the beginning of the reaction. A small volume reactor (volume approximately 62 nL) was then used to perform DNA polymerase reactions and was coupled directly to a capillary gel column for separation. The capillary reactor was placed inside a thermocycler to control the temperature during chain extension and was directly connected to the gel column via zero dead volume fused-silica connectors. The complementary DNA fragments generated (C-track only) in the reactor were denatured using heat and directly injected onto the gel-filled capillary for size separation with detection accomplished using near-IR laser-induced fluorescence. Extension and single-base separation resolution of the C

  18. Comparison of targeted next-generation sequencing and Sanger sequencing for the detection of PIK3CA mutations in breast cancer.

    Science.gov (United States)

    Arsenic, Ruza; Treue, Denise; Lehmann, Annika; Hummel, Michael; Dietel, Manfred; Denkert, Carsten; Budczies, Jan

    2015-01-01

    Phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit alpha, PIK3CA, is one of the most frequently mutated genes in breast cancer, and the mutation status of PIK3CA has clinical relevance related to response to therapy. The aim of our study was to investigate the mutation status of PIK3CA gene and to evaluate the concordance between NGS and SGS for the most important hotspot regions in exon 9 and 20, to investigate additional hotspots outside of these exons using NGS, and to correlate the PIK3CA mutation status with the clinicopathological characteristics of the cohort. In the current study, next-generation sequencing (NGS) and Sanger Sequencing (SGS) was used for the mutational analysis of PIK3CA in 186 breast carcinomas. Altogether, 64 tumors had PIK3CA mutations, 55 of these mutations occurred in exons 9 and 20. Out of these 55 mutations, 52 could also be detected by Sanger sequencing resulting in a concordance of 98.4 % between the two sequencing methods. The three mutations missed by SGS had low variant frequencies below 10 %. Additionally, 4.8 % of the tumors had mutations in exons 1, 4, 7, and 13 of PIK3CA that were not detected by SGS. PIK3CA mutation status was significantly associated with hormone receptor-positivity, HER2-negativity, tumor grade, and lymph node involvement. However, there was no statistically significant association between the PIK3CA mutation status and overall survival. Based on our study, NGS is recommended as follows: 1) for correctly assessing the mutation status of PIK3CA in breast cancer, especially for cases with low tumor content, 2) for the detection of subclonal mutations, and 3) for simultaneous mutation detection in multiple exons.

  19. Comparison of three human papillomavirus DNA detection methods: Next generation sequencing, multiplex-PCR and nested-PCR followed by Sanger based sequencing.

    Science.gov (United States)

    da Fonseca, Allex Jardim; Galvão, Renata Silva; Miranda, Angelica Espinosa; Ferreira, Luiz Carlos de Lima; Chen, Zigui

    2016-05-01

    To compare the diagnostic performance for HPV infection using three laboratorial techniques. Ninty-five cervicovaginal samples were randomly selected; each was tested for HPV DNA and genotypes using 3 methods in parallel: Multiplex-PCR, the Nested PCR followed by Sanger sequencing, and the Next_Gen Sequencing (NGS) with two assays (NGS-A1, NGS-A2). The study was approved by the Brazilian National IRB (CONEP protocol 16,800). The prevalence of HPV by the NGS assays was higher than that using the Multiplex-PCR (64.2% vs. 45.2%, respectively; P = 0.001) and the Nested-PCR (64.2% vs. 49.5%, respectively; P = 0.003). NGS also showed better performance in detecting high-risk HPV (HR-HPV) and HPV16. There was a weak interobservers agreement between the results of Multiplex-PCR and Nested-PCR in relation to NGS for the diagnosis of HPV infection, and a moderate correlation for HR-HPV detection. Both NGS assays showed a strong correlation for detection of HPVs (k = 0.86), HR-HPVs (k = 0.91), HPV16 (k = 0.92) and HPV18 (k = 0.91). NGS is more sensitive than the traditional Sanger sequencing and the Multiplex PCR to genotype HPVs, with promising ability to detect multiple infections, and may have the potential to establish an alternative method for the diagnosis and genotyping of HPV. © 2015 Wiley Periodicals, Inc.

  20. Sanger sequencing as a first-line approach for molecular diagnosis of Andersen-Tawil syndrome [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Armando Totomoch-Serra

    2017-06-01

    Full Text Available In 1977, Frederick Sanger developed a new method for DNA sequencing based on the chain termination method, now known as the Sanger sequencing method (SSM.  Recently, massive parallel sequencing, better known as next-generation sequencing (NGS,  is replacing the SSM for detecting mutations in cardiovascular diseases with a genetic background. The present opinion article wants to remark that “targeted” SSM is still effective as a first-line approach for the molecular diagnosis of some specific conditions, as is the case for Andersen-Tawil syndrome (ATS. ATS is described as a rare multisystemic autosomal dominant channelopathy syndrome caused mainly by a heterozygous mutation in the KCNJ2 gene. KCJN2 has particular characteristics that make it attractive for “directed” SSM. KCNJ2 has a sequence of 17,510 base pairs (bp, and a short coding region with two exons (exon 1=166 bp and exon 2=5220 bp, half of the mutations are located in the C-terminal cytosolic domain, a mutational hotspot has been described in residue Arg218, and this gene explains the phenotype in 60% of ATS cases that fulfill all the clinical criteria of the disease. In order to increase the diagnosis of ATS we urge cardiologists to search for facial and muscular abnormalities in subjects with frequent ventricular arrhythmias (especially bigeminy and prominent U waves on the electrocardiogram.

  1. The empirical power of rare variant association methods: Results from sanger sequencing in 1,998 individuals

    NARCIS (Netherlands)

    M. Ladouceur (Martin); Z. Dastani (Zari); Y.S. Aulchenko (Yurii); J.P. Greenwood (John); J.B. Richards (Brent)

    2012-01-01

    textabstractThe role of rare genetic variation in the etiology of complex disease remains unclear. However, the development of next-generation sequencing technologies offers the experimental opportunity to address this question. Several novel statistical methodologies have been recently proposed to

  2. Sanger Sequencing for BRCA1 c.68_69del, BRCA1 c.5266dup and BRCA2 c.5946del Mutation Screen on Pap Smear Cytology Samples

    Directory of Open Access Journals (Sweden)

    Sin Hang Lee

    2016-02-01

    Full Text Available Three sets of polymerase chain reaction (PCR primers were designed for heminested PCR amplification of the target DNA fragments in the human genome which include the site of BRCA1 c.68_69del, BRCA1 c.5266dup and BRCA2 c.5946del respectively, to prepare the templates for direct Sanger sequencing screen of these three founder mutations. With a robust PCR mixture, crude proteinase K digestate of the fixed cervicovaginal cells in the liquid-based Papanicolaou (Pap cytology specimens can be used as the sample for target DNA amplification without pre-PCR DNA extraction, purification and quantitation. The post-PCR products can be used directly as the sequencing templates without further purification or quantitation. By simplifying the frontend procedures for template preparation, the cost for screening these three founder mutations can be reduced to about US $200 per test when performed in conjunction with human papillomavirus (HPV assays now routinely ordered for cervical cancer prevention. With this projected price structure, selective patients in a high-risk population can be tested and each provided with a set of DNA sequencing electropherograms to document the absence or presence of these founder mutations in her genome to help assess inherited susceptibility to breast and ovarian cancer in this era of precision molecular personalized medicine.

  3. A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

    Directory of Open Access Journals (Sweden)

    Solis Julio

    2010-10-01

    Full Text Available Abstract Background Sweetpotato (Ipomoea batatas (L. Lam., a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species. Results Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas and 2 diploid (I. trifida accessions. Conclusions The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at http://www.cipotato.org/sweetpotato_gene_index.

  4. Homozygosity Mapping and Targeted Sanger Sequencing Identifies Three Novel CRB1 (Crumbs homologue 1) Mutations in Iranian Retinal Degeneration Families

    NARCIS (Netherlands)

    Ghofrani, M.; Yahyaei, M.; Brunner, H.G.; Cremers, F.P.; Movasat, M.; Khan, M.; Keramatipour, M.

    2017-01-01

    Background: Inherited retinal diseases (IRDs) are a group of genetic disorders with high degrees of clinical, genetic and allelic heterogeneity. IRDs generally show progressive retinal cell death resulting in gradual vision loss. IRDs constitute a broad spectrum of disorders including retinitis

  5. Exome sequencing identifies RDH12 compound heterozygous mutations in a family with severe retinitis pigmentosa.

    Science.gov (United States)

    Chacon-Camacho, Oscar F; Jitskii, Serguei; Buentello-Volante, Beatriz; Quevedo-Martinez, Jonathan; Zenteno, Juan C

    2013-10-10

    Retinitis pigmentosa (RP) is the most prevalent type of inherited retinal degeneration and one of the commonest causes of genetically determined visual dysfunction worldwide. To date, approximately 35 genes have been associated with nonsyndromic autosomal recessive RP (arRP), however the small contribution of each gene to the total prevalence of arRP and the lack of a clear genotype-phenotype correlation complicate the genetic analysis in affected patients. Next generation sequencing technologies are powerful and cost-effective methods for detecting causative mutations in both sporadic and familial RP cases. A Mexican family with 5 members affected from arRP was studied. All patients underwent a complete ophthalmologic examination. Molecular methods included genome-wide SNP homozygosity mapping, exome sequencing analysis, and Sanger-sequencing confirmation of causal mutations. No regions of shared homozygosity among affected subjects were identified. Exome sequencing in a single patient allowed the detection of two missense mutations in the RDH12 gene: a c.446T>C transition predicting a novel p.L149P substitution, and a c.295C>A transversion predicting a previously reported p.L99I replacement. Sanger sequencing confirmed that all affected subjects carried both RDH12 mutations. This study adds to the molecular spectrum of RDH12-related retinopathy and offers an additional example of the power of exome sequencing in the diagnosis of recessively inherited retinal degenerations. © 2013 Elsevier B.V. All rights reserved.

  6. Detection of variations and identifying genomic breakpoints for large deletions in the LDLR by Ion Torrent semiconductor sequencing.

    Science.gov (United States)

    Faiz, Fathimath; Allcock, Richard J; Hooper, Amanda J; van Bockxmeer, Frank M

    2013-10-01

    The aims of this study were to 1) compare LDLR variant detection between Ion Torrent Personal Genome Machine (PGM) sequencing and conventional methods used for familial hypercholesterolaemia (FH) diagnosis i.e. exon-by-exon sequence analysis and multiplex ligation-dependent probe amplification (MLPA) and 2) identify genomic breakpoints for 12 cases of large deletions in LDLR previously identified by MLPA. Thirty FH patient samples were selected, 22 with mutations previously determined. Primers were designed and optimised to generate six amplicons covering the entire LDLR and sequenced on a PGM. An additional twelve samples carrying MLPA variants were sequenced on the PGM followed by Sanger sequencing to establish the breakpoints. A total of 2179 LDLR variants were identified in the 30 samples, with 383 variants in the region sequenced that was common to both PGM and Sanger methods. Three discrepancies were identified; two of these were identified by visual inspection of the BAM files, whilst the remaining discrepancy was likely an artefact of the PCR approach. Approximate genomic breakpoints for the 12 MLPA variants were identified using PGM sequencing, and Sanger sequencing of these regions established causative breakpoints. Eleven different rearrangements/mutational events were found, with eight out of eleven occurring in Alus. Two of the three samples with exons 2-6del had identical breakpoints. Two samples with exons 11-12del had unique breakpoints, indicating separate ancestral origin or mutational events. This study showed that Ion Torrent PGM sequencing is an accurate and efficient method to detect LDLR variants while providing additional information such as genomic breakpoints. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  7. Single-Exome sequencing identified a novel RP2 mutation in a child with X-linked retinitis pigmentosa.

    Science.gov (United States)

    Lim, Hassol; Park, Young-Mi; Lee, Jong-Keuk; Taek Lim, Hyun

    2016-10-01

    To present an efficient and successful application of a single-exome sequencing study in a family clinically diagnosed with X-linked retinitis pigmentosa. Exome sequencing study based on clinical examination data. An 8-year-old proband and his family. The proband and his family members underwent comprehensive ophthalmologic examinations. Exome sequencing was undertaken in the proband using Agilent SureSelect Human All Exon Kit and Illumina HiSeq 2000 platform. Bioinformatic analysis used Illumina pipeline with Burrows-Wheeler Aligner-Genome Analysis Toolkit (BWA-GATK), followed by ANNOVAR to perform variant functional annotation. All variants passing filter criteria were validated by Sanger sequencing to confirm familial segregation. Analysis of exome sequence data identified a novel frameshift mutation in RP2 gene resulting in a premature stop codon (c.665delC, p.Pro222fsTer237). Sanger sequencing revealed this mutation co-segregated with the disease phenotype in the child's family. We identified a novel causative mutation in RP2 from a single proband's exome sequence data analysis. This study highlights the effectiveness of the whole-exome sequencing in the genetic diagnosis of X-linked retinitis pigmentosa, over the conventional sequencing methods. Even using a single exome, exome sequencing technology would be able to pinpoint pathogenic variant(s) for X-linked retinitis pigmentosa, when properly applied with aid of adequate variant filtering strategy. Copyright © 2016 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.

  8. Prevalence of Drug-Resistant Minority Variants in Untreated HIV-1-Infected Individuals With and Those Without Transmitted Drug Resistance Detected by Sanger Sequencing.

    Science.gov (United States)

    Clutter, Dana S; Zhou, Shuntai; Varghese, Vici; Rhee, Soo-Yon; Pinsky, Benjamin A; Jeffrey Fessel, W; Klein, Daniel B; Spielvogel, Ean; Holmes, Susan P; Hurley, Leo B; Silverberg, Michael J; Swanstrom, Ronald; Shafer, Robert W

    2017-08-01

    Minority variant human immunodeficiency virus type 1 (HIV-1) nonnucleoside reverse transcriptase inhibitor (NNRTI) resistance mutations are associated with an increased risk of virological failure during treatment with NNRTI-containing regimens. To determine whether individuals to whom variants with isolated NNRTI-associated drug resistance were transmitted are at increased risk of virological failure during treatment with a non-NNRTI-containing regimen, we identified minority variant resistance mutations in 33 individuals with isolated NNRTI-associated transmitted drug resistance and 49 matched controls. We found similar proportions of overall and nucleoside reverse transcriptase inhibitor-associated minority variant resistance mutations in both groups, suggesting that isolated NNRTI-associated transmitted drug resistance may not be a risk factor for virological failure during treatment with a non-NNRTI-containing regimen. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.

  9. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, noise......, and random mutations. Here, we review computational approaches to identify somatic mutations in cancer genome sequences and to distinguish the driver mutations that are responsible for cancer from random, passenger mutations. First, we describe approaches to detect somatic mutations from high-throughput DNA...... sequencing data, particularly for tumor samples that comprise heterogeneous populations of cells. Next, we review computational approaches that aim to predict driver mutations according to their frequency of occurrence in a cohort of samples, or according to their predicted functional impact on protein...

  10. Electrostatic Potential Maps and Natural Bond Orbital Analysis: Visualization and Conceptualization of Reactivity in Sanger's Reagent

    Science.gov (United States)

    Mottishaw, Jeffery D.; Erck, Adam R.; Kramer, Jordan H.; Sun, Haoran; Koppang, Miles

    2015-01-01

    Frederick Sanger's early work on protein sequencing through the use of colorimetric labeling combined with liquid chromatography involves an important nucleophilic aromatic substitution (S[subscript N]Ar) reaction in which the N-terminus of a protein is tagged with Sanger's reagent. Understanding the inherent differences between this S[subscript…

  11. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    DEFF Research Database (Denmark)

    Hu, H; Haas, S A; Chelly, J

    2016-01-01

    pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1......X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes...... of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried...

  12. Analysis of unannotated equine transcripts identified by mRNA sequencing.

    Directory of Open Access Journals (Sweden)

    Stephen J Coleman

    Full Text Available Sequencing of equine mRNA (RNA-seq identified 428 putative transcripts which do not map to any previously annotated or predicted horse genes. Most of these encode the equine homologs of known protein-coding genes described in other species, yet the potential exists to identify novel and perhaps equine-specific gene structures. A set of 36 transcripts were prioritized for further study by filtering for levels of expression (depth of RNA-seq read coverage, distance from annotated features in the equine genome, the number of putative exons, and patterns of gene expression between tissues. From these, four were selected for further investigation based on predicted open reading frames of greater than or equal to 50 amino acids and lack of detectable homology to known genes across species. Sanger sequencing of RT-PCR amplicons from additional equine samples confirmed expression and structural annotation of each transcript. Functional predictions were made by conserved domain searches. A single transcript, expressed in the cerebellum, contains a putative kruppel-associated box (KRAB domain, suggesting a potential function associated with zinc finger proteins and transcriptional regulation. Overall levels of conserved synteny and sequence conservation across a 1MB region surrounding each transcript were approximately 73% compared to the human, canine, and bovine genomes; however, the four loci display some areas of low conservation and sequence inversion in regions that immediately flank these previously unannotated equine transcripts. Taken together, the evidence suggests that these four transcripts are likely to be equine-specific.

  13. SHROOM3 is a novel candidate for heterotaxy identified by whole exome sequencing.

    Science.gov (United States)

    Tariq, Muhammad; Belmont, John W; Lalani, Seema; Smolarek, Teresa; Ware, Stephanie M

    2011-09-21

    Heterotaxy-spectrum cardiovascular disorders are challenging for traditional genetic analyses because of clinical and genetic heterogeneity, variable expressivity, and non-penetrance. In this study, high-resolution SNP genotyping and exon-targeted array comparative genomic hybridization platforms were coupled to whole-exome sequencing to identify a novel disease candidate gene. SNP genotyping identified absence-of-heterozygosity regions in the heterotaxy proband on chromosomes 1, 4, 7, 13, 15, 18, consistent with parental consanguinity. Subsequently, whole-exome sequencing of the proband identified 26,065 coding variants, including 18 non-synonymous homozygous changes not present in dbSNP132 or 1000 Genomes. Of these 18, only 4--one each in CXCL2, SHROOM3, CTSO, RXFP1--were mapped to the absence-of-heterozygosity regions, each of which was flanked by more than 50 homozygous SNPs, confirming recessive segregation of mutant alleles. Sanger sequencing confirmed the SHROOM3 homozygous missense mutation and it was predicted as pathogenic by four bioinformatic tools. SHROOM3 has been identified as a central regulator of morphogenetic cell shape changes necessary for organogenesis and can physically bind ROCK2, a rho kinase protein required for left-right patterning. Screening 96 sporadic heterotaxy patients identified four additional patients with rare variants in SHROOM3. Using whole exome sequencing, we identify a recessive missense mutation in SHROOM3 associated with heterotaxy syndrome and identify rare variants in subsequent screening of a heterotaxy cohort, suggesting SHROOM3 as a novel target for the control of left-right patterning. This study reveals the value of SNP genotyping coupled with high-throughput sequencing for identification of high yield candidates for rare disorders with genetic and phenotypic heterogeneity.

  14. Frederick Sanger, Erwin Chargaff, and the metamorphosis of specificity.

    Science.gov (United States)

    Judson, H F

    1993-12-15

    That a transformation of ruling ideas in genetics and biochemistry took place at the dawn of molecular biology, in the late 1940s, is a commonplace; but the nature and components of that transformation are widely misunderstood. The change is often identified with the importation into biology of new styles of thought and new rigor by the many scientists trained in physics or chemistry who came into the nascent field--notably, Max Delbrück, Max Perutz, Francis Crick, John Kendrew, Maurice Wilkins, Rosalind Franklin. Most generally, the change is supposed to be the realization that genes are made not of protein but of nucleic acid--and this change was initiated, of course, by the work of Oswald Avery and his colleagues. These changes are not mutually exclusive, and both were surely important to the genesis of molecular biology. But logically prior to them, more fundamental, was another transformation in ruling preconceptions, one that has been neglected: the revolution in understanding of the chemical structures--the sequences of subunits--of proteins and of nucleic acids which was wrought by the work of Frederick Sanger and of Erwin Chargaff. This was a metamorphosis in the understanding of biochemical specificity, and while it astonished many biochemists it set free the small groups of those who were beginning to call themselves molecular biologists, enabling them to think of the relationship between genes and proteins in entirely new ways.

  15. A founder AGL mutation causing glycogen storage disease type IIIa in Inuit identified through whole-exome sequencing: a case series.

    Science.gov (United States)

    Rousseau-Nepton, Isabelle; Okubo, Minoru; Grabs, Rosemarie; Mitchell, John; Polychronakos, Constantin; Rodd, Celia

    2015-02-03

    Glycogen storage disease type III is caused by mutations in both alleles of the AGL gene, which leads to reduced activity of glycogen-debranching enzyme. The clinical picture encompasses hypoglycemia, with glycogen accumulation leading to hepatomegaly and muscle involvement (skeletal and cardiac). We sought to identify the genetic cause of this disease within the Inuit community of Nunavik, in whom previous DNA sequencing had not identified such mutations. Five Inuit children with a clinical and biochemical diagnosis of glycogen storage disease type IIIa were recruited to undergo genetic testing: 2 underwent whole-exome sequencing and all 5 underwent Sanger sequencing to confirm the identified mutation. Selected DNA regions near the AGL gene were also sequenced to identify a potential founder effect in the community. In addition, control samples from 4 adults of European descent and 7 family members of the affected children were analyzed for the specific mutation by Sanger sequencing. We identified a homozygous frame-shift deletion, c.4456delT, in exon 33 of the AGL gene in 2 children by whole-exome sequencing. Confirmation by Sanger sequencing showed the same mutation in all 5 patients, and 5 family members were found to be carriers. With the identification of this mutation in 5 probands, the estimated prevalence of genetically confirmed glycogen storage disease type IIIa in this region is among the highest worldwide (1:2500). Despite identical mutations, we saw variations in clinical features of the disease. Our detection of a homozygous frameshift mutation in 5 Inuit children determines the cause of glycogen storage disease type IIIa and confirms a founder effect. © 2015 Canadian Medical Association or its licensors.

  16. Deep sequencing of uveal melanoma identifies a recurrent mutation in PLCB4

    DEFF Research Database (Denmark)

    Johansson, Peter; Aoude, Lauren G; Wadt, Karin

    2016-01-01

    , instead, a BRCA mutation signature predominated. In addition to mutations in the known UM driver genes, we found a recurrent mutation in PLCB4 (c.G1888T, p.D630Y, NM_000933), which was validated using Sanger sequencing. The identical mutation was also found in published UM sequence data (1 of 56 tumors......Next generation sequencing of uveal melanoma (UM) samples has identified a number of recurrent oncogenic or loss-of-function mutations in key driver genes including: GNAQ, GNA11, EIF1AX, SF3B1 and BAP1. To search for additional driver mutations in this tumor type we carried out whole......-genome or whole-exome sequencing of 28 tumors or primary cell lines. These samples have a low mutation burden, with a mean of 10.6 protein changing mutations per sample (range 0 to 53). As expected for these sun-shielded melanomas the mutation spectrum was not consistent with an ultraviolet radiation signature...

  17. Genetic mapping and exome sequencing identify variants associated with five novel diseases.

    Directory of Open Access Journals (Sweden)

    Erik G Puffenberger

    Full Text Available The Clinic for Special Children (CSC has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain children. Among the Plain people, we have used single nucleotide polymorphism (SNP microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb that contain many genes (mean = 79. For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data.

  18. Identifying writing tasks using sequences of keystrokes

    NARCIS (Netherlands)

    Conijn, Rianne; van Zaanen, Menno

    2017-01-01

    The sequences of keystrokes that are generated when writing texts contain information about the writer as well as the writing task and cognitive aspects of the writing process. Much research has been conducted in the area of writer identification. However, research on the analysis of writing

  19. Deep sequencing to identify the causes of viral encephalitis.

    Directory of Open Access Journals (Sweden)

    Benjamin K Chan

    Full Text Available Deep sequencing allows for a rapid, accurate characterization of microbial DNA and RNA sequences in many types of samples. Deep sequencing (also called next generation sequencing or NGS is being developed to assist with the diagnosis of a wide variety of infectious diseases. In this study, seven frozen brain samples from deceased subjects with recent encephalitis were investigated. RNA from each sample was extracted, randomly reverse transcribed and sequenced. The sequence analysis was performed in a blinded fashion and confirmed with pathogen-specific PCR. This analysis successfully identified measles virus sequences in two brain samples and herpes simplex virus type-1 sequences in three brain samples. No pathogen was identified in the other two brain specimens. These results were concordant with pathogen-specific PCR and partially concordant with prior neuropathological examinations, demonstrating that deep sequencing can accurately identify viral infections in frozen brain tissue.

  20. Common fusion transcripts identified in colorectal cancer cell lines by high-throughput RNA sequencing.

    Science.gov (United States)

    Nome, Torfinn; Thomassen, Gard Os; Bruun, Jarle; Ahlquist, Terje; Bakken, Anne C; Hoff, Andreas M; Rognum, Torleiv; Nesbakken, Arild; Lorenz, Susanne; Sun, Jinchang; Barros-Silva, João Diogo; Lind, Guro E; Myklebost, Ola; Teixeira, Manuel R; Meza-Zepeda, Leonardo A; Lothe, Ragnhild A; Skotheim, Rolf I

    2013-01-01

    Colorectal cancer (CRC) is the third most common cancer disease in the Western world, and about 40% of the patients die from this disease. The cancer cells are commonly genetically unstable, but only a few low-frequency recurrent fusion genes have so far been reported for this disease. In this study, we present a thorough search for novel fusion transcripts in CRC using high-throughput RNA sequencing. From altogether 220 million paired-end sequence reads from seven CRC cell lines, we identified 3391 candidate fused transcripts. By stringent requirements, we nominated 11 candidate fusion transcripts for further experimental validation, of which 10 were positive by reverse transcription-polymerase chain reaction and Sanger sequencing. Six were intrachromosomal fusion transcripts, and interestingly, three of these, AKAP13-PDE8A, COMMD10-AP3S1, and CTB-35F21.1-PSD2, were present in, respectively, 18, 18, and 20 of 21 analyzed cell lines and in, respectively, 18, 61, and 48 (17%-58%) of 106 primary cancer tissues. These three fusion transcripts were also detected in 2 to 4 of 14 normal colonic mucosa samples (14%-28%). Whole-genome sequencing identified a specific genomic breakpoint in COMMD10-AP3S1 and further indicates that both the COMMD10-AP3S1 and AKAP13-PDE8A fusion transcripts are due to genomic duplications in specific cell lines. In conclusion, we have identified AKAP13-PDE8A, COMMD10-AP3S1, and CTB-35F21.1-PSD2 as novel intrachromosomal fusion transcripts and the most highly recurring chimeric transcripts described for CRC to date. The functional and clinical relevance of these chimeric RNA molecules remains to be elucidated.

  1. Common Fusion Transcripts Identified in Colorectal Cancer Cell Lines by High-Throughput RNA Sequencing12

    Science.gov (United States)

    Nome, Torfinn; Thomassen, Gard OS; Bruun, Jarle; Ahlquist, Terje; Bakken, Anne C; Hoff, Andreas M; Rognum, Torleiv; Nesbakken, Arild; Lorenz, Susanne; Sun, Jinchang; Barros-Silva, João Diogo; Lind, Guro E; Myklebost, Ola; Teixeira, Manuel R; Meza-Zepeda, Leonardo A; Lothe, Ragnhild A; Skotheim, Rolf I

    2013-01-01

    Colorectal cancer (CRC) is the third most common cancer disease in the Western world, and about 40% of the patients die from this disease. The cancer cells are commonly genetically unstable, but only a few low-frequency recurrent fusion genes have so far been reported for this disease. In this study, we present a thorough search for novel fusion transcripts in CRC using high-throughput RNA sequencing. From altogether 220 million paired-end sequence reads from seven CRC cell lines, we identified 3391 candidate fused transcripts. By stringent requirements, we nominated 11 candidate fusion transcripts for further experimental validation, of which 10 were positive by reverse transcription-polymerase chain reaction and Sanger sequencing. Six were intrachromosomal fusion transcripts, and interestingly, three of these, AKAP13-PDE8A, COMMD10-AP3S1, and CTB-35F21.1-PSD2, were present in, respectively, 18, 18, and 20 of 21 analyzed cell lines and in, respectively, 18, 61, and 48 (17%-58%) of 106 primary cancer tissues. These three fusion transcripts were also detected in 2 to 4 of 14 normal colonic mucosa samples (14%–28%). Whole-genome sequencing identified a specific genomic breakpoint in COMMD10-AP3S1 and further indicates that both the COMMD10-AP3S1 and AKAP13-PDE8A fusion transcripts are due to genomic duplications in specific cell lines. In conclusion, we have identified AKAP13-PDE8A, COMMD10-AP3S1, and CTB-35F21.1-PSD2 as novel intrachromosomal fusion transcripts and the most highly recurring chimeric transcripts described for CRC to date. The functional and clinical relevance of these chimeric RNA molecules remains to be elucidated. PMID:24151535

  2. Whole-exome sequencing identifies rare pathogenic variants in new predisposition genes for familial colorectal cancer.

    Science.gov (United States)

    Esteban-Jurado, Clara; Vila-Casadesús, Maria; Garre, Pilar; Lozano, Juan José; Pristoupilova, Anna; Beltran, Sergi; Muñoz, Jenifer; Ocaña, Teresa; Balaguer, Francesc; López-Cerón, Maria; Cuatrecasas, Miriam; Franch-Expósito, Sebastià; Piqué, Josep M; Castells, Antoni; Carracedo, Angel; Ruiz-Ponte, Clara; Abulí, Anna; Bessa, Xavier; Andreu, Montserrat; Bujanda, Luis; Caldés, Trinidad; Castellví-Bel, Sergi

    2015-02-01

    Colorectal cancer is an important cause of mortality in the developed world. Hereditary forms are due to germ-line mutations in APC, MUTYH, and the mismatch repair genes, but many cases present familial aggregation but an unknown inherited cause. The hypothesis of rare high-penetrance mutations in new genes is a likely explanation for the underlying predisposition in some of these familial cases. Exome sequencing was performed in 43 patients with colorectal cancer from 29 families with strong disease aggregation without mutations in known hereditary colorectal cancer genes. Data analysis selected only very rare variants (0-0.1%), producing a putative loss of function and located in genes with a role compatible with cancer. Variants in genes previously involved in hereditary colorectal cancer or nearby previous colorectal cancer genome-wide association study hits were also chosen. Twenty-eight final candidate variants were selected and validated by Sanger sequencing. Correct family segregation and somatic studies were used to categorize the most interesting variants in CDKN1B, XRCC4, EPHX1, NFKBIZ, SMARCA4, and BARD1. We identified new potential colorectal cancer predisposition variants in genes that have a role in cancer predisposition and are involved in DNA repair and the cell cycle, which supports their putative involvement in germ-line predisposition to this neoplasm.

  3. Genome-wide linkage, exome sequencing and functional analyses identify ABCB6 as the pathogenic gene of dyschromatosis universalis hereditaria.

    Directory of Open Access Journals (Sweden)

    Hong Liu

    Full Text Available As a genetic disorder of abnormal pigmentation, the molecular basis of dyschromatosis universalis hereditaria (DUH had remained unclear until recently when ABCB6 was reported as a causative gene of DUH.We performed genome-wide linkage scan using Illumina Human 660W-Quad BeadChip and exome sequencing analyses using Agilent SureSelect Human All Exon Kits in a multiplex Chinese DUH family to identify the pathogenic mutations and verified the candidate mutations using Sanger sequencing. Quantitative RT-PCR and Immunohistochemistry was performed to verify the expression of the pathogenic gene, Zebrafish was also used to confirm the functional role of ABCB6 in melanocytes and pigmentation.Genome-wide linkage (assuming autosomal dominant inheritance mode and exome sequencing analyses identified ABCB6 as the disease candidate gene by discovering a coding mutation (c.1358C>T; p.Ala453Val that co-segregates with the disease phenotype. Further mutation analysis of ABCB6 in four other DUH families and two sporadic cases by Sanger sequencing confirmed the mutation (c.1358C>T; p.Ala453Val and discovered a second, co-segregating coding mutation (c.964A>C; p.Ser322Lys in one of the four families. Both mutations were heterozygous in DUH patients and not present in the 1000 Genome Project and dbSNP database as well as 1,516 unrelated Chinese healthy controls. Expression analysis in human skin and mutagenesis interrogation in zebrafish confirmed the functional role of ABCB6 in melanocytes and pigmentation. Given the involvement of ABCB6 mutations in coloboma, we performed ophthalmological examination of the DUH carriers of ABCB6 mutations and found ocular abnormalities in them.Our study has advanced our understanding of DUH pathogenesis and revealed the shared pathological mechanism between pigmentary DUH and ocular coloboma.

  4. Sequencing of DICER1 in sarcomas identifies biallelic somatic DICER1 mutations in an adult-onset embryonal rhabdomyosarcoma

    NARCIS (Netherlands)

    de Kock, Leanne; Rivera, Barbara; Revil, Timothée; Thorner, Paul; Goudie, Catherine; Bouron-Dal Soglio, Dorothée; Choong, Catherine S.; Priest, John R.; Van Diest, Paul J.; Tanboon, Jantima; Wagner, Anja; Ragoussis, Jiannis; Choong, Peter F.M.; Foulkes, William D

    2017-01-01

    Background:Sarcomas are rare and heterogeneous cancers. We assessed the contribution of DICER1 mutations to sarcoma development.Methods:The coding region of DICER1 was sequenced in 67 sarcomas using a custom Fluidigm Access Array. The RNase III domains were Sanger sequenced in six additional

  5. Exome sequencing identifies a DNAJB6 mutation in a family with dominantly-inherited limb-girdle muscular dystrophy.

    Science.gov (United States)

    Couthouis, Julien; Raphael, Alya R; Siskind, Carly; Findlay, Andrew R; Buenrostro, Jason D; Greenleaf, William J; Vogel, Hannes; Day, John W; Flanigan, Kevin M; Gitler, Aaron D

    2014-05-01

    Limb-girdle muscular dystrophy primarily affects the muscles of the hips and shoulders (the "limb-girdle" muscles), although it is a heterogeneous disorder that can present with varying symptoms. There is currently no cure. We sought to identify the genetic basis of limb-girdle muscular dystrophy type 1 in an American family of Northern European descent using exome sequencing. Exome sequencing was performed on DNA samples from two affected siblings and one unaffected sibling and resulted in the identification of eleven candidate mutations that co-segregated with the disease. Notably, this list included a previously reported mutation in DNAJB6, p.Phe89Ile, which was recently identified as a cause of limb-girdle muscular dystrophy type 1D. Additional family members were Sanger sequenced and the mutation in DNAJB6 was only found in affected individuals. Subsequent haplotype analysis indicated that this DNAJB6 p.Phe89Ile mutation likely arose independently of the previously reported mutation. Since other published mutations are located close by in the G/F domain of DNAJB6, this suggests that the area may represent a mutational hotspot. Exome sequencing provided an unbiased and effective method for identifying the genetic etiology of limb-girdle muscular dystrophy type 1 in a previously genetically uncharacterized family. This work further confirms the causative role of DNAJB6 mutations in limb-girdle muscular dystrophy type 1D. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Whole-exome sequencing identifies a potential TTN mutation in a multiplex family with inguinal hernia.

    Science.gov (United States)

    Mihailov, E; Nikopensius, T; Reigo, A; Nikkolo, C; Kals, M; Aruaas, K; Milani, L; Seepter, H; Metspalu, A

    2017-02-01

    Inguinal hernia repair is one of the most common procedures in general surgery. Males are seven times more likely than females to develop a hernia and have a 27 % lifetime 'risk' of inguinal hernia repair. Several studies have demonstrated that a positive family history is an important risk factor for the development of primary inguinal hernia, which indicates that genetic factors may play important roles in the etiology of the disease. So far, the contribution of genetic factors and underlying mechanisms for inguinal hernia remain largely unknown. The aim of this study was to investigate a multiplex Estonian family with inguinal hernia across four generations. The whole-exome sequencing was carried out in three affected family members and subsequent mutation screening using Sanger sequencing was performed in ten family members (six affected and four unaffected). Whole-exome sequencing in three affected family members revealed a heterozygous missense mutation c.88880A>C (p.Lys29627Thr; RefSeq NM_001256850.1) in the highly conserved myosin-binding A-band of the TTN gene. Sanger sequencing demonstrated that this mutation cosegregated with the disease in this family and was not present in ethnically matched control subjects. We report that missense variant in the A-band of TTN is the strongest candidate mutation for autosomal-dominant inguinal hernia with incomplete penetrance.

  7. Exome Sequencing Identified a Recessive RDH12 Mutation in a Family with Severe Early-Onset Retinitis Pigmentosa

    Directory of Open Access Journals (Sweden)

    Bo Gong

    2015-01-01

    Full Text Available Retinitis pigmentosa (RP is the most important hereditary retinal disease caused by progressive degeneration of the photoreceptor cells. This study is to identify gene mutations responsible for autosomal recessive retinitis pigmentosa (arRP in a Chinese family using next-generation sequencing technology. A Chinese family with 7 members including two individuals affected with severe early-onset RP was studied. All patients underwent a complete ophthalmic examination. Exome sequencing was performed on a single RP patient (the proband of this family and direct Sanger sequencing on other family members and normal controls was followed to confirm the causal mutations. A homozygous mutation c.437Tidentified as being related to the phenotype of this arRP family. This homozygous mutation was detected in the two affected patients, but not present in other family members and 600 normal controls. Another three normal members in the family were found to carry this heterozygous missense mutation. Our results emphasize the importance of c.437T

  8. Next generation sequencing identifies a novel rearrangement in the HBB cluster permitting to-the-base characterization.

    Science.gov (United States)

    Shooter, Claire; Rooks, Helen; Thein, Swee Lay; Clark, Barnaby

    2015-01-01

    Genetic testing for hemoglobinopathies is required for prenatal diagnosis, understanding complex cases where multiple pathogenic variants may be present or investigating cases of unexplained anemia. Characterization of disease causing variants that range from single base changes to large rearrangements may require several different labor-intensive methodologies. Multiplex ligation probe amplification analysis is the current method used to detect indels, but the technique does not characterize the breakpoints or detect balanced translocations. Here, we describe a next-generation sequencing (NGS) method that is able to identify and characterize a novel rearrangement of the HBB cluster responsible for εγδβ thalassemia in an English family. The structural variant involved a 59.0 kb inversion encompassing HBG2 exon 3, HBG1, HBD, HBB, and OR51V1, juxtaposed by a deletion of 122.6 kb including 82 bp of the inverted sequence, HBG2 exon 1 and 2, HBE, and the β-locus control region. Identification of reads spanning the breakpoints provided to-the-base resolution of the rearrangement, subsequently confirmed by gap-PCR and Sanger sequence analysis. The same rearrangement, termed Inv-Del English V εγδβ thalassemia (HbVar 2935), was identified in two other unrelated English individuals with a similar hematological phenotype. Our NGS approach should be applicable as a diagnostic tool for other disorders. © 2014 WILEY PERIODICALS, INC.

  9. MetaPhinder-Identifying Bacteriophage Sequences in Metagenomic Data Sets

    DEFF Research Database (Denmark)

    Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole

    2016-01-01

    and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e. contigs) of phage origin in metage-nomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic...

  10. Whole exome sequencing identifies the first STRADA point mutation in a patient with polyhydramnios, megalencephaly, and symptomatic epilepsy syndrome (PMSE).

    Science.gov (United States)

    Bi, Weimin; Glass, Ian A; Muzny, Donna M; Gibbs, Richard A; Eng, Christine M; Yang, Yaping; Sun, Angela

    2016-08-01

    Polyhydramnios, megalencephaly, and symptomatic epilepsy syndrome (PMSE) is an ultra rare neurodevelopmental disorder characterized by severe, infantile-onset intractable epilepsy, neurocognitive delay, macrocephaly, and craniofacial dysmorphism. The molecular diagnosis of this condition has thus far only been made in 16 Old Order Mennonite patients carrying a homozygous 7 kb founder deletion of exons 9-13 of STRADA. We performed clinical whole exome sequencing (WES) on a 4-year-old Indian male with global developmental delay, history of failure to thrive, infantile spasms, repetitive behaviors, hypotonia, low muscle mass, marked joint laxity, and dysmorphic facial features including tall forehead, long face, arched eyebrows, small chin, wide mouth, and tented upper lip. A homozygous single nucleotide duplication, c.842dupA (p.D281fs), in exon 10 of STRADA was identified. Sanger sequencing confirmed the mutation in the individual and identified both parents as carriers. In light of the molecular discoveries, the patient's clinical phenotype was considered to be a good fit for PMSE. We identified for the first time a homozygous point mutation in STRADA causing PMSE. Additional bi-allelic mutations related to PMSE thus far have not been observed in Baylor ∼6,000 consecutive clinical WES cases, supporting the rarity of this disorder. Our findings may have treatment implications for the patient since previous studies have shown rapamycin as a potential therapeutic agent for the seizures and cognitive problems in PMSE patients. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  11. Identifying novel sequence variants of RNA 3D motifs

    Science.gov (United States)

    Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

    2015-01-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  12. Multilocus DNA Sequence Comparisons Rapidly Identify Pathogenic Molds

    Science.gov (United States)

    Rakeman, Jennifer L.; Bui, Uyen; LaFe, Karen; Chen, Yi-Ching; Honeycutt, Rhonda J.; Cookson, Brad T.

    2005-01-01

    The increasing incidence of opportunistic fungal infections necessitates rapid and accurate identification of the associated fungi to facilitate optimal patient treatment. Traditional phenotype-based identification methods utilized in clinical laboratories rely on the production and recognition of reproductive structures, making identification difficult or impossible when these structures are not observed. We hypothesized that DNA sequence analysis of multiple loci is useful for rapidly identifying medically important molds. Our study included the analysis of the D1/D2 hypervariable region of the 28S ribosomal gene and the internal transcribed spacer (ITS) regions 1 and 2 of the rRNA operon. Two hundred one strains, including 143 clinical isolates and 58 reference and type strains, representing 43 recognized species and one possible new species, were examined. We generated a phenotypically validated database of 118 diagnostic alleles. DNA length polymorphisms detected among ITS1 and ITS2 PCR products can differentiate 20 of 33 species of molds tested, and ITS DNA sequence analysis permits identification of all species tested. For 42 of 44 species tested, conspecific strains displayed >99% sequence identity at ITS1 and ITS2; sequevars were detected in two species. For all 44 species, identifications by genotypic and traditional phenotypic methods were 100% concordant. Because dendrograms based on ITS sequence analysis are similar in topology to 28S-based trees, we conclude that ITS sequences provide phylogenetically valid information and can be utilized to identify clinically important molds. Additionally, this phenotypically validated database of ITS sequences will be useful for identifying new species of pathogenic molds. PMID:16000456

  13. Exome sequencing identifies ZNF644 mutations in high myopia.

    Directory of Open Access Journals (Sweden)

    Yi Shi

    2011-06-01

    Full Text Available Myopia is the most common ocular disorder worldwide, and high myopia in particular is one of the leading causes of blindness. Genetic factors play a critical role in the development of myopia, especially high myopia. Recently, the exome sequencing approach has been successfully used for the disease gene identification of Mendelian disorders. Here we show a successful application of exome sequencing to identify a gene for an autosomal dominant disorder, and we have identified a gene potentially responsible for high myopia in a monogenic form. We captured exomes of two affected individuals from a Han Chinese family with high myopia and performed sequencing analysis by a second-generation sequencer with a mean coverage of 30× and sufficient depth to call variants at ∼97% of each targeted exome. The shared genetic variants of these two affected individuals in the family being studied were filtered against the 1000 Genomes Project and the dbSNP131 database. A mutation A672G in zinc finger protein 644 isoform 1 (ZNF644 was identified as being related to the phenotype of this family. After we performed sequencing analysis of the exons in the ZNF644 gene in 300 sporadic cases of high myopia, we identified an additional five mutations (I587V, R680G, C699Y, 3'UTR+12 C>G, and 3'UTR+592 G>A in 11 different patients. All these mutations were absent in 600 normal controls. The ZNF644 gene was expressed in human retinal and retinal pigment epithelium (RPE. Given that ZNF644 is predicted to be a transcription factor that may regulate genes involved in eye development, mutation may cause the axial elongation of eyeball found in high myopia patients. Our results suggest that ZNF644 might be a causal gene for high myopia in a monogenic form.

  14. Exome sequencing identifies Laing distal myopathy MYH7 mutation in a Roma family previously diagnosed with distal neuronopathy.

    Science.gov (United States)

    Komlósi, Katalin; Hadzsiev, Kinga; Garbes, Lutz; Martínez Carrera, Lilian A; Pál, Endre; Sigurðsson, Jóhann Haukur; Magnusson, Olafur; Melegh, Béla; Wirth, Brunhilde

    2014-02-01

    We describe a Hungarian Roma family, originally investigated for autosomal dominant distal muscular atrophy. The mother started toe walking at 3 years and lost ambulation at age 27. Her three daughters presented with early steppage gait and showed variable progression. Muscle biopsies were nonspecific showing myogenic lesions in the mother and lesions resembling neurogenic atrophy in the two siblings. To identify the causative abnormality whole exome sequencing was performed in two affected girls and their unaffected father, unexpectedly revealing the MYH7 mutation c.4849_4851delAAG (p.K1617del) in both girls, reported to be causative for Laing distal myopathy. Sanger sequencing confirmed the mutation in the affected mother and third affected daughter. In line with variable severity in Laing distal myopathy our patients presented a more severe phenotype. Our case is the first demonstration of Laing distal myopathy in the Roma and the successful use of whole exome sequencing in obtaining a definitive diagnosis in ambiguous cases. Copyright © 2013 Elsevier B.V. All rights reserved.

  15. Next generation sequencing identified novel heterozygous nonsense mutation in CNGB1 gene associated with retinitis pigmentosa in a Chinese patient.

    Science.gov (United States)

    Banerjee, Santasree; Yao, Junping; Zhang, Xinxin; Niu, Jianjun; Chen, Zhongshan

    2017-10-24

    Retinitis pigmentosa (RP) is a severe hereditary eye disease characterized by progressive degeneration of photoreceptors and subsequent loss of vision. Retinitis pigmentosa (RP) is a clinically and genetically heterogeneous group of retinal diseases. Germline mutations of CNGB1 is associated with retinitis pigmentosa. We have identified and investigated a 34-year-old Chinese man with markedly have night vision blindness and loss of midperipheral visual field. The proband also lose his far peripheral visual field and also central vision. Proband's retinal pigment deposits visible on fundus examination and primary loss of rod photoreceptor cells followed by secondary loss of cone photoreceptors. Target exome capture based next generation sequencing and Sanger sequencing identified novel nonsense mutation, c.1917G>A and a reported mutation, c.2361C>A, in the CNGB1 gene. Both the nonsense mutations are predicted to lead to the formation of a premature stop codon which finally results into formation of truncated CNGB1 protein product which finally predicted to be disease causing. According to the variant classification guidelines of ACMG, these two variants are categorized as "likely pathogenic" variants. Our findings expand the mutational spectra of CNGB1 and are valuable in the mutation-based pre- and post-natal screening and genetic diagnosis for retinitis pigmentosa.

  16. Correlation approach to identify coding regions in DNA sequences

    Science.gov (United States)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  17. A Novel Splicing Mutation Identified in a Chinese Family with X-linked Alport Syndrome Using Targeted Next-Generation Sequencing.

    Science.gov (United States)

    Chen, Chen; Lu, Chao-Xia; Wang, Qiong; Cao, Li-Hua; Luo, Yang; Zhang, Xue

    2016-04-01

    Alport syndrome (AS) is a genetically heterogeneous disorder, characterized by hematuria, progressive renal failure, sensorineural hearing loss, and ocular abnormalities caused by mutations in the COL4A3, COL4A4, and COL4A5 genes. The aim of this study was to identify underlying mutations in individuals from a Chinese family with X-linked AS. We performed targeted next-generation sequencing (NGS) to identify mutations associated with AS. The results were processed and visualized using an Integrated Genomics Viewer software. The most likely disease-causing variants were identified and confirmed by Sanger sequencing of reverse transcription-polymerase chain reaction products. Visual inspection using Integrative Genomics Viewer software found that COL4A5 exon 10 was not covered by the disease panel, while coverage of exons 4, 17, 20, 21, 37, and 45 was incomplete. Sanger sequencing of these regions identified a novel splice-site mutation in intron 9 (c.547-3C>A) of the COL4A5 gene. Subsequent cDNA analysis revealed that c.547-3C>A led to skipping of exon 10, which resulted in an in-frame deletion of 21 amino acids from the α5 chain of type IV collagen. We determined the molecular basis of AS in a Chinese family by targeted NGS and cDNA analysis. This is the first report of the novel c.547-3C>A splicing mutation in the collagen domain of COL4A5 gene.

  18. Whole-Exome Sequencing Identifies Novel Variants for Tooth Agenesis.

    Science.gov (United States)

    Dinckan, N; Du, R; Petty, L E; Coban-Akdemir, Z; Jhangiani, S N; Paine, I; Baugh, E H; Erdem, A P; Kayserili, H; Doddapaneni, H; Hu, J; Muzny, D M; Boerwinkle, E; Gibbs, R A; Lupski, J R; Uyguner, Z O; Below, J E; Letra, A

    2018-01-01

    Tooth agenesis is a common craniofacial abnormality in humans and represents failure to develop 1 or more permanent teeth. Tooth agenesis is complex, and variations in about a dozen genes have been reported as contributing to the etiology. Here, we combined whole-exome sequencing, array-based genotyping, and linkage analysis to identify putative pathogenic variants in candidate disease genes for tooth agenesis in 10 multiplex Turkish families. Novel homozygous and heterozygous variants in LRP6, DKK1, LAMA3, and COL17A1 genes, as well as known variants in WNT10A, were identified as likely pathogenic in isolated tooth agenesis. Novel variants in KREMEN1 were identified as likely pathogenic in 2 families with suspected syndromic tooth agenesis. Variants in more than 1 gene were identified segregating with tooth agenesis in 2 families, suggesting oligogenic inheritance. Structural modeling of missense variants suggests deleterious effects to the encoded proteins. Functional analysis of an indel variant (c.3607+3_6del) in LRP6 suggested that the predicted resulting mRNA is subject to nonsense-mediated decay. Our results support a major role for WNT pathways genes in the etiology of tooth agenesis while revealing new candidate genes. Moreover, oligogenic cosegregation was suggestive for complex inheritance and potentially complex gene product interactions during development, contributing to improved understanding of the genetic etiology of familial tooth agenesis.

  19. Identifying and calling insertions, deletions, and single-base mutations efficiently from sequence data

    Science.gov (United States)

    Whole genome sequencing studies can directly identify causative mutations for subsequent use in genomic evaluations, but sequence variant identification is a lengthy and sometimes inaccurate process. The speed and accuracy of identifying small insertions and deletions of sequence, collectively terme...

  20. Targeted Next Generation Sequencing Identifies Novel Mutations in RP1 as a Relatively Common Cause of Autosomal Recessive Rod-Cone Dystrophy

    Directory of Open Access Journals (Sweden)

    Said El Shamieh

    2015-01-01

    Full Text Available We report ophthalmic and genetic findings in families with autosomal recessive rod-cone dystrophy (arRCD and RP1 mutations. Detailed ophthalmic examination was performed in 242 sporadic and arRCD subjects. Genomic DNA was investigated using our customized next generation sequencing panel targeting up to 123 genes implicated in inherited retinal disorders. Stringent filtering coupled with Sanger sequencing and followed by cosegregation analysis was performed to confirm biallelism and the implication of the most likely disease causing variants. Sequencing identified 9 RP1 mutations in 7 index cases. Eight of the mutations were novel, and all cosegregated with severe arRCD phenotype, found associated with additional macular changes. Among the identified mutations, 4 belong to a region, previously associated with arRCD, and 5 others in a region previously associated with adRCD. Our prevalence studies showed that RP1 mutations account for up to 2.5% of arRCD. These results point out for the necessity of sequencing RP1 when genetically investigating sporadic and arRCD. It further highlights the interest of unbiased sequencing technique, which allows investigating the implication of the same gene in different modes of inheritance. Finally, it reports that different regions of RP1 can also lead to arRCD.

  1. Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma

    Energy Technology Data Exchange (ETDEWEB)

    Krauthammer, Michael; Kong, Yong; Ha, Byung Hak; Evans, Perry; Bacchiocchi, Antonella; McCusker, James P.; Cheng, Elaine; Davis, Matthew J.; Goh, Gerald; Choi, Murim; Ariyan, Stephan; Narayan, Deepak; Dutton-Regester, Ken; Capatana, Ana; Holman, Edna C.; Bosenberg, Marcus; Sznol, Mario; Kluger, Harriet M.; Brash, Douglas E.; Stern, David F.; Materin, Miguel A.; Lo, Roger S.; Mane, Shrikant; Ma, Shuangge; Kidd, Kenneth K.; Hayward, Nicholas K.; Lifton, Richard P.; Schlessinger, Joseph; Boggon, Titus J.; Halaban, Ruth (Yale-MED); (UCLA); (Queens)

    2012-10-11

    We characterized the mutational landscape of melanoma, the form of skin cancer with the highest mortality rate, by sequencing the exomes of 147 melanomas. Sun-exposed melanomas had markedly more ultraviolet (UV)-like C>T somatic mutations compared to sun-shielded acral, mucosal and uveal melanomas. Among the newly identified cancer genes was PPP6C, encoding a serine/threonine phosphatase, which harbored mutations that clustered in the active site in 12% of sun-exposed melanomas, exclusively in tumors with mutations in BRAF or NRAS. Notably, we identified a recurrent UV-signature, an activating mutation in RAC1 in 9.2% of sun-exposed melanomas. This activating mutation, the third most frequent in our cohort of sun-exposed melanoma after those of BRAF and NRAS, changes Pro29 to serine (RAC1{sup P29S}) in the highly conserved switch I domain. Crystal structures, and biochemical and functional studies of RAC1{sup P29S} showed that the alteration releases the conformational restraint conferred by the conserved proline, causes an increased binding of the protein to downstream effectors, and promotes melanocyte proliferation and migration. These findings raise the possibility that pharmacological inhibition of downstream effectors of RAC1 signaling could be of therapeutic benefit.

  2. Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes.

    Science.gov (United States)

    Molenaar, Jan J; Koster, Jan; Zwijnenburg, Danny A; van Sluis, Peter; Valentijn, Linda J; van der Ploeg, Ida; Hamdi, Mohamed; van Nes, Johan; Westerman, Bart A; van Arkel, Jennemiek; Ebus, Marli E; Haneveld, Franciska; Lakeman, Arjan; Schild, Linda; Molenaar, Piet; Stroeken, Peter; van Noesel, Max M; Ora, Ingrid; Santo, Evan E; Caron, Huib N; Westerhout, Ellen M; Versteeg, Rogier

    2012-02-22

    Neuroblastoma is a childhood tumour of the peripheral sympathetic nervous system. The pathogenesis has for a long time been quite enigmatic, as only very few gene defects were identified in this often lethal tumour. Frequently detected gene alterations are limited to MYCN amplification (20%) and ALK activations (7%). Here we present a whole-genome sequence analysis of 87 neuroblastoma of all stages. Few recurrent amino-acid-changing mutations were found. In contrast, analysis of structural defects identified a local shredding of chromosomes, known as chromothripsis, in 18% of high-stage neuroblastoma. These tumours are associated with a poor outcome. Structural alterations recurrently affected ODZ3, PTPRD and CSMD1, which are involved in neuronal growth cone stabilization. In addition, ATRX, TIAM1 and a series of regulators of the Rac/Rho pathway were mutated, further implicating defects in neuritogenesis in neuroblastoma. Most tumours with defects in these genes were aggressive high-stage neuroblastomas, but did not carry MYCN amplifications. The genomic landscape of neuroblastoma therefore reveals two novel molecular defects, chromothripsis and neuritogenesis gene alterations, which frequently occur in high-risk tumours.

  3. Exome sequencing covers >98% of mutations identified on targeted next generation sequencing panels.

    Directory of Open Access Journals (Sweden)

    Holly LaDuca

    Full Text Available With the expanded availability of next generation sequencing (NGS-based clinical genetic tests, clinicians seeking to test patients with Mendelian diseases must weigh the superior coverage of targeted gene panels with the greater number of genes included in whole exome sequencing (WES when considering their first-tier testing approach. Here, we use an in silico analysis to predict the analytic sensitivity of WES using pathogenic variants identified on targeted NGS panels as a reference.Corresponding nucleotide positions for 1533 different alterations classified as pathogenic or likely pathogenic identified on targeted NGS multi-gene panel tests in our laboratory were interrogated in data from 100 randomly-selected clinical WES samples to quantify the sequence coverage at each position. Pathogenic variants represented 91 genes implicated in hereditary cancer, X-linked intellectual disability, primary ciliary dyskinesia, Marfan syndrome/aortic aneurysms, cardiomyopathies and arrhythmias.When assessing coverage among 100 individual WES samples for each pathogenic variant (153,300 individual assessments, 99.7% (n = 152,798 would likely have been detected on WES. All pathogenic variants had at least some coverage on exome sequencing, with a total of 97.3% (n = 1491 detectable across all 100 individuals. For the remaining 42 pathogenic variants, the number of WES samples with adequate coverage ranged from 35 to 99. Factors such as location in GC-rich, repetitive, or homologous regions likely explain why some of these alterations were not detected across all samples. To validate study findings, a similar analysis was performed against coverage data from 60,706 exomes available through the Exome Aggregation Consortium (ExAC. Results from this validation confirmed that 98.6% (91,743,296/93,062,298 of pathogenic variants demonstrated adequate depth for detection.Results from this in silico analysis suggest that exome sequencing may achieve a diagnostic

  4. Strategies for Wheat Stripe Rust Pathogenicity Identified by Transcriptome Sequencing.

    Directory of Open Access Journals (Sweden)

    Diana P Garnica

    Full Text Available Stripe rust caused by the fungus Puccinia striiformis f.sp. tritici (Pst is a major constraint to wheat production worldwide. The molecular events that underlie Pst pathogenicity are largely unknown. Like all rusts, Pst creates a specialized cellular structure within host cells called the haustorium to obtain nutrients from wheat, and to secrete pathogenicity factors called effector proteins. We purified Pst haustoria and used next-generation sequencing platforms to assemble the haustorial transcriptome as well as the transcriptome of germinated spores. 12,282 transcripts were assembled from 454-pyrosequencing data and used as reference for digital gene expression analysis to compare the germinated uredinospores and haustoria transcriptomes based on Illumina RNAseq data. More than 400 genes encoding secreted proteins which constitute candidate effectors were identified from the haustorial transcriptome, with two thirds of these up-regulated in this tissue compared to germinated spores. RT-PCR analysis confirmed the expression patterns of 94 effector candidates. The analysis also revealed that spores rely mainly on stored energy reserves for growth and development, while haustoria take up host nutrients for massive energy production for biosynthetic pathways and the ultimate production of spores. Together, these studies substantially increase our knowledge of potential Pst effectors and provide new insights into the pathogenic strategies of this important organism.

  5. Targeted capture and next-generation sequencing identifies C9orf75, encoding taperin, as the mutated gene in nonsyndromic deafness DFNB79.

    Science.gov (United States)

    Rehman, Atteeq Ur; Morell, Robert J; Belyantseva, Inna A; Khan, Shahid Y; Boger, Erich T; Shahzad, Mohsin; Ahmed, Zubair M; Riazuddin, Saima; Khan, Shaheen N; Riazuddin, Sheikh; Friedman, Thomas B

    2010-03-12

    Targeted genome capture combined with next-generation sequencing was used to analyze 2.9 Mb of the DFNB79 interval on chromosome 9q34.3, which includes 108 candidate genes. Genomic DNA from an affected member of a consanguineous family segregating recessive, nonsyndromic hearing loss was used to make a library of fragments covering the DFNB79 linkage interval defined by genetic analyses of four pedigrees. Homozygosity for eight previously unreported variants in transcribed sequences was detected by evaluating a library of 402,554 sequencing reads and was later confirmed by Sanger sequencing. Of these variants, six were determined to be polymorphisms in the Pakistani population, and one was in a noncoding gene that was subsequently excluded genetically from the DFNB79 linkage interval. The remaining variant was a nonsense mutation in a predicted gene, C9orf75, renamed TPRN. Evaluation of the other three DFNB79-linked families identified three additional frameshift mutations, for a total of four truncating alleles of this gene. Although TPRN is expressed in many tissues, immunolocalization of the protein product in the mouse cochlea shows prominent expression in the taper region of hair cell stereocilia. Consequently, we named the protein taperin. Copyright 2010 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  6. Novel mutations in CRB1 gene identified in a chinese pedigree with retinitis pigmentosa by targeted capture and next generation sequencing

    Science.gov (United States)

    Lo, David; Weng, Jingning; Liu, xiaohong; Yang, Juhua; He, Fen; Wang, Yun; Liu, Xuyang

    2016-01-01

    PURPOSE To detect the disease-causing gene in a Chinese pedigree with autosomal-recessive retinitis pigmentosa (ARRP). METHODS All subjects in this family underwent a complete ophthalmic examination. Targeted-capture next generation sequencing (NGS) was performed on the proband to detect variants. All variants were verified in the remaining family members by PCR amplification and Sanger sequencing. RESULTS All the affected subjects in this pedigree were diagnosed with retinitis pigmentosa (RP). The compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations in the Crumbs homolog 1 (CRB1) gene were identified in all the affected patients but not in the unaffected individuals in this family. These mutations were inherited from their parents, respectively. CONCLUSION The novel compound heterozygous mutations in CRB1 were identified in a Chinese pedigree with ARRP using targeted-capture next generation sequencing. After evaluating the significant heredity and impaired protein function, the compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations are the causal genes of early onset ARRP in this pedigree. To the best of our knowledge, there is no previous report regarding the compound mutations. PMID:27806333

  7. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

    Science.gov (United States)

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

    2014-04-08

    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  8. Variants in SKP1, PROB1, and IL17B genes at keratoconus 5q31.1-q35.3 susceptibility locus identified by whole-exome sequencing.

    Science.gov (United States)

    Karolak, Justyna A; Gambin, Tomasz; Pitarque, Jose A; Molinari, Andrea; Jhangiani, Shalini; Stankiewicz, Pawel; Lupski, James R; Gajecka, Marzena

    2016-01-01

    Keratoconus (KTCN) is a protrusion and thinning of the cornea, resulting in impairment of visual function. The extreme genetic heterogeneity makes it difficult to discover factors unambiguously influencing the KTCN phenotype. In this study, we used whole-exome sequencing (WES) and Sanger sequencing to reduce the number of candidate genes at the 5q31.1-q35.3 locus and to prioritize other potentially relevant variants in an Ecuadorian family with KTCN. We applied WES in two affected KTCN individuals from the Ecuadorian family that showed a suggestive linkage between the KTCN phenotype and the 5q31.1-q35.3 locus. Putative variants identified by WES were further evaluated in this family using Sanger sequencing. Exome capture discovered a total of 173 rare (minor allele frequency G in SKP1, c.671G>A in PROB1, and c.527G>A in IL17B in the 5q31.1-q35.3 linkage region, and c.850G>A in HKDC1 in the 10q22 locus completely segregated with the phenotype in the studied KTCN family. We demonstrate that a combination of various techniques significantly narrowed the studied genomic region and reduced the list of the putative exonic variants. Moreover, since this locus overlapped two other chromosomal regions previously recognized in distinct KTCN studies, our findings suggest that this 5q31.1-q35.3 locus might be linked with KTCN.

  9. Novel compound heterozygous mutations in the OTOF Gene identified by whole-exome sequencing in auditory neuropathy spectrum disorder.

    Science.gov (United States)

    Tang, Fengzhu; Ma, Dengke; Wang, Yulan; Qiu, Yuecai; Liu, Fei; Wang, Qingqing; Lu, Qiutian; Shi, Min; Xu, Liang; Liu, Min; Liang, Jianping

    2017-03-23

    Many hearing-loss diseases are demonstrated to have Mendelian inheritance caused by mutations in single gene. However, many deaf individuals have diseases that remain genetically unexplained. Auditory neuropathy is a sensorineural deafness in which sounds are able to be transferred into the inner ear normally but the transmission of the signals from inner ear to auditory nerve and brain is injured, also known as auditory neuropathy spectrum disorder (ANSD). The pathogenic mutations of the genes responsible for the Chinese ANSD population remain poorly understood. A total of 127 patients with non-syndromic hearing loss (NSHL) were enrolled in Guangxi Zhuang Autonomous Region. A hereditary deafness gene mutation screening was performed to identify the mutation sites in four deafness-related genes (GJB2, GJB3, 12S rRNA, and SLC26A4). In addition, whole-exome sequencing (WES) was applied to explore unappreciated mutation sites in the cases with the singularity of its phenotype. Well-characterized mutations were found in only 8.7% (11/127) of the patients. Interestingly, two mutations in the OTOF gene were identified in two affected siblings with ANSD from a Chinese family, including one nonsense mutation c.1273C > T (p.R425X) and one missense mutation c.4994 T > C (p.L1665P). Furthermore, we employed Sanger sequencing to confirm the mutations in each subject. Two compound heterozygous mutations in the OTOF gene were observed in the two affected siblings, whereas the two parents and unaffected sister were heterozygous carriers of c.1273C > T (father and sister) and c.4994 T > C (mother). The nonsense mutation p.R425X, contributes to a premature stop codon, may result in a truncated polypeptide, which strongly suggests its pathogenicity for ANSD. The missense mutation p.L1665P results in a single amino acid substitution in a highly conserved region. Two mutations in the OTOF gene in the Chinese deaf population were recognized for the first time. These

  10. Targeted next-generation sequencing identifies novel compound heterozygous mutations of DYNC2H1 in a fetus with short rib-polydactyly syndrome, type III.

    Science.gov (United States)

    Mei, Libin; Huang, Yanru; Pan, Qian; Su, Wei; Quan, Yi; Liang, Desheng; Wu, Lingqian

    2015-07-20

    A 26-year-old woman with a past history of fetal skeletal dysplasia was referred to our institution at 24weeks of gestation following a routine sonographic diagnosis of short limbs in the fetus. A fetal ultrasound showed short limbs, a narrow thorax, short ribs with marginal spurs, and polydactyly. Conventional cytogenetics analysis of cultured amniocytes demonstrated that the fetal karyotype was normal. Using targeted exome sequencing of 226 known genes implicated in inherited skeletal dysplasia, we identified compound heterozygous mutations in the DYNC2H1 gene in the fetus with short rib-polydactyly syndrome, type III (SRPS III), c.1151 C>T(p.Ala384Val) and c.4351 C>T (p.Gln1451*), which were inherited from paternally and maternally, respectively. These variants were further confirmed using Sanger sequencing and have not been previously reported. To our knowledge, this is the first report of DYNC2H1 mutations causing SRPS III, in the Chinese population. Our findings expand the number of reported cases of this rare disease, and indicate that targeted next-generation sequencing (NGS) is an accurate, rapid, and cost-effective method in the genetic diagnosis of fetal skeletal dysplasia. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Whole Exome Sequencing Identifies a Novel and a Recurrent Mutation in BBS2 Gene in a Family with Bardet-Biedl Syndrome

    Directory of Open Access Journals (Sweden)

    Yong Mong Bee

    2015-01-01

    Full Text Available Bardet-Biedl syndrome (BBS is a rare autosomal recessive disorder known to be caused by mutations in at least 19 BBS genes. We report the genetic analysis of a patient with indisputable features of BBS including cardinal features such as postaxial polydactyly, retinitis pigmentosa, obesity, and kidney failure. Taking advantage of next-generation sequencing technology, we applied whole exome sequencing (WES with Sanger direct sequencing to the proband and her unaffected mother. A pair of heterozygous nonsense mutations in BBS2 gene was identified in the proband, one being novel and the other recurrent. The novel mutation, p.Y644X, resides in exon 16 and was also found in the heterozygous state in the mother. This mutation is not currently found in the dsSNP and 1000 Genome SNP databases and is predicted to be disease causing by in silico analysis. This study highlights the potential for a rapid and precise detection of disease causing gene using WES in genetically heterogeneous disorders such as BBS.

  12. Whole-exome sequencing identifies a novel genotype-phenotype correlation in the entactin domain of the known deafness gene TECTA.

    Directory of Open Access Journals (Sweden)

    Byung Yoon Choi

    Full Text Available Postlingual progressive hearing loss, affecting primarily the high frequencies, is the clinical finding in most cases of autosomal dominant nonsyndromic hearing loss (ADNSHL. The molecular genetic etiology of ADNSHL is extremely heterogeneous. We applied whole-exome sequencing to reveal the genetic etiology of high-frequency hearing loss in a mid-sized Korean family without any prior linkage data. Whole-exome sequencing of four family members (two affected and two unaffected, together with our filtering strategy based on comprehensive bioinformatics analyses, identified 21 potential pathogenic candidates. Sanger validation of an additional five family members excluded 20 variants, leaving only one novel variant, TECTA c.710C>T (p.T237I, as the strongest candidate. This variant resides in the entactin (ENT domain and co-segregated perfectly with non-progressive high-frequency hearing loss in the family. It was absent among 700 ethnically matched control chromosomes, and the T237 residue is conserved among species, which supports its pathogenicity. Interestingly, this finding contrasted with a previously proposed genotype-phenotype correlation in which variants of the ENT domain of TECTA were associated with mid-frequency hearing loss. Based upon what we observed, we propose a novel "genotype to phenotype" correlation in the ENT domain of TECTA. Our results shed light on another important application of whole-exome sequencing: the establishment of a novel genotype-phenotype in the molecular genetic diagnosis of autosomal dominant hearing loss.

  13. Transcriptome sequencing in prostate cancer identifies inter-tumor heterogeneity

    Directory of Open Access Journals (Sweden)

    Janet Mendonca

    2015-06-01

    Full Text Available Given the dearth of gene mutations in prostate cancer, [1] ,[2] it is likely that genomic rearrangements play a significant role in the evolution of prostate cancer. However, in the search for recurrent genomic alterations, "private alterations" have received less attention. Such alterations may provide insights into the evolution, behavior, and clinical outcome of an individual tumor. In a recent report in "Genome Biology" Wyatt et al. [3] defines unique alterations in a cohort of high-risk prostate cancer patient with a lethal phenotype. Utilizing a transcriptome sequencing approach they observe high inter-tumor heterogeneity; however, the genes altered distill into three distinct cancer-relevant pathways. Their analysis reveals the presence of several non-ETS fusions, which may contribute to the phenotype of individual tumors, and have significance for disease progression.

  14. Exome Sequencing Identifies a Novel LMNA Splice-Site Mutation and Multigenic Heterozygosity of Potential Modifiers in a Family with Sick Sinus Syndrome, Dilated Cardiomyopathy, and Sudden Cardiac Death.

    Directory of Open Access Journals (Sweden)

    Michael V Zaragoza

    Full Text Available The goals are to understand the primary genetic mechanisms that cause Sick Sinus Syndrome and to identify potential modifiers that may result in intrafamilial variability within a multigenerational family. The proband is a 63-year-old male with a family history of individuals (>10 with sinus node dysfunction, ventricular arrhythmia, cardiomyopathy, heart failure, and sudden death. We used exome sequencing of a single individual to identify a novel LMNA mutation and demonstrated the importance of Sanger validation and family studies when evaluating candidates. After initial single-gene studies were negative, we conducted exome sequencing for the proband which produced 9 gigabases of sequencing data. Bioinformatics analysis showed 94% of the reads mapped to the reference and identified 128,563 unique variants with 108,795 (85% located in 16,319 genes of 19,056 target genes. We discovered multiple variants in known arrhythmia, cardiomyopathy, or ion channel associated genes that may serve as potential modifiers in disease expression. To identify candidate mutations, we focused on ~2,000 variants located in 237 genes of 283 known arrhythmia, cardiomyopathy, or ion channel associated genes. We filtered the candidates to 41 variants in 33 genes using zygosity, protein impact, database searches, and clinical association. Only 21 of 41 (51% variants were validated by Sanger sequencing. We selected nine confirmed variants with minor allele frequencies G, a novel heterozygous splice-site mutation as the primary mutation with rare or novel variants in HCN4, MYBPC3, PKP4, TMPO, TTN, DMPK and KCNJ10 as potential modifiers and a mechanism consistent with haploinsufficiency.

  15. Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes

    NARCIS (Netherlands)

    J. Molenaar (Jan); J. Koster (Jan); D. Zwijnenburg (Danny); P. van Sluis (Peter); L.J. Valentijn (Linda); I. van der Ploeg (Ida); M. Hamdi (Mohamed); J. van Nes (Johan); B.A. Westerman (Bart); J. van Arkel (Jennemiek); M.E. Ebus; F. Haneveld (Franciska); A. Lakeman (Arjan); L. Schild (Linda); P. Molenaar (Piet); P. Stroeken (Peter); M.M. van Noesel (Max); I. Øra (Ingrid); J.P. di Santo (James); H.N. Caron (Huib); E.M. Westerhout (Ellen); R. Versteeg (Rogier)

    2012-01-01

    textabstractNeuroblastoma is a childhood tumour of the peripheral sympathetic nervous system. The pathogenesis has for a long time been quite enigmatic, as only very few gene defects were identified in this often lethal tumour. Frequently detected gene alterations are limited to MYCN amplification

  16. Simple sequence repeat (SSR) markers are effective for identifying ...

    African Journals Online (AJOL)

    The present study characterized and identified pear cultivars growing in the southern region of Minas Gerais State, Brazil, using microsatellite markers. Nineteen (19) pear cultivars were collected from two sites of Southern Minas Gerais State: Ribeirão Vermelho and Lavras. DNA was extracted from newly formed leaves and ...

  17. Identifying statistical dependence in genomic sequences via mutual information estimates

    CERN Document Server

    Aktulga, H M; Lyznik, L A; Szpankowski, L; Grama, A Y; Szpankowski, W

    2007-01-01

    Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA) that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5' untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unkno...

  18. Computational and statistical approaches to analyzing variants identified by exome sequencing

    National Research Council Canada - National Science Library

    Stitziel, Nathan O; Kiezun, Adam; Sunyaev, Shamil

    2011-01-01

    New sequencing technology has enabled the identification of thousands of single nucleotide polymorphisms in the exome, and many computational and statistical approaches to identify disease-association...

  19. Identifying Statistical Dependence in Genomic Sequences via Mutual Information Estimates

    Directory of Open Access Journals (Sweden)

    Wojciech Szpankowski

    2007-12-01

    Full Text Available Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, they are used for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5′ untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's combined DNA index system (CODIS, we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats—an application of importance in genetic profiling.

  20. Identifying statistical dependence in genomic sequences via mutual information estimates.

    Science.gov (United States)

    Aktulga, Hasan Metin; Kontoyiannis, Ioannis; Lyznik, L Alex; Szpankowski, Lukasz; Grama, Ananth Y; Szpankowski, Wojciech

    2007-01-01

    Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA) that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, they are used for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5' untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's combined DNA index system (CODIS), we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats-an application of importance in genetic profiling.

  1. Identifying Statistical Dependence in Genomic Sequences via Mutual Information Estimates

    Directory of Open Access Journals (Sweden)

    Kontoyiannis Ioannis

    2007-01-01

    Full Text Available Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, they are used for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's combined DNA index system (CODIS, we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats—an application of importance in genetic profiling.

  2. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...... on transcriptional evidence. Analysis of repetitive sequences suggests that they are underrepresented in the reference assembly, reflecting an enrichment of gene-rich regions in the current assembly. Characterization of Lotus natural variation by resequencing of L. japonicus accessions and diploid Lotus species...... is currently ongoing, facilitated by the MG20 reference sequence...

  3. Exome sequencing identifies rare variants in multiple genes in atrioventricular septal defect

    NARCIS (Netherlands)

    D'Alessandro, Lisa C. A.; Al Turki, Saeed; Manickaraj, Ashok Kumar; Manase, Dorin; Mulder, Barbara J. M.; Bergin, Lynn; Rosenberg, Herschel C.; Mondal, Tapas; Gordon, Elaine; Lougheed, Jane; Smythe, John; Devriendt, Koen; Bhattacharya, Shoumo; Watkins, Hugh; Bentham, Jamie; Bowdin, Sarah; Hurles, Matthew E.; Mital, Seema

    2016-01-01

    The genetic etiology of atrioventricular septal defect (AVSD) is unknown in 40% cases. Conventional sequencing and arrays have identified the etiology in only a minority of nonsyndromic individuals with AVSD. Whole-exome sequencing was performed in 81 unrelated probands with AVSD to identify

  4. Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak

    Directory of Open Access Journals (Sweden)

    Léger Patrick

    2010-11-01

    Full Text Available Abstract Background The Fagaceae family comprises about 1,000 woody species worldwide. About half belong to the Quercus family. These oaks are often a source of raw material for biomass wood and fiber. Pedunculate and sessile oaks, are among the most important deciduous forest tree species in Europe. Despite their ecological and economical importance, very few genomic resources have yet been generated for these species. Here, we describe the development of an EST catalogue that will support ecosystem genomics studies, where geneticists, ecophysiologists, molecular biologists and ecologists join their efforts for understanding, monitoring and predicting functional genetic diversity. Results We generated 145,827 sequence reads from 20 cDNA libraries using the Sanger method. Unexploitable chromatograms and quality checking lead us to eliminate 19,941 sequences. Finally a total of 125,925 ESTs were retained from 111,361 cDNA clones. Pyrosequencing was also conducted for 14 libraries, generating 1,948,579 reads, from which 370,566 sequences (19.0% were eliminated, resulting in 1,578,192 sequences. Following clustering and assembly using TGICL pipeline, 1,704,117 EST sequences collapsed into 69,154 tentative contigs and 153,517 singletons, providing 222,671 non-redundant sequences (including alternative transcripts. We also assembled the sequences using MIRA and PartiGene software and compared the three unigene sets. Gene ontology annotation was then assigned to 29,303 unigene elements. Blast search against the SWISS-PROT database revealed putative homologs for 32,810 (14.7% unigene elements, but more extensive search with Pfam, Refseq_protein, Refseq_RNA and eight gene indices revealed homology for 67.4% of them. The EST catalogue was examined for putative homologs of candidate genes involved in bud phenology, cuticle formation, phenylpropanoids biosynthesis and cell wall formation. Our results suggest a good coverage of genes involved in these

  5. Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak.

    Science.gov (United States)

    Ueno, Saneyoshi; Le Provost, Grégoire; Léger, Valérie; Klopp, Christophe; Noirot, Céline; Frigerio, Jean-Marc; Salin, Franck; Salse, Jérôme; Abrouk, Michael; Murat, Florent; Brendel, Oliver; Derory, Jérémy; Abadie, Pierre; Léger, Patrick; Cabane, Cyril; Barré, Aurélien; de Daruvar, Antoine; Couloux, Arnaud; Wincker, Patrick; Reviron, Marie-Pierre; Kremer, Antoine; Plomion, Christophe

    2010-11-23

    The Fagaceae family comprises about 1,000 woody species worldwide. About half belong to the Quercus family. These oaks are often a source of raw material for biomass wood and fiber. Pedunculate and sessile oaks, are among the most important deciduous forest tree species in Europe. Despite their ecological and economical importance, very few genomic resources have yet been generated for these species. Here, we describe the development of an EST catalogue that will support ecosystem genomics studies, where geneticists, ecophysiologists, molecular biologists and ecologists join their efforts for understanding, monitoring and predicting functional genetic diversity. We generated 145,827 sequence reads from 20 cDNA libraries using the Sanger method. Unexploitable chromatograms and quality checking lead us to eliminate 19,941 sequences. Finally a total of 125,925 ESTs were retained from 111,361 cDNA clones. Pyrosequencing was also conducted for 14 libraries, generating 1,948,579 reads, from which 370,566 sequences (19.0%) were eliminated, resulting in 1,578,192 sequences. Following clustering and assembly using TGICL pipeline, 1,704,117 EST sequences collapsed into 69,154 tentative contigs and 153,517 singletons, providing 222,671 non-redundant sequences (including alternative transcripts). We also assembled the sequences using MIRA and PartiGene software and compared the three unigene sets. Gene ontology annotation was then assigned to 29,303 unigene elements. Blast search against the SWISS-PROT database revealed putative homologs for 32,810 (14.7%) unigene elements, but more extensive search with Pfam, Refseq_protein, Refseq_RNA and eight gene indices revealed homology for 67.4% of them. The EST catalogue was examined for putative homologs of candidate genes involved in bud phenology, cuticle formation, phenylpropanoids biosynthesis and cell wall formation. Our results suggest a good coverage of genes involved in these traits. Comparative orthologous sequences (COS

  6. Forensic Loci Allele Database (FLAD): Automatically generated, permanent identifiers for sequenced forensic alleles.

    Science.gov (United States)

    Van Neste, Christophe; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

    2016-01-01

    It is difficult to predict if and when massively parallel sequencing of forensic STR loci will replace capillary electrophoresis as the new standard technology in forensic genetics. The main benefits of sequencing are increased multiplexing scales and SNP detection. There is not yet a consensus on how sequenced profiles should be reported. We present the Forensic Loci Allele Database (FLAD) service, made freely available on http://forensic.ugent.be/FLAD/. It offers permanent identifiers for sequenced forensic alleles (STR or SNP) and their microvariants for use in forensic allele nomenclature. Analogous to Genbank, its aim is to provide permanent identifiers for forensically relevant allele sequences. Researchers that are developing forensic sequencing kits or are performing population studies, can register on http://forensic.ugent.be/FLAD/ and add loci and allele sequences with a short and simple application interface (API). Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  7. Comparative analysis of oncogenes identified by microarray and RNA-sequencing as biomarkers for clinical prognosis.

    Science.gov (United States)

    Liu, Yuan; Jing, Runyu; Xu, Junmei; Liu, Keqin; Xue, Jiwei; Wen, Zhining; Li, Menglong

    2015-01-01

    Although RNA-sequencing has been widely used to identify the differentially expressed genes (DEGs) as biomarkers to guide the therapeutic treatment, it is necessary to investigate the concordance of DEGs identified by microarray and RNA-sequencing for the clinical prognosis. By using The Cancer Genome Atlas data sets, we thoroughly investigated the concordance of DEGs identified from microarray and RNA-sequencing data and their molecular functions. The DEGs identified by both technologies averaged ~98.6% overlap. The cancer-related gene sets were significantly enriched with the DEGs and consistent between two technologies. The highly consistency of DEGs in their regulation directionality and molecular functions indicated the good reproducibility between microarray and RNA-sequencing in identifying potential oncogenes for clinical prognosis.

  8. RISCI - Repeat Induced Sequence Changes Identifier: a comprehensive, comparative genomics-based, in silico subtractive hybridization pipeline to identify repeat induced sequence changes in closely related genomes

    Science.gov (United States)

    2010-01-01

    Background - The availability of multiple whole genome sequences has facilitated in silico identification of fixed and polymorphic transposable elements (TE). Whereas polymorphic loci serve as makers for phylogenetic and forensic analysis, fixed species-specific transposon insertions, when compared to orthologous loci in other closely related species, may give insights into their evolutionary significance. Besides, TE insertions are not isolated events and are frequently associated with subtle sequence changes concurrent with insertion or post insertion. These include duplication of target site, 3' and 5' flank transduction, deletion of the target locus, 5' truncation or partial deletion and inversion of the transposon, and post insertion changes like inter or intra element recombination, disruption etc. Although such changes have been studied independently, no automated platform to identify differential transposon insertions and the associated array of sequence changes in genomes of the same or closely related species is available till date. To this end, we have designed RISCI - 'Repeat Induced Sequence Changes Identifier' - a comprehensive, comparative genomics-based, in silico subtractive hybridization pipeline to identify differential transposon insertions and associated sequence changes using specific alignment signatures, which may then be examined for their downstream effects. Results - We showcase the utility of RISCI by comparing full length and truncated L1HS and AluYa5 retrotransposons in the reference human genome with the chimpanzee genome and the alternate human assemblies (Celera and HuRef). Comparison of the reference human genome with alternate human assemblies using RISCI predicts 14 novel polymorphisms in full length L1HS, 24 in truncated L1HS and 140 novel polymorphisms in AluYa5 insertions, besides several insertion and post insertion changes. We present comparison with two previous studies to show that RISCI predictions are broadly in

  9. Inferring short-range linkage information from sequencing chromatograms.

    Directory of Open Access Journals (Sweden)

    Bastian Beggel

    Full Text Available Direct Sanger sequencing of viral genome populations yields multiple ambiguous sequence positions. It is not straightforward to derive linkage information from sequencing chromatograms, which in turn hampers the correct interpretation of the sequence data. We present a method for determining the variants existing in a viral quasispecies in the case of two nearby ambiguous sequence positions by exploiting the effect of sequence context-dependent incorporation of dideoxynucleotides. The computational model was trained on data from sequencing chromatograms of clonal variants and was evaluated on two test sets of in vitro mixtures. The approach achieved high accuracies in identifying the mixture components of 97.4% on a test set in which the positions to be analyzed are only one base apart from each other, and of 84.5% on a test set in which the ambiguous positions are separated by three bases. In silico experiments suggest two major limitations of our approach in terms of accuracy. First, due to a basic limitation of Sanger sequencing, it is not possible to reliably detect minor variants with a relative frequency of no more than 10%. Second, the model cannot distinguish between mixtures of two or four clonal variants, if one of two sets of linear constraints is fulfilled. Furthermore, the approach requires repetitive sequencing of all variants that might be present in the mixture to be analyzed. Nevertheless, the effectiveness of our method on the two in vitro test sets shows that short-range linkage information of two ambiguous sequence positions can be inferred from Sanger sequencing chromatograms without any further assumptions on the mixture composition. Additionally, our model provides new insights into the established and widely used Sanger sequencing technology. The source code of our method is made available at http://bioinf.mpi-inf.mpg.de/publications/beggel/linkageinformation.zip.

  10. Quantitative group testing-based overlapping pool sequencing to identify rare variant carriers

    Science.gov (United States)

    2014-01-01

    Background Genome-wide association studies have revealed that rare variants are responsible for a large portion of the heritability of some complex human diseases. This highlights the increasing importance of detecting and screening for rare variants. Although the massively parallel sequencing technologies have greatly reduced the cost of DNA sequencing, the identification of rare variant carriers by large-scale re-sequencing remains prohibitively expensive because of the huge challenge of constructing libraries for thousands of samples. Recently, several studies have reported that techniques from group testing theory and compressed sensing could help identify rare variant carriers in large-scale samples with few pooled sequencing experiments and a dramatically reduced cost. Results Based on quantitative group testing, we propose an efficient overlapping pool sequencing strategy that allows the efficient recovery of variant carriers in numerous individuals with much lower costs than conventional methods. We used random k-set pool designs to mix samples, and optimized the design parameters according to an indicative probability. Based on a mathematical model of sequencing depth distribution, an optimal threshold was selected to declare a pool positive or negative. Then, using the quantitative information contained in the sequencing results, we designed a heuristic Bayesian probability decoding algorithm to identify variant carriers. Finally, we conducted in silico experiments to find variant carriers among 200 simulated Escherichia coli strains. With the simulated pools and publicly available Illumina sequencing data, our method correctly identified the variant carriers for 91.5–97.9% variants with the variant frequency ranging from 0.5 to 1.5%. Conclusions Using the number of reads, variant carriers could be identified precisely even though samples were randomly selected and pooled. Our method performed better than the published DNA Sudoku design and compressed

  11. Functional Brain Activation Differences in Stuttering Identified with a Rapid fMRI Sequence

    Science.gov (United States)

    Loucks, Torrey; Kraft, Shelly Jo; Choo, Ai Leen; Sharma, Harish; Ambrose, Nicoline G.

    2011-01-01

    The purpose of this study was to investigate whether brain activity related to the presence of stuttering can be identified with rapid functional MRI (fMRI) sequences that involved overt and covert speech processing tasks. The long-term goal is to develop sensitive fMRI approaches with developmentally appropriate tasks to identify deviant speech…

  12. Whole exome sequencing identifies lncRNA GAS8-AS1 and LPAR4 as novel papillary thyroid carcinoma driver alternations.

    Science.gov (United States)

    Pan, Wenting; Zhou, Liqing; Ge, Minghua; Zhang, Bin; Yang, Xinyu; Xiong, Xiangyu; Fu, Guobin; Zhang, Jian; Nie, Xilin; Li, Hongmin; Tang, Xiaohu; Wei, Jinyu; Shao, Mingming; Zheng, Jian; Yuan, Qipeng; Tan, Wen; Wu, Chen; Yang, Ming; Lin, Dongxin

    2016-05-01

    Papillary thyroid carcinoma (PTC) is the most common type of thyroid cancer. However, we know little of mutational spectrum in the Chinese population. Thus, here we report the identification of somatic mutations for Chinese PTC using 402 tumor-normal pairs (Discovery: 91 pairs via exome sequencing; validation: 311 pairs via Sanger sequencing). We observed three distinct mutational signatures, evidently different from the two mutational signatures among Caucasian PTCs. Ten significantly mutated genes were identified, most previously uncharacterized. Notably, we found that long non-coding RNA (lncRNA) GAS8-AS1 is the secondary most frequently altered gene and acts as a novel tumor suppressor in PTC. As a mutation hotspot, the c.713A>G/714T>C dinucleotide substitution was found among 89.1% patients with GAS8-AS1 mutations and associated with advanced PTC disease (P = 0.009). Interestingly, the wild-type lncRNA GAS8-AS1 (A713T714) showed consistently higher capability to inhibit cancer cell growth compared to the mutated lncRNA (G713C714). Further studies also elucidated the oncogene nature of the G protein-coupled receptor LPAR4 and its c.872T>G (p.Ile291Ser) mutation in PTC malignant transformation. The BRAF c.1799T>A (p.Val600Glu) substitution was present in 59.0% Chinese PTCs, more frequently observed in patients with lymph node metastasis (P = 1.6 × 10(-4)). Together our study defines a exome mutational spectrum of PTC in the Chinese population and highlights lncRNA GAS8-AS1 and LPAR4 as potential diagnostics and therapeutic targets. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  13. A machine learning strategy to identify candidate binding sites in human protein-coding sequence

    Directory of Open Access Journals (Sweden)

    Leong Bernard

    2006-09-01

    Full Text Available Abstract Background The splicing of RNA transcripts is thought to be partly promoted and regulated by sequences embedded within exons. Known sequences include binding sites for SR proteins, which are thought to mediate interactions between splicing factors bound to the 5' and 3' splice sites. It would be useful to identify further candidate sequences, however identifying them computationally is hard since exon sequences are also constrained by their functional role in coding for proteins. Results This strategy identified a collection of motifs including several previously reported splice enhancer elements. Although only trained on coding exons, the model discriminates both coding and non-coding exons from intragenic sequence. Conclusion We have trained a computational model able to detect signals in coding exons which seem to be orthogonal to the sequences' primary function of coding for proteins. We believe that many of the motifs detected here represent binding sites for both previously unrecognized proteins which influence RNA splicing as well as other regulatory elements.

  14. Whole Genome Sequencing Demonstrates Limited Transmission within Identified Mycobacterium tuberculosis Clusters in New South Wales, Australia.

    Directory of Open Access Journals (Sweden)

    Ulziijargal Gurjav

    Full Text Available Australia has a low tuberculosis incidence rate with most cases occurring among recent immigrants. Given suboptimal cluster resolution achieved with 24-locus mycobacterium interspersed repetitive unit (MIRU-24 genotyping, the added value of whole genome sequencing was explored. MIRU-24 profiles of all Mycobacterium tuberculosis culture-confirmed tuberculosis cases diagnosed between 2009 and 2013 in New South Wales (NSW, Australia, were examined and clusters identified. The relatedness of cases within the largest MIRU-24 clusters was assessed using whole genome sequencing and phylogenetic analyses. Of 1841 culture-confirmed TB cases, 91.9% (1692/1841 had complete demographic and genotyping data. East-African Indian (474; 28.0% and Beijing (470; 27.8% lineage strains predominated. The overall rate of MIRU-24 clustering was 20.1% (340/1692 and was highest among Beijing lineage strains (35.7%; 168/470. One Beijing and three East-African Indian (EAI clonal complexes were responsible for the majority of observed clusters. Whole genome sequencing of the 4 largest clusters (30 isolates demonstrated diverse single nucleotide polymorphisms (SNPs within identified clusters. All sequenced EAI strains and 70% of Beijing lineage strains clustered by MIRU-24 typing demonstrated distinct SNP profiles. The superior resolution provided by whole genome sequencing demonstrated limited M. tuberculosis transmission within NSW, even within identified MIRU-24 clusters. Routine whole genome sequencing could provide valuable public health guidance in low burden settings.

  15. Increased genetic diversity and prevalence of co-infection with Trypanosoma spp. in koalas (Phascolarctos cinereus and their ticks identified using next-generation sequencing (NGS.

    Directory of Open Access Journals (Sweden)

    Amanda D Barbosa

    Full Text Available Infections with Trypanosoma spp. have been associated with poor health and decreased survival of koalas (Phascolarctos cinereus, particularly in the presence of concurrent pathogens such as Chlamydia and koala retrovirus. The present study describes the application of a next-generation sequencing (NGS-based assay to characterise the prevalence and genetic diversity of trypanosome communities in koalas and two native species of ticks (Ixodes holocyclus and I. tasmani removed from koala hosts. Among 168 koalas tested, 32.2% (95% CI: 25.2-39.8% were positive for at least one Trypanosoma sp. Previously described Trypanosoma spp. from koalas were identified, including T. irwini (32.1%, 95% CI: 25.2-39.8%, T. gilletti (25%, 95% CI: 18.7-32.3%, T. copemani (27.4%, 95% CI: 20.8-34.8% and T. vegrandis (10.1%, 95% CI: 6.0-15.7%. Trypanosoma noyesi was detected for the first time in koalas, although at a low prevalence (0.6% 95% CI: 0-3.3%, and a novel species (Trypanosoma sp. AB-2017 was identified at a prevalence of 4.8% (95% CI: 2.1-9.2%. Mixed infections with up to five species were present in 27.4% (95% CI: 21-35% of the koalas, which was significantly higher than the prevalence of single infections 4.8% (95% CI: 2-9%. Overall, a considerably higher proportion (79.7% of the Trypanosoma sequences isolated from koala blood samples were identified as T. irwini, suggesting this is the dominant species. Co-infections involving T. gilletti, T. irwini, T. copemani, T. vegrandis and Trypanosoma sp. AB-2017 were also detected in ticks, with T. gilletti and T. copemani being the dominant species within the invertebrate hosts. Direct Sanger sequencing of Trypanosoma 18S rRNA gene amplicons was also performed and results revealed that this method was only able to identify the genotypes with greater amount of reads (according to NGS within koala samples, which highlights the advantages of NGS in detecting mixed infections. The present study provides new insights

  16. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    Energy Technology Data Exchange (ETDEWEB)

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  17. Identifying recombinants in human and primate immunodeficiency virus sequence alignments using quartet scanning

    Directory of Open Access Journals (Sweden)

    Martin Darren P

    2009-04-01

    Full Text Available Abstract Background Recombination has a profound impact on the evolution of viruses, but characterizing recombination patterns in molecular sequences remains a challenging endeavor. Despite its importance in molecular evolutionary studies, identifying the sequences that exhibit such patterns has received comparatively less attention in the recombination detection framework. Here, we extend a quartet-mapping based recombination detection method to enable identification of recombinant sequences without prior specifications of either query and reference sequences. Through simulations we evaluate different recombinant identification statistics and significance tests. We compare the quartet approach with triplet-based methods that employ additional heuristic tests to identify parental and recombinant sequences. Results Analysis of phylogenetic simulations reveal that identifying the descendents of relatively old recombination events is a challenging task for all methods available, and that quartet scanning performs relatively well compared to the triplet based methods. The use of quartet scanning is further demonstrated by analyzing both well-established and putative HIV-1 recombinant strains. In agreement with recent findings, we provide evidence that the presumed circulating recombinant CRF02_AG is a 'pure' lineage, whereas the presumed parental lineage subtype G has a recombinant origin. We also demonstrate HIV-1 intrasubtype recombination, confirm the hybrid origin of SIV in chimpanzees and further disentangle the recombinant history of SIV lineages in a primate immunodeficiency virus data set. Conclusion Quartet scanning makes a valuable addition to triplet-based methods for identifying recombinant sequences without prior specifications of either query and reference sequences. The new method is available in the VisRD v.3.0 package http://www.cmp.uea.ac.uk/~vlm/visrd.

  18. Mutations in the DDR2 Kinase Gene identify a Novel therapeutic target in squamous cell lung cancer

    NARCIS (Netherlands)

    Hammerman, Peter S.; Sos, Martin L.; Ramos, Alex H.; Xu, Chunxiao; Dutt, Amit; Zhou, Wenjun; Brace, Lear E.; Woods, Brittany A.; Lin, Wenchu; Zhang, Jianming; Deng, Xianming; Lim, Sang Min; Heynck, Stefanie; Peifer, Martin; Simard, Jeffrey R.; Lawrence, Michael S.; Onofrio, Robert C.; Salvesen, Helga B.; Seidel, Danila; Zander, Thomas; Heuckmann, Johannes M.; Soltermann, Alex; Moch, Holger; Koker, Mirjam; Leenders, Frauke; Gabler, Franziska; Querings, Silvia; Ansen, Sascha; Brambilla, Elisabeth; Brambilla, Christian; Lorimier, Philippe; Brustugun, Odd Terje; Helland, Aslaug; Petersen, Iver; Clement, Joachim H.; Groen, Harry; Timens, Wim; Sietsma, Hannie; Stoelben, Erich; Wolf, Juergen; Beer, David G.; Tsao, Ming Sound; Hanna, Megan; Hatton, Charles; Eck, Michael J.; Janne, Pasi A.; Johnson, Bruce E.; Winckler, Wendy; Greulich, Heidi; Bass, Adam J.; Cho, Jeonghee; Rauh, Daniel; Gray, Nathanael S.; Wong, Kwok-Kin; Haura, Eric B.; Thomas, Roman K.; Meyerson, Matthew

    Although genomically targeted therapies have improved outcomes for patients with lung adenocarcinoma, little is known about the genomic alterations that drive squamous cell cancer (SCC) of the lung. Sanger sequencing of the tyrosine kinome identified mutations in the DDR2 kinase gene in 3.8% of lung

  19. A Bayesian framework to identify methylcytosines from high-throughput bisulfite sequencing data.

    Directory of Open Access Journals (Sweden)

    Qing Xie

    2014-09-01

    Full Text Available High-throughput bisulfite sequencing technologies have provided a comprehensive and well-fitted way to investigate DNA methylation at single-base resolution. However, there are substantial bioinformatic challenges to distinguish precisely methylcytosines from unconverted cytosines based on bisulfite sequencing data. The challenges arise, at least in part, from cell heterozygosis caused by multicellular sequencing and the still limited number of statistical methods that are available for methylcytosine calling based on bisulfite sequencing data. Here, we present an algorithm, termed Bycom, a new Bayesian model that can perform methylcytosine calling with high accuracy. Bycom considers cell heterozygosis along with sequencing errors and bisulfite conversion efficiency to improve calling accuracy. Bycom performance was compared with the performance of Lister, the method most widely used to identify methylcytosines from bisulfite sequencing data. The results showed that the performance of Bycom was better than that of Lister for data with high methylation levels. Bycom also showed higher sensitivity and specificity for low methylation level samples (<1% than Lister. A validation experiment based on reduced representation bisulfite sequencing data suggested that Bycom had a false positive rate of about 4% while maintaining an accuracy of close to 94%. This study demonstrated that Bycom had a low false calling rate at any methylation level and accurate methylcytosine calling at high methylation levels. Bycom will contribute significantly to studies aimed at recalibrating the methylation level of genomic regions based on the presence of methylcytosines.

  20. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data.

    Science.gov (United States)

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E; Greenwood, Alex D

    2015-11-24

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals.

  1. An Evolutionarily Young Polar Bear (Ursus maritimus Endogenous Retrovirus Identified from Next Generation Sequence Data

    Directory of Open Access Journals (Sweden)

    Kyriakos Tsangaras

    2015-11-01

    Full Text Available Transcriptome analysis of polar bear (Ursus maritimus tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV. Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos and black bear (Ursus americanus but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals.

  2. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions.

    Directory of Open Access Journals (Sweden)

    Wei Li

    Full Text Available Copy-number variations (CNV, loss of heterozygosity (LOH, and uniparental disomy (UPD are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS, is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs. In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information.

  3. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data

    Science.gov (United States)

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E.; Greenwood, Alex D.

    2015-01-01

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals. PMID:26610552

  4. Novel ZEB2-BCL11B Fusion Gene Identified by RNA-Sequencing in Acute Myeloid Leukemia with t(2;14(q22;q32.

    Directory of Open Access Journals (Sweden)

    Synne Torkildsen

    Full Text Available RNA-sequencing of a case of acute myeloid leukemia with the bone marrow karyotype 46,XY,t(2;14(q22;q32[5]/47,XY,idem,+?4,del(6(q13q21[cp6]/46,XY[4] showed that the t(2;14 generated a ZEB2-BCL11B chimera in which exon 2 of ZEB2 (nucleotide 595 in the sequence with accession number NM_014795.3 was fused to exon 2 of BCL11B (nucleotide 554 in the sequence with accession number NM_022898.2. RT-PCR together with Sanger sequencing verified the presence of the above-mentioned fusion transcript. All functional domains of BCL11B are retained in the chimeric protein. Abnormal expression of BCL11B coding regions subjected to control by the ZEB2 promoter seems to be the leukemogenic mechanism behind the translocation.

  5. Discordant Haplotype Sequencing Identifies Functional Variants at the 2q33 Breast Cancer Risk Locus.

    Science.gov (United States)

    Camp, Nicola J; Lin, Wei-Yu; Bigelow, Alex; Burghel, George J; Mosbruger, Timothy L; Parry, Marina A; Waller, Rosalie G; Rigas, Sushilaben H; Tai, Pei-Yi; Berrett, Kristofer; Rajamanickam, Venkatesh; Cosby, Rachel; Brock, Ian W; Jones, Brandt; Connley, Dan; Sargent, Robert; Wang, Guoying; Factor, Rachel E; Bernard, Philip S; Cannon-Albright, Lisa; Knight, Stacey; Abo, Ryan; Werner, Theresa L; Reed, Malcolm W R; Gertz, Jason; Cox, Angela

    2016-04-01

    The findings from genome-wide association studies hold enormous potential for novel insight into disease mechanisms. A major challenge in the field is to map these low-risk association signals to their underlying functional sequence variants (FSV). Simple sequence study designs are insufficient, as the vast numbers of statistically comparable variants and a limited knowledge of noncoding regulatory elements complicate prioritization. Furthermore, large sample sizes are typically required for adequate power to identify the initial association signals. One important question is whether similar sample sizes need to be sequenced to identify the FSVs. Here, we present a proof-of-principle example of an extreme discordant design to map FSVs within the 2q33 low-risk breast cancer locus. Our approach employed DNA sequencing of a small number of discordant haplotypes to efficiently identify candidate FSVs. Our results were consistent with those from a 2,000-fold larger, traditional imputation-based fine-mapping study. To prioritize further, we used expression-quantitative trait locus analysis of RNA sequencing from breast tissues, gene regulation annotations from the ENCODE consortium, and functional assays for differential enhancer activities. Notably, we implicate three regulatory variants at 2q33 that target CASP8 (rs3769823, rs3769821 in CASP8, and rs10197246 in ALS2CR12) as functionally relevant. We conclude that nested discordant haplotype sequencing is a promising approach to aid mapping of low-risk association loci. The ability to include more efficient sequencing designs into mapping efforts presents an opportunity for the field to capitalize on the potential of association loci and accelerate translation of association signals to their underlying FSVs. Cancer Res; 76(7); 1916-25. ©2016 AACR. ©2016 American Association for Cancer Research.

  6. Description and interpretation of various SNPs identified by BRCA2 gene sequencing

    Directory of Open Access Journals (Sweden)

    Anca Negura

    2011-12-01

    Full Text Available Molecular diagnosis for hereditary breast and ovarian cancer (HBOC involves systematic DNA sequencing of predisposition genes like BRCA1 or BRCA2. Deleterious mutations within such genes are responsible for developing the disease, but other sequence variants can also be identified. Common Single Nucleotide Polymorphisms (SNPs are usually present in human genome, defining alleles whose frequencies widely vary in different populations. Either intragenic or intronic, silent or generating aminoacid substitutions, SNPs cannot be afforded themselves a predisposition status. However, prevalent SNPs can be used to define gene haplotypes, with also various frequencies. Since some mutation can easily be assigned to haplotypes (such is the case for BRCA1 gene, SNPs can therefore provide usual information in interpreting gene mutations effects on hereditary predisposition to cancer. Here we describe 10 BRCA2 SNPs identified by complete gene sequencing

  7. Genomic Aberrations in Crizotinib Resistant Lung Adenocarcinoma Samples Identified by Transcriptome Sequencing

    NARCIS (Netherlands)

    Saber, Ali; van der Wekken, Anthonie J.; Kok, Klaas; Terpstra, M. Martijn; Bosman, Lisette J.; Mastik, Mirjam F.; Timens, Wim; Schuuring, Ed; Hiltermann, T. Jeroen N.; Groen, Harry J. M.; van den Berg, Anke

    2016-01-01

    ALK-break positive non-small cell lung cancer (NSCLC) patients initially respond to crizotinib, but resistance occurs inevitably. In this study we aimed to identify fusion genes in crizotinib resistant tumor samples. Re-biopsies of three patients were subjected to paired-end RNA sequencing to

  8. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    Science.gov (United States)

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  9. Internet-accessible DNA sequence database for identifying fusaria from human and animal infections

    NARCIS (Netherlands)

    O'Donnell, K.; Sutton, D.A.; Rinaldi, M.G.; Sarver, B.A.; Balajee, S.; Schroers, H.J.; Summerbell, R.C.; Robert, V.A.R.G.; Crous, P.W.; Zhang, N.; Aoki, T.; Jung, K.; Park, J.; Lee, Y.A.; Kang, S.; Park, B.; Geiser, D.M.

    2010-01-01

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated

  10. RePS: a sequence assembler that masks exact repeats identified from the shotgun data

    DEFF Research Database (Denmark)

    Wang, Jun; Wong, Gane Ka-Shu; Ni, Peixiang

    2002-01-01

    We describe a sequence assembler, RePS (repeat-masked Phrap with scaffolding), that explicitly identifies exact 20mer repeats from the shotgun data and removes them prior to the assembly. The established software is used to compute meaningful error probabilities for each base. Clone-end-pairing i...

  11. SoftSearch: integration of multiple sequence features to identify breakpoints of structural variations.

    Directory of Open Access Journals (Sweden)

    Steven N Hart

    Full Text Available BACKGROUND: Structural variation (SV represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. RESULTS: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1 not requiring secondary (or exhaustive primary alignment, 2 portability into established sequencing workflows, and 3 is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.. SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. CONCLUSIONS: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.

  12. Integrated sequence-structure motifs suffice to identify microRNA precursors.

    Directory of Open Access Journals (Sweden)

    Xiuqin Liu

    Full Text Available BACKGROUND: Upwards of 1200 miRNA loci have hitherto been annotated in the human genome. The specific features defining a miRNA precursor and deciding its recognition and subsequent processing are not yet exhaustively described and miRNA loci can thus not be computationally identified with sufficient confidence. RESULTS: We rendered pre-miRNA and non-pre-miRNA hairpins as strings of integrated sequence-structure information, and used the software Teiresias to identify sequence-structure motifs (ss-motifs of variable length in these data sets. Using only ss-motifs as features in a Support Vector Machine (SVM algorithm for pre-miRNA identification achieved 99.2% specificity and 97.6% sensitivity on a human test data set, which is comparable to previously published algorithms employing combinations of sequence-structure and additional features. Further analysis of the ss-motif information contents revealed strongly significant deviations from those of the respective training sets, revealing important potential clues as to how the sequence and structural information of RNA hairpins are utilized by the miRNA processing apparatus. CONCLUSION: Integrated sequence-structure motifs of variable length apparently capture nearly all information required to distinguish miRNA precursors from other stem-loop structures.

  13. Implementing a genomic data management system using iRODS in the Wellcome Trust Sanger Institute

    Directory of Open Access Journals (Sweden)

    Sale Kevin

    2011-09-01

    Full Text Available Abstract Background Increasingly large amounts of DNA sequencing data are being generated within the Wellcome Trust Sanger Institute (WTSI. The traditional file system struggles to handle these increasing amounts of sequence data. A good data management system therefore needs to be implemented and integrated into the current WTSI infrastructure. Such a system enables good management of the IT infrastructure of the sequencing pipeline and allows biologists to track their data. Results We have chosen a data grid system, iRODS (Rule-Oriented Data management systems, to act as the data management system for the WTSI. iRODS provides a rule-based system management approach which makes data replication much easier and provides extra data protection. Unlike the metadata provided by traditional file systems, the metadata system of iRODS is comprehensive and allows users to customize their own application level metadata. Users and IT experts in the WTSI can then query the metadata to find and track data. The aim of this paper is to describe how we designed and used (from both system and user viewpoints iRODS as a data management system. Details are given about the problems faced and the solutions found when iRODS was implemented. A simple use case describing how users within the WTSI use iRODS is also introduced. Conclusions iRODS has been implemented and works as the production system for the sequencing pipeline of the WTSI. Both biologists and IT experts can now track and manage data, which could not previously be achieved. This novel approach allows biologists to define their own metadata and query the genomic data using those metadata.

  14. Unique Trichomonas vaginalis gene sequences identified in multinational regions of Northwest China.

    Science.gov (United States)

    Liu, Jun; Feng, Meng; Wang, Xiaolan; Fu, Yongfeng; Ma, Cailing; Cheng, Xunjia

    2017-07-24

    Trichomonas vaginalis (T. vaginalis) is a flagellated protozoan parasite that infects humans worldwide. This study determined the sequence of the 18S ribosomal RNA gene of T. vaginalis infecting both females and males in Xinjiang, China. Samples from 73 females and 28 males were collected and confirmed for infection with T. vaginalis, a total of 110 sequences were identified when the T. vaginalis 18S ribosomal RNA gene was sequenced. These sequences were used to prepare a phylogenetic network. The rooted network comprised three large clades and several independent branches. Most of the Xinjiang sequences were in one group. Preliminary results suggest that Xinjiang T. vaginalis isolates might be genetically unique, as indicated by the sequence of their 18S ribosomal RNA gene. Low migration rate of local people in this province may contribute to a genetic conservativeness of T. vaginalis. The unique genetic feature of our isolates may suggest a different clinical presentation of trichomoniasis, including metronidazole susceptibility, T. vaginalis virus or Mycoplasma co-infection characteristics. The transmission and evolution of Xinjiang T. vaginalis is of interest and should be studied further. More attention should be given to T. vaginalis infection in both females and males in Xinjiang.

  15. SeqAnt: A web service to rapidly identify and annotate DNA sequence variations

    Directory of Open Access Journals (Sweden)

    Patel Viren

    2010-09-01

    Full Text Available Abstract Background The enormous throughput and low cost of second-generation sequencing platforms now allow research and clinical geneticists to routinely perform single experiments that identify tens of thousands to millions of variant sites. Existing methods to annotate variant sites using information from publicly available databases via web browsers are too slow to be useful for the large sequencing datasets being routinely generated by geneticists. Because sequence annotation of variant sites is required before functional characterization can proceed, the lack of a high-throughput pipeline to efficiently annotate variant sites can act as a significant bottleneck in genetics research. Results SeqAnt (Sequence Annotator is an open source web service and software package that rapidly annotates DNA sequence variants and identifies recessive or compound heterozygous loci in human, mouse, fly, and worm genome sequencing experiments. Variants are characterized with respect to their functional type, frequency, and evolutionary conservation. Annotated variants can be viewed on a web browser, downloaded in a tab-delimited text file, or directly uploaded in a BED format to the UCSC genome browser. To demonstrate the speed of SeqAnt, we annotated a series of publicly available datasets that ranged in size from 37 to 3,439,107 variant sites. The total time to completely annotate these data completely ranged from 0.17 seconds to 28 minutes 49.8 seconds. Conclusion SeqAnt is an open source web service and software package that overcomes a critical bottleneck facing research and clinical geneticists using second-generation sequencing platforms. SeqAnt will prove especially useful for those investigators who lack dedicated bioinformatics personnel or infrastructure in their laboratories.

  16. Whole-Exome Sequencing Identifies Rare and Low-Frequency Coding Variants Associated with LDL Cholesterol

    Science.gov (United States)

    Lange, Leslie A.; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M.; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M.; Smith, Joshua D.; Turner, Emily H.; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A.; Holmen, Oddgeir L.; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A.; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C.; Correa, Adolfo; Griswold, Michael E.; Jakobsdottir, Johanna; Smith, Albert V.; Schreiner, Pamela J.; Feitosa, Mary F.; Zhang, Qunyuan; Huffman, Jennifer E.; Crosby, Jacy; Wassel, Christina L.; Do, Ron; Franceschini, Nora; Martin, Lisa W.; Robinson, Jennifer G.; Assimes, Themistocles L.; Crosslin, David R.; Rosenthal, Elisabeth A.; Tsai, Michael; Rieder, Mark J.; Farlow, Deborah N.; Folsom, Aaron R.; Lumley, Thomas; Fox, Ervin R.; Carlson, Christopher S.; Peters, Ulrike; Jackson, Rebecca D.; van Duijn, Cornelia M.; Uitterlinden, André G.; Levy, Daniel; Rotter, Jerome I.; Taylor, Herman A.; Gudnason, Vilmundur; Siscovick, David S.; Fornage, Myriam; Borecki, Ingrid B.; Hayward, Caroline; Rudan, Igor; Chen, Y. Eugene; Bottinger, Erwin P.; Loos, Ruth J.F.; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M.; Gabriel, Stacey B.; O’Donnell, Christopher J.; Post, Wendy S.; North, Kari E.; Reiner, Alexander P.; Boerwinkle, Eric; Psaty, Bruce M.; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P.; Cupples, L. Adrienne; Kooperberg, Charles; Wilson, James G.; Nickerson, Deborah A.; Abecasis, Goncalo R.; Rich, Stephen S.; Tracy, Russell P.; Willer, Cristen J.; Gabriel, Stacey B.; Altshuler, David M.; Abecasis, Gonçalo R.; Allayee, Hooman; Cresci, Sharon; Daly, Mark J.; de Bakker, Paul I.W.; DePristo, Mark A.; Do, Ron; Donnelly, Peter; Farlow, Deborah N.; Fennell, Tim; Garimella, Kiran; Hazen, Stanley L.; Hu, Youna; Jordan, Daniel M.; Jun, Goo; Kathiresan, Sekar; Kang, Hyun Min; Kiezun, Adam; Lettre, Guillaume; Li, Bingshan; Li, Mingyao; Newton-Cheh, Christopher H.; Padmanabhan, Sandosh; Peloso, Gina; Pulit, Sara; Rader, Daniel J.; Reich, David; Reilly, Muredach P.; Rivas, Manuel A.; Schwartz, Steve; Scott, Laura; Siscovick, David S.; Spertus, John A.; Stitziel, Nathaniel O.; Stoletzki, Nina; Sunyaev, Shamil R.; Voight, Benjamin F.; Willer, Cristen J.; Rich, Stephen S.; Akylbekova, Ermeg; Atwood, Larry D.; Ballantyne, Christie M.; Barbalic, Maja; Barr, R. Graham; Benjamin, Emelia J.; Bis, Joshua; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer; Budoff, Matthew; Burke, Greg; Buxbaum, Sarah; Carr, Jeff; Chen, Donna T.; Chen, Ida Y.; Chen, Wei-Min; Concannon, Pat; Crosby, Jacy; Cupples, L. Adrienne; D’Agostino, Ralph; DeStefano, Anita L.; Dreisbach, Albert; Dupuis, Josée; Durda, J. Peter; Ellis, Jaclyn; Folsom, Aaron R.; Fornage, Myriam; Fox, Caroline S.; Fox, Ervin; Funari, Vincent; Ganesh, Santhi K.; Gardin, Julius; Goff, David; Gordon, Ora; Grody, Wayne; Gross, Myron; Guo, Xiuqing; Hall, Ira M.; Heard-Costa, Nancy L.; Heckbert, Susan R.; Heintz, Nicholas; Herrington, David M.; Hickson, DeMarc; Huang, Jie; Hwang, Shih-Jen; Jacobs, David R.; Jenny, Nancy S.; Johnson, Andrew D.; Johnson, Craig W.; Kawut, Steven; Kronmal, Richard; Kurz, Raluca; Lange, Ethan M.; Lange, Leslie A.; Larson, Martin G.; Lawson, Mark; Lewis, Cora E.; Levy, Daniel; Li, Dalin; Lin, Honghuang; Liu, Chunyu; Liu, Jiankang; Liu, Kiang; Liu, Xiaoming; Liu, Yongmei; Longstreth, William T.; Loria, Cay; Lumley, Thomas; Lunetta, Kathryn; Mackey, Aaron J.; Mackey, Rachel; Manichaikul, Ani; Maxwell, Taylor; McKnight, Barbara; Meigs, James B.; Morrison, Alanna C.; Musani, Solomon K.; Mychaleckyj, Josyf C.; Nettleton, Jennifer A.; North, Kari; O’Donnell, Christopher J.; O’Leary, Daniel; Ong, Frank; Palmas, Walter; Pankow, James S.; Pankratz, Nathan D.; Paul, Shom; Perez, Marco; Person, Sharina D.; Polak, Joseph; Post, Wendy S.; Psaty, Bruce M.; Quinlan, Aaron R.; Raffel, Leslie J.; Ramachandran, Vasan S.; Reiner, Alexander P.; Rice, Kenneth; Rotter, Jerome I.; Sanders, Jill P.; Schreiner, Pamela; Seshadri, Sudha; Shea, Steve; Sidney, Stephen; Silverstein, Kevin; Smith, Nicholas L.; Sotoodehnia, Nona; Srinivasan, Asoke; Taylor, Herman A.; Taylor, Kent; Thomas, Fridtjof; Tracy, Russell P.; Tsai, Michael Y.; Volcik, Kelly A.; Wassel, Chrstina L.; Watson, Karol; Wei, Gina; White, Wendy; Wiggins, Kerri L.; Wilk, Jemma B.; Williams, O. Dale; Wilson, Gregory; Wilson, James G.; Wolf, Phillip; Zakai, Neil A.; Hardy, John; Meschia, James F.; Nalls, Michael; Singleton, Andrew; Worrall, Brad; Bamshad, Michael J.; Barnes, Kathleen C.; Abdulhamid, Ibrahim; Accurso, Frank; Anbar, Ran; Beaty, Terri; Bigham, Abigail; Black, Phillip; Bleecker, Eugene; Buckingham, Kati; Cairns, Anne Marie; Caplan, Daniel; Chatfield, Barbara; Chidekel, Aaron; Cho, Michael; Christiani, David C.; Crapo, James D.; Crouch, Julia; Daley, Denise; Dang, Anthony; Dang, Hong; De Paula, Alicia; DeCelie-Germana, Joan; Drumm, Allen DozorMitch; Dyson, Maynard; Emerson, Julia; Emond, Mary J.; Ferkol, Thomas; Fink, Robert; Foster, Cassandra; Froh, Deborah; Gao, Li; Gershan, William; Gibson, Ronald L.; Godwin, Elizabeth; Gondor, Magdalen; Gutierrez, Hector; Hansel, Nadia N.; Hassoun, Paul M.; Hiatt, Peter; Hokanson, John E.; Howenstine, Michelle; Hummer, Laura K.; Kanga, Jamshed; Kim, Yoonhee; Knowles, Michael R.; Konstan, Michael; Lahiri, Thomas; Laird, Nan; Lange, Christoph; Lin, Lin; Lin, Xihong; Louie, Tin L.; Lynch, David; Make, Barry; Martin, Thomas R.; Mathai, Steve C.; Mathias, Rasika A.; McNamara, John; McNamara, Sharon; Meyers, Deborah; Millard, Susan; Mogayzel, Peter; Moss, Richard; Murray, Tanda; Nielson, Dennis; Noyes, Blakeslee; O’Neal, Wanda; Orenstein, David; O’Sullivan, Brian; Pace, Rhonda; Pare, Peter; Parker, H. Worth; Passero, Mary Ann; Perkett, Elizabeth; Prestridge, Adrienne; Rafaels, Nicholas M.; Ramsey, Bonnie; Regan, Elizabeth; Ren, Clement; Retsch-Bogart, George; Rock, Michael; Rosen, Antony; Rosenfeld, Margaret; Ruczinski, Ingo; Sanford, Andrew; Schaeffer, David; Sell, Cindy; Sheehan, Daniel; Silverman, Edwin K.; Sin, Don; Spencer, Terry; Stonebraker, Jackie; Tabor, Holly K.; Varlotta, Laurie; Vergara, Candelaria I.; Weiss, Robert; Wigley, Fred; Wise, Robert A.; Wright, Fred A.; Wurfel, Mark M.; Zanni, Robert; Zou, Fei; Nickerson, Deborah A.; Rieder, Mark J.; Green, Phil; Shendure, Jay; Akey, Joshua M.; Bustamante, Carlos D.; Crosslin, David R.; Eichler, Evan E.; Fox, P. Keolu; Fu, Wenqing; Gordon, Adam; Gravel, Simon; Jarvik, Gail P.; Johnsen, Jill M.; Kan, Mengyuan; Kenny, Eimear E.; Kidd, Jeffrey M.; Lara-Garduno, Fremiet; Leal, Suzanne M.; Liu, Dajiang J.; McGee, Sean; O’Connor, Timothy D.; Paeper, Bryan; Robertson, Peggy D.; Smith, Joshua D.; Staples, Jeffrey C.; Tennessen, Jacob A.; Turner, Emily H.; Wang, Gao; Yi, Qian; Jackson, Rebecca; Peters, Ulrike; Carlson, Christopher S.; Anderson, Garnet; Anton-Culver, Hoda; Assimes, Themistocles L.; Auer, Paul L.; Beresford, Shirley; Bizon, Chris; Black, Henry; Brunner, Robert; Brzyski, Robert; Burwen, Dale; Caan, Bette; Carty, Cara L.; Chlebowski, Rowan; Cummings, Steven; Curb, J. David; Eaton, Charles B.; Ford, Leslie; Franceschini, Nora; Fullerton, Stephanie M.; Gass, Margery; Geller, Nancy; Heiss, Gerardo; Howard, Barbara V.; Hsu, Li; Hutter, Carolyn M.; Ioannidis, John; Jiao, Shuo; Johnson, Karen C.; Kooperberg, Charles; Kuller, Lewis; LaCroix, Andrea; Lakshminarayan, Kamakshi; Lane, Dorothy; Lasser, Norman; LeBlanc, Erin; Li, Kuo-Ping; Limacher, Marian; Lin, Dan-Yu; Logsdon, Benjamin A.; Ludlam, Shari; Manson, JoAnn E.; Margolis, Karen; Martin, Lisa; McGowan, Joan; Monda, Keri L.; Kotchen, Jane Morley; Nathan, Lauren; Ockene, Judith; O’Sullivan, Mary Jo; Phillips, Lawrence S.; Prentice, Ross L.; Robbins, John; Robinson, Jennifer G.; Rossouw, Jacques E.; Sangi-Haghpeykar, Haleh; Sarto, Gloria E.; Shumaker, Sally; Simon, Michael S.; Stefanick, Marcia L.; Stein, Evan; Tang, Hua; Taylor, Kira C.; Thomson, Cynthia A.; Thornton, Timothy A.; Van Horn, Linda; Vitolins, Mara; Wactawski-Wende, Jean; Wallace, Robert; Wassertheil-Smoller, Sylvia; Zeng, Donglin; Applebaum-Bowden, Deborah; Feolo, Michael; Gan, Weiniu; Paltoo, Dina N.; Sholinsky, Phyliss; Sturcke, Anne

    2014-01-01

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775

  17. HIV-1 envelope sequence-based diversity measures for identifying recent infections.

    Directory of Open Access Journals (Sweden)

    Alexis Kafando

    Full Text Available Identifying recent HIV-1 infections is crucial for monitoring HIV-1 incidence and optimizing public health prevention efforts. To identify recent HIV-1 infections, we evaluated and compared the performance of 4 sequence-based diversity measures including percent diversity, percent complexity, Shannon entropy and number of haplotypes targeting 13 genetic segments within the env gene of HIV-1. A total of 597 diagnostic samples obtained in 2013 and 2015 from recently and chronically HIV-1 infected individuals were selected. From the selected samples, 249 (134 from recent versus 115 from chronic infections env coding regions, including V1-C5 of gp120 and the gp41 ectodomain of HIV-1, were successfully amplified and sequenced by next generation sequencing (NGS using the Illumina MiSeq platform. The ability of the four sequence-based diversity measures to correctly identify recent HIV infections was evaluated using the frequency distribution curves, median and interquartile range and area under the curve (AUC of the receiver operating characteristic (ROC. Comparing the median and interquartile range and evaluating the frequency distribution curves associated with the 4 sequence-based diversity measures, we observed that the percent diversity, number of haplotypes and Shannon entropy demonstrated significant potential to discriminate recent from chronic infections (p<0.0001. Using the AUC of ROC analysis, only the Shannon entropy measure within three HIV-1 env segments could accurately identify recent infections at a satisfactory level. The env segments were gp120 C2_1 (AUC = 0.806, gp120 C2_3 (AUC = 0.805 and gp120 V3 (AUC = 0.812. Our results clearly indicate that the Shannon entropy measure represents a useful tool for predicting HIV-1 infection recency.

  18. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens.

    Science.gov (United States)

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-09-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content. © 2014 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

  19. Identification of innate immunodeficiencies by whole genome sequencing

    DEFF Research Database (Denmark)

    Mogensen, Trine; Christiansen, Mette; Veirum, Jens Erik

    2014-01-01

    . Exome sequencing coverage was on average 100. Any identified mutations were confirmed with Sanger sequencing. Next in vitro studies to evaluate the functional consequences of identified mutations will be done. We plan to express the mutated and wild type versions of these molecules by stable......, TBK1 and Unc93B) may contribute to the development of herpes encephalitis. Common to these genetic defects is that they lead to reduced antiviral interferon (IFN) responses. In this study whole exome sequencing (WES) was performed to identify mutations associated with susceptibility to herpes...

  20. New mutations in chronic lymphocytic leukemia identified by target enrichment and deep sequencing.

    Directory of Open Access Journals (Sweden)

    Elena Doménech

    Full Text Available Chronic lymphocytic leukemia (CLL is a heterogeneous disease without a well-defined genetic alteration responsible for the onset of the disease. Several lines of evidence coincide in identifying stimulatory and growth signals delivered by B-cell receptor (BCR, and co-receptors together with NFkB pathway, as being the driving force in B-cell survival in CLL. However, the molecular mechanism responsible for this activation has not been identified. Based on the hypothesis that BCR activation may depend on somatic mutations of the BCR and related pathways we have performed a complete mutational screening of 301 selected genes associated with BCR signaling and related pathways using massive parallel sequencing technology in 10 CLL cases. Four mutated genes in coding regions (KRAS, SMARCA2, NFKBIE and PRKD3 have been confirmed by capillary sequencing. In conclusion, this study identifies new genes mutated in CLL, all of them in cases with progressive disease, and demonstrates that next-generation sequencing technologies applied to selected genes or pathways of interest are powerful tools for identifying novel mutational changes.

  1. Whole exome sequencing identifies genetic variants in inherited thrombocytopenia with secondary qualitative function defects

    Science.gov (United States)

    Johnson, Ben; Lowe, Gillian C.; Futterer, Jane; Lordkipanidzé, Marie; MacDonald, David; Simpson, Michael A.; Sanchez-Guiú, Isabel; Drake, Sian; Bem, Danai; Leo, Vincenzo; Fletcher, Sarah J.; Dawood, Ban; Rivera, José; Allsup, David; Biss, Tina; Bolton-Maggs, Paula HB; Collins, Peter; Curry, Nicola; Grimley, Charlotte; James, Beki; Makris, Mike; Motwani, Jayashree; Pavord, Sue; Talks, Katherine; Thachil, Jecko; Wilde, Jonathan; Williams, Mike; Harrison, Paul; Gissen, Paul; Mundell, Stuart; Mumford, Andrew; Daly, Martina E.; Watson, Steve P.; Morgan, Neil V.

    2016-01-01

    Inherited thrombocytopenias are a heterogeneous group of disorders characterized by abnormally low platelet counts which can be associated with abnormal bleeding. Next-generation sequencing has previously been employed in these disorders for the confirmation of suspected genetic abnormalities, and more recently in the discovery of novel disease-causing genes. However its full potential has not yet been exploited. Over the past 6 years we have sequenced the exomes from 55 patients, including 37 index cases and 18 additional family members, all of whom were recruited to the UK Genotyping and Phenotyping of Platelets study. All patients had inherited or sustained thrombocytopenia of unknown etiology with platelet counts varying from 11×109/L to 186×109/L. Of the 51 patients phenotypically tested, 37 (73%), had an additional secondary qualitative platelet defect. Using whole exome sequencing analysis we have identified “pathogenic” or “likely pathogenic” variants in 46% (17/37) of our index patients with thrombocytopenia. In addition, we report variants of uncertain significance in 12 index cases, including novel candidate genetic variants in previously unreported genes in four index cases. These results demonstrate that whole exome sequencing is an efficient method for elucidating potential pathogenic genetic variants in inherited thrombocytopenia. Whole exome sequencing also has the added benefit of discovering potentially pathogenic genetic variants for further study in novel genes not previously implicated in inherited thrombocytopenia. PMID:27479822

  2. SERPINA1 Full-Gene Sequencing Identifies Rare Mutations Not Detected in Targeted Mutation Analysis.

    Science.gov (United States)

    Graham, Rondell P; Dina, Michelle A; Howe, Sarah C; Butz, Malinda L; Willkomm, Kurt S; Murray, David L; Snyder, Melissa R; Rumilla, Kandelaria M; Halling, Kevin C; Highsmith, W Edward

    2015-11-01

    Genetic α-1 antitrypsin (AAT) deficiency is characterized by low serum AAT levels and the identification of causal mutations or an abnormal protein. It needs to be distinguished from deficiency because of nongenetic causes, and diagnostic delay may contribute to worse patient outcome. Current routine clinical testing assesses for only the most common mutations. We wanted to determine the proportion of unexplained cases of AAT deficiency that harbor causal mutations not identified through current standard allele-specific genotyping and isoelectric focusing (IEF). All prospective cases from December 1, 2013, to October 1, 2014, with a low serum AAT level not explained by allele-specific genotyping and IEF were assessed through full-gene sequencing with a direct sequencing method for pathogenic mutations. We reviewed the results using American Council of Medical Genetics criteria. Of 3523 cases, 42 (1.2%) met study inclusion criteria. Pathogenic or likely pathogenic mutations not identified through clinical testing were detected through full-gene sequencing in 16 (38%) of the 42 cases. Rare mutations not detected with current allele-specific testing and IEF underlie a substantial proportion of genetic AAT deficiency. Full-gene sequencing, therefore, has the ability to improve accuracy in the diagnosis of AAT deficiency. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  3. Functional brain activation differences in stuttering identified with a rapid fMRI sequence

    Science.gov (United States)

    Kraft, Shelly Jo; Choo, Ai Leen; Sharma, Harish; Ambrose, Nicoline G.

    2011-01-01

    The purpose of this study was to investigate whether brain activity related to the presence of stuttering can be identified with rapid functional MRI (fMRI) sequences that involved overt and covert speech processing tasks. The long-term goal is to develop sensitive fMRI approaches with developmentally appropriate tasks to identify deviant speech motor and auditory brain activity in children who stutter closer to the age at which recovery from stuttering is documented. Rapid sequences may be preferred for individuals or populations who do not tolerate long scanning sessions. In this report, we document the application of a picture naming and phoneme monitoring task in three minute fMRI sequences with adults who stutter (AWS). If relevant brain differences are found in AWS with these approaches that conform to previous reports, then these approaches can be extended to younger populations. Pairwise contrasts of brain BOLD activity between AWS and normally fluent adults indicated the AWS showed higher BOLD activity in the right inferior frontal gyrus (IFG), right temporal lobe and sensorimotor cortices during picture naming and and higher activity in the right IFG during phoneme monitoring. The right lateralized pattern of BOLD activity together with higher activity in sensorimotor cortices is consistent with previous reports, which indicates rapid fMRI sequences can be considered for investigating stuttering in younger participants. PMID:22133409

  4. Microfluidic screening and whole-genome sequencing identifies mutations associated with improved protein secretion by yeast

    DEFF Research Database (Denmark)

    Huang, Mingtao; Bai, Yunpeng; Sjostrom, Staffan L.

    2015-01-01

    interest in improving its protein secretion capacity. Due to the complexity of the secretory machinery in eukaryotic cells, it is difficult to apply rational engineering for construction of improved strains. Here we used high-throughput microfluidics for the screening of yeast libraries, generated by UV...... mutagenesis. Several screening and sorting rounds resulted in the selection of eight yeast clones with significantly improved secretion of recombinant a-amylase. Efficient secretion was genetically stable in the selected clones. We performed whole-genome sequencing of the eight clones and identified 330...... to construct efficient cell factories for protein secretion. The combined use of microfluidics screening and whole-genome sequencing to map the mutations associated with the improved phenotype can easily be adapted for other products and cell types to identify novel engineering targets, and this approach could...

  5. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    DEFF Research Database (Denmark)

    Hu, H; Haas, S A; Chelly, J

    2016-01-01

    with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.......X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes...... or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset...

  6. A novel PCCB mutation in a Thai patient with propionic acidemia identified by exome sequencing.

    Science.gov (United States)

    Porntaveetus, Thantrira; Srichomthong, Chalurmpon; Suphapeetiporn, Kanya; Shotelersuk, Vorasuk

    2015-01-01

    Propionic acidemia (PA) is an inborn error of metabolism, caused by mutations in either the PCCA or PCCB gene, leading to mitochondrial accumulation of propionyl-CoA and its by-products. Here we report a 6-year-old Thai boy with PA who was born to consanguineous parents. Exome sequencing identified a novel homozygous frameshift insertion (c.379_380insA; p.T127NfsX160) in the PCCB gene, expanding its mutational spectrum.

  7. A viral metagenomic approach on a non-metagenomic experiment: Mining next generation sequencing datasets from pig DNA identified several porcine parvoviruses for a retrospective evaluation of viral infections.

    Directory of Open Access Journals (Sweden)

    Samuele Bovo

    Full Text Available Shot-gun next generation sequencing (NGS on whole DNA extracted from specimens collected from mammals often produces reads that are not mapped (i.e. unmapped reads on the host reference genome and that are usually discarded as by-products of the experiments. In this study, we mined Ion Torrent reads obtained by sequencing DNA isolated from archived blood samples collected from 100 performance tested Italian Large White pigs. Two reduced representation libraries were prepared from two DNA pools constructed each from 50 equimolar DNA samples. Bioinformatic analyses were carried out to mine unmapped reads on the reference pig genome that were obtained from the two NGS datasets. In silico analyses included read mapping and sequence assembly approaches for a viral metagenomic analysis using the NCBI Viral Genome Resource. Our approach identified sequences matching several viruses of the Parvoviridae family: porcine parvovirus 2 (PPV2, PPV4, PPV5 and PPV6 and porcine bocavirus 1-H18 isolate (PBoV1-H18. The presence of these viruses was confirmed by PCR and Sanger sequencing of individual DNA samples. PPV2, PPV4, PPV5, PPV6 and PBoV1-H18 were all identified in samples collected in 1998-2007, 1998-2000, 1997-2000, 1998-2004 and 2003, respectively. For most of these viruses (PPV4, PPV5, PPV6 and PBoV1-H18 previous studies reported their first occurrence much later (from 5 to more than 10 years than our identification period and in different geographic areas. Our study provided a retrospective evaluation of apparently asymptomatic parvovirus infected pigs providing information that could be important to define occurrence and prevalence of different parvoviruses in South Europe. This study demonstrated the potential of mining NGS datasets non-originally derived by metagenomics experiments for viral metagenomics analyses in a livestock species.

  8. A viral metagenomic approach on a non-metagenomic experiment: Mining next generation sequencing datasets from pig DNA identified several porcine parvoviruses for a retrospective evaluation of viral infections.

    Science.gov (United States)

    Bovo, Samuele; Mazzoni, Gianluca; Ribani, Anisa; Utzeri, Valerio Joe; Bertolini, Francesca; Schiavo, Giuseppina; Fontanesi, Luca

    2017-01-01

    Shot-gun next generation sequencing (NGS) on whole DNA extracted from specimens collected from mammals often produces reads that are not mapped (i.e. unmapped reads) on the host reference genome and that are usually discarded as by-products of the experiments. In this study, we mined Ion Torrent reads obtained by sequencing DNA isolated from archived blood samples collected from 100 performance tested Italian Large White pigs. Two reduced representation libraries were prepared from two DNA pools constructed each from 50 equimolar DNA samples. Bioinformatic analyses were carried out to mine unmapped reads on the reference pig genome that were obtained from the two NGS datasets. In silico analyses included read mapping and sequence assembly approaches for a viral metagenomic analysis using the NCBI Viral Genome Resource. Our approach identified sequences matching several viruses of the Parvoviridae family: porcine parvovirus 2 (PPV2), PPV4, PPV5 and PPV6 and porcine bocavirus 1-H18 isolate (PBoV1-H18). The presence of these viruses was confirmed by PCR and Sanger sequencing of individual DNA samples. PPV2, PPV4, PPV5, PPV6 and PBoV1-H18 were all identified in samples collected in 1998-2007, 1998-2000, 1997-2000, 1998-2004 and 2003, respectively. For most of these viruses (PPV4, PPV5, PPV6 and PBoV1-H18) previous studies reported their first occurrence much later (from 5 to more than 10 years) than our identification period and in different geographic areas. Our study provided a retrospective evaluation of apparently asymptomatic parvovirus infected pigs providing information that could be important to define occurrence and prevalence of different parvoviruses in South Europe. This study demonstrated the potential of mining NGS datasets non-originally derived by metagenomics experiments for viral metagenomics analyses in a livestock species.

  9. Next-Generation Whole Genome Sequencing Identifies the Direction of Norovirus Transmission in Linked Patients

    Science.gov (United States)

    Kundu, Samit; Lockwood, Julianne; Depledge, Daniel P.; Chaudhry, Yasmin; Aston, Antony; Rao, Kanchan; Hartley, John C.; Goodfellow, Ian; Breuer, Judith

    2013-01-01

    Background. Noroviruses are a highly transmissible and major cause of nosocomial gastroenteritis resulting in bed and hospital-ward closures. Where hospital outbreaks are suspected, it is important to determine the routes of spread so that appropriate infection-control procedures can be implemented. To investigate a cluster of norovirus cases occurring in children undergoing bone marrow transplant, we undertook norovirus genome sequencing by next-generation methods. Detailed comparison of sequence data from 2 linked cases enabled us to identify the likely direction of spread. Methods. Norovirus complementary DNA was amplified by overlapping polymerase chain reaction (PCR) from 13 stool samples from 5 diagnostic real-time PCR–positive patients. The amplicons were sequenced by Roche 454, the genomes assembled by de novo assembly, and the data analyzed phylogenetically. Results. Phylogenetic analysis indicated that patients were infected by viruses similar to 4 distinct GII.4 subtypes and 2 patients were linked by the same virus. Of the 14 sites at which there were differences between the consensus sequences of the 2 linked viral genomes, 9 had minor variants present within one or the other patient. Further analysis confirmed that minor variants at all 9 sites in patient B were present as the consensus sequence in patient A. Conclusions. Phylogenetic analysis excluded a common source of infection in this apparent outbreak. Two of 3 patients on the same ward had closely related viruses, raising the possibility of cross-infection despite protective isolation. Analysis of deep sequencing data enabled us to establish the likely direction of nosocomial transmission. PMID:23645848

  10. A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data.

    Directory of Open Access Journals (Sweden)

    Amanda J Lea

    2015-11-01

    Full Text Available Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS data from baboons, we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach. Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html.

  11. Exome sequencing identifies potential risk variants for Mendelian disorders at high prevalence in Qatar.

    Science.gov (United States)

    Rodriguez-Flores, Juan L; Fakhro, Khalid; Hackett, Neil R; Salit, Jacqueline; Fuller, Jennifer; Agosto-Perez, Francisco; Gharbiah, Maey; Malek, Joel A; Zirie, Mahmoud; Jayyousi, Amin; Badii, Ramin; Al-Nabet Al-Marri, Ajayeb; Chouchane, Lotfi; Stadler, Dora J; Mezey, Jason G; Crystal, Ronald G

    2014-01-01

    Exome sequencing of families of related individuals has been highly successful in identifying genetic polymorphisms responsible for Mendelian disorders. Here, we demonstrate the value of the reverse approach, where we use exome sequencing of a sample of unrelated individuals to analyze allele frequencies of known causal mutations for Mendelian diseases. We sequenced the exomes of 100 individuals representing the three major genetic subgroups of the Qatari population (Q1 Bedouin, Q2 Persian-South Asian, Q3 African) and identified 37 variants in 33 genes with effects on 36 clinically significant Mendelian diseases. These include variants not present in 1000 Genomes and variants at high frequency when compared with 1000 Genomes populations. Several of these Mendelian variants were only segregating in one Qatari subpopulation, where the observed subpopulation specificity trends were confirmed in an independent population of 386 Qataris. Premarital genetic screening in Qatar tests for only four out of the 37, such that this study provides a set of Mendelian disease variants with potential impact on the epidemiological profile of the population that could be incorporated into the testing program if further experimental and clinical characterization confirms high penetrance. © 2013 WILEY PERIODICALS, INC.

  12. Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    Directory of Open Access Journals (Sweden)

    Devier Benjamin

    2007-08-01

    Full Text Available Abstract Background The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.

  13. [Detection of TSC1/TSC2 gene mutations among patients with tuberous sclerosis complex by Ion Torrent semiconductor sequencing].

    Science.gov (United States)

    Wang, Yuguo; Lin, Ying; Luo, Chunyu; Liang, Dong; Ji, Xiuqing; Jiang, Tao; Ma, Dingyuan; Xu, Zhengfeng

    2016-04-01

    To develop and validate a method for mutation screening and prenatal diagnosis of TSC1/TSC2 mutations among patients with tuberous sclerosis complex (TSC) by Ion Torrent semiconductor sequencing. Potential mutations of SC1/TSC2 gene was detected in 2 TSC families and 1 sporadic TSC patient using an Ion Torrent PGM sequencer. Candidate variants were validated by Sanger sequencing. The corresponding site of TSC2 in the fetus of family 2 was also detected with Sanger sequencing. Ion Torrent semiconductor sequencing has identified a probably pathogenic TSC2 mutation (c.311-312insGCTG) in the patient from family 1, and a probably pathogenic TSC2 mutation (c.1790A>G) in the patient of family 2. Targeted Ion Torrent PGM sequencing is an accurate and efficient method to detect TSC1/TSC2 mutations in TSC.

  14. Whole Exome Sequencing Identifies de Novo Mutations in GATA6 Associated with Congenital Diaphragmatic Hernia

    Science.gov (United States)

    Yu, Lan; Bennett, James T.; Wynn, Julia; Carvill, Gemma L.; Cheung, Yee Him; Shen, Yufeng; Mychaliska, George B.; Azarow, Kenneth S.; Crombleholme, Timothy M.; Chung, Dai H.; Potoka, Douglas; Warner, Brad W.; Bucher, Brian; Lim, Foong-Yen; Pietsch, John; Stolar, Charles; Aspelund, Gudrun; Arkovitz, Marc S.; Mefford, Heather; Chung, Wendy K.

    2014-01-01

    Background Congenital diaphragmatic hernia (CDH) is a common birth defect affecting 1 in 3,000 births. It is characterized by herniation of abdominal viscera through an incompletely formed diaphragm. Although chromosomal anomalies and mutations in several genes have been implicated, the cause for most patients is unknown. Methods We used whole exome sequencing in two families with CDH and congenital heart disease, and identified mutations in GATA6 in both. Results In the first family, we identified a de novo missense mutation (c.1366C>T, p.R456C) in a sporadic CDH patient with tetralogy of Fallot. In the second, a nonsense mutation (c.712G>T, p.G238*) was identified in two siblings with CDH and a large ventricular septal defect. The G238* mutation was inherited from their mother, who was clinically affected with congenital absence of the pericardium, patent ductus arteriosus, and intestinal malrotation. Deep sequencing of blood and saliva derived DNA from the mother suggested somatic mosaicism as an explanation for her milder phenotype, with only approximately 15% mutant alleles. To determine the frequency of GATA6 mutations in CDH, we sequenced the gene in 378 patients with CDH. We identified one additional de novo mutation (c.1071delG, p.V358Cfs34*). Conclusions Mutations in GATA6 have been previously associated with pancreatic agenesis and congenital heart disease. We conclude that, in addition to the heart and the pancreas, GATA6 is involved in development of two additional organs, the diaphragm and the pericardium. In addition we have shown that de novo mutations can contribute to the development of CDH, a common birth defect. PMID:24385578

  15. Whole exome sequencing identifies de novo mutations in GATA6 associated with congenital diaphragmatic hernia.

    Science.gov (United States)

    Yu, Lan; Bennett, James T; Wynn, Julia; Carvill, Gemma L; Cheung, Yee Him; Shen, Yufeng; Mychaliska, George B; Azarow, Kenneth S; Crombleholme, Timothy M; Chung, Dai H; Potoka, Douglas; Warner, Brad W; Bucher, Brian; Lim, Foong-Yen; Pietsch, John; Stolar, Charles; Aspelund, Gudrun; Arkovitz, Marc S; Mefford, Heather; Chung, Wendy K

    2014-03-01

    Congenital diaphragmatic hernia (CDH) is a common birth defect affecting 1 in 3000 births. It is characterised by herniation of abdominal viscera through an incompletely formed diaphragm. Although chromosomal anomalies and mutations in several genes have been implicated, the cause for most patients is unknown. We used whole exome sequencing in two families with CDH and congenital heart disease, and identified mutations in GATA6 in both. In the first family, we identified a de novo missense mutation (c.1366C>T, p.R456C) in a sporadic CDH patient with tetralogy of Fallot. In the second, a nonsense mutation (c.712G>T, p.G238*) was identified in two siblings with CDH and a large ventricular septal defect. The G238* mutation was inherited from their mother, who was clinically affected with congenital absence of the pericardium, patent ductus arteriosus and intestinal malrotation. Deep sequencing of blood and saliva-derived DNA from the mother suggested somatic mosaicism as an explanation for her milder phenotype, with only approximately 15% mutant alleles. To determine the frequency of GATA6 mutations in CDH, we sequenced the gene in 378 patients with CDH. We identified one additional de novo mutation (c.1071delG, p.V358Cfs34*). Mutations in GATA6 have been previously associated with pancreatic agenesis and congenital heart disease. We conclude that, in addition to the heart and the pancreas, GATA6 is involved in development of two additional organs, the diaphragm and the pericardium. In addition, we have shown that de novo mutations can contribute to the development of CDH, a common birth defect.

  16. Determining DNA Sequence Specificity of Natural and Artificial Transcription Factors by Cognate Site Identifier Analysis

    Science.gov (United States)

    Ozers, Mary S.; Warren, Christopher L.; Ansari, Aseem Z.

    Artificial transcription factors (ATFs) are designed to mimic natural transcription factors in the control of gene expression and are comprised of domains for DNA binding and gene regulation. ATF domains are modular, interchangeable, and can be composed of protein-based or nonpeptidic moieties, yielding DNA-interacting regulatory molecules that can either activate or inhibit transcription. Sequence-specific targeting is a key determinant in ATF activity, and DNA-binding domains such as natural zinc fingers and synthetic polyamides have emerged as useful DNA targeting molecules. Defining the comprehensive DNA binding specificity of these targeting molecules for accurate manipulations of the genome can be achieved using cognate site identifier DNA microarrays to explore the entire sequence space of binding sites. Design of ATFs that regulate gene expression with temporal control will generate important molecular tools to probe cell- and tissue-specific gene regulation and to function as potential therapeutic agents.

  17. Use of a mitochondrial COI sequence to identify species of the subtribe Aphidina (Hemiptera, Aphididae

    Directory of Open Access Journals (Sweden)

    Jianfeng WANG

    2011-08-01

    Full Text Available Aphids of the subtribe Aphidina are found mainly in the North Temperate Zone. The relative lack of diagnostic morphological characteristics has obscured the identification of species in this group. However, DNA-based taxonomic methods can clarify species relationships within this group. Sequence variation in a partial segment of the mitochondrial COI gene was highly effective for resolving species relationships within Aphidina. Forty-five species were correctly identified in a neighbor-joining tree. Mean intraspecific sequence divergence was 0.17%, with a range of 0.00% to 1.54%. Mean interspecific divergence within previously recognized genera or morphologically similar species groups was 4.54%, with variation mainly in the range of 3.50% to 8.00%. Possible reasons for anomalous levels of mean nucleotide divergence within or between some taxa are discussed.

  18. Microsatellite Primers Identified by 454 Sequencing in the Floodplain Tree Species Eucalyptus victrix (Myrtaceae

    Directory of Open Access Journals (Sweden)

    Paul G. Nevill

    2013-05-01

    Full Text Available Premise of the study: Microsatellite primers were developed for Eucalyptus victrix (Myrtaceae to evaluate the population and spatial genetic structure of this widespread northwestern Australian riparian tree species, which may be impacted by hydrological changes associated with mining activity. Methods and Results: 454 GS-FLX shotgun sequencing was used to obtain 1895 sequences containing putative microsatellite motifs. Ten polymorphic microsatellite loci were identified and screened for variation in individuals from two populations in the Pilbara region. Observed heterozygosities ranged from 0.44 to 0.91 (mean: 0.66 and the number of alleles per locus ranged from five to 25 (average: 11. Conclusions: These microsatellite loci will be useful in future studies of population and spatial genetic structure in E. victrix, and inform the development of seed sourcing strategies for the species.

  19. Identification of a novel DMD duplication identified by a combination of MLPA and targeted exome sequencing.

    Science.gov (United States)

    Wu, Beibei; Wang, Liying; Dong, Ting; Jin, Jiahui; Lu, Yili; Wu, Huiping; Luo, Yue; Shan, Xiaoou

    2017-01-01

    Duchenne muscular dystrophy (DMD) is an X-linked recessive muscle-wasting disease caused by a mutation in the DMD gene. The aim of this study was to identify a de novo mutation of the DMD gene in the family of a 9-month-old Chinese male patient, as well as to describe the phenotypic characteristics of this patient. The patient was suspected to suffer from DMD according to physical examination, biochemical analyses, and electromyogram. We identified a duplication of exons 4-42 in DMD gene with targeted exome sequencing and multiplex ligation-dependent probe amplification (MLPA). In addition, the patient's mother was a carrier of the same mutation. We identified a de novo duplication of exons 4-42 in a patient with early stage DMD. The discovery of this mutation may provide insights into future investigations.

  20. Deep sequencing identifies new and regulated microRNAs in Schmidtea mediterranea.

    Science.gov (United States)

    Lu, Yi-Chien; Smielewska, Magda; Palakodeti, Dasaradhi; Lovci, Michael T; Aigner, Stefan; Yeo, Gene W; Graveley, Brenton R

    2009-08-01

    MicroRNAs (miRNAs) play important roles in directing the differentiation of cells down a variety of cell lineage pathways. The planarian Schmidtea mediterranea can regenerate all lost body tissue after amputation due to a population of pluripotent somatic stem cells called neoblasts, and is therefore an excellent model organism to study the roles of miRNAs in stem cell function. Here, we use a combination of deep sequencing and bioinformatics to discover 66 new miRNAs in S. mediterranea. We also identify 21 miRNAs that are specifically expressed in either sexual or asexual animals. Finally, we identified five miRNAs whose expression is sensitive to gamma-irradiation, suggesting they are expressed in neoblasts or early neoblast progeny. Together, these results increase the known repertoire of S. mediterranea miRNAs and identify numerous regulated miRNAs that may play important roles in regeneration, homeostasis, neoblast function, and reproduction.

  1. Sequence-Based Introgression Mapping Identifies Candidate White Mold Tolerance Genes in Common Bean

    Directory of Open Access Journals (Sweden)

    Sujan Mamidi

    2016-07-01

    Full Text Available White mold, caused by the necrotrophic fungus (Lib. de Bary, is a major disease of common bean ( L.. WM7.1 and WM8.3 are two quantitative trait loci (QTL with major effects on tolerance to the pathogen. Advanced backcross populations segregating individually for either of the two QTL, and a recombinant inbred (RI population segregating for both QTL were used to fine map and confirm the genetic location of the QTL. The QTL intervals were physically mapped using the reference common bean genome sequence, and the physical intervals for each QTL were further confirmed by sequence-based introgression mapping. Using whole-genome sequence data from susceptible and tolerant DNA pools, introgressed regions were identified as those with significantly higher numbers of single-nucleotide polymorphisms (SNPs relative to the whole genome. By combining the QTL and SNP data, WM7.1 was located to a 660-kb region that contained 41 gene models on the proximal end of chromosome Pv07, while the WM8.3 introgression was narrowed to a 1.36-Mb region containing 70 gene models. The most polymorphic candidate gene in the WM7.1 region encodes a BEACH-domain protein associated with apoptosis. Within the WM8.3 interval, a receptor-like protein with the potential to recognize pathogen effectors was the most polymorphic gene. The use of gene and sequence-based mapping identified two candidate genes whose putative functions are consistent with the current model of pathogenicity.

  2. Whole exome sequencing identifies novel candidate genes that modify chronic obstructive pulmonary disease susceptibility.

    Science.gov (United States)

    Bruse, Shannon; Moreau, Michael; Bromberg, Yana; Jang, Jun-Ho; Wang, Nan; Ha, Hongseok; Picchi, Maria; Lin, Yong; Langley, Raymond J; Qualls, Clifford; Klensney-Tait, Julia; Zabner, Joseph; Leng, Shuguang; Mao, Jenny; Belinsky, Steven A; Xing, Jinchuan; Nyunoya, Toru

    2016-01-07

    Chronic obstructive pulmonary disease (COPD) is characterized by an irreversible airflow limitation in response to inhalation of noxious stimuli, such as cigarette smoke. However, only 15-20 % smokers manifest COPD, suggesting a role for genetic predisposition. Although genome-wide association studies have identified common genetic variants that are associated with susceptibility to COPD, effect sizes of the identified variants are modest, as is the total heritability accounted for by these variants. In this study, an extreme phenotype exome sequencing study was combined with in vitro modeling to identify COPD candidate genes. We performed whole exome sequencing of 62 highly susceptible smokers and 30 exceptionally resistant smokers to identify rare variants that may contribute to disease risk or resistance to COPD. This was a cross-sectional case-control study without therapeutic intervention or longitudinal follow-up information. We identified candidate genes based on rare variant analyses and evaluated exonic variants to pinpoint individual genes whose function was computationally established to be significantly different between susceptible and resistant smokers. Top scoring candidate genes from these analyses were further filtered by requiring that each gene be expressed in human bronchial epithelial cells (HBECs). A total of 81 candidate genes were thus selected for in vitro functional testing in cigarette smoke extract (CSE)-exposed HBECs. Using small interfering RNA (siRNA)-mediated gene silencing experiments, we showed that silencing of several candidate genes augmented CSE-induced cytotoxicity in vitro. Our integrative analysis through both genetic and functional approaches identified two candidate genes (TACC2 and MYO1E) that augment cigarette smoke (CS)-induced cytotoxicity and, potentially, COPD susceptibility.

  3. The Dynamics of DNA Sequencing.

    Science.gov (United States)

    Morvillo, Nancy

    1997-01-01

    Describes a paper-and-pencil activity that helps students understand DNA sequencing and expands student understanding of DNA structure, replication, and gel electrophoresis. Appropriate for advanced biology students who are familiar with the Sanger method. (DDR)

  4. Development and analytical validation of a 25-gene next generation sequencing panel that includes the BRCA1 and BRCA2 genes to assess hereditary cancer risk.

    Science.gov (United States)

    Judkins, Thaddeus; Leclair, Benoît; Bowles, Karla; Gutin, Natalia; Trost, Jeff; McCulloch, James; Bhatnagar, Satish; Murray, Adam; Craft, Jonathan; Wardell, Bryan; Bastian, Mark; Mitchell, Jeffrey; Chen, Jian; Tran, Thanh; Williams, Deborah; Potter, Jennifer; Jammulapati, Srikanth; Perry, Michael; Morris, Brian; Roa, Benjamin; Timms, Kirsten

    2015-04-02

    Germline DNA mutations that increase the susceptibility of a patient to certain cancers have been identified in various genes, and patients can be screened for mutations in these genes to assess their level of risk for developing cancer. Traditional methods using Sanger sequencing focus on small groups of genes and therefore are unable to screen for numerous genes from several patients simultaneously. The goal of the present study was to validate a 25-gene panel to assess genetic risk for cancer in 8 different tissues using next generation sequencing (NGS) techniques. Twenty-five genes associated with hereditary cancer syndromes were selected for development of a panel to screen for risk of these cancers using NGS. In an initial technical assessment, NGS results for BRCA1 and BRCA2 were compared with Sanger sequencing in 1864 anonymized DNA samples from patients who had undergone previous clinical testing. Next, the entire gene panel was validated using parallel NGS and Sanger sequencing in 100 anonymized DNA samples. Large rearrangement analysis was validated using NGS, microarray comparative genomic hybridization (CGH), and multiplex ligation-dependent probe amplification analyses (MLPA). NGS identified 15,877 sequence variants, while Sanger sequencing identified 15,878 in the BRCA1 and BRCA2 comparison study of the same regions. Based on these results, the NGS process was refined prior to the validation of the full gene panel. In the validation study, NGS and Sanger sequencing were 100% concordant for the 3,923 collective variants across all genes for an analytical sensitivity of the NGS assay of >99.92% (lower limit of 95% confidence interval). NGS, microarray CGH and MLPA correctly identified all expected positive and negative large rearrangement results for the 25-gene panel. This study provides a thorough validation of the 25-gene NGS panel and indicates that this analysis tool can be used to collect clinically significant information related to risk of

  5. Deep sequencing identifies genetic heterogeneity and recurrent convergent evolution in chronic lymphocytic leukemia.

    Science.gov (United States)

    Ojha, Juhi; Ayres, Jackline; Secreto, Charla; Tschumper, Renee; Rabe, Kari; Van Dyke, Daniel; Slager, Susan; Shanafelt, Tait; Fonseca, Rafael; Kay, Neil E; Braggio, Esteban

    2015-01-15

    Recent high-throughput sequencing and microarray studies have characterized the genetic landscape and clonal complexity of chronic lymphocytic leukemia (CLL). Here, we performed a longitudinal study in a homogeneously treated cohort of 12 patients, with sequential samples obtained at comparable stages of disease. We identified clonal competition between 2 or more genetic subclones in 70% of the patients with relapse, and stable clonal dynamics in the remaining 30%. By deep sequencing, we identified a high reservoir of genetic heterogeneity in the form of several driver genes mutated in small subclones underlying the disease course. Furthermore, in 2 patients, we identified convergent evolution, characterized by the combination of genetic lesions affecting the same genes or copy number abnormality in different subclones. The phenomenon affects multiple CLL putative driver abnormalities, including mutations in NOTCH1, SF3B1, DDX3X, and del(11q23). This is the first report documenting convergent evolution as a recurrent event in the CLL genome. Furthermore, this finding suggests the selective advantage of specific combinations of genetic lesions for CLL pathogenesis in a subset of patients. © 2015 by The American Society of Hematology.

  6. Buckyballs conjugated with nucleic acid sequences identifies microorganisms in live cell assays.

    Science.gov (United States)

    Cheng, Qingsu; Parvin, Bahram

    2017-11-09

    Rapid identification of bacteria can play an important role at the point of care, evaluating the health of the ecosystem, and discovering spatiotemporal distributions of a bacterial community. We introduce a method for rapid identification of bacteria in live cell assays based on cargo delivery of a nucleic acid sequence and demonstrate how a mixed culture can be differentiated using a simple microfluidic system. C60 Buckyballs are functionalized with nucleic acid sequences and a fluorescent reporter to show that a diversity of microorganisms can be detected and identified in live cell assays. The nucleic acid complexes include an RNA detector, targeting a species-specific sequence in the 16S rRNA, and a complementary DNA with an attached fluorescent reporter. As a result, each bacterium can be detected and visualized at a specific emission frequency through fluorescence microscopy. The C60 probe complexes can detect and identify a diversity of microorganisms that include gram-position and negative bacteria, yeast, and fungi. More specifically, nucleic-acid probes are designed to identify mixed cultures of Bacillus subtilis and Streptococcus sanguinis, or Bacillus subtilis and Pseudomonas aeruginosa. The efficiency, cross talk, and accuracy for the C60 probe complexes are reported. Finally, to demonstrate that mixed cultures can be separated, a microfluidic system is designed that connects a single source-well to multiple sinks wells, where chemo-attractants are placed in the sink wells. The microfluidic system allows for differentiating a mixed culture. The technology allows profiling of bacteria composition, at a very low cost, for field studies and point of care.

  7. Cytochrome c Oxidase Sequences of Zambian Wildlife Helps to Identify Species of Origin of Meat

    Directory of Open Access Journals (Sweden)

    Michelo Syakalima

    2016-01-01

    Full Text Available Accurate species identification is a crucial tool in wildlife conservation. Enforcement of antipoaching law is more achievable with robust molecular identification of poached meat. Determining the region where the animal may have been taken from would also be a useful tool in suppression of cross-border trade of poached meat. We present data from a cytochrome c oxidase “barcoding” study of Zambian ruminants that adequately identifies the species of origin of meat samples. Furthermore, the method demonstrates possible improvement and application in regional variation in sequence identity that has a potential for discriminating meat samples from different subpopulations.

  8. Globicatella sanguinis bacteraemia identified by partial 16S rRNA gene sequencing

    DEFF Research Database (Denmark)

    Abdul-Redha, Rawaa Jalil; Balslew, Ulla; Christensen, Jens Jørgen

    2007-01-01

    Globicatella sanguinis is a gram-positive coccus, resembling non-haemolytic streptococci. The organism has been isolated infrequently from normally sterile sites of humans. Three isolates obtained by blood culture could not be identified by Rapid 32 ID Strep, but partial sequencing of the 16S r......RNA gene revealed the identity of the isolated bacteria, and supplementary biochemical tests confirmed the species identification. The cases histories illustrate the dilemma of finding relevant, newly recognized, opportunistic pathogens and the identification achievement (s) that can be obtained by using...

  9. Neonatal Meningitis by Multidrug Resistant Elizabethkingia meningosepticum Identified by 16S Ribosomal RNA Gene Sequencing

    Directory of Open Access Journals (Sweden)

    V. V. Shailaja

    2014-01-01

    Full Text Available Clinical and microbiological profile of 9 neonates with meningitis by Elizabethkingia meningosepticum identified by 16S ribosomal gene sequencing was studied. All the clinical isolates were resistant to cephalosporins, aminoglycosides, trimethoprim-sulfamethoxazole, β-lactam combinations, carbapenems and only one isolate was susceptible to ciprofloxacin. All the isolates were susceptible to vancomycin. Six of nine neonates died even after using vancomycin, based on susceptibility results. E. meningosepticum meningitis in neonates results in high mortality rate. Though the organism is susceptible to vancomycin in vitro, its efficacy in vivo is questionable and it is difficult to determine the most appropriate antibiotic for treating E. meningosepticum meningitis in neonates.

  10. Exome Sequencing Fails to Identify the Genetic Cause of Aicardi Syndrome

    DEFF Research Database (Denmark)

    Lund, Caroline; Striano, Pasquale; Sorte, Hanne Sørmo

    2016-01-01

    Aicardi syndrome (AS) is a well-characterized neurodevelopmental disorder with an unknown etiology. In this study, we performed whole-exome sequencing in 11 female patients with the diagnosis of AS, in order to identify the disease-causing gene. In particular, we focused on detecting variants......-exonic region or that the mutation is somatic and not detectable by our approach. Alternatively, it is possible that AS is genetically heterogeneous and that 11 patients are not sufficient to reveal the causative genes. Future studies of AS should consider designs where also non-exonic regions are explored...

  11. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    DEFF Research Database (Denmark)

    Hu, H; Haas, S A; Chelly, J

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes...... with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases....

  12. Characterization of functional methylomes by next-generation capture sequencing identifies novel disease-associated variants

    Science.gov (United States)

    Allum, Fiona; Shao, Xiaojian; Guénard, Frédéric; Simon, Marie-Michelle; Busche, Stephan; Caron, Maxime; Lambourne, John; Lessard, Julie; Tandre, Karolina; Hedman, Åsa K.; Kwan, Tony; Ge, Bing; Rönnblom, Lars; McCarthy, Mark I.; Deloukas, Panos; Richmond, Todd; Burgess, Daniel; Spector, Timothy D.; Tchernof, André; Marceau, Simon; Lathrop, Mark; Vohl, Marie-Claude; Pastinen, Tomi; Grundberg, Elin; Ahmadi, Kourosh R.; Ainali, Chrysanthi; Barrett, Amy; Bataille, Veronique; Bell, Jordana T.; Buil, Alfonso; Dermitzakis, Emmanouil T.; Dimas, Antigone S.; Durbin, Richard; Glass, Daniel; Hassanali, Neelam; Ingle, Catherine; Knowles, David; Krestyaninova, Maria; Lindgren, Cecilia M.; Lowe, Christopher E.; Meduri, Eshwar; di Meglio, Paola; Min, Josine L.; Montgomery, Stephen B.; Nestle, Frank O.; Nica, Alexandra C.; Nisbet, James; O'Rahilly, Stephen; Parts, Leopold; Potter, Simon; Sandling, Johanna; Sekowska, Magdalena; Shin, So-Youn; Small, Kerrin S.; Soranzo, Nicole; Surdulescu, Gabriela; Travers, Mary E.; Tsaprouni, Loukia; Tsoka, Sophia; Wilk, Alicja; Yang, Tsun-Po; Zondervan, Krina T.

    2015-01-01

    Most genome-wide methylation studies (EWAS) of multifactorial disease traits use targeted arrays or enrichment methodologies preferentially covering CpG-dense regions, to characterize sufficiently large samples. To overcome this limitation, we present here a new customizable, cost-effective approach, methylC-capture sequencing (MCC-Seq), for sequencing functional methylomes, while simultaneously providing genetic variation information. To illustrate MCC-Seq, we use whole-genome bisulfite sequencing on adipose tissue (AT) samples and public databases to design AT-specific panels. We establish its efficiency for high-density interrogation of methylome variability by systematic comparisons with other approaches and demonstrate its applicability by identifying novel methylation variation within enhancers strongly correlated to plasma triglyceride and HDL-cholesterol, including at CD36. Our more comprehensive AT panel assesses tissue methylation and genotypes in parallel at ∼4 and ∼3 M sites, respectively. Our study demonstrates that MCC-Seq provides comparable accuracy to alternative approaches but enables more efficient cataloguing of functional and disease-relevant epigenetic and genetic variants for large-scale EWAS. PMID:26021296

  13. Genome sequence of a novel mitovirus identified in the phytopathogenic fungus Alternaria arborescens.

    Science.gov (United States)

    Komatsu, Ken; Katayama, Yukie; Omatsu, Tsutomu; Mizutani, Tetsuya; Fukuhara, Toshiyuki; Kodama, Motoichiro; Arie, Tsutomu; Teraoka, Tohru; Moriyama, Hiromitsu

    2016-09-01

    The phytopathogenic fungus Alternaria spp. contains a variety of double-stranded RNA (dsRNA) elements of different sizes. Detailed analysis of next-generation sequencing data obtained using dsRNA purified from Alternaria arborescens, from which we had previously found Alternaria arborescens victorivirus 1, revealed the presence of another mycoviral-like dsRNA of approximately 2.5 kbp in length. When using the fungal mitochondrial genetic code, this dsRNA has a single open reading frame that potentially encodes an RNA-dependent RNA polymerase (RdRp) with significant to sequence similarity to those of viruses of the genus Mitovirus. Moreover, both the 5'- and 3'-untranslated regions have the potential to fold into stable stem-loop structures, which is characteristic of mitoviruses. Pairwise comparisons and phylogenetic analysis of the deduced amino acid sequences of RdRp indicated that the virus we identified in A. arborescens is a distinct member of the genus Mitovirus in the family Narnaviridae, designated as "Alternaria arborescens mitovirus 1" (AaMV1).

  14. Exome sequencing of a large family identifies potential candidate genes contributing risk to bipolar disorder.

    Science.gov (United States)

    Zhang, Tianxiao; Hou, Liping; Chen, David T; McMahon, Francis J; Wang, Jen-Chyong; Rice, John P

    2018-03-01

    Bipolar disorder is a mental illness with lifetime prevalence of about 1%. Previous genetic studies have identified multiple chromosomal linkage regions and candidate genes that might be associated with bipolar disorder. The present study aimed to identify potential susceptibility variants for bipolar disorder using 6 related case samples from a four-generation family. A combination of exome sequencing and linkage analysis was performed to identify potential susceptibility variants for bipolar disorder. Our study identified a list of five potential candidate genes for bipolar disorder. Among these five genes, GRID1(Glutamate Receptor Delta-1 Subunit), which was previously reported to be associated with several psychiatric disorders and brain related traits, is particularly interesting. Variants with functional significance in this gene were identified from two cousins in our bipolar disorder pedigree. Our findings suggest a potential role for these genes and the related rare variants in the onset and development of bipolar disorder in this one family. Additional research is needed to replicate these findings and evaluate their patho-biological significance. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Characterization of NIST human mitochondrial DNA SRM-2392 and SRM-2392-I standard reference materials by next generation sequencing.

    Science.gov (United States)

    Riman, Sarah; Kiesler, Kevin M; Borsuk, Lisa A; Vallone, Peter M

    2017-07-01

    Standard Reference Materials SRM 2392 and 2392-I are intended to provide quality control when amplifying and sequencing human mitochondrial genome sequences. The National Institute of Standards and Technology (NIST) offers these SRMs to laboratories performing DNA-based forensic human identification, molecular diagnosis of mitochondrial diseases, mutation detection, evolutionary anthropology, and genetic genealogy. The entire mtGenome (∼16569bp) of SRM 2392 and 2392-I have previously been characterized at NIST by Sanger sequencing. Herein, we used the sensitivity, specificity, and accuracy offered by next generation sequencing (NGS) to: (1) re-sequence the certified values of the SRM 2392 and 2392-I; (2) confirm Sanger data with a high coverage new sequencing technology; (3) detect lower level heteroplasmies (sequencing communities in the adoption of NGS methods. To obtain a consensus sequence for the SRMs as well as identify and control any bias, sequencing was performed using two NGS platforms and data was analyzed using different bioinformatics pipelines. Our results confirm five low level heteroplasmy sites that were not previously observed with Sanger sequencing: three sites in the GM09947A template in SRM 2392 and two sites in the HL-60 template in SRM 2392-I. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Small RNA sequencing identifies miRNA roles in ovule and fibre development.

    Science.gov (United States)

    Xie, Fuliang; Jones, Don C; Wang, Qinglian; Sun, Runrun; Zhang, Baohong

    2015-04-01

    MicroRNAs (miRNAs) have been found to be differentially expressed during cotton fibre development. However, which specific miRNAs and how they are involved in fibre development is unclear. Here, using deep sequencing, 65 conserved miRNA families were identified and 32 families were differentially expressed between leaf and ovule. At least 40 miRNAs were either leaf or ovule specific, whereas 62 miRNAs were shared in both leaf and ovule. qRT-PCR confirmed these miRNAs were differentially expressed during fibre early development. A total of 820 genes were potentially targeted by the identified miRNAs, whose functions are involved in a series of biological processes including fibre development, metabolism and signal transduction. Many predicted miRNA-target pairs were subsequently validated by degradome sequencing analysis. GO and KEGG analyses showed that the identified miRNAs and their targets were classified to 1027 GO terms including 568 biological processes, 324 molecular functions and 135 cellular components and were enriched to 78 KEGG pathways. At least seven unique miRNAs participate in trichome regulatory interaction network. Eleven trans-acting siRNA (tasiRNA) candidate genes were also identified in cotton. One has never been found in other plant species and two of them were derived from MYB and ARF, both of which play important roles in cotton fibre development. Sixteen genes were predicted to be tasiRNA targets, including sucrose synthase and MYB2. Together, this study discovered new miRNAs in cotton and offered evidences that miRNAs play important roles in cotton ovule/fibre development. The identification of tasiRNA genes and their targets broadens our understanding of the complicated regulatory mechanism of miRNAs in cotton. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  17. Analysis of the Campylobacter jejuni genome by SMRT DNA sequencing identifies restriction-modification motifs.

    Directory of Open Access Journals (Sweden)

    Jason L O'Loughlin

    Full Text Available Campylobacter jejuni is a leading bacterial cause of human gastroenteritis. The goal of this study was to analyze the C. jejuni F38011 strain, recovered from an individual with severe enteritis, at a genomic and proteomic level to gain insight into microbial processes. The C. jejuni F38011 genome is comprised of 1,691,939 bp, with a mol.% (G+C content of 30.5%. PacBio sequencing coupled with REBASE analysis was used to predict C. jejuni F38011 genomic sites and enzymes that may be involved in DNA restriction-modification. A total of five putative methylation motifs were identified as well as the C. jejuni enzymes that could be responsible for the modifications. Peptides corresponding to the deduced amino acid sequence of the C. jejuni enzymes were identified using proteomics. This work sets the stage for studies to dissect the precise functions of the C. jejuni putative restriction-modification enzymes. Taken together, the data generated in this study contributes to our knowledge of the genomic content, methylation profile, and encoding capacity of C. jejuni.

  18. Transcriptome sequencing in pediatric acute lymphoblastic leukemia identifies fusion genes associated with distinct DNA methylation profiles.

    Science.gov (United States)

    Marincevic-Zuniga, Yanara; Dahlberg, Johan; Nilsson, Sara; Raine, Amanda; Nystedt, Sara; Lindqvist, Carl Mårten; Berglund, Eva C; Abrahamsson, Jonas; Cavelier, Lucia; Forestier, Erik; Heyman, Mats; Lönnerholm, Gudmar; Nordlund, Jessica; Syvänen, Ann-Christine

    2017-08-14

    Structural chromosomal rearrangements that lead to expressed fusion genes are a hallmark of acute lymphoblastic leukemia (ALL). In this study, we performed transcriptome sequencing of 134 primary ALL patient samples to comprehensively detect fusion transcripts. We combined fusion gene detection with genome-wide DNA methylation analysis, gene expression profiling, and targeted sequencing to determine molecular signatures of emerging ALL subtypes. We identified 64 unique fusion events distributed among 80 individual patients, of which over 50% have not previously been reported in ALL. Although the majority of the fusion genes were found only in a single patient, we identified several recurrent fusion gene families defined by promiscuous fusion gene partners, such as ETV6, RUNX1, PAX5, and ZNF384, or recurrent fusion genes, such as DUX4-IGH. Our data show that patients harboring these fusion genes displayed characteristic genome-wide DNA methylation and gene expression signatures in addition to distinct patterns in single nucleotide variants and recurrent copy number alterations. Our study delineates the fusion gene landscape in pediatric ALL, including both known and novel fusion genes, and highlights fusion gene families with shared molecular etiologies, which may provide additional information for prognosis and therapeutic options in the future.

  19. Identifying Genetic Signatures of Natural Selection Using Pooled Population Sequencing in Picea abies.

    Science.gov (United States)

    Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin

    2016-07-07

    The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. Copyright © 2016 Chen et al.

  20. Identifying and sequencing a Mycobacterium sp. strain F4 as a potential bioremediation agent for quinclorac.

    Directory of Open Access Journals (Sweden)

    Yingying Li

    Full Text Available Quinclorac is a widely used herbicide in rice filed. Unfortunately, quinclorac residues are phytotoxic to many crops/vegetables. The degradation of quinclorac in nature is very slow. On the other hand, degradation of quinclorac using bacteria can be an effective and efficient method to reduce its contamination. In this study, we isolated a quinclorac bioremediation bacterium strain F4 from quinclorac contaminated soils. Based on morphological characteristics and 16S rRNA gene sequence analysis, we identified strain F4 as Mycobacterium sp. We investigated the effects of temperature, pH, inoculation size and initial quinclorac concentration on growth and degrading efficiency of F4 and determined the optimal quinclorac degrading condition of F4. Under optimal degrading conditions, F4 degraded 97.38% of quinclorac from an initial concentration of 50 mg/L in seven days. Our indoor pot experiment demonstrated that the degradation products were non-phytotoxic to tobacco. After analyzing the quinclorac degradation products of F4, we proposed that F4 could employ two pathways to degrade quinclorac: one is through methylation, the other is through dechlorination. Furthermore, we reconstructed the whole genome of F4 through single molecular sequencing and de novo assembly. We identified 77 methyltransferases and eight dehalogenases in the F4 genome to support our hypothesized degradation path.

  1. Transcriptome sequencing in pediatric acute lymphoblastic leukemia identifies fusion genes associated with distinct DNA methylation profiles

    Directory of Open Access Journals (Sweden)

    Yanara Marincevic-Zuniga

    2017-08-01

    Full Text Available Abstract Background Structural chromosomal rearrangements that lead to expressed fusion genes are a hallmark of acute lymphoblastic leukemia (ALL. In this study, we performed transcriptome sequencing of 134 primary ALL patient samples to comprehensively detect fusion transcripts. Methods We combined fusion gene detection with genome-wide DNA methylation analysis, gene expression profiling, and targeted sequencing to determine molecular signatures of emerging ALL subtypes. Results We identified 64 unique fusion events distributed among 80 individual patients, of which over 50% have not previously been reported in ALL. Although the majority of the fusion genes were found only in a single patient, we identified several recurrent fusion gene families defined by promiscuous fusion gene partners, such as ETV6, RUNX1, PAX5, and ZNF384, or recurrent fusion genes, such as DUX4-IGH. Our data show that patients harboring these fusion genes displayed characteristic genome-wide DNA methylation and gene expression signatures in addition to distinct patterns in single nucleotide variants and recurrent copy number alterations. Conclusion Our study delineates the fusion gene landscape in pediatric ALL, including both known and novel fusion genes, and highlights fusion gene families with shared molecular etiologies, which may provide additional information for prognosis and therapeutic options in the future.

  2. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    Science.gov (United States)

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  3. Exome sequencing identifies a TCF4 mutation in a Chinese pedigree with symmetrical acral keratoderma.

    Science.gov (United States)

    Chen, P; Sun, S; Zeng, K; Li, C; Wen, J; Liang, J; Tian, X; Jiang, Y; Zhang, J; Zhang, S; Han, K; Han, C; Zhang, X

    2017-09-18

    Symmetrical acral keratoderma (SAK) is a rare skin disorder and its pathogenesis and inheritability are unknown. To investigate the inheritance and pathogenesis of SAK. Four SAK cases occurred in a four-generation Chinese family. Exome sequencing identified SNPs with potential SAK-related mutations, and a potentially responsible gene transcription factor 4 (TCF4) was identified. TCF4 was then sequenced in all 11 family members, and pedigree analysis was performed. Histopathology and immunohistochemistry evaluated TCF4 expression in skin lesions. The gene mutation was investigated in human keratinocytes for keratin-related protein expression. A novel heterozygous missense mutation, c.85C>A (p.Pro29Thr) was found in TCF4. The mutation showed autosomal dominant inheritance and perfectly cosegregated with the SAK phenotype in all family members. In skin lesions, TCF4 was present in the cytoplasm and membranes of the basal layer, the stratum spinosum and the stratum granulosum of the epidermis. The mutant TCF4 induced overexpression of differentiation markers including KRT1, KRT14, loricrin and involucrin. A SAK-related gene mutation in TCF4 may function through transcriptional regulation of keratin. © 2017 European Academy of Dermatology and Venereology.

  4. OS049. Exome sequencing identifies likely functional variantsinfluencing preeclampsia and CVD risk.

    Science.gov (United States)

    Johnson, M; Løset, M; Brennecke, S; Peralta, J; Dyer, T; East, C; Pennell, C; Huang, R-C; Mori, T; Beilin, L; Blangero, J; Moses, E

    2012-07-01

    Next-generation sequencing (NGS) in family-based study designs will be pivotal in unlocking the missing heritability of common complex diseases. Whilst our prior linkage- and association-based positional cloning studies in family- and population-based Australian cohorts, respectively, have discovered novel preeclampsia candidate genes (INHBB,ACVR2A,LCT,LRP1B,RND3,GCA,ERAP2,TNFSF13B), the full complement of causal genetic variation remains largely unknown. We have now sequenced the exomes of two Australian preeclampsia families in another step forward to unlocking preeclampsia's complex allelic architecture. Identify family-specific exon-centric loci segregating in preeclamptic women only. The exomes of 18 women (7 preeclamptics,11 controls) from two Australian families contributing to our chromosome 5q (Family 1) and 13q (Family 2) susceptibility loci, respectively, were sequenced using Illumina's TruSeq Exome Enrichment assay and NGS technology. Sequence alignments, quality control assessment and variant calling were conducted on our 8000 parallel processor compute server, MEDUSA. As a first pass, we prioritized exome sequence data to non-synonymous variants within the 1-LOD drop intervals of our 5q and 13q loci. Prioritized exonic variants were also genotyped in the Western Australian Pregnancy (Raine) Cohort to assess their significance against a plethora of cardiovascular disease (CVD) related traits. In Family 1 we identified two missense SNPs and in Family 2 we identified one missense SNP to segregate in the preeclamptic women but not in the unaffected women. The first SNP in Family 1 (rs62375061) resides within the LYSMD3 gene, is predicted to "possibly" damage the focal protein and the only public record of this SNP is within the Watson genome. The second SNP in Family 1 (rs111033530) resides within the GPR98 gene, is predicted to "probably" damage the focal protein and is rare (1.7% population prevalence). The SNP in Family 2 (rs1805388) resides within the

  5. Exome sequencing identifies three novel candidate genes implicated in intellectual disability.

    Science.gov (United States)

    Agha, Zehra; Iqbal, Zafar; Azam, Maleeha; Ayub, Humaira; Vissers, Lisenka E L M; Gilissen, Christian; Ali, Syeda Hafiza Benish; Riaz, Moeen; Veltman, Joris A; Pfundt, Rolph; van Bokhoven, Hans; Qamar, Raheel

    2014-01-01

    Intellectual disability (ID) is a major health problem mostly with an unknown etiology. Recently exome sequencing of individuals with ID identified novel genes implicated in the disease. Therefore the purpose of the present study was to identify the genetic cause of ID in one syndromic and two non-syndromic Pakistani families. Whole exome of three ID probands was sequenced. Missense variations in two plausible novel genes implicated in autosomal recessive ID were identified: lysine (K)-specific methyltransferase 2B (KMT2B), zinc finger protein 589 (ZNF589), as well as hedgehog acyltransferase (HHAT) with a de novo mutation with autosomal dominant mode of inheritance. The KMT2B recessive variant is the first report of recessive Kleefstra syndrome-like phenotype. Identification of plausible causative mutations for two recessive and a dominant type of ID, in genes not previously implicated in disease, underscores the large genetic heterogeneity of ID. These results also support the viewpoint that large number of ID genes converge on limited number of common networks i.e. ZNF589 belongs to KRAB-domain zinc-finger proteins previously implicated in ID, HHAT is predicted to affect sonic hedgehog, which is involved in several disorders with ID, KMT2B associated with syndromic ID fits the epigenetic module underlying the Kleefstra syndromic spectrum. The association of these novel genes in three different Pakistani ID families highlights the importance of screening these genes in more families with similar phenotypes from different populations to confirm the involvement of these genes in pathogenesis of ID.

  6. Targeted exome sequencing identified novel USH2A mutations in Usher syndrome families.

    Directory of Open Access Journals (Sweden)

    Xiu-Feng Huang

    Full Text Available Usher syndrome (USH is a leading cause of deaf-blindness in autosomal recessive trait. Phenotypic and genetic heterogeneities in USH make molecular diagnosis much difficult. This is a pilot study aiming to develop an approach based on next-generation sequencing to determine the genetic defects in patients with USH or allied diseases precisely and effectively. Eight affected patients and twelve unaffected relatives from five unrelated Chinese USH families, including 2 pseudo-dominant ones, were recruited. A total of 144 known genes of inherited retinal diseases were selected for deep exome resequencing. Through systematic data analysis using established bioinformatics pipeline and segregation analysis, a number of genetic variants were released. Eleven mutations, eight of them were novel, in the USH2A gene were identified. Biparental mutations in USH2A were revealed in 2 families with pseudo-dominant inheritance. A proband was found to have triple mutations, two of them were supposed to locate in the same chromosome. In conclusion, this study revealed the genetic defects in the USH2A gene and demonstrated the robustness of targeted exome sequencing to precisely and rapidly determine genetic defects. The methodology provides a reliable strategy for routine gene diagnosis of USH.

  7. LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone

    KAUST Repository

    Chen, Peng

    2014-12-03

    Background Protein-ligand binding is important for some proteins to perform their functions. Protein-ligand binding sites are the residues of proteins that physically bind to ligands. Despite of the recent advances in computational prediction for protein-ligand binding sites, the state-of-the-art methods search for similar, known structures of the query and predict the binding sites based on the solved structures. However, such structural information is not commonly available. Results In this paper, we propose a sequence-based approach to identify protein-ligand binding residues. We propose a combination technique to reduce the effects of different sliding residue windows in the process of encoding input feature vectors. Moreover, due to the highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we construct several balanced data sets, for each of which a random forest (RF)-based classifier is trained. The ensemble of these RF classifiers forms a sequence-based protein-ligand binding site predictor. Conclusions Experimental results on CASP9 and CASP8 data sets demonstrate that our method compares favorably with the state-of-the-art protein-ligand binding site prediction methods.

  8. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder.

    Science.gov (United States)

    C Yuen, Ryan K; Merico, Daniele; Bookman, Matt; L Howe, Jennifer; Thiruvahindrapuram, Bhooma; Patel, Rohan V; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A; Walker, Susan; Marshall, Christian R; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D'Abate, Lia; Chan, Ada J S; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R; Nalpathamkalam, Thomas; Sung, Wilson W L; Tsoi, Fiona J; Wei, John; Xu, Lizhen; Tasse, Anne-Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie MacKinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A; Parr, Jeremy R; Spence, Sarah J; Vorstman, Jacob; Frey, Brendan J; Robinson, James T; Strug, Lisa J; Fernandez, Bridget A; Elsabbagh, Mayada; Carter, Melissa T; Hallmayer, Joachim; Knoppers, Bartha M; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H; Glazer, David; Pletcher, Mathew T; Scherer, Stephen W

    2017-04-01

    We are performing whole-genome sequencing of families with autism spectrum disorder (ASD) to build a resource (MSSNG) for subcategorizing the phenotypes and underlying genetic factors involved. Here we report sequencing of 5,205 samples from families with ASD, accompanied by clinical information, creating a database accessible on a cloud platform and through a controlled-access internet portal. We found an average of 73.8 de novo single nucleotide variants and 12.6 de novo insertions and deletions or copy number variations per ASD subject. We identified 18 new candidate ASD-risk genes and found that participants bearing mutations in susceptibility genes had significantly lower adaptive ability (P = 6 × 10(-4)). In 294 of 2,620 (11.2%) of ASD cases, a molecular basis could be determined and 7.2% of these carried copy number variations and/or chromosomal abnormalities, emphasizing the importance of detecting all forms of genetic variation as diagnostic and therapeutic targets in ASD.

  9. iACP: a sequence-based tool for identifying anticancer peptides.

    Science.gov (United States)

    Chen, Wei; Ding, Hui; Feng, Pengmian; Lin, Hao; Chou, Kuo-Chen

    2016-03-29

    Cancer remains a major killer worldwide. Traditional methods of cancer treatment are expensive and have some deleterious side effects on normal cells. Fortunately, the discovery of anticancer peptides (ACPs) has paved a new way for cancer treatment. With the explosive growth of peptide sequences generated in the post genomic age, it is highly desired to develop computational methods for rapidly and effectively identifying ACPs, so as to speed up their application in treating cancer. Here we report a sequence-based predictor called iACP developed by the approach of optimizing the g-gap dipeptide components. It was demonstrated by rigorous cross-validations that the new predictor remarkably outperformed the existing predictors for the same purpose in both overall accuracy and stability. For the convenience of most experimental scientists, a publicly accessible web-server for iACP has been established at http://lin.uestc.edu.cn/server/iACP, by which users can easily obtain their desired results.

  10. Exome sequencing identifies somatic mutations of DDX3X in natural killer/T-cell lymphoma.

    Science.gov (United States)

    Jiang, Lu; Gu, Zhao-Hui; Yan, Zi-Xun; Zhao, Xia; Xie, Yin-Yin; Zhang, Zi-Guan; Pan, Chun-Ming; Hu, Yuan; Cai, Chang-Ping; Dong, Ying; Huang, Jin-Yan; Wang, Li; Shen, Yang; Meng, Guoyu; Zhou, Jian-Feng; Hu, Jian-Da; Wang, Jin-Fen; Liu, Yuan-Hua; Yang, Lin-Hua; Zhang, Feng; Wang, Jian-Min; Wang, Zhao; Peng, Zhi-Gang; Chen, Fang-Yuan; Sun, Zi-Min; Ding, Hao; Shi, Ju-Mei; Hou, Jian; Yan, Jin-Song; Shi, Jing-Yi; Xu, Lan; Li, Yang; Lu, Jing; Zheng, Zhong; Xue, Wen; Zhao, Wei-Li; Chen, Zhu; Chen, Sai-Juan

    2015-09-01

    Natural killer/T-cell lymphoma (NKTCL) is a malignant proliferation of CD56(+) and cytoCD3(+) lymphocytes with aggressive clinical course, which is prevalent in Asian and South American populations. The molecular pathogenesis of NKTCL has largely remained elusive. We identified somatic gene mutations in 25 people with NKTCL by whole-exome sequencing and confirmed them in an extended validation group of 80 people by targeted sequencing. Recurrent mutations were most frequently located in the RNA helicase gene DDX3X (21/105 subjects, 20.0%), tumor suppressors (TP53 and MGA), JAK-STAT-pathway molecules (STAT3 and STAT5B) and epigenetic modifiers (MLL2, ARID1A, EP300 and ASXL3). As compared to wild-type protein, DDX3X mutants exhibited decreased RNA-unwinding activity, loss of suppressive effects on cell-cycle progression in NK cells and transcriptional activation of NF-κB and MAPK pathways. Clinically, patients with DDX3X mutations presented a poor prognosis. Our work thus contributes to the understanding of the disease mechanism of NKTCL.

  11. Huntington's disease biomarker progression profile identified by transcriptome sequencing in peripheral blood.

    Science.gov (United States)

    Mastrokolias, Anastasios; Ariyurek, Yavuz; Goeman, Jelle J; van Duijn, Erik; Roos, Raymund A C; van der Mast, Roos C; van Ommen, GertJan B; den Dunnen, Johan T; 't Hoen, Peter A C; van Roon-Mom, Willeke M C

    2015-10-01

    With several therapeutic approaches in development for Huntington's disease, there is a need for easily accessible biomarkers to monitor disease progression and therapy response. We performed next-generation sequencing-based transcriptome analysis of total RNA from peripheral blood of 91 mutation carriers (27 presymptomatic and, 64 symptomatic) and 33 controls. Transcriptome analysis by DeepSAGE identified 167 genes significantly associated with clinical total motor score in Huntington's disease patients. Relative to previous studies, this yielded novel genes and confirmed previously identified genes, such as H2AFY, an overlap in results that has proven difficult in the past. Pathway analysis showed enrichment of genes of the immune system and target genes of miRNAs, which are downregulated in Huntington's disease models. Using a highly parallelized microfluidics array chip (Fluidigm), we validated 12 of the top 20 significant genes in our discovery cohort and 7 in a second independent cohort. The five genes (PROK2, ZNF238, AQP9, CYSTM1 and ANXA3) that were validated independently in both cohorts present a candidate biomarker panel for stage determination and therapeutic readout in Huntington's disease. Finally we suggest a first empiric formula predicting total motor score from the expression levels of our biomarker panel. Our data support the view that peripheral blood is a useful source to identify biomarkers for Huntington's disease and monitor disease progression in future clinical trials.

  12. Transcriptome Sequencing of Lima Bean (Phaseolus lunatus) to Identify Putative Positive Selection in Phaseolus and Legumes.

    Science.gov (United States)

    Li, Fengqi; Cao, Depan; Liu, Yang; Yang, Ting; Wang, Guirong

    2015-07-03

    The identification of genes under positive selection is a central goal of evolutionary biology. Many legume species, including Phaseolus vulgaris (common bean) and Phaseolus lunatus (lima bean), have important ecological and economic value. In this study, we sequenced and assembled the transcriptome of one Phaseolus species, lima bean. A comparison with the genomes of six other legume species, including the common bean, Medicago, lotus, soybean, chickpea, and pigeonpea, revealed 15 and 4 orthologous groups with signatures of positive selection among the two Phaseolus species and among the seven legume species, respectively. Characterization of these positively selected genes using Non redundant (nr) annotation, gene ontology (GO) classification, GO term enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses revealed that these genes are mostly involved in thylakoids, photosynthesis and metabolism. This study identified genes that may be related to the divergence of the Phaseolus and legume species. These detected genes are particularly good candidates for subsequent functional studies.

  13. Transcriptome Sequencing of Lima Bean (Phaseolus lunatus to Identify Putative Positive Selection in Phaseolus and Legumes

    Directory of Open Access Journals (Sweden)

    Fengqi Li

    2015-07-01

    Full Text Available The identification of genes under positive selection is a central goal of evolutionary biology. Many legume species, including Phaseolus vulgaris (common bean and Phaseolus lunatus (lima bean, have important ecological and economic value. In this study, we sequenced and assembled the transcriptome of one Phaseolus species, lima bean. A comparison with the genomes of six other legume species, including the common bean, Medicago, lotus, soybean, chickpea, and pigeonpea, revealed 15 and 4 orthologous groups with signatures of positive selection among the two Phaseolus species and among the seven legume species, respectively. Characterization of these positively selected genes using Non redundant (nr annotation, gene ontology (GO classification, GO term enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG pathway analyses revealed that these genes are mostly involved in thylakoids, photosynthesis and metabolism. This study identified genes that may be related to the divergence of the Phaseolus and legume species. These detected genes are particularly good candidates for subsequent functional studies.

  14. A new system identification approach to identify genetic variants in sequencing studies for a binary phenotype.

    Science.gov (United States)

    Kang, Guolian; Bi, Wenjian; Zhao, Yanlong; Zhang, Ji-Feng; Yang, Jun J; Xu, Heng; Loh, Mignon L; Hunger, Stephen P; Relling, Mary V; Pounds, Stanley; Cheng, Cheng

    2014-01-01

    We propose in this paper a set-valued (SV) system model, which is a generalized form of logistic (LG) and Probit (Probit) regression, to be considered as a method for discovering genetic variants, especially rare genetic variants in next-generation sequencing studies, for a binary phenotype. We propose a new SV system identification method to estimate all underlying key system parameters for the Probit model and compare it with the LG model in the setting of genetic association studies. Across an extensive series of simulation studies, the Probit method maintained type I error control and had similar or greater power than the LG method, which is robust to different distributions of noise: logistic, normal, or t distributions. Additionally, the Probit association parameter estimate was 2.7-46.8-fold less variable than the LG log-odds ratio association parameter estimate. Less variability in the association parameter estimate translates to greater power and robustness across the spectrum of minor allele frequencies (MAFs), and these advantages are the most pronounced for rare variants. For instance, in a simulation that generated data from an additive logistic model with an odds ratio of 7.4 for a rare single nucleotide polymorphism with a MAF of 0.005 and a sample size of 2,300, the Probit method had 60% power whereas the LG method had 25% power at the α = 10(-6) level. Consistent with these simulation results, the set of variants identified by the LG method was a subset of those identified by the Probit method in two example analyses. Thus, we suggest the Probit method may be a competitive alternative to the LG method in genetic association studies such as candidate gene, genome-wide, or next-generation sequencing studies for a binary phenotype.

  15. A New System Identification Approach to Identifying Genetic Variants in Sequencing Studies for A Binary Phenotype

    Science.gov (United States)

    Kang, Guolian; Bi, Wenjian; Zhao, Yanlong; Zhang, Ji-Feng; Yang, Jun J.; Xu, Heng; Loh, Mignon L.; Hunger, Stephen P.; Relling, Mary V.; Pounds, Stanley; Cheng, Cheng

    2014-01-01

    We propose in this paper a set-valued (SV) system model, which is a generalized form of Logistic (LG) and Probit (Probit) regression, to be considered as a method for discovering genetic variants, especially rare genetic variants in next generation sequencing studies, for a binary phenotype. We propose a new set-valued system identification method to estimate all the underlying key system parameters for the Probit model and compare it with the LG model in the setting of genetic association studies. Across an extensive series of simulation studies, the Probit method maintained Type I error control and had similar or greater power than the LG method which is robust to different distributions of noise: logistic, normal or t distributions. Additionally, the Probit association parameter estimate was 2.7–46.8 fold less variable than the LG log-odds ratio association parameter estimate. Less variability in the association parameter estimate translates to greater power and robustness across the spectrum of minor allele frequencies (MAFs), and these advantages are the most pronounced for rare variants. For instance, in a simulation that generated data from an additive logistic model with odds ratio of 7.4 for a rare single nucleotide polymorphism with a MAF of 0.005 and a sample size of 2300, the Probit method had 60% power whereas the LG method had 25% power at the α=10−6 level. Consistent with these simulation results, the set of variants identified by the LG method was a subset of those identified by the Probit method in two example analyses. Thus, we suggest the Probit method may be a competitive alternative to the LG method in genetic association studies such as candidate gene, genome-wide, or next generation sequencing studies for a binary phenotype. PMID:25096228

  16. Complete genome sequence of a novel extrachromosomal virus-like element identified in planarian Girardia tigrina

    Directory of Open Access Journals (Sweden)

    Vagner Loura L

    2002-06-01

    Full Text Available Abstract Background Freshwater planarians are widely used as models for investigation of pattern formation and studies on genetic variation in populations. Despite extensive information on the biology and genetics of planaria, the occurrence and distribution of viruses in these animals remains an unexplored area of research. Results Using a combination of Suppression Subtractive Hybridization (SSH and Mirror Orientation Selection (MOS, we compared the genomes of two strains of freshwater planarian, Girardia tigrina. The novel extrachromosomal DNA-containing virus-like element denoted PEVE (Planarian Extrachromosomal Virus-like Element was identified in one planarian strain. The PEVE genome (about 7.5 kb consists of two unique regions (Ul and Us flanked by inverted repeats. Sequence analyses reveal that PEVE comprises two helicase-like sequences in the genome, of which the first is a homolog of a circoviral replication initiator protein (Rep, and the second is similar to the papillomavirus E1 helicase domain. PEVE genome exists in at least two variant forms with different arrangements of single-stranded and double-stranded DNA stretches that correspond to the Us and Ul regions. Using PCR analysis and whole-mount in situ hybridization, we characterized PEVE distribution and expression in the planarian body. Conclusions PEVE is the first viral element identified in free-living flatworms. This element differs from all known viruses and viral elements, and comprises two potential helicases that are homologous to proteins from distant viral phyla. PEVE is unevenly distributed in the worm body, and is detected in specific parenchyma cells.

  17. Transcriptome Sequencing of Chemically Induced Aquilaria sinensis to Identify Genes Related to Agarwood Formation.

    Directory of Open Access Journals (Sweden)

    Wei Ye

    Full Text Available Agarwood is a traditional Chinese medicine used as a clinical sedative, carminative, and antiemetic drug. Agarwood is formed in Aquilaria sinensis when A. sinensis trees are threatened by external physical, chemical injury or endophytic fungal irritation. However, the mechanism of agarwood formation via chemical induction remains unclear. In this study, we characterized the transcriptome of different parts of a chemically induced A. sinensis trunk sample with agarwood. The Illumina sequencing platform was used to identify the genes involved in agarwood formation.A five-year-old Aquilaria sinensis treated by formic acid was selected. The white wood part (B1 sample, the transition part between agarwood and white wood (W2 sample, the agarwood part (J3 sample, and the rotten wood part (F5 sample were collected for transcriptome sequencing. Accordingly, 54,685,634 clean reads, which were assembled into 83,467 unigenes, were obtained with a Q20 value of 97.5%. A total of 50,565 unigenes were annotated using the Nr, Nt, SWISS-PROT, KEGG, COG, and GO databases. In particular, 171,331,352 unigenes were annotated by various pathways, including the sesquiterpenoid (ko00909 and plant-pathogen interaction (ko03040 pathways. These pathways were related to sesquiterpenoid biosynthesis and defensive responses to chemical stimulation.The transcriptome data of the different parts of the chemically induced A. sinensis trunk provide a rich source of materials for discovering and identifying the genes involved in sesquiterpenoid production and in defensive responses to chemical stimulation. This study is the first to use de novo sequencing and transcriptome assembly for different parts of chemically induced A. sinensis. Results demonstrate that the sesquiterpenoid biosynthesis pathway and WRKY transcription factor play important roles in agarwood formation via chemical induction. The comparative analysis of the transcriptome data of agarwood and A. sinensis lays the

  18. Shirky and Sanger, or the costs of crowdsourcing

    Directory of Open Access Journals (Sweden)

    Mathieu O'Neil

    2010-03-01

    Full Text Available Online knowledge production sites do not rely on isolated experts but on collaborative processes, on the wisdom of the group or “crowd”. Some authors have argued that it is possible to combine traditional or credentialled expertise with collective production; others believe that traditional expertise's focus on correctness has been superseded by the affordances of digital networking, such as re-use and verifiability. This paper examines the costs of two kinds of “crowdsourced” encyclopedic projects: Citizendium, based on the work of credentialled and identified experts, faces a recruitment deficit; in contrast Wikipedia has proved wildly popular, but anti-credentialism and anonymity result in uncertainty, irresponsibility, the development of cliques and the growing importance of pseudo-legal competencies for conflict resolution. Finally the paper reflects on the wider social implications of focusing on what experts are rather than on what they are for.

  19. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  20. FN-Identify: Novel Restriction Enzymes-Based Method for Bacterial Identification in Absence of Genome Sequencing.

    Science.gov (United States)

    Awad, Mohamed; Ouda, Osama; El-Refy, Ali; El-Feky, Fawzy A; Mosa, Kareem A; Helmy, Mohamed

    2015-01-01

    Sequencing and restriction analysis of genes like 16S rRNA and HSP60 are intensively used for molecular identification in the microbial communities. With aid of the rapid progress in bioinformatics, genome sequencing became the method of choice for bacterial identification. However, the genome sequencing technology is still out of reach in the developing countries. In this paper, we propose FN-Identify, a sequencing-free method for bacterial identification. FN-Identify exploits the gene sequences data available in GenBank and other databases and the two algorithms that we developed, CreateScheme and GeneIdentify, to create a restriction enzyme-based identification scheme. FN-Identify was tested using three different and diverse bacterial populations (members of Lactobacillus, Pseudomonas, and Mycobacterium groups) in an in silico analysis using restriction enzymes and sequences of 16S rRNA gene. The analysis of the restriction maps of the members of three groups using the fragment numbers information only or along with fragments sizes successfully identified all of the members of the three groups using a minimum of four and maximum of eight restriction enzymes. Our results demonstrate the utility and accuracy of FN-Identify method and its two algorithms as an alternative method that uses the standard microbiology laboratories techniques when the genome sequencing is not available.

  1. FN-Identify: Novel Restriction Enzymes-Based Method for Bacterial Identification in Absence of Genome Sequencing

    Directory of Open Access Journals (Sweden)

    Mohamed Awad

    2015-01-01

    Full Text Available Sequencing and restriction analysis of genes like 16S rRNA and HSP60 are intensively used for molecular identification in the microbial communities. With aid of the rapid progress in bioinformatics, genome sequencing became the method of choice for bacterial identification. However, the genome sequencing technology is still out of reach in the developing countries. In this paper, we propose FN-Identify, a sequencing-free method for bacterial identification. FN-Identify exploits the gene sequences data available in GenBank and other databases and the two algorithms that we developed, CreateScheme and GeneIdentify, to create a restriction enzyme-based identification scheme. FN-Identify was tested using three different and diverse bacterial populations (members of Lactobacillus, Pseudomonas, and Mycobacterium groups in an in silico analysis using restriction enzymes and sequences of 16S rRNA gene. The analysis of the restriction maps of the members of three groups using the fragment numbers information only or along with fragments sizes successfully identified all of the members of the three groups using a minimum of four and maximum of eight restriction enzymes. Our results demonstrate the utility and accuracy of FN-Identify method and its two algorithms as an alternative method that uses the standard microbiology laboratories techniques when the genome sequencing is not available.

  2. Next-generation sequencing identifies transportin 3 as the causative gene for LGMD1F.

    Directory of Open Access Journals (Sweden)

    Annalaura Torella

    Full Text Available Limb-girdle muscular dystrophies (LGMD are genetically and clinically heterogeneous conditions. We investigated a large family with autosomal dominant transmission pattern, previously classified as LGMD1F and mapped to chromosome 7q32. Affected members are characterized by muscle weakness affecting earlier the pelvic girdle and the ileopsoas muscles. We sequenced the whole exome of four family members and identified a shared heterozygous frame-shift variant in the Transportin 3 (TNPO3 gene, encoding a member of the importin-β super-family. The TNPO3 gene is mapped within the LGMD1F critical interval and its 923-amino acid human gene product is also expressed in skeletal muscle. In addition, we identified an isolated case of LGMD with a new missense mutation in the same gene. We localized the mutant TNPO3 around the nucleus, but not inside. The involvement of gene related to the nuclear transport suggests a novel disease mechanism leading to muscular dystrophy.

  3. Comparison of inherently essential genes of Porphyromonas gingivalis identified in two transposon-sequencing libraries.

    Science.gov (United States)

    Hutcherson, J A; Gogeneni, H; Yoder-Himes, D; Hendrickson, E L; Hackett, M; Whiteley, M; Lamont, R J; Scott, D A

    2016-08-01

    Porphyromonas gingivalis is a Gram-negative anaerobe and keystone periodontal pathogen. A mariner transposon insertion mutant library has recently been used to define 463 genes as putatively essential for the in vitro growth of P. gingivalis ATCC 33277 in planktonic culture (Library 1). We have independently generated a transposon insertion mutant library (Library 2) for the same P. gingivalis strain and herein compare genes that are putatively essential for in vitro growth in complex media, as defined by both libraries. In all, 281 genes (61%) identified by Library 1 were common to Library 2. Many of these common genes are involved in fundamentally important metabolic pathways, notably pyrimidine cycling as well as lipopolysaccharide, peptidoglycan, pantothenate and coenzyme A biosynthesis, and nicotinate and nicotinamide metabolism. Also in common are genes encoding heat-shock protein homologues, sigma factors, enzymes with proteolytic activity, and the majority of sec-related protein export genes. In addition to facilitating a better understanding of critical physiological processes, transposon-sequencing technology has the potential to identify novel strategies for the control of P. gingivalis infections. Those genes defined as essential by two independently generated TnSeq mutant libraries are likely to represent particularly attractive therapeutic targets. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  4. Exome sequencing identifies MAX mutations as a cause of hereditary pheochromocytoma.

    Science.gov (United States)

    Comino-Méndez, Iñaki; Gracia-Aznárez, Francisco J; Schiavi, Francesca; Landa, Iñigo; Leandro-García, Luis J; Letón, Rocío; Honrado, Emiliano; Ramos-Medina, Rocío; Caronia, Daniela; Pita, Guillermo; Gómez-Graña, Alvaro; de Cubas, Aguirre A; Inglada-Pérez, Lucía; Maliszewska, Agnieszka; Taschin, Elisa; Bobisse, Sara; Pica, Giuseppe; Loli, Paola; Hernández-Lavado, Rafael; Díaz, José A; Gómez-Morales, Mercedes; González-Neira, Anna; Roncador, Giovanna; Rodríguez-Antona, Cristina; Benítez, Javier; Mannelli, Massimo; Opocher, Giuseppe; Robledo, Mercedes; Cascón, Alberto

    2011-06-19

    Hereditary pheochromocytoma (PCC) is often caused by germline mutations in one of nine susceptibility genes described to date, but there are familial cases without mutations in these known genes. We sequenced the exomes of three unrelated individuals with hereditary PCC (cases) and identified mutations in MAX, the MYC associated factor X gene. Absence of MAX protein in the tumors and loss of heterozygosity caused by uniparental disomy supported the involvement of MAX alterations in the disease. A follow-up study of a selected series of 59 cases with PCC identified five additional MAX mutations and suggested an association with malignant outcome and preferential paternal transmission of MAX mutations. The involvement of the MYC-MAX-MXD1 network in the development and progression of neural crest cell tumors is further supported by the lack of functional MAX in rat PCC (PC12) cells and by the amplification of MYCN in neuroblastoma and suggests that loss of MAX function is correlated with metastatic potential.

  5. RNA sequencing on Solanum lycopersicum trichomes identifies transcription factors that activate terpene synthase promoters.

    Science.gov (United States)

    Spyropoulou, Eleni A; Haring, Michel A; Schuurink, Robert C

    2014-05-27

    Glandular trichomes are production and storage organs of specialized metabolites such as terpenes, which play a role in the plant's defense system. The present study aimed to shed light on the regulation of terpene biosynthesis in Solanum lycopersicum trichomes by identification of transcription factors (TFs) that control the expression of terpene synthases. A trichome transcriptome database was created with a total of 27,195 contigs that contained 743 annotated TFs. Furthermore a quantitative expression database was obtained of jasmonic acid-treated trichomes. Sixteen candidate TFs were selected for further analysis. One TF of the MYC bHLH class and one of the WRKY class were able to transiently transactivate S. lycopersicum terpene synthase promoters in Nicotiana benthamiana leaves. Strikingly, SlMYC1 was shown to act synergistically with a previously identified zinc finger-like TF, Expression of Terpenoids 1 (SlEOT1) in transactivating the SlTPS5 promoter. High-throughput sequencing of tomato stem trichomes led to the discovery of two transcription factors that activated several terpene synthase promoters. Our results identified new elements of the transcriptional regulation of tomato terpene biosynthesis in trichomes, a largely unexplored field.

  6. Exome sequencing in 53 sporadic cases of schizophrenia identifies 18 putative candidate genes.

    Directory of Open Access Journals (Sweden)

    Michel Guipponi

    Full Text Available Schizophrenia (SCZ is a severe, debilitating mental illness which has a significant genetic component. The identification of genetic factors related to SCZ has been challenging and these factors remain largely unknown. To evaluate the contribution of de novo variants (DNVs to SCZ, we sequenced the exomes of 53 individuals with sporadic SCZ and of their non-affected parents. We identified 49 DNVs, 18 of which were predicted to alter gene function, including 13 damaging missense mutations, 2 conserved splice site mutations, 2 nonsense mutations, and 1 frameshift deletion. The average number of exonic DNV per proband was 0.88, which corresponds to an exonic point mutation rate of 1.7×10(-8 per nucleotide per generation. The non-synonymous-to-synonymous mutation ratio of 2.06 did not differ from neutral expectations. Overall, this study provides a list of 18 putative candidate genes for sporadic SCZ, and when combined with the results of similar reports, identifies a second proband carrying a non-synonymous DNV in the RGS12 gene.

  7. Multilocus sequence typing of Dientamoeba fragilis identified a major clone with widespread geographical distribution.

    Science.gov (United States)

    Cacciò, Simone M; Sannella, Anna Rosa; Bruno, Antonella; Stensvold, Christen R; David, Erica Boarato; Guimarães, Semiramis; Manuali, Elisabetta; Magistrali, Chiara; Mahdad, Karim; Beaman, Miles; Maserati, Roberta; Tosini, Fabio; Pozio, Edoardo

    2016-11-01

    The flagellated protozoan Dientamoeba fragilis is often detected in humans with gastrointestinal symptoms, but it is also commonly found in healthy subjects. As for other intestinal protozoa, the hypothesis that genetically dissimilar parasite isolates differ in their ability to cause symptoms has also been raised for D. fragilis. To date, only two D. fragilis genotypes (1 and 2) have been described, of which genotype 1 largely predominates worldwide. However, very few markers are available for genotyping studies and therefore the extent of genetic variation among isolates remains largely unknown. Here, we performed metagenomics experiments on two D. fragilis-positive stool samples, and identified a number of candidate markers based on sequence similarity to the phylogenetically related species Trichomonas vaginalis. Markers corresponding to structural genes and to genes encoding for proteases were selected for this study, and PCR experiments confirmed their belonging to the D. fragilis genome; two previously described markers (small subunit ribosomal DNA and large subunit of RNA polymerase II) were also included. Using this panel of markers, 111 isolates of human origin were genotyped, all of which, except one, belonged to genotype 1. These isolates had been collected at different times from symptomatic and asymptomatic persons of different age groups in Italy, Denmark, Brazil and Australia. By sequencing approximately 160kb from 500 PCR products, a very low level of polymorphism was observed across all the investigated loci, suggesting the existence of a major clone of D. fragilis with a widespread geographical distribution. Copyright © 2016 Australian Society for Parasitology. Published by Elsevier Ltd. All rights reserved.

  8. The Ebola virus VP35 protein binds viral immunostimulatory and host RNAs identified through deep sequencing.

    Directory of Open Access Journals (Sweden)

    Kari A Dilley

    Full Text Available Ebola virus and Marburg virus are members of the Filovirdae family and causative agents of hemorrhagic fever with high fatality rates in humans. Filovirus virulence is partially attributed to the VP35 protein, a well-characterized inhibitor of the RIG-I-like receptor pathway that triggers the antiviral interferon (IFN response. Prior work demonstrates the ability of VP35 to block potent RIG-I activators, such as Sendai virus (SeV, and this IFN-antagonist activity is directly correlated with its ability to bind RNA. Several structural studies demonstrate that VP35 binds short synthetic dsRNAs; yet, there are no data that identify viral immunostimulatory RNAs (isRNA or host RNAs bound to VP35 in cells. Utilizing a SeV infection model, we demonstrate that both viral isRNA and host RNAs are bound to Ebola and Marburg VP35s in cells. By deep sequencing the purified VP35-bound RNA, we identified the SeV copy-back defective interfering (DI RNA, previously identified as a robust RIG-I activator, as the isRNA bound by multiple filovirus VP35 proteins, including the VP35 protein from the West African outbreak strain (Makona EBOV. Moreover, RNAs isolated from a VP35 RNA-binding mutant were not immunostimulatory and did not include the SeV DI RNA. Strikingly, an analysis of host RNAs bound by wild-type, but not mutant, VP35 revealed that select host RNAs are preferentially bound by VP35 in cell culture. Taken together, these data support a model in which VP35 sequesters isRNA in virus-infected cells to avert RIG-I like receptor (RLR activation.

  9. Diagnostic SNPs for inferring population structure in American mink (Neovison vison) identified through RAD sequencing

    DEFF Research Database (Denmark)

    2015-01-01

    Data from: "Diagnostic SNPs for inferring population structure in American mink (Neovison vison) identified through RAD sequencing" in Genomic Resources Notes accepted 1 October 2014 to 30 November 2014....

  10. Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing

    Science.gov (United States)

    Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona

    2014-01-01

    The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia

  11. Whole-exome sequencing identifies novel ECHS1 mutations in Leigh syndrome.

    Science.gov (United States)

    Tetreault, Martine; Fahiminiya, Somayyeh; Antonicka, Hana; Mitchell, Grant A; Geraghty, Michael T; Lines, Matthew; Boycott, Kym M; Shoubridge, Eric A; Mitchell, John J; Michaud, Jacques L; Majewski, Jacek

    2015-09-01

    Leigh syndrome (LS) is a rare heterogeneous progressive neurodegenerative disorder usually presenting in infancy or early childhood. Clinical presentation is variable and includes psychomotor delay or regression, acute neurological or acidotic episodes, hypotonia, ataxia, spasticity, movement disorders, and corresponding anomalies of the basal ganglia and brain stem on magnetic resonance imaging. To date, 35 genes have been associated with LS, mostly involved in mitochondrial respiratory chain function and encoded in either nuclear or mitochondrial DNA. We used whole-exome sequencing to identify disease-causing variants in four patients with basal ganglia abnormalities and clinical presentations consistent with LS. Compound heterozygote variants in ECHS1, encoding the enzyme enoyl-CoA hydratase were identified. One missense variant (p.Thr180Ala) was common to all four patients and the haplotype surrounding this variant was also shared, suggesting a common ancestor of French-Canadian origin. Rare mutations in ECHS1 as well as in HIBCH, the enzyme downstream in the valine degradation pathway, have been associated with LS or LS-like disorders. A clear clinical overlap is observed between our patients and the reported cases with ECHS1 or HIBCH deficiency. The main clinical features observed in our cohort are T2-hyperintense signal in the globus pallidus and putamen, failure to thrive, developmental delay or regression, and nystagmus. Respiratory chain studies are not strikingly abnormal in our patients: one patient had a mild reduction of complex I and III and another of complex IV. The identification of four additional patients with mutations in ECHS1 highlights the emerging importance of this pathway in LS.

  12. Sequencing ASMT identifies rare mutations in Chinese Han patients with autism.

    Directory of Open Access Journals (Sweden)

    Lifang Wang

    Full Text Available Melatonin is involved in the regulation of circadian and seasonal rhythms and immune function. Prior research reported low melatonin levels in autism spectrum disorders (ASD. ASMT located in pseudo-autosomal region 1 encodes the last enzyme of the melatonin biosynthesis pathway. A previous study reported an association between ASD and single nucleotide polymorphisms (SNPs rs4446909 and rs5989681 located in the promoter of ASMT. Furthermore, rare deleterious mutations were identified in a subset of patients. To investigate the association between ASMT and autism, we sequenced all ASMT exons and its neighboring region in 398 Chinese Han individuals with autism and 437 healthy controls. Although our study did not detect significant differences of genotypic distribution and allele frequencies of the common SNPs in ASMT between patients with autism and healthy controls, we identified new rare coding mutations of ASMT. Among these rare variants, 4 were exclusively detected in patients with autism including a stop mutation (p.R115W, p.V166I, p.V179G, and p.W257X. These four coding variants were observed in 6 of 398 (1.51% patients with autism and none in 437 controls (Chi-Square test, Continuity Correction p = 0.032, two-sided. Functional prediction of impact of amino acid showed that p.R115W might affect protein function. These results indicate that ASMT might be a susceptibility gene for autism. Further studies in larger samples are needed to better understand the degree of variation in this gene as well as to understand the biochemical and clinical impacts of ASMT/melatonin deficiency.

  13. RNA sequencing identifies gene regulatory networks controlling extracellular matrix synthesis in intervertebral disk tissues.

    Science.gov (United States)

    Riester, Scott M; Lin, Yang; Wang, Wei; Cong, Lin; Mohamed Ali, Abdel-Moneim; Peck, Sun H; Smith, Lachlan J; Currier, Bradford L; Clark, Michelle; Huddleston, Paul; Krauss, William; Yaszemski, Michael J; Morrey, Mark E; Abdel, Matthew P; Bydon, Mohamad; Qu, Wenchun; Larson, Annalise N; van Wijnen, Andre J; Nassr, Ahmad

    2017-12-11

    Degenerative disk disease of the spine is a major cause of back pain and disability. Optimization of regenerative medical therapies for degenerative disk disease requires a deep mechanistic understanding of the factors controlling the structural integrity of spinal tissues. In this investigation, we sought to identify candidate regulatory genes controlling extracellular matrix synthesis in spinal tissues. To achieve this goal we performed high throughput next generation RNA sequencing on 39 annulus fibrosus and 21 nucleus pulposus human tissue samples. Specimens were collected from patients undergoing surgical discectomy for the treatment of degenerative disk disease. Our studies identified associations between extracellular matrix genes, growth factors, and other important regulatory molecules. The fibrous matrix characteristic of annulus fibrosus was associated with expression of the growth factors platelet derived growth factor beta (PDGFB), vascular endothelial growth factor C (VEGFC), and fibroblast growth factor 9 (FGF9). Additionally we observed high expression of multiple signaling proteins involved in the NOTCH and WNT signaling cascades. Nucleus pulposus extracellular matrix related genes were associated with the expression of numerous diffusible growth factors largely associated with the transforming growth signaling cascade, including transforming factor alpha (TGFA), inhibin alpha (INHA), inhibin beta A (INHBA), bone morphogenetic proteins (BMP2, BMP6), and others. this investigation provides important data on extracellular matrix gene regulatory networks in disk tissues. This information can be used to optimize pharmacologic, stem cell, and tissue engineering strategies for regeneration of the intervertebral disk and the treatment of back pain. © 2017 Orthopaedic Research Society. Published by Wiley Periodicals, Inc. J Orthop Res. © 2017 Orthopaedic Research Society. Published by Wiley Periodicals, Inc.

  14. Next-Generation Sequencing of Lung Cancer EGFR Exons 18-21 Allows Effective Molecular Diagnosis of Small Routine Samples (Cytology and Biopsy)

    Science.gov (United States)

    de Biase, Dario; Visani, Michela; Malapelle, Umberto; Simonato, Francesca; Cesari, Valentina; Bellevicine, Claudio; Pession, Annalisa; Troncone, Giancarlo; Fassina, Ambrogio; Tallini, Giovanni

    2013-01-01

    Selection of lung cancer patients for therapy with tyrosine kinase inhibitors directed at EGFR requires the identification of specific EGFR mutations. In most patients with advanced, inoperable lung carcinoma limited tumor samples often represent the only material available for both histologic typing and molecular analysis. We defined a next generation sequencing protocol targeted to EGFR exons 18-21 suitable for the routine diagnosis of such clinical samples. The protocol was validated in an unselected series of 80 small biopsies (n=14) and cytology (n=66) specimens representative of the material ordinarily submitted for diagnostic evaluation to three referral medical centers in Italy. Specimens were systematically evaluated for tumor cell number and proportion relative to non-neoplastic cells. They were analyzed in batches of 100-150 amplicons per run, reaching an analytical sensitivity of 1% and obtaining an adequate number of reads, to cover all exons on all samples analyzed. Next generation sequencing was compared with Sanger sequencing. The latter identified 15 EGFR mutations in 14/80 cases (17.5%) but did not detected mutations when the proportion of neoplastic cells was below 40%. Next generation sequencing identified 31 EGFR mutations in 24/80 cases (30.0%). Mutations were detected with a proportion of neoplastic cells as low as 5%. All mutations identified by the Sanger method were confirmed. In 6 cases next generation sequencing identified exon 19 deletions or the L858R mutation not seen after Sanger sequencing, allowing the patient to be treated with tyrosine kinase inhibitors. In one additional case the R831H mutation associated with treatment resistance was identified in an EGFR wild type tumor after Sanger sequencing. Next generation sequencing is robust, cost-effective and greatly improves the detection of EGFR mutations. Its use should be promoted for the clinical diagnosis of mutations in specimens with unfavorable tumor cell content. PMID

  15. SPG2 mimicking multiple sclerosis in a family identified using next generation sequencing.

    Science.gov (United States)

    Rubegni, Anna; Battisti, Carla; Tessa, Alessandra; Cerase, Alfonso; Doccini, Stefano; Malandrini, Alessandro; Santorelli, Filippo M; Federico, Antonio

    2017-04-15

    Several single gene disorders can potentially be overlooked in the differential diagnostic evaluation of patients with multiple sclerosis (MS). Pelizaeus-Merzbacher disease and spastic paraplegia type 2 are allelic X-linked disorders associated with defective myelination of the central nervous system and mutations in PLP1. Neurological symptoms are occasionally observed in female carriers of these mutations. Two women - the proposita (Pt1) and her mother (Pt2) - reported walking difficulties since adolescence and showed progressive cognitive decline. Their neurological examinations revealed spastic gait, pyramidal tract involvement and distal muscle atrophy in the legs. Peripheral neuropathy and diffuse white matter (WM) changes on brain MRI were also observed. Both patients had a preliminary diagnosis of primary progressive MS. Using a targeted method in next generation sequencing, the novel heterozygous c.210T>G/p.Y70* in PLP1 was identified in Pt2. The same mutation was also found in Pt1 but not in five healthy relatives. The mutation showed moderate-to-severe skewed X inactivation in tissues, and Western blotting revealed a significant reduction of PLP1 and DM20 in the sural nerve of Pt2. In conclusion a mother and daughter presented an X-linked dominant disorder with skewed X inactivation. The authors suggest that PLP1 testing might be considered in the evaluation of women with spastic paraparesis, cognitive decline and WM changes. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    Science.gov (United States)

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  17. Global transcriptome sequencing identifies chlamydospore specific markers in Candida albicans and Candida dubliniensis.

    Directory of Open Access Journals (Sweden)

    Katja Palige

    Full Text Available Candida albicans and Candida dubliniensis are pathogenic fungi that are highly related but differ in virulence and in some phenotypic traits. During in vitro growth on certain nutrient-poor media, C. albicans and C. dubliniensis are the only yeast species which are able to produce chlamydospores, large thick-walled cells of unknown function. Interestingly, only C. dubliniensis forms pseudohyphae with abundant chlamydospores when grown on Staib medium, while C. albicans grows exclusively as a budding yeast. In order to further our understanding of chlamydospore development and assembly, we compared the global transcriptional profile of both species during growth in liquid Staib medium by RNA sequencing. We also included a C. albicans mutant in our study which lacks the morphogenetic transcriptional repressor Nrg1. This strain, which is characterized by its constitutive pseudohyphal growth, specifically produces masses of chlamydospores in Staib medium, similar to C. dubliniensis. This comparative approach identified a set of putatively chlamydospore-related genes. Two of the homologous C. albicans and C. dubliniensis genes (CSP1 and CSP2 which were most strongly upregulated during chlamydospore development were analysed in more detail. By use of the green fluorescent protein as a reporter, the encoded putative cell wall related proteins were found to exclusively localize to C. albicans and C. dubliniensis chlamydospores. Our findings uncover the first chlamydospore specific markers in Candida species and provide novel insights in the complex morphogenetic development of these important fungal pathogens.

  18. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder

    Science.gov (United States)

    Yuen, Ryan KC; Merico, Daniele; Bookman, Matt; Howe, Jennifer L; Thiruvahindrapuram, Bhooma; Patel, Rohan V; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A; Walker, Susan; Marshall, Christian R; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D’Abate, Lia; Chan, Ada JS; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R; Nalpathamkalam, Thomas; Sung, Wilson WL; Tsoi, Fiona J; Wei, John; Xu, Lizhen; Tasse, Anne-Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie MacKinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A; Parr, Jeremy R; Spence, Sarah J; Vorstman, Jacob; Frey, Brendan J; Robinson, James T; Strug, Lisa J; Fernandez, Bridget A; Elsabbagh, Mayada; Carter, Melissa T; Hallmayer, Joachim; Knoppers, Bartha M; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H; Glazer, David; Pletcher, Mathew T; Scherer, Stephen W

    2017-01-01

    We are performing whole genome sequencing (WGS) of families with Autism Spectrum Disorder (ASD) to build a resource, named MSSNG, to enable the sub-categorization of phenotypes and underlying genetic factors involved. Here, we report WGS of 5,205 samples from families with ASD, accompanied by clinical information, creating a database accessible in a cloud platform, and through an internet portal with controlled access. We found an average of 73.8 de novo single nucleotide variants and 12.6 de novo insertion/deletions (indels) or copy number variations (CNVs) per ASD subject. We identified 18 new candidate ASD-risk genes such as MED13 and PHF3, and found that participants bearing mutations in susceptibility genes had significantly lower adaptive ability (p=6×10−4). In 294/2,620 (11.2%) of ASD cases, a molecular basis could be determined and 7.2% of these carried CNV/chromosomal abnormalities, emphasizing the importance of detecting all forms of genetic variation as diagnostic and therapeutic targets in ASD. PMID:28263302

  19. MicroRNA sequence analysis identifies microRNAs associated with peri-implantitis in dogs.

    Science.gov (United States)

    Wu, Xiaolin; Chen, Xipeng; Mi, Wenxiang; Wu, Tingting; Gu, Qinhua; Huang, Hui

    2017-10-31

    Peri-implantitis, which is characterized by dense inflammatory infiltrates and increased osteoclast activity, can lead to alveolar bone destruction and implantation failure. miRNAs participate in the regulation of various inflammatory diseases, such as periodontitis and osteoporosis. Therefore, the present study aimed to investigate the differential expression of miRNAs in canine peri-implantitis and to explore the functions of their target genes. An miRNA sequence analysis was used to identify differentially expressed miRNAs in peri-implantitis. Under the criteria of a fold-change >1.5 and Pimplantitis through an intricate mechanism. The results of quantitative real-time PCR (qRT-PCR) revealed that let-7g, miR-27a, and miR-145 may play important roles in peri-implantitis and are worth further investigation. The results of the present study provide insights into the potential biological effects of the differentially expressed miRNAs, and specific enrichment of target genes involved in the mitogen-activated protein kinase (MAPK) signaling pathway was observed. These findings highlight the intricate and specific roles of miRNAs in inflammation and osteoclastogenesis, both of which are key aspects of peri-implantitis, and thus may contribute to future investigations of the etiology, underlying mechanism, and treatment of peri-implantitis. © 2017 The Author(s).

  20. Whole exome sequencing identifies mutations in Usher syndrome genes in profoundly deaf Tunisian patients.

    Directory of Open Access Journals (Sweden)

    Zied Riahi

    Full Text Available Usher syndrome (USH is an autosomal recessive disorder characterized by combined deafness-blindness. It accounts for about 50% of all hereditary deafness blindness cases. Three clinical subtypes (USH1, USH2, and USH3 are described, of which USH1 is the most severe form, characterized by congenital profound deafness, constant vestibular dysfunction, and a prepubertal onset of retinitis pigmentosa. We performed whole exome sequencing in four unrelated Tunisian patients affected by apparently isolated, congenital profound deafness, with reportedly normal ocular fundus examination. Four biallelic mutations were identified in two USH1 genes: a splice acceptor site mutation, c.2283-1G>T, and a novel missense mutation, c.5434G>A (p.Glu1812Lys, in MYO7A, and two previously unreported mutations in USH1G, i.e. a frameshift mutation, c.1195_1196delAG (p.Leu399Alafs*24, and a nonsense mutation, c.52A>T (p.Lys18*. Another ophthalmological examination including optical coherence tomography actually showed the presence of retinitis pigmentosa in all the patients. Our findings provide evidence that USH is under-diagnosed in Tunisian deaf patients. Yet, early diagnosis of USH is of utmost importance because these patients should undergo cochlear implant surgery in early childhood, in anticipation of the visual loss.

  1. Exome sequencing of Pakistani consanguineous families identifies 30 novel candidate genes for recessive intellectual disability.

    Science.gov (United States)

    Riazuddin, S; Hussain, M; Razzaq, A; Iqbal, Z; Shahzad, M; Polla, D L; Song, Y; van Beusekom, E; Khan, A A; Tomas-Roca, L; Rashid, M; Zahoor, M Y; Wissink-Lindhout, W M; Basra, M A R; Ansar, M; Agha, Z; van Heeswijk, K; Rasheed, F; Van de Vorst, M; Veltman, J A; Gilissen, C; Akram, J; Kleefstra, T; Assir, M Z; Grozeva, D; Carss, K; Raymond, F L; O'Connor, T D; Riazuddin, S A; Khan, S N; Ahmed, Z M; de Brouwer, A P M; van Bokhoven, H; Riazuddin, S

    2016-07-26

    Intellectual disability (ID) is a clinically and genetically heterogeneous disorder, affecting 1-3% of the general population. Although research into the genetic causes of ID has recently gained momentum, identification of pathogenic mutations that cause autosomal recessive ID (ARID) has lagged behind, predominantly due to non-availability of sizeable families. Here we present the results of exome sequencing in 121 large consanguineous Pakistani ID families. In 60 families, we identified homozygous or compound heterozygous DNA variants in a single gene, 30 affecting reported ID genes and 30 affecting novel candidate ID genes. Potential pathogenicity of these alleles was supported by co-segregation with the phenotype, low frequency in control populations and the application of stringent bioinformatics analyses. In another eight families segregation of multiple pathogenic variants was observed, affecting 19 genes that were either known or are novel candidates for ID. Transcriptome profiles of normal human brain tissues showed that the novel candidate ID genes formed a network significantly enriched for transcriptional co-expression (Pnovel ID genes directly interact with previously reported ID proteins in six known pathways essential for cognitive function (Pgenes involved in ARID, and provide new insights into the molecular mechanisms and the transcriptome map of ID.Molecular Psychiatry advance online publication, 26 July 2016; doi:10.1038/mp.2016.109.

  2. Antimicrobial susceptibility among clinical Nocardia species identified by multilocus sequence analysis.

    Science.gov (United States)

    McTaggart, Lisa R; Doucet, Jennifer; Witkowska, Maria; Richardson, Susan E

    2015-01-01

    Antimicrobial susceptibility patterns of 112 clinical isolates, 28 type strains, and 9 reference strains of Nocardia were determined using the Sensititre Rapmyco microdilution panel (Thermo Fisher, Inc.). Isolates were identified by highly discriminatory multilocus sequence analysis and were chosen to represent the diversity of species recovered from clinical specimens in Ontario, Canada. Susceptibility to the most commonly used drug, trimethoprim-sulfamethoxazole, was observed in 97% of isolates. Linezolid and amikacin were also highly effective; 100% and 99% of all isolates demonstrated a susceptible phenotype. For the remaining antimicrobials, resistance was species specific with isolates of Nocardia otitidiscaviarum, N. brasiliensis, N. abscessus complex, N. nova complex, N. transvalensis complex, N. farcinica, and N. cyriacigeorgica displaying the traditional characteristic drug pattern types. In addition, the antimicrobial susceptibility profiles of a variety of rarely encountered species isolated from clinical specimens are reported for the first time and were categorized into four additional drug pattern types. Finally, MICs for the control strains N. nova ATCC BAA-2227, N. asteroides ATCC 19247(T), and N. farcinica ATCC 23826 were robustly determined to demonstrate method reproducibility and suitability of the commercial Sensititre Rapmyco panel for antimicrobial susceptibility testing of Nocardia spp. isolated from clinical specimens. The reported values will facilitate quality control and standardization among laboratories. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  3. Global Transcriptome Sequencing Identifies Chlamydospore Specific Markers in Candida albicans and Candida dubliniensis

    LENUS (Irish Health Repository)

    Palige, Katja

    2013-04-15

    Candida albicans and Candida dubliniensis are pathogenic fungi that are highly related but differ in virulence and in some phenotypic traits. During in vitro growth on certain nutrient-poor media, C. albicans and C. dubliniensis are the only yeast species which are able to produce chlamydospores, large thick-walled cells of unknown function. Interestingly, only C. dubliniensis forms pseudohyphae with abundant chlamydospores when grown on Staib medium, while C. albicans grows exclusively as a budding yeast. In order to further our understanding of chlamydospore development and assembly, we compared the global transcriptional profile of both species during growth in liquid Staib medium by RNA sequencing. We also included a C. albicans mutant in our study which lacks the morphogenetic transcriptional repressor Nrg1. This strain, which is characterized by its constitutive pseudohyphal growth, specifically produces masses of chlamydospores in Staib medium, similar to C. dubliniensis. This comparative approach identified a set of putatively chlamydospore-related genes. Two of the homologous C. albicans and C. dubliniensis genes (CSP1 and CSP2) which were most strongly upregulated during chlamydospore development were analysed in more detail. By use of the green fluorescent protein as a reporter, the encoded putative cell wall related proteins were found to exclusively localize to C. albicans and C. dubliniensis chlamydospores. Our findings uncover the first chlamydospore specific markers in Candida species and provide novel insights in the complex morphogenetic development of these important fungal pathogens.

  4. Identifying Highly Penetrant Disease Causal Mutations Using Next Generation Sequencing: Guide to Whole Process

    Directory of Open Access Journals (Sweden)

    A. Mesut Erzurumluoglu

    2015-01-01

    Full Text Available Recent technological advances have created challenges for geneticists and a need to adapt to a wide range of new bioinformatics tools and an expanding wealth of publicly available data (e.g., mutation databases, and software. This wide range of methods and a diversity of file formats used in sequence analysis is a significant issue, with a considerable amount of time spent before anyone can even attempt to analyse the genetic basis of human disorders. Another point to consider that is although many possess “just enough” knowledge to analyse their data, they do not make full use of the tools and databases that are available and also do not fully understand how their data was created. The primary aim of this review is to document some of the key approaches and provide an analysis schema to make the analysis process more efficient and reliable in the context of discovering highly penetrant causal mutations/genes. This review will also compare the methods used to identify highly penetrant variants when data is obtained from consanguineous individuals as opposed to nonconsanguineous; and when Mendelian disorders are analysed as opposed to common-complex disorders.

  5. Identifying sugarcane expressed sequences associated with nutrient transporters and peptide metal chelators

    Directory of Open Access Journals (Sweden)

    Antonio Figueira

    2001-12-01

    Full Text Available Plant nutrient uptake is an active process, requiring energy to accumulate essential elements at higher levels in plant tissues than in the soil solution, while the presence of toxic metals or excess of nutrients requires mechanisms to modulate the accumulation of ions. Genes encoding ion transporters isolated from plants and yeast were used to identify sugarcane putative homologues in the sugarcane expressed sequence tag (SUCEST database. Five cluster consensi with sequence homology to plant high-affinity phosphate transporter genes were identified. One cluster consensus allowed the prediction of a full-length protein containing 541 amino acids, with 81% amino acid identity to the Nicotiana tabacum NtPT1 gene, consisting of 12 membrane-spanning domains divided by a large hydrophilic charged region. Putative homologues to Arabidopsis thaliana micronutrient transporter genes were also detected in some of the SUCEST libraries. Iron uptake in grasses involves the release of the phytosiderophore mugeneic acid (MA which chelate Fe3+ which is then absorbed by a specific transporter. Sugarcane expressed sequence tag (EST homologous to genes coding for three enzymes of the mugeneic acid biosynthetic pathway [nicotianamine synthase; nicotianamine transferase; and putative mugeneic acid synthetase (ids3] and a putative Fe3+-phytosiderophore transporter were detected. Seven sugarcane sequence clusters were identified with strong homology to members of the ZIP gene family (ZIP1, ZIP3, ZIP4, IRT1 and ZNT1, while four clusters homologous to ZIP2 and three to ZAT were found. Homologues to members of another gene family, Nramp, which code for broad-specificity transition metal transporters were also detected with constitutive expression. Partial transcripts homologous to genes encoding gamma-glutamylcysteine synthetase, glutathione synthetase, and phytochelatin synthase (responsible for biosynthesis of the metal chelator phytochelatin and all four types of the

  6. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder

    NARCIS (Netherlands)

    Yuen, Ryan K C; Merico, Daniele; Bookman, Matt; Howe, Jennifer L.; Thiruvahindrapuram, Bhooma; Patel, Rohan V.; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A.; Walker, Susan; Marshall, Christian R.; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D'Abate, Lia; Chan, Ada J S; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L.; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J.; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R.; Nalpathamkalam, Thomas; Sung, Wilson W L; Tsoi, Fiona J.; Wei, John; Xu, Lizhen; Tasse, Anne Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie Mackinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M.; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H.; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A.; Parr, Jeremy R.; Spence, Sarah J.; Vorstman, Jacob|info:eu-repo/dai/nl/304817023; Frey, Brendan J.; Robinson, James T.; Strug, Lisa J.; Fernandez, Bridget A.; Elsabbagh, Mayada; Carter, Melissa T.; Hallmayer, Joachim; Knoppers, Bartha M.; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H.; Glazer, David; Pletcher, Mathew T.; Scherer, Stephen W.

    2017-01-01

    We are performing whole-genome sequencing of families with autism spectrum disorder (ASD) to build a resource (MSSNG) for subcategorizing the phenotypes and underlying genetic factors involved. Here we report sequencing of 5,205 samples from families with ASD, accompanied by clinical information,

  7. A method for identifying alternative or cryptic donor splice sites within gene and mRNA sequences. Comparisons among sequences from vertebrates, echinoderms and other groups.

    Science.gov (United States)

    Buckley, Katherine M; Florea, Liliana D; Smith, L Courtney

    2009-07-16

    As the amount of genome sequencing data grows, so does the problem of computational gene identification, and in particular, the splicing signals that flank exon borders. Traditional methods for identifying splicing signals have been created and optimized using sequences from model organisms, mostly vertebrate and yeast species. However, as genome sequencing extends across the animal kingdom and includes various invertebrate species, the need for mechanisms to recognize splice signals in these organisms increases as well. With that aim in mind, we generated a model for identifying donor and acceptor splice sites that was optimized using sequences from the purple sea urchin, Strongylocentrotus purpuratus. This model was then used to assess the possibility of alternative or cryptic splicing within the highly variable immune response gene family known as 185/333. A donor splice site model was generated from S. purpuratus sequences that incorporates non-adjacent dependences among positions within the 9 nt splice signal and uses position weight matrices to determine the probability that the site is used for splicing. The Purpuratus model was shown to predict splice signals better than a similar model created from vertebrate sequences. Although the Purpuratus model was able to correctly predict the true splice sites within the 185/333 genes, no evidence for alternative or trans-gene splicing was observed. The data presented herein describe the first published analyses of echinoderm splice sites and suggest that the previous methods of identifying splice signals that are based largely on vertebrate sequences may be insufficient. Furthermore, alternative or trans-gene splicing does not appear to be acting as a diversification mechanism in the 185/333 gene family.

  8. A method for identifying alternative or cryptic donor splice sites within gene and mRNA sequences. Comparisons among sequences from vertebrates, echinoderms and other groups

    Directory of Open Access Journals (Sweden)

    Florea Liliana D

    2009-07-01

    Full Text Available Abstract Background As the amount of genome sequencing data grows, so does the problem of computational gene identification, and in particular, the splicing signals that flank exon borders. Traditional methods for identifying splicing signals have been created and optimized using sequences from model organisms, mostly vertebrate and yeast species. However, as genome sequencing extends across the animal kingdom and includes various invertebrate species, the need for mechanisms to recognize splice signals in these organisms increases as well. With that aim in mind, we generated a model for identifying donor and acceptor splice sites that was optimized using sequences from the purple sea urchin, Strongylocentrotus purpuratus. This model was then used to assess the possibility of alternative or cryptic splicing within the highly variable immune response gene family known as 185/333. Results A donor splice site model was generated from S. purpuratus sequences that incorporates non-adjacent dependences among positions within the 9 nt splice signal and uses position weight matrices to determine the probability that the site is used for splicing. The Purpuratus model was shown to predict splice signals better than a similar model created from vertebrate sequences. Although the Purpuratus model was able to correctly predict the true splice sites within the 185/333 genes, no evidence for alternative or trans-gene splicing was observed. Conclusion The data presented herein describe the first published analyses of echinoderm splice sites and suggest that the previous methods of identifying splice signals that are based largely on vertebrate sequences may be insufficient. Furthermore, alternative or trans-gene splicing does not appear to be acting as a diversification mechanism in the 185/333 gene family.

  9. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Thorup Nielsen, Mette

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely...

  10. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Directory of Open Access Journals (Sweden)

    Pimlapas Leekitcharoenphon

    Full Text Available Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections.

  11. Next generation sequencing identifies 'interactome' signatures in relapsed and refractory metastatic colorectal cancer.

    Science.gov (United States)

    Johnson, Benny; Cooke, Laurence; Mahadevan, Daruka

    2017-02-01

    In the management of metastatic colorectal cancer (mCRC), KRAS, NRAS and BRAF mutational status individualizes therapeutic options and identify a cohort of patients (pts) with an aggressive clinical course. We hypothesized that relapsed and refractory mCRC pts develop unique mutational signatures that may guide therapy, predict for a response and highlight key signaling pathways important for clinical decision making. Relapsed and refractory mCRC pts (N=32) were molecularly profiled utilizing commercially available next generation sequencing (NGS) platforms. Web-based bioinformatics tools (Reactome/Enrichr) were utilized to elucidate mutational profile linked pathways-networks that have the potential to guide therapy. Pts had progressed on fluoropyrimidines, oxaliplatin, irinotecan, bevacizumab, cetuximab and/or panitumumab. Most common histology was adenocarcinoma (colon N=29; rectal N=3). Of the mutations TP53 was the most common, followed by APC, KRAS, PIK3CA, BRAF, SMAD4, SPTA1, FAT1, PDGFRA, ATM, ROS1, ALK, CDKN2A, FBXW7, TGFBR2, NOTCH1 and HER3. Pts had on average had ≥5 unique mutations. The most frequent activated signaling pathways were: HER2, fibroblast growth factor receptor (FGFR), p38 through BRAF-MEK cascade via RIT and RIN, ARMS-mediated activation of MAPK cascade, and VEGFR2. Dominant driver oncogene mutations do not always equate to oncogenic dependence, hence understanding pathogenic 'interactome(s)' in individual pts is key to both clinically relevant targets and in choosing the next best therapy. Mutational signatures derived from corresponding 'pathway-networks' represent a meaningful tool to (I) evaluate functional investigation in the laboratory; (II) predict response to drug therapy; and (III) guide rational drug combinations in relapsed and refractory mCRC pts.

  12. A bioinformatics approach for identifying transgene insertion sites using whole genome sequencing data.

    Science.gov (United States)

    Park, Doori; Park, Su-Hyun; Ban, Yong Wook; Kim, Youn Shic; Park, Kyoung-Cheul; Kim, Nam-Soo; Kim, Ju-Kon; Choi, Ik-Young

    2017-08-15

    Genetically modified crops (GM crops) have been developed to improve the agricultural traits of modern crop cultivars. Safety assessments of GM crops are of paramount importance in research at developmental stages and before releasing transgenic plants into the marketplace. Sequencing technology is developing rapidly, with higher output and labor efficiencies, and will eventually replace existing methods for the molecular characterization of genetically modified organisms. To detect the transgenic insertion locations in the three GM rice gnomes, Illumina sequencing reads are mapped and classified to the rice genome and plasmid sequence. The both mapped reads are classified to characterize the junction site between plant and transgene sequence by sequence alignment. Herein, we present a next generation sequencing (NGS)-based molecular characterization method, using transgenic rice plants SNU-Bt9-5, SNU-Bt9-30, and SNU-Bt9-109. Specifically, using bioinformatics tools, we detected the precise insertion locations and copy numbers of transfer DNA, genetic rearrangements, and the absence of backbone sequences, which were equivalent to results obtained from Southern blot analyses. NGS methods have been suggested as an effective means of characterizing and detecting transgenic insertion locations in genomes. Our results demonstrate the use of a combination of NGS technology and bioinformatics approaches that offers cost- and time-effective methods for assessing the safety of transgenic plants.

  13. The Classification And Functional Characterization Of RYR1 Sequence Variants Associated With Malignant Hyperthermia Susceptibility Identified Through ExomeSequencing

    Science.gov (United States)

    2014-09-15

    structure of the receptor (145; 146; 171- 174). The basic structure has been described as a mushroom , with a large tetrameric cap representing 80% of the...genomic DNA of unrelated Japanese patients diagnosed with MHS by the calcium-induced calcium release (CICR) test—the Japanese equivalent of the North...since it was identified in five unrelated Japanese patients diagnosed as MH-susceptible by the calcium-induced calcium release test (i.e., Japanese

  14. Exome Sequencing of Cell-Free DNA from Metastatic Cancer Patients Identifies Clinically Actionable Mutations Distinct from Primary Disease.

    Science.gov (United States)

    Butler, Timothy M; Johnson-Camacho, Katherine; Peto, Myron; Wang, Nicholas J; Macey, Tara A; Korkola, James E; Koppie, Theresa M; Corless, Christopher L; Gray, Joe W; Spellman, Paul T

    2015-01-01

    The identification of the molecular drivers of cancer by sequencing is the backbone of precision medicine and the basis of personalized therapy; however, biopsies of primary tumors provide only a snapshot of the evolution of the disease and may miss potential therapeutic targets, especially in the metastatic setting. A liquid biopsy, in the form of cell-free DNA (cfDNA) sequencing, has the potential to capture the inter- and intra-tumoral heterogeneity present in metastatic disease, and, through serial blood draws, track the evolution of the tumor genome. In order to determine the clinical utility of cfDNA sequencing we performed whole-exome sequencing on cfDNA and tumor DNA from two patients with metastatic disease; only minor modifications to our sequencing and analysis pipelines were required for sequencing and mutation calling of cfDNA. The first patient had metastatic sarcoma and 47 of 48 mutations present in the primary tumor were also found in the cell-free DNA. The second patient had metastatic breast cancer and sequencing identified an ESR1 mutation in the cfDNA and metastatic site, but not in the primary tumor. This likely explains tumor progression on Anastrozole. Significant heterogeneity between the primary and metastatic tumors, with cfDNA reflecting the metastases, suggested separation from the primary lesion early in tumor evolution. This is best illustrated by an activating PIK3CA mutation (H1047R) which was clonal in the primary tumor, but completely absent from either the metastasis or cfDNA. Here we show that cfDNA sequencing supplies clinically actionable information with minimal risks compared to metastatic biopsies. This study demonstrates the utility of whole-exome sequencing of cell-free DNA from patients with metastatic disease. cfDNA sequencing identified an ESR1 mutation, potentially explaining a patient's resistance to aromatase inhibition, and gave insight into how metastatic lesions differ from the primary tumor.

  15. The genome sequence of Rickettsia felis identifies the first putative conjugative plasmid in an obligate intracellular parasite.

    OpenAIRE

    Hiroyuki Ogata; Patricia Renesto; Stéphane Audic; Catherine Robert; Guillaume Blanc; Pierre-Edouard Fournier; Hugues Parinello; Jean-Michel Claverie; Didier Raoult

    2005-01-01

    We sequenced the genome of Rickettsia felis, a flea-associated obligate intracellular alpha-proteobacterium causing spotted fever in humans. Besides a circular chromosome of 1,485,148 bp, R. felis exhibits the first putative conjugative plasmid identified among obligate intracellular bacteria. This plasmid is found in a short (39,263 bp) and a long (62,829 bp) form. R. felis contrasts with previously sequenced Rickettsia in terms of many other features, including a number of transposases, sev...

  16. SNPs in putative regulatory regions identified by human mouse comparative sequencing and transcription factor binding site data

    Energy Technology Data Exchange (ETDEWEB)

    Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.; Loots, Gabriela G.; Houston, Kathryn A.; Dubchak, Inna; Speed, Terence P.; Rubin, Edward M.

    2002-01-01

    Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs in gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.

  17. Culture-independent sequence analysis of Chlamydia trachomatis in urogenital specimens identifies regions of recombination and in-patient sequence mutations.

    Science.gov (United States)

    Putman, Timothy E; Suchland, Robert J; Ivanovitch, John D; Rockey, Daniel D

    2013-10-01

    A culture-independent genome sequencing approach was developed and used to examine genomic variability in Chlamydia trachomatis-positive specimens that were collected from patients in the Seattle, WA, USA, area. The procedure is based on an immunomagnetic separation approach with chlamydial LPS-specific mAbs, followed by DNA purification and total DNA amplification, and subsequent Illumina-based sequence analysis. Quality of genome sequencing was independent of the total number of inclusion-forming units determined for the sample and the amount of non-chlamydial DNA in the Illumina libraries. A geographically and temporally linked clade of isolates was identified with evidence of several different regions of recombination and variable ompA sequence types, suggesting that recombination is common within outbreaks. Culture-independent sequence analysis revealed a linkage pattern at two nucleotide positions that was unique to the genomes of isolates from patients, but not in C. trachomatis recombinants generated in vitro. These data demonstrated that culture-independent sequence analysis can be used to rapidly and inexpensively collect genome data from patients infected by C. trachomatis, and that this approach can be used to examine genomic variation within this species.

  18. Evaluation of the Ion Torrent PGM sequencing workflow for the routine rapid detection of BRCA1 and BRCA2 germline mutations.

    Science.gov (United States)

    Zanella, Isabella; Merola, Francesca; Biasiotto, Giorgio; Archetti, Silvana; Spinelli, Elide; Di Lorenzo, Diego

    2017-04-01

    Conventional methods used to identify BRCA1/2 germline mutations in hereditary cancers are time-consuming and expensive, due to the large size of the genes. The recent introduction of next generation sequencing (NGS) benchtop platforms is a great promise, which is rapidly revolutionizing genetic screening in diagnostic and clinical applications. We recently transferred our methodology for routine BRCA1/2 mutation screening (denaturing High Performance Liquid Chromatography plus Sanger sequencing) to the Ion Torrent PGM platform with the Ion Ampliseq BRCA1 and BRCA2 panel and tested the performance of the system. We first validated the NGS approach in a cohort of 33 patients who had previously undergone genetic diagnosis in our laboratory by conventional methods. Then, we tested 29 newly diagnosed and uncharacterized patients by NGS, and Sanger sequencing was used to confirm results from the NGS platform. In the validation cohort, all previously identified single nucleotide variants, insertions and deletions (also composed of multiple bases and within complex homopolymeric stretches) were identified by NGS in their correct zygosity status except for variants in a complex multinucleotide region within intron 7 of BRCA1 gene. NGS approach was further able to identify previously undetected variants. In the prospective cohort, almost all (99.3%) called variants were confirmed by Sanger. In both cohorts, in addition to the false positive (31) and false negative (110) results in the intron 7 of BRCA1 gene, the NGS method detected 10 false positives, that were solved by Sanger. The Ion Torrent PGM NGS approach in BRCA1/2 germline mutation identification is highly sensitive, easy to use, faster and cheaper than traditional approaches. Therefore, according to other recently published works, we highly recommend this system for routine diagnostic testing on BRCA1/2 genes, along with Sanger confirmation of the called variants, and support the usefulness of the approach also in

  19. Fatal Psychrobacter sp. infection in a pediatric patient with meningitis identified by metagenomic next-generation sequencing in cerebrospinal fluid.

    Science.gov (United States)

    Ortiz-Alcántara, Joanna María; Segura-Candelas, José Miguel; Garcés-Ayala, Fabiola; Gonzalez-Durán, Elizabeth; Rodríguez-Castillo, Araceli; Alcántara-Pérez, Patricia; Wong-Arámbula, Claudia; González-Villa, Maribel; León-Ávila, Gloria; García-Chéquer, Adda Jeanette; Diaz-Quiñonez, José Alberto; Méndez-Tenorio, Alfonso; Ramírez-González, José Ernesto

    2016-03-01

    The genus Psychrobacter contains environmental, psychrophilic and halotolerant gram-negative bacteria considered rare opportunistic pathogens in humans. Metagenomics was performed on the cerebrospinal fluid (CSF) of a pediatric patient with meningitis. Nucleic acids were extracted, randomly amplified, and sequenced with the 454 GS FLX Titanium next-generation sequencing (NGS) system. Sequencing reads were assembled, and potential virulence genes were predicted. Phylogenomic and phylogenetic studies were performed. Psychrobacter sp. 310 was identified, and several virulence genes characteristic of pathogenic bacteria were found. The phylogenomic study and 16S rRNA gene phylogenetic analysis showed that the closest relative of Psychrobacter sp. 310 was Psychrobacter sanguinis. To our knowledge, this is the first report of a meningitis case associated with Psychrobacter sp. identified by NGS metagenomics in CSF from a pediatric patient. The metagenomic strategy based on NGS was a powerful tool to identify a rare unknown pathogen in a clinical case.

  20. Salmonella Persistence in Tomatoes Requires a Distinct Set of Metabolic Functions Identified by Transposon Insertion Sequencing.

    Science.gov (United States)

    de Moraes, Marcos H; Desai, Prerak; Porwollik, Steffen; Canals, Rocio; Perez, Daniel R; Chu, Weiping; McClelland, Michael; Teplitski, Max

    2017-03-01

    Human enteric pathogens, such as Salmonella spp. and verotoxigenic Escherichia coli , are increasingly recognized as causes of gastroenteritis outbreaks associated with the consumption of fruits and vegetables. Persistence in plants represents an important part of the life cycle of these pathogens. The identification of the full complement of Salmonella genes involved in the colonization of the model plant (tomato) was carried out using transposon insertion sequencing analysis. With this approach, 230,000 transposon insertions were screened in tomato pericarps to identify loci with reduction in fitness, followed by validation of the screen results using competition assays of the isogenic mutants against the wild type. A comparison with studies in animals revealed a distinct plant-associated set of genes, which only partially overlaps with the genes required to elicit disease in animals. De novo biosynthesis of amino acids was critical to persistence within tomatoes, while amino acid scavenging was prevalent in animal infections. Fitness reduction of the Salmonella amino acid synthesis mutants was generally more severe in the tomato rin mutant, which hyperaccumulates certain amino acids, suggesting that these nutrients remain unavailable to Salmonella spp. within plants. Salmonella lipopolysaccharide (LPS) was required for persistence in both animals and plants, exemplifying some shared pathogenesis-related mechanisms in animal and plant hosts. Similarly to phytopathogens, Salmonella spp. required biosynthesis of amino acids, LPS, and nucleotides to colonize tomatoes. Overall, however, it appears that while Salmonella shares some strategies with phytopathogens and taps into its animal virulence-related functions, colonization of tomatoes represents a distinct strategy, highlighting this pathogen's flexible metabolism. IMPORTANCE Outbreaks of gastroenteritis caused by human pathogens have been increasingly associated with foods of plant origin, with tomatoes being

  1. Exome Sequencing Identifies a Novel MAP3K14 Mutation in Recessive Atypical Combined Immunodeficiency

    Directory of Open Access Journals (Sweden)

    Nikola Schlechter

    2017-11-01

    Full Text Available Primary immunodeficiency disorders (PIDs render patients vulnerable to infection with a wide range of microorganisms and thus provide good in vivo models for the assessment of immune responses during infectious challenges. Priming of the immune system, especially in infancy, depends on different environmental exposures and medical practices. This may determine the timing and phenotype of clinical appearance of immune deficits as exemplified with early exposure to Bacillus Calmette-Guérin (BCG vaccination and dissemination in combined immunodeficiencies. Varied phenotype expression poses a challenge to identification of the putative immune deficit. Without the availability of genomic diagnosis and data analysis resources and with limited capacity for functional definition of immune pathways, it is difficult to establish a definitive diagnosis and to decide on appropriate treatment. This study describes the use of exome sequencing to identify a homozygous recessive variant in MAP3K14, NIKVal345Met, in a patient with combined immunodeficiency, disseminated BCG-osis, and paradoxically elevated lymphocytes. Laboratory testing confirmed hypogammaglobulinemia with normal CD19, but failed to confirm a definitive diagnosis for targeted treatment decisions. NIKVal345Met is predicted to be deleterious and pathogenic by two in silico prediction tools and is situated in a gene crucial for effective functioning of the non-canonical nuclear factor-kappa B signaling pathway. Functional analysis of NIKVal345Met- versus NIKWT-transfected human embryonic kidney-293T cells showed that this mutation significantly affects the kinase activity of NIK leading to decreased levels of phosphorylated IkappaB kinase-alpha (IKKα, the target of NIK. BCG-stimulated RAW264.7 cells transfected with NIKVal345Met also presented with reduced levels of phosphorylated IKKα, significantly increased p100 levels and significantly decreased p52 levels compared to cells transfected

  2. [Antimicrobial susceptibilities of clinical Nocardia isolates identified by 16S rRNA gene sequence analysis].

    Science.gov (United States)

    Uner, Mahmut Celalettin; Hasçelik, Gülşen; Müştak, Hamit Kaan

    2016-01-01

    Nocardia species are ubiquitous in the environment and responsible for various human infections such as pulmonary, cutaneous, central nervous system and disseminated nocardiosis. Since the clinical pictures and antimicrobial susceptibilities of Nocardia species exhibit variability, susceptibility testing is recommended for every Nocardia isolates. The aims of this study was to determine the antimicrobial susceptibilities of Nocardia clinical isolates and to compare the results of broth microdilution and disc diffusion susceptibility tests. A total of 45 clinical Nocardia isolates (isolated from 17 respiratory tract, 8 brain abscess, 7 pus, 3 skin, 3 conjunctiva, 2 blood, 2 tissue, 2 pleural fluid and 1 cerebrospinal fluid samples) were identified by using conventional methods and 16S rRNA gene sequence analysis. Susceptibility testing was performed for amikacin, ciprofloxacin, ceftriaxone, linezolid and trimethoprim-sulfamethoxazole (TMP-SMX) by broth microdilution method according to the Clinical and Laboratory Standards Institute (CLSI) criteria recommended in 2011 approved standard (M24-A2) and disk diffusion method used as an alternative comparative susceptibility testing method. Among the 45 Nocardia strains, N.cyriacigeorgica (n: 26, 57.8%) was the most common species, followed by N.farcinica (n: 12, 26.7%), N.otitiscaviarum (n: 4, 8.9%), N.asteroides (n: 1, 2.2%), N.neocaledoniensis (n: 1, 2.2%) and N.abscessus (n: 1, 2.2%). Amikacin and linezolid were the only two antimicrobials to which all isolates were susceptible for both broth microdilution and disk diffusion tests. In broth microdilution test, resistance rates to TMP-SMX, ceftriaxone and ciprofloxacin were found as 15.6%, 37.8% and 84.4% respectively, whereas in the disk diffusion test, the highest resistance rate was observed against ciprofloxacin (n: 33, 73.3%), followed by TMP-SMX (n: 22, 48.9%) and ceftriaxone (n: 15, 33.3%). In both of these tests, N.cyriacigeorgica was the species with the

  3. Identification of a disease-causing mutation in a Chinese patient with retinitis pigmentosa by targeted next-generation sequencing

    DEFF Research Database (Denmark)

    Xiao, Jianping; Guo, Xueqin; Wang, Yong

    2017-01-01

    disease-causing mutations. Sanger sequencing was performed on all subjects to confirm the candidate mutations and assess cosegregation within the family. Results: Clinical examinations of the proband showed typical characteristics of RP. Three candidate heterozygous mutations in 3 genes associated with RP...... were detected in the proband by targeted NGS. The 3 mutations were confirmed by Sanger sequencing and the deletion (c.357_358delAA) in PRPF31 was shown to cosegregate with RP phenotype in 7 affected family members, but not in 3 unaffected family members. Conclusions: The deletion (c.357_358del......Purpose: To identify disease-causing mutations in a Chinese patient with retinitis pigmentosa (RP). Methods: A detailed clinical examination was performed on the proband. Targeted next-generation sequencing (NGS) combined with bioinformatics analysis was performed on the proband to detect candidate...

  4. Whole-genome sequencing in autism identifies hot spots for de novo germline mutation

    DEFF Research Database (Denmark)

    Michaelson, Jacob J.; Shi, Yujian; Gujral, Madhusudan

    2012-01-01

    investigated global patterns of germline mutation by whole-genome sequencing of monozygotic twins concordant for ASD and their parents. Mutation rates varied widely throughout the genome (by 100-fold) and could be explained by intrinsic characteristics of DNA sequence and chromatin structure. Dense clusters...... of mutations within individual genomes were attributable to compound mutation or gene conversion. Hypermutability was a characteristic of genes involved in ASD and other diseases. In addition, genes impacted by mutations in this study were associated with ASD in independent exome-sequencing data sets. Our......De novo mutation plays an important role in autism spectrum disorders (ASDs). Notably, pathogenic copy number variants (CNVs) are characterized by high mutation rates. We hypothesize that hypermutability is a property of ASD genes and may also include nucleotide-substitution hot spots. We...

  5. LAMTOR1-PRKCD and NUMA1-SFMBT1 fusion genes identified by RNA sequencing in aneurysmal benign fibrous histiocytoma with t(3;11)(p21;q13).

    Science.gov (United States)

    Panagopoulos, Ioannis; Gorunova, Ludmila; Bjerkehagen, Bodil; Lobmaier, Ingvild; Heim, Sverre

    2015-11-01

    RNA sequencing of an aneurysmal benign fibrous histiocytoma with the karyotype 46,XY,t(3;11)(p21;q13),del(6)(p23)[17]/46,XY[2] showed that the t(3;11) generated two fusion genes: LAMTOR1-PRKCD and NUMA1-SFMBT1. RT-PCR together with Sanger sequencing verified the presence of fusion transcripts from both fusion genes. In the LAMTOR1-PRKCD fusion, the part of the PRKCD gene coding for the catalytic domain of the serine/threonine kinase is under control of the LAMTOR1 promoter. In the NUMA1-SFMBT1 fusion, the part of the SFMBT1 gene coding for two of four malignant brain tumor domains and the sterile alpha motif domain is controlled by the NUMA1 promoter. The data support a neoplastic genesis of aneurysmal benign fibrous histiocytoma and indicate a pathogenetic role for LAMTOR1-PRKCD and NUMA1-SFMBT1. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. Targeted next generation sequencing identified a novel mutation in MYO7A causing Usher syndrome type 1 in an Iranian consanguineous pedigree.

    Science.gov (United States)

    Kooshavar, Daniz; Razipour, Masoumeh; Movasat, Morteza; Keramatipour, Mohammad

    2018-01-01

    Usher syndrome (USH) is characterized by congenital hearing loss and retinitis pigmentosa (RP) with a later onset. It is an autosomal recessive trait with clinical and genetic heterogeneity which makes the molecular diagnosis much difficult. In this study, we introduce a pedigree with two affected members with USH type 1 and represent a cost and time effective approach for genetic diagnosis of USH as a genetically heterogeneous disorder. Target region capture in the genes of interest, followed by next generation sequencing (NGS) was used to determine the causative mutations in one of the probands. Then segregation analysis in the pedigree was conducted using PCR-Sanger sequencing. Targeted NGS detected a novel homozygous nonsense variant c.4513G > T (p.Glu1505Ter) in MYO7A. The variant is segregating in the pedigree with an autosomal recessive pattern. In this study, a novel stop gained variant c.4513G > T (p.Glu1505Ter) in MYO7A was found in an Iranian pedigree with two affected members with USH type 1. Bioinformatic as well as pedigree segregation analyses were in line with pathogenic nature of this variant. Targeted NGS panel was showed to be an efficient method for mutation detection in hereditary disorders with locus heterogeneity. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Multilocus Sequence Typing Identifies Epidemic Clones of Flavobacterium psychrophilum in Nordic Countries

    DEFF Research Database (Denmark)

    Nilsen, Hanne; Sundell, Krister; Duchaud, Eric

    2014-01-01

    , Norway, and Sweden. Multilocus sequence typing of 560 geographically and temporally disparate F. psychrophilum isolates collected from various sources between 1983 and 2012 revealed 81 different sequence types (STs) belonging to 12 clonal complexes (CCs) and 30 singleton STs. The largest CC, CC-ST10...... of genetically distinct CCs in the Nordic countries and points out specific F. psychrophilum STs posing a threat to the salmonid production. The study provides a significant contribution toward mapping the genetic diversity of F. psychrophilum globally and support for the existence of an epidemic population...

  8. Mining small RNA sequencing data: a new approach to identify small nucleolar RNAs in Arabidopsis

    OpenAIRE

    Chen, Ho-Ming; Wu, Shu-Hsing

    2009-01-01

    Small nucleolar RNAs (snoRNAs) are noncoding RNAs that direct 2?-O-methylation or pseudouridylation on ribosomal RNAs or spliceosomal small nuclear RNAs. These modifications are needed to modulate the activity of ribosomes and spliceosomes. A comprehensive repertoire of snoRNAs is needed to expand the knowledge of these modifications. The sequences corresponding to snoRNAs in 18?26-nt small RNA sequencing data have been rarely explored and remain as a hidden treasure for snoRNA annotation. He...

  9. Exome sequencing positively identified relevant alterations in more than half of cases with an indication of prenatal ultrasound anomalies.

    Science.gov (United States)

    Alamillo, Christina L; Powis, Zöe; Farwell, Kelly; Shahmirzadi, Layla; Weltmer, Elaine C; Turocy, John; Lowe, Thomas; Kobelka, Christine; Chen, Emily; Basel, Donald; Ashkinadze, Elena; D'Augelli, Lisa; Chao, Elizabeth; Tang, Sha

    2015-11-01

    Exome sequencing is a successful option for diagnosing individuals with previously uncharacterized genetic conditions, however little has been reported regarding its utility in a prenatal setting. The goal of this study is to describe the results from a cohort of fetuses for which exome sequencing was performed. We performed a retrospective analysis of the first seven cases referred to our laboratory for exome sequencing following fetal demise or termination of pregnancy. All seven pregnancies had multiple congenital anomalies identified by level II ultrasound. Exome sequencing was performed on trios using cultured amniocytes or products of conception from the affected fetuses. Relevant alterations were identified in more than half of the cases (4/7). Three of the four were categorized as 'positive' results, and one of the four was categorized as a 'likely positive' result. The provided diagnoses included osteogenesis imperfecta II (COL1A2), glycogen storage disease IV (GBE1), oral-facial-digital syndrome 1 (OFD1), and RAPSN-associated fetal akinesia deformation sequence. This data suggests that exome sequencing is likely to be a valuable diagnostic testing option for pregnancies with multiple congenital anomalies detected by prenatal ultrasound; however, additional studies with larger cohorts of affected pregnancies are necessary to confirm these findings. © 2015 John Wiley & Sons, Ltd.

  10. An Efficient Method for Identifying Gene Fusions by Targeted RNA Sequencing from Fresh Frozen and FFPE Samples.

    Science.gov (United States)

    Scolnick, Jonathan A; Dimon, Michelle; Wang, I-Ching; Huelga, Stephanie C; Amorese, Douglas A

    2015-01-01

    Fusion genes are known to be key drivers of tumor growth in several types of cancer. Traditionally, detecting fusion genes has been a difficult task based on fluorescent in situ hybridization to detect chromosomal abnormalities. More recently, RNA sequencing has enabled an increased pace of fusion gene identification. However, RNA-Seq is inefficient for the identification of fusion genes due to the high number of sequencing reads needed to detect the small number of fusion transcripts present in cells of interest. Here we describe a method, Single Primer Enrichment Technology (SPET), for targeted RNA sequencing that is customizable to any target genes, is simple to use, and efficiently detects gene fusions. Using SPET to target 5701 exons of 401 known cancer fusion genes for sequencing, we were able to identify known and previously unreported gene fusions from both fresh-frozen and formalin-fixed paraffin-embedded (FFPE) tissue RNA in both normal tissue and cancer cells.

  11. An Efficient Method for Identifying Gene Fusions by Targeted RNA Sequencing from Fresh Frozen and FFPE Samples.

    Directory of Open Access Journals (Sweden)

    Jonathan A Scolnick

    Full Text Available Fusion genes are known to be key drivers of tumor growth in several types of cancer. Traditionally, detecting fusion genes has been a difficult task based on fluorescent in situ hybridization to detect chromosomal abnormalities. More recently, RNA sequencing has enabled an increased pace of fusion gene identification. However, RNA-Seq is inefficient for the identification of fusion genes due to the high number of sequencing reads needed to detect the small number of fusion transcripts present in cells of interest. Here we describe a method, Single Primer Enrichment Technology (SPET, for targeted RNA sequencing that is customizable to any target genes, is simple to use, and efficiently detects gene fusions. Using SPET to target 5701 exons of 401 known cancer fusion genes for sequencing, we were able to identify known and previously unreported gene fusions from both fresh-frozen and formalin-fixed paraffin-embedded (FFPE tissue RNA in both normal tissue and cancer cells.

  12. Sequence analysis of the its-2 region: a tool to identify strains of Scenedesmus (Chlorophyceae)

    NARCIS (Netherlands)

    Van Hannen, E.J.; Lürling, M.; Van Donk, E.

    2000-01-01

    The genetic distances between several strains of Senedesmus obliquus (Turp,) Kutz,, S, acutus Hortobagyi, and S, naegelii Chod. calculated from ITS-2 sequences were found to be smaller than the genetic distances within other strains of Scenedesmus-that is, in S, acuminatus (Lagerh,) Chod, and S,

  13. Use of microsatellite markers derived from whole genome sequence data for identifying polymorphism in Phytophthora ramorum

    Science.gov (United States)

    Kelly Ivors; Matteo Garbelotto; Ineke De Vries; Peter Bonants

    2006-01-01

    Investigating the population genetics of Phytophthora ramorum, the causal agent of sudden oak death (SOD), is critical to understanding the biology and epidemiology of this important phytopathogen. Raw sequence data (445,000 reads) of P. ramorum was provided by the Joint Genome Institute. Our objective was to develop and utilize...

  14. Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer

    NARCIS (Netherlands)

    Wang, Kai; Yuen, Siu Tsan; Xu, Jiangchun; Lee, Siu Po; Yan, Helen H N; Shi, Stephanie T; Siu, Hoi Cheong; Deng, Shibing; Chu, Kent Man; Law, Simon; Chan, Kok Hoe; Chan, Annie S Y; Tsui, Wai Yin; Ho, Siu Lun; Chan, Anthony K W; Man, Jonathan L K; Foglizzo, Valentina; Ng, Man Kin; Chan, April S; Ching, Yick Pang; Cheng, Grace H W; Xie, Tao; Fernandez, Julio; Li, Vivian S W; Clevers, Hans; Rejto, Paul A; Mao, Mao; Leung, Suet Yi

    Gastric cancer is a heterogeneous disease with diverse molecular and histological subtypes. We performed whole-genome sequencing in 100 tumor-normal pairs, along with DNA copy number, gene expression and methylation profiling, for integrative genomic analysis. We found subtype-specific genetic and

  15. Genome sequencing identifies Listeria fleischmannii subsp. coloradonensis subsp. nov., isolated from a ranch.

    Science.gov (United States)

    den Bakker, Henk C; Manuel, Clyde S; Fortes, Esther D; Wiedmann, Martin; Nightingale, Kendra K

    2013-09-01

    Twenty Listeria-like isolates were obtained from environmental samples collected on a cattle ranch in northern Colorado; all of these isolates were found to share an identical partial sigB sequence, suggesting close relatedness. The isolates were similar to members of the genus Listeria in that they were Gram-stain-positive, short rods, oxidase-negative and catalase-positive; the isolates were similar to Listeria fleischmannii because they were non-motile at 25 °C. 16S rRNA gene sequencing for representative isolates and whole genome sequencing for one isolate was performed. The genome of the type strain of Listeria fleischmannii (strain LU2006-1(T)) was also sequenced. The draft genomes were very similar in size and the average MUMmer nucleotide identity across 91% of the genomes was 95.16%. Genome sequence data were used to design primers for a six-gene multi-locus sequence analysis (MLSA) scheme. Phylogenies based on (i) the near-complete 16S rRNA gene, (ii) 31 core genes and (iii) six housekeeping genes illustrated the close relationship of these Listeria-like isolates to Listeria fleischmannii LU2006-1(T). Sufficient genetic divergence of the Listeria-like isolates from the type strain of Listeria fleischmannii and differing phenotypic characteristics warrant these isolates to be classified as members of a distinct infraspecific taxon, for which the name Listeria fleischmannii subsp. coloradonensis subsp. nov. is proposed. The type strain is TTU M1-001(T) ( =BAA-2414(T) =DSM 25391(T)). The isolates of Listeria fleischmannii subsp. coloradonensis subsp. nov. differ from the nominate subspecies by the inability to utilize melezitose, turanose and sucrose, and the ability to utilize inositol. The results also demonstrate the utility of whole genome sequencing to facilitate identification of novel taxa within a well-described genus. The genomes of both subspecies of Listeria fleischmannii contained putative enhancin genes; the Listeria fleischmannii subsp

  16. Sequencing and assembly of highly heterozygous genome of Vitis vinifera L. cv Pinot Noir: problems and solutions.

    Science.gov (United States)

    Zharkikh, Andrey; Troggio, Michela; Pruss, Dmitry; Cestaro, Alessandro; Eldrdge, Glenn; Pindo, Massimo; Mitchell, Jeff T; Vezzulli, Silvia; Bhatnagar, Satish; Fontana, Paolo; Viola, Roberto; Gutin, Alexander; Salamini, Francesco; Skolnick, Mark; Velasco, Riccardo

    2008-08-31

    A new approach to sequencing and assembling a highly heterozygous genome, that of grape, species Vitis vinifera cv Pinot Noir, is described. The combining of genome shotgun of paired reads produced by Sanger sequencing and sequencing by synthesis of unpaired reads was shown to be an efficient procedure for decoding a complex genome. About 2 million SNPs and more than a million heterozygous gaps have been identified in the 500 Mb genome of grape. More than 91% of the sequence assembled into 58,611 contigs is now anchored to the 19 linkage groups of V. vinifera.

  17. Using Next-Generation Sequencing to Identify a Mutation in Human MCSU that is Responsible for Type II Xanthinuria

    OpenAIRE

    Yunan Zhou; Xueguang Zhang; Rui Ding; Zuoxiang Li; Quan Hong; Yan Wang; Wei Zheng; Xiaodong Geng; Meng Fan; Guangyan Cai; Xiangmei Chen; Di Wu

    2015-01-01

    Background: Hypouricemia is caused by various diseases and disorders, such as hepatic failure, Fanconi renotubular syndrome, nutritional deficiencies and genetic defects. Genetic defects of the molybdoflavoprotein enzymes induce hypouricemia and xanthinuria. Here, we identified a patient whose plasma and urine uric acid levels were both extremely low and aimed to identify the pathogenic gene and verify its mechanism. Methods: Using next-generation sequencing (NGS), we detected a mutation in t...

  18. Genome sequence of erythromelalgia-related poxvirus identifies it as an ectromelia virus strain.

    Directory of Open Access Journals (Sweden)

    Jorge D Mendez-Rios

    Full Text Available Erythromelagia is a condition characterized by attacks of burning pain and inflammation in the extremeties. An epidemic form of this syndrome occurs in secondary students in rural China and a virus referred to as erythromelalgia-associated poxvirus (ERPV was reported to have been recovered from throat swabs in 1987. Studies performed at the time suggested that ERPV belongs to the orthopoxvirus genus and has similarities with ectromelia virus, the causative agent of mousepox. We have determined the complete genome sequence of ERPV and demonstrated that it has 99.8% identity to the Naval strain of ectromelia virus and a slighly lower identity to the Moscow strain. Small DNA deletions in the Naval genome that are absent from ERPV may suggest that the sequenced strain of Naval was not the immediate progenitor of ERPV.

  19. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome

    Science.gov (United States)

    Benoit, Joshua B.; Adelman, Zach N.; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C.; Szuter, Elise M.; Hagan, Richard W.; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M.; Nelson, David R.; Rosendale, Andrew J.; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M.; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R.; Ioannidis, Panagiotis; Waterhouse, Robert M.; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J. Spencer; Gondhalekar, Ameya D.; Scharf, Michael E.; Peterson, Brittany F.; Raje, Kapil R.; Hottel, Benjamin A.; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S. T.; Duncan, Elizabeth J.; Murali, Shwetha C.; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L.; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C.; Muzny, Donna M.; Wheeler, David; Panfilio, Kristen A.; Vargas Jentzsch, Iris M.; Vargo, Edward L.; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T.; Anderson, Michelle A. E.; Jones, Jeffery W.; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D.; Attardo, Geoffrey M.; Robertson, Hugh M.; Zdobnov, Evgeny M.; Ribeiro, Jose M. C.; Gibbs, Richard A.; Werren, John H.; Palli, Subba R.; Schal, Coby; Richards, Stephen

    2016-01-01

    The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host–symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human–bed bug and symbiont–bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite. PMID:26836814

  20. Expressed Sequence Tags (ESTs) from Desiccated Tortula ruralis Identify a Large Number of Novel Plant Genes

    OpenAIRE

    Andrew J., Wood; R.Joel, Duff; Melvin J., Oliver; Department of Plant Biology, Southern Illinois University-Carbondale; Plant Stress and Water Conservation Unit, Cropping Systems Research Laboratory

    1999-01-01

    The desiccation-tolerant moss Tortula ruralis [Hedw.] Gaerten., Meyer & Scherb.has both a constitutive protection system and an active rehydration induced recovery mechanism apparently unique to bryophytes. Immediately following rehydration, desiccated T.ruralis gametophytes produce a set of polypeptides whose synthesis is unique to the rehydrated state. We report the construction of a cDNA expression library from the polysomal mRNA of desiccated gametophytes and the single-pass sequencing of...

  1. Epitope Sequences in Dengue Virus NS1 Protein Identified by Monoclonal Antibodies

    Directory of Open Access Journals (Sweden)

    Leticia Barboza Rocha

    2017-10-01

    Full Text Available Dengue nonstructural protein 1 (NS1 is a multi-functional glycoprotein with essential functions both in viral replication and modulation of host innate immune responses. NS1 has been established as a good surrogate marker for infection. In the present study, we generated four anti-NS1 monoclonal antibodies against recombinant NS1 protein from dengue virus serotype 2 (DENV2, which were used to map three NS1 epitopes. The sequence 193AVHADMGYWIESALNDT209 was recognized by monoclonal antibodies 2H5 and 4H1BC, which also cross-reacted with Zika virus (ZIKV protein. On the other hand, the sequence 25VHTWTEQYKFQPES38 was recognized by mAb 4F6 that did not cross react with ZIKV. Lastly, a previously unidentified DENV2 NS1-specific epitope, represented by the sequence 127ELHNQTFLIDGPETAEC143, is described in the present study after reaction with mAb 4H2, which also did not cross react with ZIKV. The selection and characterization of the epitope, specificity of anti-NS1 mAbs, may contribute to the development of diagnostic tools able to differentiate DENV and ZIKV infections.

  2. Sequence Analysis of SSR-Flanking Regions Identifies Genome Affinities between Pasture Grass Fungal Endophyte Taxa

    Directory of Open Access Journals (Sweden)

    Eline van Zijll de Jong

    2011-01-01

    Full Text Available Fungal species of the Neotyphodium and Epichloë genera are endophytes of pasture grasses showing complex differences of life-cycle and genetic architecture. Simple sequence repeat (SSR markers have been developed from endophyte-derived expressed sequence tag (EST collections. Although SSR array size polymorphisms are appropriate for phenetic analysis to distinguish between taxa, the capacity to resolve phylogenetic relationships is limited by both homoplasy and heteroploidy effects. In contrast, nonrepetitive sequence regions that flank SSRs have been effectively implemented in this study to demonstrate a common evolutionary origin of grass fungal endophytes. Consistent patterns of relationships between specific taxa were apparent across multiple target loci, confirming previous studies of genome evolution based on variation of individual genes. Evidence was obtained for the definition of endophyte taxa not only through genomic affinities but also by relative gene content. Results were compatible with the current view that some asexual Neotyphodium species arose following interspecific hybridisation between sexual Epichloë ancestors. Phylogenetic analysis of SSR-flanking regions, in combination with the results of previous studies with other EST-derived SSR markers, further permitted characterisation of Neotyphodium isolates that could not be assigned to known taxa on the basis of morphological characteristics.

  3. The Candida Genome Database (CGD): incorporation of Assembly 22, systematic identifiers and visualization of high throughput sequencing data.

    Science.gov (United States)

    Skrzypek, Marek S; Binkley, Jonathan; Binkley, Gail; Miyasato, Stuart R; Simison, Matt; Sherlock, Gavin

    2017-01-04

    The Candida Genome Database (CGD, http://www.candidagenome.org/) is a freely available online resource that provides gene, protein and sequence information for multiple Candida species, along with web-based tools for accessing, analyzing and exploring these data. The mission of CGD is to facilitate and accelerate research into Candida pathogenesis and biology, by curating the scientific literature in real time, and connecting literature-derived annotations to the latest version of the genomic sequence and its annotations. Here, we report the incorporation into CGD of Assembly 22, the first chromosome-level, phased diploid assembly of the C. albicans genome, coupled with improvements that we have made to the assembly using additional available sequence data. We also report the creation of systematic identifiers for C. albicans genes and sequence features using a system similar to that adopted by the yeast community over two decades ago. Finally, we describe the incorporation of JBrowse into CGD, which allows online browsing of mapped high throughput sequencing data, and its implementation for several RNA-Seq data sets, as well as the whole genome sequencing data that was used in the construction of Assembly 22. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Towards a Phylogeny for Coffea (Rubiaceae): identifying well-supported lineages based on nuclear and plastid DNA sequences.

    Science.gov (United States)

    Maurin, Olivier; Davis, Aaron P; Chester, Michael; Mvungi, Esther F; Jaufeerally-Fakim, Yasmina; Fay, Michael F

    2007-12-01

    The phylogenetic relationships between species of Coffea and Psilanthus remain poorly understood, owing to low levels of sequence variation recovered in previous studies, coupled with relatively limited species sampling. In this study, the relationships between Coffea and Psilanthus species are assessed based on substantially increased molecular sequence data and greatly improved species sampling. Phylogenetic relationships are assessed using parsimony, with sequence data from four plastid regions [trnL-F intron, trnL-F intergenic spacer (IGS), rpl16 intron and accD-psa1 IGS], and the internal transcribed spacer (ITS) region of nuclear rDNA (ITS 1/5.8S/ITS 2). Supported lineages in Coffea are discussed within the context of geographical correspondence, biogeography, morphology and systematics. Several major lineages with geographical coherence, as identified in previous studies based on smaller data sets, are supported. Other lineages with either geographical or ecological correspondence are recognized for the first time. Coffea subgenus Baracoffea is shown to be monophyletic, but Coffea subgenus Coffea is paraphyletic. Sequence data do not substantiate the monophyly of either Coffea or Psilanthus. Low levels of sequence divergence do not allow detailed resolution of relationships within Coffea, most notably for species of Coffea subgenus Coffea occurring in Madagascar. The origin of C. arabica by recent hybridization between C. canephora and C. eugenioides is supported. Phylogenetic separation resulting from the presence of the Dahomey Gap is inferred based on sequence data from Coffea.

  5. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Directory of Open Access Journals (Sweden)

    Gilbert Greub

    Full Text Available BACKGROUND: With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. METHODS/PRINCIPAL FINDINGS: We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. CONCLUSIONS/SIGNIFICANCE: This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  6. Anchoring genome sequence to chromosomes of the central bearded dragon (Pogona vitticeps) enables reconstruction of ancestral squamate macrochromosomes and identifies sequence content of the Z chromosome.

    Science.gov (United States)

    Deakin, Janine E; Edwards, Melanie J; Patel, Hardip; O'Meally, Denis; Lian, Jinmin; Stenhouse, Rachael; Ryan, Sam; Livernois, Alexandra M; Azad, Bhumika; Holleley, Clare E; Li, Qiye; Georges, Arthur

    2016-06-10

    Squamates (lizards and snakes) are a speciose lineage of reptiles displaying considerable karyotypic diversity, particularly among lizards. Understanding the evolution of this diversity requires comparison of genome organisation between species. Although the genomes of several squamate species have now been sequenced, only the green anole lizard has any sequence anchored to chromosomes. There is only limited gene mapping data available for five other squamates. This makes it difficult to reconstruct the events that have led to extant squamate karyotypic diversity. The purpose of this study was to anchor the recently sequenced central bearded dragon (Pogona vitticeps) genome to chromosomes to trace the evolution of squamate chromosomes. Assigning sequence to sex chromosomes was of particular interest for identifying candidate sex determining genes. By using two different approaches to map conserved blocks of genes, we were able to anchor approximately 42 % of the dragon genome sequence to chromosomes. We constructed detailed comparative maps between dragon, anole and chicken genomes, and where possible, made broader comparisons across Squamata using cytogenetic mapping information for five other species. We show that squamate macrochromosomes are relatively well conserved between species, supporting findings from previous molecular cytogenetic studies. Macrochromosome diversity between members of the Toxicofera clade has been generated by intrachromosomal, and a small number of interchromosomal, rearrangements. We reconstructed the ancestral squamate macrochromosomes by drawing upon comparative cytogenetic mapping data from seven squamate species and propose the events leading to the arrangements observed in representative species. In addition, we assigned over 8 Mbp of sequence containing 219 genes to the Z chromosome, providing a list of genes to begin testing as candidate sex determining genes. Anchoring of the dragon genome has provided substantial insight into

  7. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.

    Science.gov (United States)

    Motamayor, Juan C; Mockaitis, Keithanne; Schmutz, Jeremy; Haiminen, Niina; Livingstone, Donald; Cornejo, Omar; Findley, Seth D; Zheng, Ping; Utro, Filippo; Royaert, Stefan; Saski, Christopher; Jenkins, Jerry; Podicheti, Ram; Zhao, Meixia; Scheffler, Brian E; Stack, Joseph C; Feltus, Frank A; Mustiga, Guiliana M; Amores, Freddy; Phillips, Wilbert; Marelli, Jean Philippe; May, Gregory D; Shapiro, Howard; Ma, Jianxin; Bustamante, Carlos D; Schnell, Raymond J; Main, Dorrie; Gilbert, Don; Parida, Laxmi; Kuhn, David N

    2013-06-03

    Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits.

  8. TAPDANCE: An automated tool to identify and annotate transposon insertion CISs and associations between CISs from next generation sequence data

    Directory of Open Access Journals (Sweden)

    Sarver Aaron L

    2012-06-01

    Full Text Available Abstract Background Next generation sequencing approaches applied to the analyses of transposon insertion junction fragments generated in high throughput forward genetic screens has created the need for clear informatics and statistical approaches to deal with the massive amount of data currently being generated. Previous approaches utilized to 1 map junction fragments within the genome and 2 identify Common Insertion Sites (CISs within the genome are not practical due to the volume of data generated by current sequencing technologies. Previous approaches applied to this problem also required significant manual annotation. Results We describe Transposon Annotation Poisson Distribution Association Network Connectivity Environment (TAPDANCE software, which automates the identification of CISs within transposon junction fragment insertion data. Starting with barcoded sequence data, the software identifies and trims sequences and maps putative genomic sequence to a reference genome using the bowtie short read mapper. Poisson distribution statistics are then applied to assess and rank genomic regions showing significant enrichment for transposon insertion. Novel methods of counting insertions are used to ensure that the results presented have the expected characteristics of informative CISs. A persistent mySQL database is generated and utilized to keep track of sequences, mappings and common insertion sites. Additionally, associations between phenotypes and CISs are also identified using Fisher’s exact test with multiple testing correction. In a case study using previously published data we show that the TAPDANCE software identifies CISs as previously described, prioritizes them based on p-value, allows holistic visualization of the data within genome browser software and identifies relationships present in the structure of the data. Conclusions The TAPDANCE process is fully automated, performs similarly to previous labor intensive approaches

  9. Genome sequence of a novel victorivirus identified in the phytopathogenic fungus Alternaria arborescens.

    Science.gov (United States)

    Komatsu, Ken; Katayama, Yukie; Omatsu, Tsutomu; Mizutani, Tetsuya; Fukuhara, Toshiyuki; Kodama, Motoichiro; Arie, Tsutomu; Teraoka, Tohru; Moriyama, Hiromitsu

    2016-06-01

    Strains of the phytopathogenic fungus Alternaria spp. have been found to contain a variety of double-stranded RNA (dsRNA) elements indicative of mycovirus infection. Here, we report the molecular characterization of a novel dsRNA mycovirus, Alternaria arborescens victorivirus 1 (AaVV1), from A. arborescens, the tomato pathotype of A. alternata. Using next-generation sequencing of dsRNA purified from an A. arborescens strain from the United States of America, we found that the AaVV1 genome is 5203 bp in length and contains two open reading frames (ORF1 and 2) that overlap at the tetranucleotide AUGA. Proteins encoded by ORF1 and ORF2 showed significant similarities to the coat protein (CP) and the RNA-dependent RNA polymerase (RdRp), respectively, of dsRNA mycoviruses of the genus Victorivirus. Pairwise comparisons and phylogenetic analysis of the deduced amino acid sequences of both CP and RdRp indicated that AaVV1 is a member of a distinct species of the genus Victorivirus in the family Totiviridae.

  10. Identifying selection in the within-host evolution of influenza using viral sequence data.

    Directory of Open Access Journals (Sweden)

    Christopher J R Illingworth

    2014-07-01

    Full Text Available The within-host evolution of influenza is a vital component of its epidemiology. A question of particular interest is the role that selection plays in shaping the viral population over the course of a single infection. We here describe a method to measure selection acting upon the influenza virus within an individual host, based upon time-resolved genome sequence data from an infection. Analysing sequence data from a transmission study conducted in pigs, describing part of the haemagglutinin gene (HA1 of an influenza virus, we find signatures of non-neutrality in six of a total of sixteen infections. We find evidence for both positive and negative selection acting upon specific alleles, while in three cases, the data suggest the presence of time-dependent selection. In one infection we observe what is potentially a specific immune response against the virus; a non-synonymous mutation in an epitope region of the virus is found to be under initially positive, then strongly negative selection. Crucially, given the lack of homologous recombination in influenza, our method accounts for linkage disequilibrium between nucleotides at different positions in the haemagglutinin gene, allowing for the analysis of populations in which multiple mutations are present at any given time. Our approach offers a new insight into the dynamics of influenza infection, providing a detailed characterisation of the forces that underlie viral evolution.

  11. Complete genome sequence of an astrovirus identified in a domestic rabbit (Oryctolagus cuniculus with gastroenteritis

    Directory of Open Access Journals (Sweden)

    Stenglein Mark D

    2012-09-01

    Full Text Available Abstract A colony of domestic rabbits in Tennessee, USA, experienced a high-mortality (~90% outbreak of enterocolitis. The clinical characteristics were one to six days of lethargy, bloating, and diarrhea, followed by death. Heavy intestinal coccidial load was a consistent finding as was mucoid enteropathy with cecal impaction. Preliminary analysis by electron microscopy revealed the presence of virus-like particles in the stool of one of the affected rabbits. Analysis using the Virochip, a viral detection microarray, suggested the presence of an astrovirus, and follow-up PCR and sequence determination revealed a previously uncharacterized member of that family. Metagenomic sequencing enabled the recovery of the complete viral genome, which contains the characteristic attributes of astrovirus genomes. Attempts to propagate the virus in tissue culture have yet to succeed. Although astroviruses cause gastroenteric disease in other mammals, the pathogenicity of this virus and the relationship to this outbreak remains to be determined. This study therefore defines a viral species and a potential rabbit pathogen.

  12. Next-generation DNA sequencing identifies novel gene variants and pathways involved in specific language impairment

    NARCIS (Netherlands)

    Chen, X.S.; Reader, R.H.; Hoischen, A.; Veltman, J.A.; Simpson, N.H.; Francks, C.; Newbury, D.F.; Fisher, S.E.

    2017-01-01

    A significant proportion of children have unexplained problems acquiring proficient linguistic skills despite adequate intelligence and opportunity. Developmental language disorders are highly heritable with substantial societal impact. Molecular studies have begun to identify candidate loci, but

  13. Failure to Identify Somatic Mutations in Monozygotic Twins Discordant for Schizophrenia by Whole Exome Sequencing

    Directory of Open Access Journals (Sweden)

    Nan Lyu

    2016-01-01

    Conclusions: This study is not alone in the failure to identify pathogenic somatic variations in MZ twins, suggesting that exonic somatic variations are extremely rare. Further efforts are warranted to explore the potential genetic mechanism of SCZ.

  14. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    DEFF Research Database (Denmark)

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang

    2015-01-01

    . Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication...... was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 × 10(-11); ncases = 98,742 and ncontrols = 409,511). Using an En1(cre/flox) mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified...

  15. Novel noncontiguous duplications identified with a comprehensive mutation analysis in the DMD gene by DMD gene-targeted sequencing.

    Science.gov (United States)

    Xu, Yan; Wang, Huanhuan; Xiao, Bing; Wei, Wei; Liu, Yu; Ye, Hui; Ying, Xiaomin; Chen, Yingwei; Liu, Xiaoqing; Ji, Xing; Sun, Yu

    2018-03-01

    Genomic rearrangements, such as intragenic deletions and duplications, are the most prevalent types of mutation in the DMD gene, and DMD mutations underlie Duchenne muscular dystrophy (DMD) and Becker muscular dystrophy (BMD). Using multiplex ligation dependent probe amplification (MLPA) and DMD gene-targeted sequencing, we performed a molecular characterization of two cases of complex noncontiguous duplication rearrangements that involved inverted duplications. The breakpoint sequences were analyzed to investigate the mechanisms of the rearrangement. The two cases shared the same duplication events (Dup-nml-Dup/inv), and both involved microhomology and small insertions at the breakpoints. Additionally, in case 1, SNP sequencing results indicated that the de novo duplication mutation arose in the allele that originated from the grandfather. This study has identified a novel type of DMD complex rearrangement and provides insight into the molecular basis of this genomic rearrangement. Copyright © 2017. Published by Elsevier B.V.

  16. Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction

    DEFF Research Database (Denmark)

    Do, Ron; Stitziel, Nathan O; Won, Hong-Hee

    2015-01-01

    Myocardial infarction (MI), a leading cause of death around the world, displays a complex pattern of inheritance. When MI occurs early in life, genetic inheritance is a major component to risk. Previously, rare mutations in low-density lipoprotein (LDL) genes have been shown to contribute to MI...... risk in individual families, whereas common variants at more than 45 loci have been associated with MI risk in the population. Here we evaluate how rare mutations contribute to early-onset MI risk in the population. We sequenced the protein-coding regions of 9,793 genomes from patients with MI......-synonymous mutations were at 4.2-fold increased risk for MI; carriers of null alleles at LDLR were at even higher risk (13-fold difference). Approximately 2% of early MI cases harbour a rare, damaging mutation in LDLR; this estimate is similar to one made more than 40 years ago using an analysis of total cholesterol...

  17. Whole transcriptome sequencing identifies increased CXCR2 expression in PNH granulocytes.

    Science.gov (United States)

    Hosokawa, Kohei; Kajigaya, Sachiko; Keyvanfar, Keyvan; Qiao, Wangmin; Xie, Yanling; Biancotto, Angelique; Townsley, Danielle M; Feng, Xingmin; Young, Neal S

    2017-04-01

    The aetiology of paroxysmal nocturnal haemoglobinuria (PNH) is a somatic mutation in the X-linked phosphatidylinositol glycan class A gene (PIGA), resulting in global deficiency of glycosyl phosphatidylinositol-anchored proteins (GPI-APs). This study applied RNA-sequencing to examine functional effects of the PIGA mutation in human granulocytes. CXCR2 expression was increased in GPI-AP- compared to GPI-AP+ granulocytes. Macrophage migration inhibitory factor, a CXCR2 agonist, was significantly higher in plasma of PNH patients. Nuclear factor-κB phosphorylation was upregulated in GPI-AP- compared with GPI-AP+ granulocytes. Our data suggest novel mechanisms in PNH, not obviously predicted by decreased production of the GPI moiety. © 2017 John Wiley & Sons Ltd.

  18. Validation of an Ion Torrent Sequencing Platform for the Detection of Gene Mutations in Biopsy Specimens from Patients with Non-Small-Cell Lung Cancer.

    Directory of Open Access Journals (Sweden)

    Shiro Fujita

    Full Text Available Treatment for patients with advanced non-small cell lung cancer (NSCLC is often determined by the presence of biomarkers that predict the response to agents targeting specific molecular pathways. Demands for multiplex analysis of the genes involved in the pathogenesis of NSCLC are increasing.We validated the Ion Torrent Personal Genome Machine (PGM system using the Ion AmpliSeq Cancer Hotspot Panel and compared the results with those obtained using the gold standard methods, conventional PCR and Sanger sequencing. The cycleave PCR method was used to verify the results.The Ion Torrent PGM resulted in a similar level of accuracy in identifying multiple genetic mutations in parallel, compared with conventional PCR and Sanger sequencing; however, the Ion Torrent PGM was superior to the other sequencing methods in terms of increased ease of use, even when taking into account the small amount of DNA that was obtained from formalin-fixed paraffin embedded (FFPE biopsy specimens.

  19. Validation of an Ion Torrent Sequencing Platform for the Detection of Gene Mutations in Biopsy Specimens from Patients with Non-Small-Cell Lung Cancer.

    Science.gov (United States)

    Fujita, Shiro; Masago, Katsuhiro; Takeshita, Jumpei; Okuda, Chiyuki; Otsuka, Kyoko; Hata, Akito; Kaji, Reiko; Katakami, Nobuyuki; Hirata, Yukio

    2015-01-01

    Treatment for patients with advanced non-small cell lung cancer (NSCLC) is often determined by the presence of biomarkers that predict the response to agents targeting specific molecular pathways. Demands for multiplex analysis of the genes involved in the pathogenesis of NSCLC are increasing. We validated the Ion Torrent Personal Genome Machine (PGM) system using the Ion AmpliSeq Cancer Hotspot Panel and compared the results with those obtained using the gold standard methods, conventional PCR and Sanger sequencing. The cycleave PCR method was used to verify the results. The Ion Torrent PGM resulted in a similar level of accuracy in identifying multiple genetic mutations in parallel, compared with conventional PCR and Sanger sequencing; however, the Ion Torrent PGM was superior to the other sequencing methods in terms of increased ease of use, even when taking into account the small amount of DNA that was obtained from formalin-fixed paraffin embedded (FFPE) biopsy specimens.

  20. Whole-exome sequencing identify a new mutation of MYH7 in a Chinese family with left ventricular noncompaction.

    Science.gov (United States)

    Yang, Jing; Zhu, Meng; Wang, Yao; Hou, Xiaofeng; Wu, Hongping; Wang, Daowu; Shen, Hongbing; Hu, Zhibin; Zou, Jiangang

    2015-03-01

    Left ventricular noncompaction (LVNC) is a genetic cardiomyopathy results from the failure of myocardial development during embryogenesis. Previous reports show that defects in TAZ, SCN5A, TPM1, YWHAE, MYH7, ACTC1 and TNNT2 are associated with LVNC. Sequencing of individuals using family-based design is a powerful approach for hereditary disease. In this study, we used whole-exome sequencing to screen potentially novel causal mutations in a Chinese Han family with LVNC. DNA from 3 individuals belonging to the same family was extracted and sequenced based on standard whole-exome sequencing protocol. The exome sequence data was analyzed using BWA, PICARD and Genome Analysis Toolkit (GATK v2.8). Non-silent single nucleotide variants (SNVs) were further selected if they exist in both LVNC patients and not in the health control. A web-based software Snv Prioritization via the INtegration of Genomic data (SPRING), was used to prioritize the causal SNV by calculating a q-value which indicates the statistical significance that a variant is causative for a query disease. From the LVNC family in which the mother and son were affected, a novel single nucleotide variant c.C1492G in exon 15 of MYH7 was identified probably to be the causal SNV of the family with P-value of 3.45E-05 and q-value of 4.65E-03 by SPRING. The SNV was predicted as deleterious in SIFT, PolyPhe2 and MutatioTaster database. Another 12 SNVs were also identified with P-value less than 0.05 by SPRING. A novel genetic variant in the coding regions of MYH7 gene was identified in a Chinese LVNC-family. The results support the previous evidence that MYH7 is a pathogenic gene for LVNC. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. SysPIMP: the web-based systematical platform for identifying human disease-related mutated sequences from mass spectrometry.

    Science.gov (United States)

    Xi, Hong; Park, Jongsun; Ding, Guohui; Lee, Yong-Hwan; Li, Yixue

    2009-01-01

    Some mutations resulting in protein sequence change might be tightly related to certain human diseases by affecting its roles, such as sickle cell anemia. Until now several databases, such as PMD, OMIM and HGMD, have been developed, providing useful information about human disease-related mutation. Tandem mass spectrometry (MS) has been used for characterizing proteins in various conditions; however, there is no system in place for finding disease-related mutated proteins within the MS results. Here, a Systematical Platform for Identifying Mutated Proteins (SysPIMP; http://pimp.starflr.info/) was developed to efficiently identify human disease-related mutated proteins within MS results. SysPIMP comprises of three layers: (i) a standardized data warehouse, (ii) a pipeline layer for maintaining human disease databases and X!Tandem and BLAST and (iii) a web-based interface. From OMIM AV part, PMD and SwissProt databases, 35,497 non-redundant human disease-related mutated sequences were collected with disease information described by OMIM terms. With the interfaces to browse sequences archived in SysPIMP, X!Tandem, an open source database-search engine used to identify proteins within MS data, was integrated into SysPIMP to help support the detection of potential human disease-related mutants in MS results. In addition, together with non-redundant disease-related mutated sequences, original non-mutated sequences are also provided in SysPIMP for comparative research. Based on this system, SysPIMP will be the platform for efficiently and intensively studying human diseases caused by mutation.

  2. Highly Accurate Sequencing of Full-Length Immune Repertoire Amplicons Using Tn5-Enabled and Molecular Identifier-Guided Amplicon Assembly.

    Science.gov (United States)

    Cole, Charles; Volden, Roger; Dharmadhikari, Sumedha; Scelfo-Dalbey, Camille; Vollmers, Christopher

    2016-03-15

    Ab repertoire sequencing is a powerful tool to analyze the adaptive immune system. To sequence entire Ab repertoires, amplicons are created from Ab H chain (IgH) transcripts and sequenced on a high-throughput sequencer. The field of immune repertoire sequencing is growing rapidly and the protocols used are steadily improving; however, thus far, immune repertoire sequencing protocols have not been able to sequence full-length immune repertoires including the entire IgH V region and enough of the IgH C region to identify isotype subtypes. In this study, we present a method that combines Tn5 transposase and molecular identifiers for the highly accurate sequencing of amplicons >500 bp using Illumina short read paired-end sequencing. We then apply this method to Ab H chain amplicons to sequence the first, to our knowledge, highly accurate full-length immune repertoire. Copyright © 2016 by The American Association of Immunologists, Inc.

  3. Integrative analysis of functional genomic annotations and sequencing data to identify rare causal variants via hierarchical modeling

    Directory of Open Access Journals (Sweden)

    Marinela eCapanu

    2015-05-01

    Full Text Available Identifying the small number of rare causal variants contributing to disease has beena major focus of investigation in recent years, but represents a formidable statisticalchallenge due to the rare frequencies with which these variants are observed. In thiscommentary we draw attention to a formal statistical framework, namely hierarchicalmodeling, to combine functional genomic annotations with sequencing data with theobjective of enhancing our ability to identify rare causal variants. Using simulations weshow that in all configurations studied, the hierarchical modeling approach has superiordiscriminatory ability compared to a recently proposed aggregate measure of deleteriousness,the Combined Annotation-Dependent Depletion (CADD score, supportingour premise that aggregate functional genomic measures can more accurately identifycausal variants when used in conjunction with sequencing data through a hierarchicalmodeling approach

  4. Genome-based exome sequencing analysis identifies GYG1, DIS3L ...

    Indian Academy of Sciences (India)

    Myocardial infarction (MI) is a complex disease caused by combination of genetic and environmental factors. Although genome-wide association studies (GWAS) identified more than 46 risk loci which are associated with coronary artery disease and MI, most of the genetic variability inMI still remains undefined. Here, we ...

  5. Evolution of DNA sequencing.

    Science.gov (United States)

    Tipu, Hamid Nawaz; Shabbir, Ambreen

    2015-03-01

    Sanger and coworkers introduced DNA sequencing in 1970s for the first time. It principally relied on termination of growing nucleotide chain when a dideoxythymidine triphosphate (ddTTP) was inserted in it. Detection of terminated sequences was done radiographically on Polyacrylamide Gel Electrophoresis (PAGE). Improvements that have evolved over time in original Sanger sequencing include replacement of radiography with fluorescence, use of separate fluorescent markers for each nucleotide, use of capillary electrophoresis instead of polyacrylamide gel electrophoresis and then introduction of capillary array electrophoresis. However, this technique suffered from few inherent limitations like decreased sensitivity for low level mutant alleles, complexities in analyzing highly polymorphic regions like Major Histocompatibility Complex (MHC) and high DNA concentrations required. Several Next Generation Sequencing (NGS) technologies have been introduced by Roche, Illumina and other commercial manufacturers that tend to overcome Sanger sequencing limitations and have been reviewed. Introduction of NGS in clinical research and medical diagnostics is expected to change entire diagnostic approach. These include study of cancer variants, detection of minimal residual disease, exome sequencing, detection of Single Nucleotide Polymorphisms (SNPs) and their disease association, epigenetic regulation of gene expression and sequencing of microorganisms genome.

  6. Transcriptome Sequencing of Gynostemma pentaphyllum to Identify Genes and Enzymes Involved in Triterpenoid Biosynthesis

    Science.gov (United States)

    Ma, Chengtong; Qian, Jieying; Lan, Xiuwan; Chao, Naixia; Sun, Jian

    2016-01-01

    G. pentaphyllum (Gynostemma pentaphyllum), a creeping herbaceous perennial with many important medicinal properties, is widely distributed in Asia. Gypenosides (triterpenoid saponins), the main effective components of G. pentaphyllum, are well studied. FPS (farnesyl pyrophosphate synthase), SS (squalene synthase), and SE (squalene epoxidase) are the main enzymes involved in the synthesis of triterpenoid saponins. Considering the important medicinal functions of G. pentaphyllum, it is necessary to investigate the transcriptomic information of G. pentaphyllum to facilitate future studies of transcriptional regulation. After sequencing G. pentaphyllum, we obtained 50,654,708 unigenes. Next, we used RPKM (reads per kilobases per million reads) to calculate expression of the unigenes and we performed comparison of our data to that contained in five common databases to annotate different aspects of the unigenes. Finally, we noticed that FPS, SS, and SE showed differential expression of enzymes in DESeq. Leaves showed the highest expression of FPS, SS, and SE relative to the other two tissues. Our research provides transcriptomic information of G. pentaphyllum in its natural environment and we found consistency in unigene expression, enzymes expression (FPS, SS, and SE), and the distribution of gypenosides content in G. pentaphyllum. Our results will enable future related studies of G. pentaphyllum. PMID:28097124

  7. Identifying neuronal lineages of Drosophila by sequence analysis of axon tracts

    Science.gov (United States)

    Cardona, Albert; Saalfeld, Stephan; Arganda, Ignacio; Pereanu, Wayne; Schindelin, Johannes; Hartenstein, Volker

    2010-01-01

    The Drosophila brain is formed by an invariant set of lineages, each of which is derived from a unique neural stem cell (neuroblast) and forms a genetic and structural unit of the brain. The task of reconstructing brain circuitry at the level of individual neurons can be made significantly easier by assigning neurons to their respective lineages. In this paper we address the automatization of neuron and lineage identification. We focused on the Drosophila brain lineages at the larval stage when they form easily recognizable secondary axon tracts (SATs) that were previously partially characterized. We now generated an annotated digital database containing all lineage tracts reconstructed from five registered wild-type brains, at higher resolution and including some that were previously not characterized. We developed a method for SAT structural comparisons based on a dynamic programming approach akin to nucleotide sequence alignment, and a machine learning classifier trained on the annotated database of reference SATs. We quantified the stereotypy of SATs by measuring the residual variability of aligned wild-type SATs. Next, we employed our method for the identification of SATs within wild-type larval brains, and found it highly accurate (93–99 %). The method proved highly robust for the identification of lineages in mutant brains, and in brains that differed in developmental time or labeling. We describe for the first time an algorithm that quantifies neuronal projection stereotypy in the Drosophila brain, and use the algorithm for automatic neuron and lineage recognition. PMID:20519528

  8. Circulating tumor DNA identified by targeted sequencing in advanced-stage non-small cell lung cancer patients.

    Science.gov (United States)

    Xu, Song; Lou, Feng; Wu, Yi; Sun, Da-Qiang; Zhang, Jing-Bo; Chen, Wei; Ye, Hua; Liu, Jing-Hao; Wei, Sen; Zhao, Ming-Yu; Wu, Wen-Jun; Su, Xue-Xia; Shi, Rong; Jones, Lindsey; Huang, Xue F; Chen, Si-Yi; Chen, Jun

    2016-01-28

    Non-small cell lung cancers (NSCLC) have unique mutation patterns, and some of these mutations may be used to predict prognosis or guide patient treatment. Mutation profiling before and during treatment often requires repeated tumor biopsies, which is not always possible. Recently, cell-free, circulating tumor DNA (ctDNA) isolated from blood plasma has been shown to contain genetic mutations representative of those found in the primary tumor tissue DNA (tDNA), and these samples can readily be obtained using non-invasive techniques. However, there are still no standardized methods to identify mutations in ctDNA. In the current study, we used a targeted sequencing approach with a semi-conductor based next-generation sequencing (NGS) platform to identify gene mutations in matched tDNA and ctDNA samples from 42 advanced-stage NSCLC patients from China. We identified driver mutations in matched tDNA and ctDNA in EGFR, KRAS, PIK3CA, and TP53, with an overall concordance of 76%. In conclusion, targeted sequencing of plasma ctDNA may be a feasible option for clinical monitoring of NSCLC in the near future. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  9. Deep sequencing identifies ethnicity-specific bacterial signatures in the oral microbiome.

    Directory of Open Access Journals (Sweden)

    Matthew R Mason

    Full Text Available Oral infections have a strong ethnic predilection; suggesting that ethnicity is a critical determinant of oral microbial colonization. Dental plaque and saliva samples from 192 subjects belonging to four major ethnicities in the United States were analyzed using terminal restriction fragment length polymorphism (t-RFLP and 16S pyrosequencing. Ethnicity-specific clustering of microbial communities was apparent in saliva and subgingival biofilms, and a machine-learning classifier was capable of identifying an individual's ethnicity from subgingival microbial signatures. The classifier identified African Americans with a 100% sensitivity and 74% specificity and Caucasians with a 50% sensitivity and 91% specificity. The data demonstrates a significant association between ethnic affiliation and the composition of the oral microbiome; to the extent that these microbial signatures appear to be capable of discriminating between ethnicities.

  10. Using BAC transgenesis in zebrafish to identify regulatory sequences of the amyloid precursor protein gene in humans

    Directory of Open Access Journals (Sweden)

    Shakes Leighcraft A

    2012-09-01

    Full Text Available Abstract Background Non-coding DNA in and around the human Amyloid Precursor Protein (APP gene that is central to Alzheimer’s disease (AD shares little sequence similarity with that of appb in zebrafish. Identifying DNA domains regulating expression of the gene in such situations becomes a challenge. Taking advantage of the zebrafish system that allows rapid functional analyses of gene regulatory sequences, we previously showed that two discontinuous DNA domains in zebrafish appb are important for expression of the gene in neurons: an enhancer in intron 1 and sequences 28–31 kb upstream of the gene. Here we identify the putative transcription factor binding sites responsible for this distal cis-acting regulation, and use that information to identify a regulatory region of the human APP gene. Results Functional analyses of intron 1 enhancer mutations in enhancer-trap BACs expressed as transgenes in zebrafish identified putative binding sites of two known transcription factor proteins, E4BP4/ NFIL3 and Forkhead, to be required for expression of appb. A cluster of three E4BP4 sites at −31 kb is also shown to be essential for neuron-specific expression, suggesting that the dependence of expression on upstream sequences is mediated by these E4BP4 sites. E4BP4/ NFIL3 and XFD1 sites in the intron enhancer and E4BP4/ NFIL3 sites at −31 kb specifically and efficiently bind the corresponding zebrafish proteins in vitro. These sites are statistically over-represented in both the zebrafish appb and the human APP genes, although their locations are different. Remarkably, a cluster of four E4BP4 sites in intron 4 of human APP exists in actively transcribing chromatin in a human neuroblastoma cell-line, SHSY5Y, expressing APP as shown using chromatin immunoprecipitation (ChIP experiments. Thus although the two genes share little sequence conservation, they appear to share the same regulatory logic and are regulated by a similar set of transcription

  11. Deep sequencing of target linkage assay-identified regions in familial breast cancer: methods, analysis pipeline and troubleshooting.

    Directory of Open Access Journals (Sweden)

    Juan Manuel Rosa-Rosa

    Full Text Available BACKGROUND: The classical candidate-gene approach has failed to identify novel breast cancer susceptibility genes. Nowadays, massive parallel sequencing technology allows the development of studies unaffordable a few years ago. However, analysis protocols are not yet sufficiently developed to extract all information from the huge amount of data obtained. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we performed high throughput sequencing in two regions located on chromosomes 3 and 6, recently identified by linkage studies by our group as candidate regions for harbouring breast cancer susceptibility genes. In order to enrich for the coding regions of all described genes located in both candidate regions, a hybrid-selection method on tiling microarrays was performed. CONCLUSIONS/SIGNIFICANCE: We developed an analysis pipeline based on SOAP aligner to identify candidate variants with a high real positive confirmation rate (0.89, with which we identified eight variants considered candidates for functional studies. The results suggest that the present strategy might be a valid second step for identifying high penetrance genes.

  12. Exome sequencing identifies rare deleterious mutations in DNA repair genes FANCC and BLM as potential breast cancer susceptibility alleles.

    Directory of Open Access Journals (Sweden)

    Ella R Thompson

    2012-09-01

    Full Text Available Despite intensive efforts using linkage and candidate gene approaches, the genetic etiology for the majority of families with a multi-generational breast cancer predisposition is unknown. In this study, we used whole-exome sequencing of thirty-three individuals from 15 breast cancer families to identify potential predisposing genes. Our analysis identified families with heterozygous, deleterious mutations in the DNA repair genes FANCC and BLM, which are responsible for the autosomal recessive disorders Fanconi Anemia and Bloom syndrome. In total, screening of all exons in these genes in 438 breast cancer families identified three with truncating mutations in FANCC and two with truncating mutations in BLM. Additional screening of FANCC mutation hotspot exons identified one pathogenic mutation among an additional 957 breast cancer families. Importantly, none of the deleterious mutations were identified among 464 healthy controls and are not reported in the 1,000 Genomes data. Given the rarity of Fanconi Anemia and Bloom syndrome disorders among Caucasian populations, the finding of multiple deleterious mutations in these critical DNA repair genes among high-risk breast cancer families is intriguing and suggestive of a predisposing role. Our data demonstrate the utility of intra-family exome-sequencing approaches to uncover cancer predisposition genes, but highlight the major challenge of definitively validating candidates where the incidence of sporadic disease is high, germline mutations are not fully penetrant, and individual predisposition genes may only account for a tiny proportion of breast cancer families.

  13. A survey of single nucleotide polymorphisms identified from whole-genome sequencing and their functional effect in the porcine genome.

    Science.gov (United States)

    Keel, B N; Nonneman, D J; Rohrer, G A

    2017-08-01

    Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole-genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire-Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss-of-function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss-of-function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  14. Sequencing of mRNA identifies re-expression of fetal splice variants in cardiac hypertrophy.

    Science.gov (United States)

    Ames, E G; Lawson, M J; Mackey, A J; Holmes, J W

    2013-09-01

    Cardiac hypertrophy has been well-characterized at the level of transcription. During cardiac hypertrophy, genes normally expressed primarily during fetal heart development are re-expressed, and this fetal gene program is believed to be a critical component of the hypertrophic process. Recently, alternative splicing of mRNA transcripts has been shown to be temporally regulated during heart development, leading us to consider whether fetal patterns of splicing also reappear during hypertrophy. We hypothesized that patterns of alternative splicing occurring during heart development are recapitulated during cardiac hypertrophy. Here we present a study of isoform expression during pressure-overload cardiac hypertrophy induced by 10 days of transverse aortic constriction (TAC) in rats and in developing fetal rat hearts compared to sham-operated adult rat hearts, using high-throughput sequencing of poly(A) tail mRNA. We find a striking degree of overlap between the isoforms expressed differentially in fetal and pressure-overloaded hearts compared to control: forty-four percent of the isoforms with significantly altered expression in TAC hearts are also expressed at significantly different levels in fetal hearts compared to control (Phypertrophy and fetal heart development are significantly enriched for genes involved in cytoskeletal organization, RNA processing, developmental processes, and metabolic enzymes. Our data strongly support the concept that mRNA splicing patterns normally associated with heart development recur as part of the hypertrophic response to pressure overload. These findings suggest that cardiac hypertrophy shares post-transcriptional as well as transcriptional regulatory mechanisms with fetal heart development. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Exome sequencing identifies DYNC2H1 mutations as a common cause of asphyxiating thoracic dystrophy (Jeune syndrome) without major polydactyly, renal or retinal involvement

    Science.gov (United States)

    Schmidts, Miriam; Arts, Heleen H; Bongers, Ernie M H F; Yap, Zhimin; Oud, Machteld M; Antony, Dinu; Duijkers, Lonneke; Emes, Richard D; Stalker, Jim; Yntema, Jan-Bart L; Plagnol, Vincent; Hoischen, Alexander; Gilissen, Christian; Forsythe, Elisabeth; Lausch, Ekkehart; Veltman, Joris A; Roeleveld, Nel; Superti-Furga, Andrea; Kutkowska-Kazmierczak, Anna; Kamsteeg, Erik-Jan; Elçioğlu, Nursel; van Maarle, Merel C; Graul-Neumann, Luitgard M; Devriendt, Koenraad; Smithson, Sarah F; Wellesley, Diana; Verbeek, Nienke E; Hennekam, Raoul C M; Kayserili, Hulya; Scambler, Peter J; Beales, Philip L; Knoers, Nine VAM; Roepman, Ronald; Mitchison, Hannah M

    2013-01-01

    Background Jeune asphyxiating thoracic dystrophy (JATD) is a rare, often lethal, recessively inherited chondrodysplasia characterised by shortened ribs and long bones, sometimes accompanied by polydactyly, and renal, liver and retinal disease. Mutations in intraflagellar transport (IFT) genes cause JATD, including the IFT dynein-2 motor subunit gene DYNC2H1. Genetic heterogeneity and the large DYNC2H1 gene size have hindered JATD genetic diagnosis. Aims and methods To determine the contribution to JATD we screened DYNC2H1 in 71 JATD patients JATD patients combining SNP mapping, Sanger sequencing and exome sequencing. Results and conclusions We detected 34 DYNC2H1 mutations in 29/71 (41%) patients from 19/57 families (33%), showing it as a major cause of JATD especially in Northern European patients. This included 13 early protein termination mutations (nonsense/frameshift, deletion, splice site) but no patients carried these in combination, suggesting the human phenotype is at least partly hypomorphic. In addition, 21 missense mutations were distributed across DYNC2H1 and these showed some clustering to functional domains, especially the ATP motor domain. DYNC2H1 patients largely lacked significant extra-skeletal involvement, demonstrating an important genotype–phenotype correlation in JATD. Significant variability exists in the course and severity of the thoracic phenotype, both between affected siblings with identical DYNC2H1 alleles and among individuals with different alleles, which suggests the DYNC2H1 phenotype might be subject to modifier alleles, non-genetic or epigenetic factors. Assessment of fibroblasts from patients showed accumulation of anterograde IFT proteins in the ciliary tips, confirming defects similar to patients with other retrograde IFT machinery mutations, which may be of undervalued potential for diagnostic purposes. PMID:23456818

  16. Comprehensive mutation analysis by whole-exome sequencing in 41 Chinese families with Leber congenital amaurosis.

    Science.gov (United States)

    Chen, Yabin; Zhang, Qingyan; Shen, Tao; Xiao, Xueshan; Li, Shiqiang; Guan, Liping; Zhang, Jianguo; Zhu, Zhihong; Yin, Ye; Wang, Panfeng; Guo, Xiangming; Wang, Jun; Zhang, Qingjiong

    2013-06-26

    Leber congenital amaurosis (LCA) is a genetically heterogeneous disease with, to date, 19 identified causative genes. Our aim was to evaluate the mutations in all 19 genes in Chinese families with LCA. LCA patients from 41 unrelated Chinese families were enrolled, including 25 previously unanalyzed families and 16 families screened previously by Sanger sequencing, but with no identified mutations. Genetic variations were screened by whole-exome sequencing and then validated using Sanger sequencing. A total of 41 variants predicted to affect protein coding or splicing was detected by whole-exome sequencing, and 40 were confirmed by Sanger sequencing. Bioinformatic and segregation analyses revealed 22 potentially pathogenic variants (17 novel) in 15 probands, comprised of 3 of 16 previously analyzed families and 12 of 25 (48%) previously unanalyzed families. In the latter 12 families, mutations were found in CEP290 (three probands); GUCY2D (two probands); and CRB1, CRX, RPE65, IQCB1, LCA5, TULP1, and IMPDH1 (one proband each). Based on the results from 87 previously analyzed probands and 25 new cases, GUCY2D, CRB1, RPGRIP1, CEP290, and CRX were the five most frequently mutated genes, which was similar to the results from studies in Caucasian subjects. Whole-exome sequencing detected mutations in the 19 known LCA genes in approximately half of Chinese families with LCA. These results, together with our previous results, demonstrate the spectrum and frequency of mutations of the 19 genes responsible for LCA in Han Chinese individuals. Whole-exome sequencing is an efficient method for detecting mutations in highly heterogeneous hereditary diseases. Chinese Abstract.

  17. Identifying Genetic Sources of Phenotypic Heterogeneity in Orofacial Clefts by Targeted Sequencing.

    Science.gov (United States)

    Carlson, Jenna C; Taub, Margaret A; Feingold, Eleanor; Beaty, Terri H; Murray, Jeffrey C; Marazita, Mary L; Leslie, Elizabeth J

    2017-07-17

    Orofacial clefts (OFCs), including nonsyndromic cleft lip with or without cleft palate (NSCL/P), are common birth defects. NSCL/P is highly heterogeneous with multiple phenotypic presentations. Two common subtypes of NSCL/P are cleft lip (CL) and cleft lip with cleft palate (CLP) which have different population prevalence. Similarly, NSCL/P can be divided into bilateral and unilateral clefts, with unilateral being the most common. Individuals with unilateral NSCL/P are more likely to be affected on the left side of the upper lip, but right side affection also occurs. Moreover, NSCL/P is twice as common in males as in females. The goal of this study is to discover genetic variants that have different effects in case subgroups. We conducted both common variant and rare variant analyses in 1034 individuals of Asian ancestry with NSCL/P, examining four sources of heterogeneity within CL/P: cleft type, sex, laterality, and side. We identified several regions associated with subtype differentiation: cleft type differences in 8q24 (p = 1.00 × 10-4 ), laterality differences in IRF6, a gene previously implicated with wound healing (p = 2.166 × 10-4 ), sex differences and side of unilateral CL differences in FGFR2 (p = 3.00 × 10-4 ; p = 6.00 × 10-4 ), and sex differences in VAX1 (p < 1.00 × 10-4 ) among others. Many of the regions associated with phenotypic modification were either adjacent to or overlapping functional elements based on ENCODE chromatin marks and published craniofacial enhancers. We have identified multiple common and rare variants as potential phenotypic modifiers of NSCL/P, and suggest plausible elements responsible for phenotypic heterogeneity, further elucidating the complex genetic architecture of OFCs. Birth Defects Research 109:1030-1038, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  18. Sequencing EVC and EVC2 identifies mutations in two-thirds of Ellis-van Creveld syndrome patients.

    Science.gov (United States)

    Tompson, Stuart W J; Ruiz-Perez, Victor L; Blair, Helen J; Barton, Stephanie; Navarro, Victoria; Robson, Joanne L; Wright, Michael J; Goodship, Judith A

    2007-01-01

    Ellis-van Creveld syndrome (EvC) is caused by mutations in EVC and EVC2, genes in a divergent orientation separated by only 2.6 kb. We systematically sought mutations in both genes in a panel of 65 affected individuals to assess the proportion of cases resulting from mutations in each gene. We PCR amplified and sequenced the coding exons of both genes. We investigated mutations that could affect splicing by in vitro splicing assays and cDNA analysis. We have identified EVC mutations in 20 cases (31%); in all of these we have detected the mutation on each allele. We have identified EVC2 mutations in 25 cases (38%); in 22 of these we have isolated a mutation on each allele. The majority of the mutations introduce a premature termination codon. We sequenced the region between the two genes in 10 of the 20 cases in which we had not identified a mutation in either gene, revealing only one SNP that was not a common polymorphism. As we have not identified mutations in either gene in 20 cases (31%) it is possible that there is further genetic heterogeneity.

  19. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing

    Science.gov (United States)

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-01-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations. PMID:26206155

  20. Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

    Science.gov (United States)

    Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

    2014-01-01

    High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

  1. Low-Cost, High-Throughput Sequencing of DNA Assemblies Using a Highly Multiplexed Nextera Process.

    Science.gov (United States)

    Shapland, Elaine B; Holmes, Victor; Reeves, Christopher D; Sorokin, Elena; Durot, Maxime; Platt, Darren; Allen, Christopher; Dean, Jed; Serber, Zach; Newman, Jack; Chandran, Sunil

    2015-07-17

    In recent years, next-generation sequencing (NGS) technology has greatly reduced the cost of sequencing whole genomes, whereas the cost of sequence verification of plasmids via Sanger sequencing has remained high. Consequently, industrial-scale strain engineers either limit the number of designs or take short cuts in quality control. Here, we show that over 4000 plasmids can be completely sequenced in one Illumina MiSeq run for less than $3 each (15× coverage), which is a 20-fold reduction over using Sanger sequencing (2× coverage). We reduced the volume of the Nextera tagmentation reaction by 100-fold and developed an automated workflow to prepare thousands of samples for sequencing. We also developed software to track the samples and associated sequence data and to rapidly identify correctly assembled constructs having the fewest defects. As DNA synthesis and assembly become a centralized commodity, this NGS quality control (QC) process will be essential to groups operating high-throughput pipelines for DNA construction.

  2. Identifying cryptic speciation across groundwater populations: first COI sequences of Bathynellidae (Crustacea, Syncarida

    Directory of Open Access Journals (Sweden)

    Camacho, A. I.

    2011-06-01

    Full Text Available The biodiversity of groundwater fauna remains poorly known and understood. Groundwater biodiversity studies are strongly affected by habitat inaccessibility and taxonomic crisis. The objective of this work was to investigate levels of genetic divergence across populations of Bathynellacea, a small crustacean group that lives exclusively in groundwater, in order to evaluate the extent of cryptic speciation in morphologically constrained clades. Partial sequences of cytochrome oxidase I (COI have been obtained, for the first time in Bathynellidae. Specimens analyzed of the genus Vejdovskybathynella were obtained from six populations morphologically assignable to a single species; all of them are located in different areas of one of the largest karst systems (110 km of galleries topographied known in Spain. The analyses of molecular data demonstrate the presence of three highly divergent genetic units, possibly corresponding to undescribed new species. The results of this study provide the first molecular data that complement morphological knowledge in order to address phylogenetic studies to try to resolve the relations between genera and species of the Bathynellidae family. We conclude that the evolutionary scenario of this special group of subterranean crustaceans cannot be revealed only by using morphological information due to the presence of very old lineages of cryptic species, as has been brought to light with the molecular data obtained here.

    La biodiversidad de la fauna de las aguas subterráneas sigue siendo poco conocida. Los estudios de diversidad biológica de las aguas subterráneas se ven negativamente afectados por la inaccesibilidad del hábitat y la crisis taxonómica. El objetivo de este trabajo es estudiar los niveles de divergencia genética de poblaciones de Bathynellacea, un pequeño grupo de crustáceos que viven exclusivamente en las aguas subterráneas, para evaluar la extensión de la especiación cr

  3. Detection of a Distinctive Genomic Signature in Rhabdoid Glioblastoma, A Rare Disease Entity Identified by Whole Exome Sequencing and Whole Transcriptome Sequencing

    Directory of Open Access Journals (Sweden)

    Youngil Koh

    2015-08-01

    Full Text Available We analyzed the genome of a rhabdoid glioblastoma (R-GBM tumor, a very rare variant of GBM. A surgical specimen of R-GBM from a 20-year-old woman was analyzed using whole exome sequencing (WES, whole transcriptome sequencing (WTS, single nucleotide polymorphism array, and array comparative genomic hybridization. The status of gene expression in R-GBM tissue was compared with that of normal brain tissue and conventional GBM tumor tissue. We identified 23 somatic non-synonymous small nucleotide variants with WES. We identified the BRAF V600E mutation and possible functional changes in the mutated genes, ISL1 and NDRG2. Copy number alteration analysis revealed gains of chromosomes 3, 7, and 9. We found loss of heterozygosity and focal homozygous deletion on 9q21, which includes CDKN2A and CDKN2B. In addition, WTS revealed that CDK6, MET, EZH2, EGFR, and NOTCH1, which are located on chromosomes 7 and 9, were over-expressed, whereas CDKN2A/2B were minimally expressed. Fusion gene analysis showed 14 candidate genes that may be functionally involved in R-GBM, including TWIST2, and UPK3BL. The BRAF V600E mutation, CDKN2A/2B deletion, and EGFR/MET copy number gain were observed. These simultaneous alterations are very rarely found in GBM. Moreover, the NDRG2 mutation was first identified in this study as it has never been reported in GBM. We observed a unique genomic signature in R-GBM compared to conventional GBM, which may provide insight regarding R-GBM as a distinct disease entity among the larger group of GBMs.

  4. Transcriptome Sequencing Identified Genes and Gene Ontologies Associated with Early Freezing Tolerance in Maize

    Science.gov (United States)

    Li, Zhao; Hu, Guanghui; Liu, Xiangfeng; Zhou, Yao; Li, Yu; Zhang, Xu; Yuan, Xiaohui; Zhang, Qian; Yang, Deguang; Wang, Tianyu; Zhang, Zhiwu

    2016-01-01

    Originating in a tropical climate, maize has faced great challenges as cultivation has expanded to the majority of the world's temperate zones. In these zones, frost and cold temperatures are major factors that prevent maize from reaching its full yield potential. Among 30 elite maize inbred lines adapted to northern China, we identified two lines of extreme, but opposite, freezing tolerance levels—highly tolerant and highly sensitive. During the seedling stage of these two lines, we used RNA-seq to measure changes in maize whole genome transcriptome before and after freezing treatment. In total, 19,794 genes were expressed, of which 4550 exhibited differential expression due to either treatment (before or after freezing) or line type (tolerant or sensitive). Of the 4550 differently expressed genes, 948 exhibited differential expression due to treatment within line or lines under freezing condition. Analysis of gene ontology found that these 948 genes were significantly enriched for binding functions (DNA binding, ATP binding, and metal ion binding), protein kinase activity, and peptidase activity. Based on their enrichment, literature support, and significant levels of differential expression, 30 of these 948 genes were selected for quantitative real-time PCR (qRT-PCR) validation. The validation confirmed our RNA-Seq-based findings, with squared correlation coefficients of 80% and 50% in the tolerance and sensitive lines, respectively. This study provided valuable resources for further studies to enhance understanding of the molecular mechanisms underlying maize early freezing response and enable targeted breeding strategies for developing varieties with superior frost resistance to achieve yield potential. PMID:27774095

  5. 454 sequencing put to the test using the complex genome of barley

    Directory of Open Access Journals (Sweden)

    Keller Beat

    2006-10-01

    Full Text Available Abstract Background During the past decade, Sanger sequencing has been used to completely sequence hundreds of microbial and a few higher eukaryote genomes. In recent years, a number of alternative technologies became available, among them adaptations of the pyrosequencing procedure (i.e. "454 sequencing", promising a ~100-fold increase in throughput over Sanger technology – an advancement which is needed to make large and complex genomes more amenable to full genome sequencing at affordable costs. Although several studies have demonstrated its potential usefulness for sequencing small and compact microbial genomes, it was unclear how the new technology would perform in large and highly repetitive genomes such as those of wheat or barley. Results To study its performance in complex genomes, we used 454 technology to sequence four barley Bacterial Artificial Chromosome (BAC clones and compared the results to those from ABI-Sanger sequencing. All gene containing regions were covered efficiently and at high quality with 454 sequencing whereas repetitive sequences were more problematic with 454 sequencing than with ABI-Sanger sequencing. 454 sequencing provided a much more even coverage of the BAC clones than ABI-Sanger sequencing, resulting in almost complete assembly of all genic sequences even at only 9 to 10-fold coverage. To obtain highly advanced working draft sequences for the BACs, we developed a strategy to assemble large parts of the BAC sequences by combining comparative genomics, detailed repeat analysis and use of low-quality reads from 454 sequencing. Additionally, we describe an approach of including small numbers of ABI-Sanger sequences to produce hybrid assemblies to partly compensate the short read length of 454 sequences. Conclusion Our data indicate that 454 pyrosequencing allows rapid and cost-effective sequencing of the gene-containing portions of large and complex genomes and that its combination with ABI-Sanger sequencing

  6. Detection limits of tidal-wetland sequences to identify variable rupture modes of megathrust earthquakes

    Science.gov (United States)

    Shennan, Ian; Garrett, Ed; Barlow, Natasha

    2016-10-01

    Recent paleoseismological studies question whether segment boundaries identified for 20th and 21st century great, >M8, earthquakes persist through multiple earthquake cycles or whether smaller segments with different boundaries rupture and cause significant hazards. The smaller segments may include some currently slipping rather than locked. In this review, we outline general principles regarding indicators of relative sea-level change in tidal wetlands and the conditions in which paleoseismic indicators must be distinct from those resulting from non-seismic processes. We present new evidence from sites across southcentral Alaska to illustrate different detection limits of paleoseismic indicators and consider alternative interpretations for marsh submergence and emergence. We compare predictions of coseismic uplift and subsidence derived from geophysical models of earthquakes with different rupture modes. The spatial patterns of agreement and misfits between model predictions and quantitative reconstructions of coseismic submergence and emergence suggest that no earthquake within the last 4000 years had a pattern of rupture the same as the Mw 9.2 Alaska earthquake in 1964. From the Alaska examples and research from other subduction zones we suggest that If we want to understand whether a megathrust ruptures in segments of variable length in different earthquakes, we need to be site-specific as to what sort of geological-based criteria eliminate the possibility of a particular rupture mode in different earthquakes. We conclude that coastal paleoseismological studies benefit from a methodological framework that employs rigorous evaluation of five essential criteria and a sixth which may be very robust but only occur at some sites: 1 - lateral extent of peat-mud or mud-peat couplets with sharp contacts; 2 - suddenness of submergence or emergence, and replicated within each site; 3 - amount of vertical motion, quantified with 95% error terms and replicated within each

  7. First case series of emerging Rickettsial neonatal sepsis identified by polymerase chain reaction-based deoxyribonucleic acid sequencing

    Directory of Open Access Journals (Sweden)

    P Aarthi

    2013-01-01

    Full Text Available Purpose: To detect and identify the aetiological agent in the peripheral blood from the cases of neonatal sepsis. Materials and Methods: Four neonates from geographically different regions of South India presented with signs of neonatal sepsis and all the routine clinical and laboratory investigations were performed. Blood culture by Bac T Alert 3D was negative. To establish the aetiology, polymerase chain reaction (PCR for eubacterial genome and subsequent amplification with Gram positive and Gram negative primers were performed followed by deoxyribonucleic acid (DNA sequencing. Results: PCR for the detection of eubacterial genome was positive in all the four neonates and further amplification with designed Gram positive and Gram negative primers revealed the presence of Gram negative bacteria. The amplicons were identified as Orientia tsutsugamushi in three neonates and Coxiella burnetti in the other neonate. Multalin analysis was done to further characterise the strain variation among the three strains. Conclusion: PCR-based DNA sequencing is a rapid and reliable diagnostic tool to identify the aetiological agents of neonatal sepsis. This is the first case series of emerging Rickettsial neonatal sepsis in India .

  8. Integrating transcriptome and genome re-sequencing data to identify key genes and mutations affecting chicken eggshell qualities.

    Directory of Open Access Journals (Sweden)

    Quan Zhang

    Full Text Available Eggshell damages lead to economic losses in the egg production industry and are a threat to human health. We examined 49-wk-old Rhode Island White hens (Gallus gallus that laid eggs having shells with significantly different strengths and thicknesses. We used HiSeq 2000 (Illumina sequencing to characterize the chicken transcriptome and whole genome to identify the key genes and genetic mutations associated with eggshell calcification. We identified a total of 14,234 genes expressed in the chicken uterus, representing 89% of all annotated chicken genes. A total of 889 differentially expressed genes were identified by comparing low eggshell strength (LES and normal eggshell strength (NES genomes. The DEGs are enriched in calcification-related processes, including calcium ion transport and calcium signaling pathways as revealed by gene ontology (GO and Kyoto encyclopedia of genes and genomes (KEGG pathway analysis. Some important matrix proteins, such as OC-116, LTF and SPP1, were also expressed differentially between two groups. A total of 3,671,919 single-nucleotide polymorphisms (SNPs and 508,035 Indels were detected in protein coding genes by whole-genome re-sequencing, including 1775 non-synonymous variations and 19 frame-shift Indels in DEGs. SNPs and Indels found in this study could be further investigated for eggshell traits. This is the first report to integrate the transcriptome and genome re-sequencing to target the genetic variations which decreased the eggshell qualities. These findings further advance our understanding of eggshell calcification in the chicken uterus.

  9. Structural variation in the chicken genome identified by paired-end next-generation DNA sequencing of reduced representation libraries

    Directory of Open Access Journals (Sweden)

    Okimoto Ron

    2011-02-01

    Full Text Available Abstract Background Variation within individual genomes ranges from single nucleotide polymorphisms (SNPs to kilobase, and even megabase, sized structural variants (SVs, such as deletions, insertions, inversions, and more complex rearrangements. Although much is known about the extent of SVs in humans and mice, species in which they exert significant effects on phenotypes, very little is known about the extent of SVs in the 2.5-times smaller and less repetitive genome of the chicken. Results We identified hundreds of shared and divergent SVs in four commercial chicken lines relative to the reference chicken genome. The majority of SVs were found in intronic and intergenic regions, and we also found SVs in the coding regions. To identify the SVs, we combined high-throughput short read paired-end sequencing of genomic reduced representation libraries (RRLs of pooled samples from 25 individuals and computational mapping of DNA sequences from a reference genome. Conclusion We provide a first glimpse of the high abundance of small structural genomic variations in the chicken. Extrapolating our results, we estimate that there are thousands of rearrangements in the chicken genome, the majority of which are located in non-coding regions. We observed that structural variation contributes to genetic differentiation among current domesticated chicken breeds and the Red Jungle Fowl. We expect that, because of their high abundance, SVs might explain phenotypic differences and play a role in the evolution of the chicken genome. Finally, our study exemplifies an efficient and cost-effective approach for identifying structural variation in sequenced genomes.

  10. Multiplexed microsatellite recovery using massively parallel sequencing

    Science.gov (United States)

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  11. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak

    Science.gov (United States)

    Saelens, Joseph W.; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M.; Xet-Mull, Ana M.; Stout, Jason E.; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M.

    2015-01-01

    Summary Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City. PMID:26542222

  12. iCTX-Type: A Sequence-Based Predictor for Identifying the Types of Conotoxins in Targeting Ion Channels

    Directory of Open Access Journals (Sweden)

    Hui Ding

    2014-01-01

    Full Text Available Conotoxins are small disulfide-rich neurotoxic peptides, which can bind to ion channels with very high specificity and modulate their activities. Over the last few decades, conotoxins have been the drug candidates for treating chronic pain, epilepsy, spasticity, and cardiovascular diseases. According to their functions and targets, conotoxins are generally categorized into three types: potassium-channel type, sodium-channel type, and calcium-channel types. With the avalanche of peptide sequences generated in the postgenomic age, it is urgent and challenging to develop an automated method for rapidly and accurately identifying the types of conotoxins based on their sequence information alone. To address this challenge, a new predictor, called iCTX-Type, was developed by incorporating the dipeptide occurrence frequencies of a conotoxin sequence into a 400-D (dimensional general pseudoamino acid composition, followed by the feature optimization procedure to reduce the sample representation from 400-D to 50-D vector. The overall success rate achieved by iCTX-Type via a rigorous cross-validation was over 91%, outperforming its counterpart (RBF network. Besides, iCTX-Type is so far the only predictor in this area with its web-server available, and hence is particularly useful for most experimental scientists to get their desired results without the need to follow the complicated mathematics involved.

  13. Massively parallel DNA sequencing successfully identifies new causative mutations in deafness genes in patients with cochlear implantation and EAS.

    Directory of Open Access Journals (Sweden)

    Maiko Miyagawa

    Full Text Available Genetic factors, the most common etiology in severe to profound hearing loss, are one of the key determinants of Cochlear Implantation (CI and Electric Acoustic Stimulation (EAS outcomes. Satisfactory auditory performance after receiving a CI/EAS in patients with certain deafness gene mutations indicates that genetic testing would be helpful in predicting CI/EAS outcomes and deciding treatment choices. However, because of the extreme genetic heterogeneity of deafness, clinical application of genetic information still entails difficulties. Target exon sequencing using massively parallel DNA sequencing is a new powerful strategy to discover rare causative genes in Mendelian disorders such as deafness. We used massive sequencing of the exons of 58 target candidate genes to analyze 8 (4 early-onset, 4 late-onset Japanese CI/EAS patients, who did not have mutations in commonly found genes including GJB2, SLC26A4, or mitochondrial 1555A>G or 3243A>G mutations. We successfully identified four rare causative mutations in the MYO15A, TECTA, TMPRSS3, and ACTG1 genes in four patients who showed relatively good auditory performance with CI including EAS, suggesting that genetic testing may be able to predict the performance after implantation.

  14. First report of an astrovirus type 5 gastroenteritis outbreak in a residential elderly care home identified by sequencing.

    Science.gov (United States)

    Jarchow-Macdonald, Anna A; Halley, Shona; Chandler, Daniel; Gunson, Rory; Shepherd, Samantha J; Parcell, Benjamin J

    2015-12-01

    This is the report of an outbreak of human astrovirus type 5 gastroenteritis that occurred in a residential care home for older people in June 2013 in Tayside, Scotland, and which involved seven staff members and thirteen residents. This type of astrovirus has not been found in Scotland before and is rarely described in the literature. Using molecular methods such as PCR and sequencing to detect the cause of this gastroenteritis outbreak and to contain the outbreak using Public Health measures. Following an epidemiological investigation, stool samples were sent for routine virology and microbiology testing at the local microbiology and virology laboratory and were found to be negative. Further testing with real-time PCR and gene sequencing at the West of Scotland Specialist Virology Centre was performed. Data on the epidemiology and the response to the outbreak was collected. All samples had a 99% match to human astrovirus type 5. The use of standard infection control precautions with the addition of transmission-based precautions most likely contained the spread of the virus in this situation. This report illustrates the importance of using PCR and sequencing to identify pathogens such as astrovirus in outbreaks of vomiting and diarrhoea in older people particularly if routine virology and microbiology tests are negative. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. The genome sequence of Rickettsia felis identifies the first putative conjugative plasmid in an obligate intracellular parasite.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available We sequenced the genome of Rickettsia felis, a flea-associated obligate intracellular alpha-proteobacterium causing spotted fever in humans. Besides a circular chromosome of 1,485,148 bp, R. felis exhibits the first putative conjugative plasmid identified among obligate intracellular bacteria. This plasmid is found in a short (39,263 bp and a long (62,829 bp form. R. felis contrasts with previously sequenced Rickettsia in terms of many other features, including a number of transposases, several chromosomal toxin-antitoxin genes, many more spoT genes, and a very large number of ankyrin- and tetratricopeptide-motif-containing genes. Host-invasion-related genes for patatin and RickA were found. Several phenotypes predicted from genome analysis were experimentally tested: conjugative pili and mating were observed, as well as beta-lactamase activity, actin-polymerization-driven mobility, and hemolytic properties. Our study demonstrates that complete genome sequencing is the fastest approach to reveal phenotypic characters of recently cultured obligate intracellular bacteria.

  16. The genome sequence of Rickettsia felis identifies the first putative conjugative plasmid in an obligate intracellular parasite.

    Science.gov (United States)

    Ogata, Hiroyuki; Renesto, Patricia; Audic, Stéphane; Robert, Catherine; Blanc, Guillaume; Fournier, Pierre-Edouard; Parinello, Hugues; Claverie, Jean-Michel; Raoult, Didier

    2005-08-01

    We sequenced the genome of Rickettsia felis, a flea-associated obligate intracellular alpha-proteobacterium causing spotted fever in humans. Besides a circular chromosome of 1,485,148 bp, R. felis exhibits the first putative conjugative plasmid identified among obligate intracellular bacteria. This plasmid is found in a short (39,263 bp) and a long (62,829 bp) form. R. felis contrasts with previously sequenced Rickettsia in terms of many other features, including a number of transposases, several chromosomal toxin-antitoxin genes, many more spoT genes, and a very large number of ankyrin- and tetratricopeptide-motif-containing genes. Host-invasion-related genes for patatin and RickA were found. Several phenotypes predicted from genome analysis were experimentally tested: conjugative pili and mating were observed, as well as beta-lactamase activity, actin-polymerization-driven mobility, and hemolytic properties. Our study demonstrates that complete genome sequencing is the fastest approach to reveal phenotypic characters of recently cultured obligate intracellular bacteria.

  17. Population analysis of the alpha hemoglobin stabilizing protein (AHSP) gene identifies sequence variants that alter expression and function.

    Science.gov (United States)

    dos Santos, Camila O; Zhou, Suiping; Secolin, Rodrigo; Wang, Xiaomei; Cunha, Anderson F; Higgs, Douglas R; Kwiatkowski, Janet L; Thein, Swee Lay; Gallagher, Patrick G; Costa, Fernando F; Weiss, Mitchell J

    2008-02-01

    Alpha-hemoglobin stabilizing protein (AHSP) is a potential modifier of beta-thalassemia by virtue of its ability to detoxify excess free alpha-globin. However, examination of patients with beta-thalassemia from a few geographic regions failed to identify obvious AHSP mutations. We extended these studies by analyzing AHSP gene sequences in 366 anonymous individuals from five different areas of the world. We detected numerous polymorphisms comprising 18 different haplotypes and two rare missense mutations. Two sequence variations produce functional effects in laboratory assays. First, a rare missense mutation in a Brazilian/Mediterranean cohort converts asparagine to isoleucine at position 75 of AHSP protein and impairs its ability to inhibit reactive oxygen species production by alpha-hemoglobin. Second, a high-frequency polymorphism in intron 1 of the AHSP gene (12391 G>A) alters an Oct-1 transcription factor binding site previously shown to be important for optimal gene expression. The 12391A polymorphism impairs Oct-1 binding and inhibits the ability of AHSP regulatory sequences to activate expression of a linked luciferase reporter. Although structural mutations predicted to alter AHSP protein function or ablate its activity are rare, the 12391 G>A SNP is common and represents a potential mechanism through which genetically determined variations in AHSP expression could influence beta-thalassemia.

  18. T cell receptor sequencing of early-stage breast cancer tumors identifies altered clonal structure of the T cell repertoire.

    Science.gov (United States)

    Beausang, John F; Wheeler, Amanda J; Chan, Natalie H; Hanft, Violet R; Dirbas, Frederick M; Jeffrey, Stefanie S; Quake, Stephen R

    2017-11-28

    Tumor-infiltrating T cells play an important role in many cancers, and can improve prognosis and yield therapeutic targets. We characterized T cells infiltrating both breast cancer tumors and the surrounding normal breast tissue to identify T cells specific to each, as well as their abundance in peripheral blood. Using immune profiling of the T cell beta-chain repertoire in 16 patients with early-stage breast cancer, we show that the clonal structure of the tumor is significantly different from adjacent breast tissue, with the tumor containing ∼2.5-fold greater density of T cells and higher clonality compared with normal breast. The clonal structure of T cells in blood and normal breast is more similar than between blood and tumor, and could be used to distinguish tumor from normal breast tissue in 14 of 16 patients. Many T cell sequences overlap between tissue and blood from the same patient, including ∼50% of T cells between tumor and normal breast. Both tumor and normal breast contain high-abundance "enriched" sequences that are absent or of low abundance in the other tissue. Many of these T cells are either not detected or detected with very low frequency in the blood, suggesting the existence of separate compartments of T cells in both tumor and normal breast. Enriched T cell sequences are typically unique to each patient, but a subset is shared between many different patients. We show that many of these are commonly generated sequences, and thus unlikely to play an important role in the tumor microenvironment. Copyright © 2017 the Author(s). Published by PNAS.

  19. Exome sequencing identifies a novel UNC5D mutation in a severe myopic anisometropia family: A case report.

    Science.gov (United States)

    Feng, Lei; Zhou, Daizhan; Zhang, Zhou; He, Lin; Liu, Yun; Yang, Yabo

    2017-06-01

    Severe myopic anisometropia has been identified to have heritability, but the pathogenesis of anisometropia still remains obscure. Here, we presented a Chinese severe myopic anisometropia family with 5 members affected. Though using the exome sequencing, we identified a novel mutation in the UNC5D gene (c.1297C>T, p.R433C), which was predicted to have a damage effect on the protein function and kept highly conserved throughout evolution across species. As previously described, the UNC5D gene belongs to the UNC5 protein family and may have functions to regulate neuronal migration, axon guidance, and cell survival. The expression of UNC5D was also co-located at the visual areas of the mouse cortical regions at early postnatal ages. Our data provide the first evidence for involvement of UNC5D gene in the severe myopic anisometropia.

  20. Fast forward genetics to identify mutations causing a high light tolerant phenotype in Chlamydomonas reinhardtii by whole-genome-sequencing.

    Science.gov (United States)

    Schierenbeck, Lisa; Ries, David; Rogge, Kristin; Grewe, Sabrina; Weisshaar, Bernd; Kruse, Olaf

    2015-02-06

    High light tolerance of microalgae is a desired phenotype for efficient cultivation in large scale production systems under fluctuating outdoor conditions. Outdoor cultivation requires the use of either wild-type or non-GMO derived mutant strains due to safety concerns. The identification and molecular characterization of such mutants derived from untagged forward genetics approaches was limited previously by the tedious and time-consuming methods involving techniques such as classical meiotic mapping. The combination of mapping with next generation sequencing technologies offers alternative strategies to identify genes involved in high light adaptation in untagged mutants. We used the model alga Chlamydomonas reinhardtii in a non-GMO mutation strategy without any preceding crossing step or pooled progeny to identify genes involved in the regulatory processes of high light adaptation. To generate high light tolerant mutants, wildtype cells were mutagenized only to a low extent, followed by a stringent selection. We performed whole-genome sequencing of two independent mutants hit1 and hit2 and the parental wildtype. The availability of a reference genome sequence and the removal of shared bakground variants between the wildtype strain and each mutant, enabled us to identify two single nucleotide polymorphisms within the same gene Cre02.g085050, hereafter called LRS1 (putative Light Response Signaling protein 1). These two independent single amino acid exchanges are both located in the putative WD40 propeller domain of the corresponding protein LRS1. Both mutants exhibited an increased rate of non-photochemical-quenching (NPQ) and an improved resistance against chemically induced reactive oxygen species. In silico analyses revealed homology of LRS1 to the photoregulatory protein COP1 in plants. In this work we identified the nuclear encoded gene LRS1 as an essential factor for high light adaptation in C. reinhardtii. The causative random mutation within this gene was

  1. Somatic mutation profiles of MSI and MSS colorectal cancer identified by whole exome next generation sequencing and bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Bernd Timmermann

    Full Text Available BACKGROUND: Colorectal cancer (CRC is with approximately 1 million cases the third most common cancer worldwide. Extensive research is ongoing to decipher the underlying genetic patterns with the hope to improve early cancer diagnosis and treatment. In this direction, the recent progress in next generation sequencing technologies has revolutionized the field of cancer genomics. However, one caveat of these studies remains the large amount of genetic variations identified and their interpretation. METHODOLOGY/PRINCIPAL FINDINGS: Here we present the first work on whole exome NGS of primary colon cancers. We performed 454 whole exome pyrosequencing of tumor as well as adjacent not affected normal colonic tissue from microsatellite stable (MSS and microsatellite instable (MSI colon cancer patients and identified more than 50,000 small nucleotide variations for each tissue. According to predictions based on MSS and MSI pathomechanisms we identified eight times more somatic non-synonymous variations in MSI cancers than in MSS and we were able to reproduce the result in four additional CRCs. Our bioinformatics filtering approach narrowed down the rate of most significant mutations to 359 for MSI and 45 for MSS CRCs with predicted altered protein functions. In both CRCs, MSI and MSS, we found somatic mutations in the intracellular kinase domain of bone morphogenetic protein receptor 1A, BMPR1A, a gene where so far germline mutations are associated with juvenile polyposis syndrome, and show that the mutations functionally impair the protein function. CONCLUSIONS/SIGNIFICANCE: We conclude that with deep sequencing of tumor exomes one may be able to predict the microsatellite status of CRC and in addition identify potentially clinically relevant mutations.

  2. Whole exome sequencing identifies driver mutations in asymptomatic computed tomography-detected lung cancers with normal karyotype.

    Science.gov (United States)

    Belloni, Elena; Veronesi, Giulia; Rotta, Luca; Volorio, Sara; Sardella, Domenico; Bernard, Loris; Pece, Salvatore; Di Fiore, Pier Paolo; Fumagalli, Caterina; Barberis, Massimo; Spaggiari, Lorenzo; Pelicci, Pier Giuseppe; Riva, Laura

    2015-04-01

    The efficacy of curative surgery for lung cancer could be largely improved by non-invasive screening programs, which can detect the disease at early stages. We previously showed that 18% of screening-identified lung cancers demonstrate a normal karyotype and, following high-density genome scanning, can be subdivided into samples with 1) numerous; 2) none; and 3) few copy number alterations. Whole exome sequencing was applied to the two normal karyotype, screening-detected lung cancers, constituting group 2, as well as normal controls. We identified mutations in both tumors, including KEAP1 (commonly mutated in lung cancers) in one, and TP53, PMS1, and MSH3 (well-characterized DNA-repair genes) in the other. The two normal karyotype screening-detected lung tumors displayed a typical lung cancer mutational profile that only next generation sequencing could reveal, which offered an additional contribution to the over-diagnosis bias concept hypothesized within lung cancer screening programs. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Cross-comparison of the genome sequences from human, chimpanzee, Neanderthal and a Denisovan hominin identifies novel potentially compensated mutations

    Directory of Open Access Journals (Sweden)

    Zhang Guojie

    2011-07-01

    Full Text Available Abstract The recent publication of the draft genome sequences of the Neanderthal and a ~50,000-year-old archaic hominin from Denisova Cave in southern Siberia has ushered in a new age in molecular archaeology. We previously cross-compared the human, chimpanzee and Neanderthal genome sequences with respect to a set of disease-causing/disease-associated missense and regulatory mutations (Human Gene Mutation Database and succeeded in identifying genetic variants which, although apparently pathogenic in humans, may represent a 'compensated' wild-type state in at least one of the other two species. Here, in an attempt to identify further 'potentially compensated mutations' (PCMs of interest, we have compared our dataset of disease-causing/disease-associated mutations with their corresponding nucleotide positions in the Denisovan hominin, Neanderthal and chimpanzee genomes. Of the 15 human putatively disease-causing mutations that were found to be compensated in chimpanzee, Denisovan or Neanderthal, only a solitary F5 variant (Val1736Met was specific to the Denisovan. In humans, this missense mutation is associated with activated protein C resistance and an increased risk of thromboembolism and recurrent miscarriage. It is unclear at this juncture whether this variant was indeed a PCM in the Denisovan or whether it could instead have been associated with disease in this ancient hominin.

  4. Cross-comparison of the genome sequences from human, chimpanzee, Neanderthal and a Denisovan hominin identifies novel potentially compensated mutations.

    Science.gov (United States)

    Zhang, Guojie; Pei, Zhang; Ball, Edward V; Mort, Matthew; Kehrer-Sawatzki, Hildegard; Cooper, David N

    2011-07-01

    The recent publication of the draft genome sequences of the Neanderthal and a ∼50,000-year-old archaic hominin from Denisova Cave in southern Siberia has ushered in a new age in molecular archaeology. We previously cross-compared the human, chimpanzee and Neanderthal genome sequences with respect to a set of disease-causing/disease-associated missense and regulatory mutations (Human Gene Mutation Database) and succeeded in identifying genetic variants which, although apparently pathogenic in humans, may represent a 'compensated' wild-type state in at least one of the other two species. Here, in an attempt to identify further 'potentially compensated mutations' (PCMs) of interest, we have compared our dataset of disease-causing/disease-associated mutations with their corresponding nucleotide positions in the Denisovan hominin, Neanderthal and chimpanzee genomes. Of the 15 human putatively disease-causing mutations that were found to be compensated in chimpanzee, Denisovan or Neanderthal, only a solitary F5 variant (Val1736Met) was specific to the Denisovan. In humans, this missense mutation is associated with activated protein C resistance and an increased risk of thromboembolism and recurrent miscarriage. It is unclear at this juncture whether this variant was indeed a PCM in the Denisovan or whether it could instead have been associated with disease in this ancient hominin.

  5. Exome capture sequencing identifies a novel CCM1 mutation in a Chinese family with multiple cerebral cavernous malformations.

    Science.gov (United States)

    Mao, Cheng-Yuan; Yang, Jing; Zhang, Shu-Yu; Luo, Hai-Yang; Song, Bo; Liu, Yu-Tao; Wu, Jun; Sun, Shi-Lei; Yang, Zhi-Hua; Du, Pan; Wang, Yao-He; Shi, Chang-He; Xu, Yu-Ming

    2016-12-01

    Cerebral cavernous malformations (CCMs) are vascular anomalies predominantly in the central nervous system but may include lesions in other tissues, such as the retina, skin and liver. The main clinical manifestations include seizures, hemorrhage, recurrent headaches and focal neurological deficits. Previous studies of familial CCMs (FCCMs) have mainly reported in Hispanic and Caucasian cases. Here, we report on FCCMs in a Chinese family further characterized by a novel CCM1 gene mutation. We investigated clinical and neuroradiological features of a Chinese family of 30 members. Furthermore, we used exome capture sequencing to identify the causing gene. The CCM1 mRNA expression level in three patients of the family and 10 wild-type healthy individuals were detected by real-time quantitative polymerase chain reaction (real-time RT-PCR). Brain magnetic resonance imaging demonstrated multiple intracranial lesions in seven members. The clinical manifestation of CCM was found in five of these cases, including recurrent headaches, weakness, hemorrhage and seizures. Moreover, we identified a novel nonsense mutation c.1159G>T (p. E387*) in the CCM1 gene in the pedigree. Based on real-time RT-PCR results, we have found that the CCM1 mRNA expression level in three patients was reduced by 35% than that in wild-type healthy individuals. Our finding suggests that the novel nonsense mutation c.1159G>T in CCM1 gene is associated with FCCM, and that CCM1 haploinsufficiency may be the underlying mechanism of CCMs. Furthermore, it also demonstrates that exome capture sequencing is an efficient and direct diagnostic tool to identify causes of genetically heterogeneous diseases.

  6. RNA sequencing identifies gene expression profile changes associated with β-estradiol treatment in U2OS osteosarcoma cells

    Directory of Open Access Journals (Sweden)

    Chen B

    2017-07-01

    Full Text Available Bin Chen, Zude Liu, Jidong Zhang, Hantao Wang, Bo Yu Department of Orthopedic Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, People’s Republic of China Abstract: This study was conducted to identify gene expression profile changes associated with β-estradiol (E2 treatment in U2OS osteosarcoma cells by high-throughput RNA sequencing (RNA-seq. Two U2OS cell samples treated with E2 (15 µmol/L and two untreated control U2OS cell samples were subjected to RNA-seq. Differentially expressed genes (DEGs between the groups were identified, and main biological process enrichment was performed using gene ontology (GO analysis. A protein–protein interaction (PPI network was constructed using Cytoscape based on the Human Protein Reference Database. Finally, NFKB1 expression was confirmed by quantitative real-time polymerase chain reaction (qRT-PCR. The map ratios of the four sequenced samples were >65%. In total, 128 upregulated and 92 downregulated DEGs were identified in E2 samples. After GO enrichment, the downregulated DEGs, such as AKT1, were found to be mainly enriched in cell cycle processes, whereas the upregulated DEGs, such as NFKB1, were involved in the regulation of gene expression. Moreover, AKT1 (degree =117 and NFKB1 (degree =72 were key nodes with the highest degrees in the PPI network. Similarly, the results of qRT-PCR confirmed that E2 upregulated NFKB1 expression. The results suggest that E2 upregulates the expression of NFKB1, ATF7IP, and HDAC5, all of which are involved in the regulation of gene expression and transcription, but downregulates that of TCF7L2, ALCAM, and AKT, which are involved in Wnt receptor signaling through β-catenin and morphogenesis in U2OS osteosarcoma cells. Keywords: differentially expressed genes, Wnt receptor signaling, β-catenin, protein-protein interaction network

  7. Establishing genomic tools and resources for Guizotia abyssinica (L.f.) Cass.-the development of a library of expressed sequence tags, microsatellite loci, and the sequencing of its chloroplast genome.

    Science.gov (United States)

    Dempewolf, Hannes; Kane, Nolan C; Ostevik, Katherine L; Geleta, Mulatu; Barker, Michael S; Lai, Zhao; Stewart, Megan L; Bekele, Endashaw; Engels, Johannes M M; Cronk, Quentin C B; Rieseberg, Loren H

    2010-11-01

    We present an EST library, chloroplast genome sequence, and nuclear microsatellite markers that were developed for the semi-domesticated oilseed crop noug (Guizotia abyssinica) from Ethiopia. The EST library consists of 25 711 Sanger reads, assembled into 17 538 contigs and singletons, of which 4781 were functionally annotated using the Arabidopsis Information Resource (TAIR). The age distribution of duplicated genes in the EST library shows evidence of two paleopolyploidizations-a pattern that noug shares with several other species in the Heliantheae tribe (Compositae family). From the EST library, we selected 43 microsatellites and then designed and tested primers for their amplification. The number of microsatellite alleles varied between 2 and 10 (average 4.67), and the average observed and expected heterozygosities were 0.49 and 0.54, respectively. The chloroplast genome was sequenced de novo using Illumina's sequencing technology and completed with traditional Sanger sequencing. No large re-arrangements were found between the noug and sunflower chloroplast genomes, but 1.4% of sites have indels and 1.8% show sequence divergence between the two species. We identified 34 tRNAs, 4 rRNA sequences, and 80 coding sequences, including one region (trnH-psbA) with 15% sequence divergence between noug and sunflower that may be particularly useful for phylogeographic studies in noug and its wild relatives. © 2010 Blackwell Publishing Ltd.

  8. The first endogenous herpesvirus, identified in the tarsier genome, and novel sequences from primate rhadinoviruses and lymphocryptoviruses.

    Directory of Open Access Journals (Sweden)

    Amr Aswad

    2014-06-01

    Full Text Available Herpesviridae is a diverse family of large and complex pathogens whose genomes are extremely difficult to sequence. This is particularly true for clinical samples, and if the virus, host, or both genomes are being sequenced for the first time. Although herpesviruses are known to occasionally integrate in host genomes, and can also be inherited in a Mendelian fashion, they are notably absent from the genomic fossil record comprised of endogenous viral elements (EVEs. Here, we combine paleovirological and metagenomic approaches to both explore the constituent viral diversity of mammalian genomes and search for endogenous herpesviruses. We describe the first endogenous herpesvirus from the genome of the Philippine tarsier, belonging to the Roseolovirus genus, and characterize its highly defective genome that is integrated and flanked by unambiguous host DNA. From a draft assembly of the aye-aye genome, we use bioinformatic tools to reveal over 100,000 bp of a novel rhadinovirus that is the first lemur gammaherpesvirus, closely related to Kaposi's sarcoma-associated virus. We also identify 58 genes of Pan paniscus lymphocryptovirus 1, the bonobo equivalent of human Epstein-Barr virus. For each of the viruses, we postulate gene function via comparative analysis to known viral relatives. Most notably, the evidence from gene content and phylogenetics suggests that the aye-aye sequences represent the most basal known rhadinovirus, and indicates that tumorigenic herpesviruses have been infecting primates since their emergence in the late Cretaceous. Overall, these data show that a genomic fossil record of herpesviruses exists despite their extremely large genomes, and expands the known diversity of Herpesviridae, which will aid the characterization of pathogenesis. Our analytical approach illustrates the benefit of intersecting evolutionary approaches with metagenomics, genetics and paleovirology.

  9. The rolling circle amplification and next generation sequencing ...

    African Journals Online (AJOL)

    Using bioinformatic CLC Genomics 5.5.1 software programs the quality assessment of reads and contig assembly of viral sequences. This was done through de novo and reference-guided assembly. The identity and diversities of the begomoviral sequences were compared with sequences in Sanger sequencing of viral ...

  10. CA88, a nuclear repetitive DNA sequence identified in Schistosoma mansoni, aids in the genotyping of nine Schistosoma species of medical and veterinary importance

    OpenAIRE

    Diana Bahia; Rodrigues,Nilton B; Araújo,Flávio Marcos G; Álvaro José Romanha; Ruiz, Jerônimo C.; Johnston, David A.; Guilherme Oliveira

    2010-01-01

    CA88 is the first long nuclear repetitive DNA sequence identified in the blood fluke, Schistosoma mansoni. The assembled S. mansoni sequence, which contains the CA88 repeat, has 8,887 nucleotides and at least three repeat units of approximately 360 bp. In addition, CA88 also possesses an internal CA microsatellite, identified as SmBr18. Both PCR and BLAST analysis have been used to analyse and confirm the CA88 sequence in other S. mansoni sequences in the public database. PCR-acquired nuclear...

  11. Flavonoid Biosynthesis Genes Putatively Identified in the Aromatic Plant Polygonum minus via Expressed Sequences Tag (EST Analysis

    Directory of Open Access Journals (Sweden)

    Zamri Zainal

    2012-02-01

    Full Text Available P. minus is an aromatic plant, the leaf of which is widely used as a food additive and in the perfume industry. The leaf also accumulates secondary metabolites that act as active ingredients such as flavonoid. Due to limited genomic and transcriptomic data, the biosynthetic pathway of flavonoids is currently unclear. Identification of candidate genes involved in the flavonoid biosynthetic pathway will significantly contribute to understanding the biosynthesis of active compounds. We have constructed a standard cDNA library from P. minus leaves, and two normalized full-length enriched cDNA libraries were constructed from stem and root organs in order to create a gene resource for the biosynthesis of secondary metabolites, especially flavonoid biosynthesis. Thus, large‑scale sequencing of P. minus cDNA libraries identified 4196 expressed sequences tags (ESTs which were deposited in dbEST in the National Center of Biotechnology Information (NCBI. From the three constructed cDNA libraries, 11 ESTs encoding seven genes were mapped to the flavonoid biosynthetic pathway. Finally, three flavonoid biosynthetic pathway-related ESTs chalcone synthase, CHS (JG745304, flavonol synthase, FLS (JG705819 and leucoanthocyanidin dioxygenase, LDOX (JG745247 were selected for further examination by quantitative RT-PCR (qRT-PCR in different P. minus organs. Expression was detected in leaf, stem and root. Gene expression studies have been initiated in order to better understand the underlying physiological processes.

  12. De novo transcriptome sequencing in Frankliniella occidentalis to identify genes involved in plant virus transmission and insecticide resistance.

    Science.gov (United States)

    Zhang, Zhijun; Zhang, Pengjun; Li, Weidi; Zhang, Jinming; Huang, Fang; Yang, Jian; Bei, Yawei; Lu, Yaobin

    2013-05-01

    The western flower thrips (WFT), Frankliniella occidentalis, a world-wide invasive insect, causes agricultural damage by directly feeding and by indirectly vectoring Tospoviruses, such as Tomato spotted wilt virus (TSWV). We characterized the transcriptome of WFT and analyzed global gene expression of WFT response to TSWV infection using Illumina sequencing platform. We compiled 59,932 unigenes, and identified 36,339 unigenes by similarity analysis against public databases, most of which were annotated using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Within these annotated transcripts, we collected 278 sequences related to insecticide resistance. GO and KEGG analysis of different expression genes between TSWV-infected and non-infected WFT population revealed that TSWV can regulate cellular process and immune response, which might lead to low virus titers in thrips cells and no detrimental effects on F. occidentalis. This data-set not only enriches genomic resource for WFT, but also benefits research into its molecular genetics and functional genomics. Copyright © 2013 Elsevier Inc. All rights reserved.

  13. Exome sequencing identifies a novel MYH7 p.G407C mutation responsible for familial hypertrophic cardiomyopathy.

    Science.gov (United States)

    Guo, Qianqian; Xu, Yuejuan; Wang, Xike; Guo, Ying; Xu, Rang; Sun, Kun; Chen, Sun

    2014-10-01

    Hypertrophic cardiomyopathy (HCM), characterized by myocardial hypertrophy, is the most common cause of sudden cardiac arrest in young individuals. More than 270 mutations have been found to be responsible for familial HCM to date; mutations in MYH7, which encodes the β-myosin heavy chain (β-MHC) and MYBPC3, which encodes the myosin binding protein C, are seen most often. This study aimed to screen a pathogenic mutation causing HCM in a large family and assess its possible impact on the function of the specific protein. Exome sequencing was applied in the proband for searching a novel mutation; segments bearing the specific mutation were analyzed by polymerase chain reaction and direct sequencing. A novel p.G407C mutation in the β-MHC gene (MYH7) was identified to be responsible for familial HCM in this family. The mutation may cause damage to the second structure of the protein despite the fact that patients bearing the mutation may have a relatively benign prognosis in this family. The clinical details of the p.G407C mutation are described for the first time in this study. Our report shows a good genotype-phenotype consistency and makes it possible for genetic counseling in this family.

  14. De Novo Transcriptome Sequencing of Olea europaea L. to Identify Genes Involved in the Development of the Pollen Tube

    Directory of Open Access Journals (Sweden)

    Domenico Iaria

    2016-01-01

    Full Text Available In olive (Olea europaea L., the processes controlling self-incompatibility are still unclear and the molecular basis underlying this process are still not fully characterized. In order to determine compatibility relationships, using next-generation sequencing techniques and a de novo transcriptome assembly strategy, we show that pollen tubes from different olive plants, grown in vitro in a medium containing its own pistil and in combination pollen/pistil from self-sterile and self-fertile cultivars, have a distinct gene expression profile and many of the differentially expressed sequences between the samples fall within gene families involved in the development of the pollen tube, such as lipase, carboxylesterase, pectinesterase, pectin methylesterase, and callose synthase. Moreover, different genes involved in signal transduction, transcription, and growth are overrepresented. The analysis also allowed us to identify members in actin and actin depolymerization factor and fibrin gene family and member of the Ca2+ binding gene family related to the development and polarization of pollen apical tip. The whole transcriptomic analysis, through the identification of the differentially expressed transcripts set and an extended functional annotation analysis, will lead to a better understanding of the mechanisms of pollen germination and pollen tube growth in the olive.

  15. Mapping and exome sequencing identifies a mutation in the IARS gene as the cause of hereditary perinatal weak calf syndrome.

    Directory of Open Access Journals (Sweden)

    Takashi Hirano

    Full Text Available We identified an IARS (isoleucyl-tRNA synthetase c.235G>C (p.Val79Leu substitution as the causative mutation for neonatal weakness with intrauterine growth retardation (perinatal weak calf syndrome. In Japanese Black cattle, the syndrome was frequently found in calves sired by Bull A. Hence, we employed homozygosity mapping and linkage analysis. In order to identify the perinatal weak calf syndrome locus in a 4.04-Mb region of BTA 8, we analysed a paternal half-sibling family with a BovineSNP50 BeadChip and microsatellites. In this critical region, we performed exome sequencing to identify a causative mutation. Three variants were detected as possible candidates for causative mutations that were predicted to disrupt the protein function, including a G>C (p.Val79Leu mutation in IARS c.235. The IARS c.235G>C mutation was not a homozygous risk allele in the 36 healthy offspring of Bull A. Moreover, the IARS Val79 residue and its flanking regions were evolutionarily and highly conserved. The IARS mutant (Leu79 had decreased aminoacylation activity. Additionally, the homozygous mutation was not found in any of 1526 healthy cattle. Therefore, we concluded that the IARS c.235G>C mutation was the cause of hereditary perinatal weak calf syndrome.

  16. Fine Mapping of a Clubroot Resistance Gene in Chinese Cabbage Using SNP Markers Identified from Bulked Segregant RNA Sequencing

    Directory of Open Access Journals (Sweden)

    Zhen Huang

    2017-08-01

    Full Text Available Clubroot, caused by Plasmodiophora brassicae, is an important disease of canola (Brassica napus in western Canada and worldwide. In this study, a clubroot resistance gene (Rcr2 was identified and fine mapped in Chinese cabbage cv. “Jazz” using single-nucleotide polymorphisms (SNP markers identified from bulked segregant RNA sequencing (BSR-Seq and molecular markers were developed for use in marker assisted selection. In total, 203.9 million raw reads were generated from one pooled resistant (R and one pooled susceptible (S sample, and >173,000 polymorphic SNP sites were identified between the R and S samples. One significant peak was observed between 22 and 26 Mb of chromosome A03, which had been predicted by BSR-Seq to contain the causal gene Rcr2. There were 490 polymorphic SNP sites identified in the region. A segregating population consisting of 675 plants was analyzed with 15 SNP sites in the region using the Kompetitive Allele Specific PCR method, and Rcr2 was fine mapped between two SNP markers, SNP_A03_32 and SNP_A03_67 with 0.1 and 0.3 cM from Rcr2, respectively. Five SNP markers co-segregated with Rcr2 in this region. Variants were identified in 14 of 36 genes annotated in the Rcr2 target region. The numbers of poly variants differed among the genes. Four genes encode TIR-NBS-LRR proteins and two of them Bra019410 and Bra019413, had high numbers of polymorphic variants and so are the most likely candidates of Rcr2.

  17. Novel MicroRNA Involved in Host Response to Avian Pathogenic Escherichia coli Identified by Deep Sequencing and Integration Analysis.

    Science.gov (United States)

    Jia, Xinzheng; Nie, Qinghua; Zhang, Xiquan; Nolan, Lisa K; Lamont, Susan J

    2017-01-01

    Avian pathogenic Escherichia coli (APEC) causes one of the most common bacterial diseases of poultry worldwide. Effective control methods are therefore desirable and will be facilitated by a better understanding of the host response to the pathogen. Currently, microRNAs (miRNAs) involved in host resistance to APEC are unknown. Here, we applied RNA sequencing to explore the changed miRNAs and deregulated genes in the spleen of three groups of broilers: nonchallenged (NC), APEC-challenged with mild pathology (CM), and APEC-challenged with severe pathology (CS). Twenty-seven differentially expressed miRNAs (fold change >1.5; P value APEC infection, which may help to shed light on the roles of these recently identified genetic elements in the mechanisms of host resistance and susceptibility to APEC. Copyright © 2016 Jia et al.

  18. snpTree - a web-server to identify and construct SNP trees from whole genome sequence data

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Kaas, Rolf Sommer; Thomsen, Martin Christen Frølund

    2012-01-01

    skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Results Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can......Background The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis...... to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic...

  19. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak.

    Science.gov (United States)

    Saelens, Joseph W; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M; Xet-Mull, Ana M; Stout, Jason E; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M

    2015-12-01

    Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  20. Identifying Rare Variation in Cases of Schizophrenia in the Isolated Population of the Faroe Islands using Whole-genome Sequencing

    DEFF Research Database (Denmark)

    Als, Thomas Damm; Lescai, Francesco; Dahl, Hans

    The allelic architecture of schizophrenia (SZ) is likely to be underlined by a combination of multiple common and rare variants. Genome-wide association studies (GWAS) and large-scale consortia meta-analysis of GWAS have successfully been applied in the search for common variants affecting the risk...... influencing susceptibility to schizophrenia using whole genome sequencing of Faroese case-control samples. We will conduct association testing based on IBD information (IBD association testing) by analysing genome-wide association of shared IBD segments identified by the method implemented in GERMLINE...... and clustered using the DASH algorithm. Genomic regions with evidence for shared ancestral polymorphisms and/or genetic linkage co-segregation will thus be prioritized. We hypothesize greater degree of IBD sharing of rare haplotypes in cases compared to controls for regions harbouring disease susceptibility...

  1. Sequencing illustrates the transcriptional response of Legionella pneumophila during infection and identifies seventy novel small non-coding RNAs.

    LENUS (Irish Health Repository)

    Weissenmayer, Barbara A

    2011-01-01

    Second generation sequencing has prompted a number of groups to re-interrogate the transcriptomes of several bacterial and archaeal species. One of the central findings has been the identification of complex networks of small non-coding RNAs that play central roles in transcriptional regulation in all growth conditions and for the pathogen\\'s interaction with and survival within host cells. Legionella pneumophila is a gram-negative facultative intracellular human pathogen with a distinct biphasic lifestyle. One of its primary environmental hosts in the free-living amoeba Acanthamoeba castellanii and its infection by L. pneumophila mimics that seen in human macrophages. Here we present analysis of strand specific sequencing of the transcriptional response of L. pneumophila during exponential and post-exponential broth growth and during the replicative and transmissive phase of infection inside A. castellanii. We extend previous microarray based studies as well as uncovering evidence of a complex regulatory architecture underpinned by numerous non-coding RNAs. Over seventy new non-coding RNAs could be identified; many of them appear to be strain specific and in configurations not previously reported. We discover a family of non-coding RNAs preferentially expressed during infection conditions and identify a second copy of 6S RNA in L. pneumophila. We show that the newly discovered putative 6S RNA as well as a number of other non-coding RNAs show evidence for antisense transcription. The nature and extent of the non-coding RNAs and their expression patterns suggests that these may well play central roles in the regulation of Legionella spp. specific traits and offer clues as to how L. pneumophila adapts to its intracellular niche. The expression profiles outlined in the study have been deposited into Genbank\\'s Gene Expression Omnibus (GEO) database under the series accession GSE27232.

  2. Massively parallel signature sequencing and bioinformatics analysis identifies up-regulation of TGFBI and SOX4 in human glioblastoma.

    Directory of Open Access Journals (Sweden)

    Biaoyang Lin

    Full Text Available BACKGROUND: A comprehensive network-based understanding of molecular pathways abnormally altered in glioblastoma multiforme (GBM is essential for developing effective therapeutic approaches for this deadly disease. METHODOLOGY/PRINCIPAL FINDINGS: Applying a next generation sequencing technology, massively parallel signature sequencing (MPSS, we identified a total of 4535 genes that are differentially expressed between normal brain and GBM tissue. The expression changes of three up-regulated genes, CHI3L1, CHI3L2, and FOXM1, and two down-regulated genes, neurogranin and L1CAM, were confirmed by quantitative PCR. Pathway analysis revealed that TGF- beta pathway related genes were significantly up-regulated in GBM tumor samples. An integrative pathway analysis of the TGF beta signaling network identified two alternative TGF-beta signaling pathways mediated by SOX4 (sex determining region Y-box 4 and TGFBI (Transforming growth factor beta induced. Quantitative RT-PCR and immunohistochemistry staining demonstrated that SOX4 and TGFBI expression is elevated in GBM tissues compared with normal brain tissues at both the RNA and protein levels. In vitro functional studies confirmed that TGFBI and SOX4 expression is increased by TGF-beta stimulation and decreased by a specific inhibitor of TGF-beta receptor 1 kinase. CONCLUSIONS/SIGNIFICANCE: Our MPSS database for GBM and normal brain tissues provides a useful resource for the scientific community. The identification of non-SMAD mediated TGF-beta signaling pathways acting through SOX4 and TGFBI (GENE ID:7045 in GBM indicates that these alternative pathways should be considered, in addition to the canonical SMAD mediated pathway, in the development of new therapeutic strategies targeting TGF-beta signaling in GBM. Finally, the construction of an extended TGF-beta signaling network with overlaid gene expression changes between GBM and normal brain extends our understanding of the biology of GBM.

  3. 46,XY Gonadal Dysgenesis due to a Homozygous Mutation in Desert Hedgehog (DHH) Identified by Exome Sequencing.

    Science.gov (United States)

    Werner, Ralf; Merz, Hartmut; Birnbaum, Wiebke; Marshall, Louise; Schröder, Tatjana; Reiz, Benedikt; Kavran, Jennifer M; Bäumer, Tobias; Capetian, Philipp; Hiort, Olaf

    2015-07-01

    46,XY disorders of sex development (DSD) comprise a heterogeneous group of congenital conditions. Mutations in a variety of genes can affect gonadal development or androgen biosynthesis/action and thereby influence the development of the internal and external genital organs. The objective of the study was to identify the genetic cause in two 46,XY sisters of a consanguineous family with DSD and gonadal tumor formation. We used a next-generation sequencing approach by exome sequencing. Electrophysiological and high-resolution ultrasound examination of peripheral nerves as well as histopathological examination of the gonads were performed. We identified a novel homozygous R124Q mutation in the desert hedgehog gene (DHH), which alters a conserved residue among the three mammalian Hedgehog ligands sonic hedgehog, Indian hedgehog, and desert hedgehog. No other relevant mutations in DSD-related genes were encountered. The gonads of one patient showed partial gonadal dysgenesis with loss of Leydig cells in tubular areas with seminoma in situ and a hyperplasia of Leydig cell-like cells expressing CYP17A1 in more dysgenetic parts of the gonad. In addition, both patients suffer from a polyneuropathy. High-resolution ultrasound revealed a structural change of peripheral nerve structure that fits well to a minifascicle formation of peripheral nerves. Mutations in DHH play a role in 46,XY gonadal dysgenesis and are associated with seminoma formation and a neuropathy with minifascicle formation. Gonadal dysgenesis in these cases may be due to impairment of Sertoli cell-Leydig cell interaction during gonadal development.

  4. Next generation sequencing of Apis mellifera syriaca identifies genes for Varroa resistance and beneficial bee keeping traits.

    Science.gov (United States)

    Haddad, Nizar; Mahmud Batainh, Ahmed; Suleiman Migdadi, Osama; Saini, Deepti; Krishnamurthy, Venkatesh; Parameswaran, Sriram; Alhamuri, Zaid

    2016-08-01

    Apis mellifera syriaca exhibits a high degree of tolerance to pests and pathogens including varroa mites. This native honey bee subspecies of Jordan expresses behavioral adaptations to high temperature and dry seasons typical of the region. However, persistent honey bee imports of commercial breeder lines are endangering local honey bee population. This study reports the use of next-generation sequencing (NGS) technology to study the A. m. syriaca genome and to identify genetic factors possibly contributing toward mite resistance and other favorable traits. We obtained a total of 46.2 million raw reads by applying the NGS to sequence A. m. syriaca and used extensive bioinformatics approach to identify several candidate genes for Varroa mite resistance, behavioral and immune responses characteristic for these bees. As a part of characterizing the functional regulation of molecular genetic pathway, we have mapped the pathway genes potentially involved using information from Drosophila melanogaster and present possible functional changes implicated in responses to Varroa destructor mite infestation toward this. We performed in-depth functional annotation methods to identify ∼600 candidates that are relevant, genes involved in pathways such as microbial recognition and phagocytosis, peptidoglycan recognition protein family, Gram negative binding protein family, phagocytosis receptors, serpins, Toll signaling pathway, Imd pathway, Tnf, JAK-STAT and MAPK pathway, heamatopioesis and cellular response pathways, antiviral, RNAi pathway, stress factors, etc. were selected. Finally, we have cataloged function-specific polymorphisms between A. mellifera and A. m. syriaca that could give better understanding of varroa mite resistance mechanisms and assist in breeding. We have identified immune related embryonic development (Cactus, Relish, dorsal, Ank2, baz), Varroa hygiene (NorpA2, Zasp, LanA, gasp, impl3) and Varroa resistance (Pug, pcmt, elk, elf3-s10, Dscam2, Dhc64C, gro

  5. Figaro: a novel statistical method for vector sequence removal

    Science.gov (United States)

    White, James Robert; Roberts, Michael; Yorke, James A.; Pop, Mihai

    2009-01-01

    Motivation Sequences produced by automated Sanger sequencing machines frequently contain fragments of the cloning vector on their ends. Software tools currently available for identifying and removing the vector sequence require knowledge of the vector sequence, specific splice sites and any adapter sequences used in the experiment—information often omitted from public databases. Furthermore, the clipping coordinates themselves are missing or incorrectly reported. As an example, within the ~1.24 billion shotgun sequences deposited in the NCBI Trace Archive, as many as ~735 million (~60%) lack vector clipping information. Correct clipping information is essential to scientists attempting to validate, improve and even finish the increasingly large number of genomes released at a ‘draft’ quality level. Results We present here Figaro, a novel software tool for identifying and removing the vector from raw sequence data without prior knowledge of the vector sequence. The vector sequence is automatically inferred by analyzing the frequency of occurrence of short oligo-nucleotides using Poisson statistics. We show that Figaro achieves 99.98% sensitivity when tested on ~1.5 million shotgun reads from Drosophila pseudoobscura. We further explore the impact of accurate vector trimming on the quality of whole-genome assemblies by re-assembling two bacterial genomes from shotgun sequences deposited in the Trace Archive. Designed as a module in large computational pipelines, Figaro is fast, lightweight and flexible. Availability Figaro is released under an open-source license through the AMOS package (http://amos.sourceforge.net/Figaro). PMID:18202027

  6. RNA sequencing of Populus x canadensis roots identifies key molecular mechanisms underlying physiological adaption to excess zinc.

    Science.gov (United States)

    Ariani, Andrea; Di Baccio, Daniela; Romeo, Stefania; Lombardi, Lara; Andreucci, Andrea; Lux, Alexander; Horner, David Stephen; Sebastiani, Luca

    2015-01-01

    Populus x canadensis clone I-214 exhibits a general indicator phenotype in response to excess Zn, and a higher metal uptake in roots than in shoots with a reduced translocation to aerial parts under hydroponic conditions. This physiological adaptation seems mainly regulated by roots, although the molecular mechanisms that underlie these processes are still poorly understood. Here, differential expression analysis using RNA-sequencing technology was used to identify the molecular mechanisms involved in the response to excess Zn in root. In order to maximize specificity of detection of differentially expressed (DE) genes, we consider the intersection of genes identified by three distinct statistical approaches (61 up- and 19 down-regulated) and validate them by RT-qPCR, yielding an agreement of 93% between the two experimental techniques. Gene Ontology (GO) terms related to oxidation-reduction processes, transport and cellular iron ion homeostasis were enriched among DE genes, highlighting the importance of metal homeostasis in adaptation to excess Zn by P. x canadensis clone I-214. We identified the up-regulation of two Populus metal transporters (ZIP2 and NRAMP1) probably involved in metal uptake, and the down-regulation of a NAS4 gene involved in metal translocation. We identified also four Fe-homeostasis transcription factors (two bHLH38 genes, FIT and BTS) that were differentially expressed, probably for reducing Zn-induced Fe-deficiency. In particular, we suggest that the down-regulation of FIT transcription factor could be a mechanism to cope with Zn-induced Fe-deficiency in Populus. These results provide insight into the molecular mechanisms involved in adaption to excess Zn in Populus spp., but could also constitute a starting point for the identification and characterization of molecular markers or biotechnological targets for possible improvement of phytoremediation performances of poplar trees.

  7. RNA sequencing of Populus x canadensis roots identifies key molecular mechanisms underlying physiological adaption to excess zinc.

    Directory of Open Access Journals (Sweden)

    Andrea Ariani

    Full Text Available Populus x canadensis clone I-214 exhibits a general indicator phenotype in response to excess Zn, and a higher metal uptake in roots than in shoots with a reduced translocation to aerial parts under hydroponic conditions. This physiological adaptation seems mainly regulated by roots, although the molecular mechanisms that underlie these processes are still poorly understood. Here, differential expression analysis using RNA-sequencing technology was used to identify the molecular mechanisms involved in the response to excess Zn in root. In order to maximize specificity of detection of differentially expressed (DE genes, we consider the intersection of genes identified by three distinct statistical approaches (61 up- and 19 down-regulated and validate them by RT-qPCR, yielding an agreement of 93% between the two experimental techniques. Gene Ontology (GO terms related to oxidation-reduction processes, transport and cellular iron ion homeostasis were enriched among DE genes, highlighting the importance of metal homeostasis in adaptation to excess Zn by P. x canadensis clone I-214. We identified the up-regulation of two Populus metal transporters (ZIP2 and NRAMP1 probably involved in metal uptake, and the down-regulation of a NAS4 gene involved in metal translocation. We identified also four Fe-homeostasis transcription factors (two bHLH38 genes, FIT and BTS that were differentially expressed, probably for reducing Zn-induced Fe-deficiency. In particular, we suggest that the down-regulation of FIT transcription factor could be a mechanism to cope with Zn-induced Fe-deficiency in Populus. These results provide insight into the molecular mechanisms involved in adaption to excess Zn in Populus spp., but could also constitute a starting point for the identification and characterization of molecular markers or biotechnological targets for possible improvement of phytoremediation performances of poplar trees.

  8. Exome sequencing identifies novel compound heterozygous mutations in SPG11 that cause autosomal recessive hereditary spastic paraplegia.

    Science.gov (United States)

    Zhao, Wei; Zhu, Qing-Yan; Zhang, Jia-Tang; Liu, Hui; Wang, Li-Juan; Chen, Zhi-Qiang; Guan, Li-Ping; Huang, Xu-Sheng; Yang, Ling; Yu, Sheng-Yuan

    2013-12-15

    Hereditary spastic paraplegia (HSP) is a neurodegenerative disease characterized by progressive weakness and spasticity of the lower limbs, in complicated forms, with additional neurological signs. To identify the genotype and characterize the phenotype in a Chinese HSP family, ten subjects from the family were examined through detailed clinical evaluations, auxiliary examinations and genetic tests. Using a combined approach of whole-exome sequencing and candidate mutation validation, we identified novel compound heterozygous mutations in the SPG11 gene of the patients as follows: a nonsense mutation c.6856C>T (p.R2286X) in exon 38 and a deletion mutation c.2863delG (p.Glu955Lysfs*8) in exon 16. Both mutations co-segregated with the phenotype in this family and were absent in 100 normal Chinese individuals. Our finding suggests that the novel compound heterozygous mutations in SPG11 are associated with HSP. We were able to assess the future risk of HSP in healthy younger family members using genetic detection, and provide prenatal diagnoses for the family members. Furthermore, to some extent, this new finding enriches the information on SPG11 and may provide a new basis for the genetic diagnosis of HSP. © 2013.

  9. Epidemiological study on the penicillin resistance of clinical Streptococcus pneumoniae isolates identified as the common sequence types.

    Science.gov (United States)

    Gao, Wei; Shi, Wei; Chen, Chang-hui; Wen, De-nian; Tian, Jin; Yao, Kai-hu

    2016-10-20

    There were some limitation in the current interpretation about the penicillin resistance mechanism of clinical Streptococcus pneumoniae isolates at the strain level. To explore the possibilities of studying the mechanism based on the sequence types (ST) of this bacteria, 488 isolates collected in Beijing from 1997-2014 and 88 isolates collected in Youyang County, Chongqing and Zhongjiang County, Sichuan in 2015 were analyzed by penicillin minimum inhibitory concentration (MIC) distribution and annual distribution. The results showed that the penicillin MICs of the all isolates covering by the given ST in Beijing have a defined range, either penicillin MIC penicillin MICs in the first few years after it was identified. The penicillin MIC of isolates identified as common STs and collected in Youyang County, Chongqing and Sichuan Zhongjiang County, including the ST271, ST320 and ST81, was around 0.25~2 mg/L (≥0.25 mg/L). Our study revealed the epidemiological distribution of penicillin MICs of the given STs determined in clinical S. pneumoniae isolates, suggesting that it is reasonable to research the penicillin resistance mechanism based on the STs of this bacteria.

  10. Targeted Next-Generation Sequencing Identifies a Recurrent Mutation in MCPH1 Associating with Hereditary Breast Cancer Susceptibility.

    Directory of Open Access Journals (Sweden)

    Tuomo Mantere

    2016-01-01

    Full Text Available Breast cancer is strongly influenced by hereditary risk factors, a majority of which still remain unknown. Here, we performed a targeted next-generation sequencing of 796 genes implicated in DNA repair in 189 Finnish breast cancer cases with indication of hereditary disease susceptibility and focused the analysis on protein truncating mutations. A recurrent heterozygous mutation (c.904_916del, p.Arg304ValfsTer3 was identified in early DNA damage response gene, MCPH1, significantly associating with breast cancer susceptibility both in familial (5/145, 3.4%, P = 0.003, OR 8.3 and unselected cases (16/1150, 1.4%, P = 0.016, OR 3.3. A total of 21 mutation positive families were identified, of which one-third exhibited also brain tumors and/or sarcomas (P = 0.0007. Mutation carriers exhibited significant increase in genomic instability assessed by cytogenetic analysis for spontaneous chromosomal rearrangements in peripheral blood lymphocytes (P = 0.0007, suggesting an effect for MCPH1 haploinsufficiency on cancer susceptibility. Furthermore, 40% of the mutation carrier tumors exhibited loss of the wild-type allele. These findings collectively provide strong evidence for MCHP1 being a novel breast cancer susceptibility gene, which warrants further investigations in other populations.

  11. TSSer: an automated method to identify transcription start sites in prokaryotic genomes from differential RNA sequencing data.

    Science.gov (United States)

    Jorjani, Hadi; Zavolan, Mihaela

    2014-04-01

    Accurate identification of transcription start sites (TSSs) is an essential step in the analysis of transcription regulatory networks. In higher eukaryotes, the capped analysis of gene expression technology enabled comprehensive annotation of TSSs in genomes such as those of mice and humans. In bacteria, an equivalent approach, termed differential RNA sequencing (dRNA-seq), has recently been proposed, but the application of this approach to a large number of genomes is hindered by the paucity of computational analysis methods. With few exceptions, when the method has been used, annotation of TSSs has been largely done manually. In this work, we present a computational method called 'TSSer' that enables the automatic inference of TSSs from dRNA-seq data. The method rests on a probabilistic framework for identifying both genomic positions that are preferentially enriched in the dRNA-seq data as well as preferentially captured relative to neighboring genomic regions. Evaluating our approach for TSS calling on several publicly available datasets, we find that TSSer achieves high consistency with the curated lists of annotated TSSs, but identifies many additional TSSs. Therefore, TSSer can accelerate genome-wide identification of TSSs in bacterial genomes and can aid in further characterization of bacterial transcription regulatory networks. TSSer is freely available under GPL license at http://www.clipz.unibas.ch/TSSer/index.php

  12. Using Next-Generation Sequencing to Identify a Mutation in Human MCSU that is Responsible for Type II Xanthinuria

    Directory of Open Access Journals (Sweden)

    Yunan Zhou

    2015-04-01

    Full Text Available Background: Hypouricemia is caused by various diseases and disorders, such as hepatic failure, Fanconi renotubular syndrome, nutritional deficiencies and genetic defects. Genetic defects of the molybdoflavoprotein enzymes induce hypouricemia and xanthinuria. Here, we identified a patient whose plasma and urine uric acid levels were both extremely low and aimed to identify the pathogenic gene and verify its mechanism. Methods: Using next-generation sequencing (NGS, we detected a mutation in the human molybdenum cofactor sulfurase (MCSU gene that may cause hypouricemia. We cultured L02 cells, knocked down MCSU with RNAi, and then detected the uric acid and MCSU concentrations, xanthine oxidase (XOD and xanthine dehydrogenase (XDH activity levels, and xanthine/hypoxanthine concentrations in cell lysates and culture supernatants. Results: The NGS results showed that the patient had a mutation in the human MCSU gene. The in vitro study showed that RNAi of MCSU caused the uric acid, human MCSU concentrations, the XOD and XDH activity levels among cellular proteins and culture supernatants to be extremely low relative to those of the control. However, the xanthine/hypoxanthine concentrations were much higher than those of the control. Conclusions: We strongly confirmed the pathogenicity of the human MCSU gene.

  13. Using Next-Generation Sequencing to Identify a Mutation in Human MCSU that is Responsible for Type II Xanthinuria.

    Science.gov (United States)

    Zhou, Yunan; Zhang, Xueguang; Ding, Rui; Li, Zuoxiang; Hong, Quan; Wang, Yan; Zheng, Wei; Geng, Xiaodong; Fan, Meng; Cai, Guangyan; Chen, Xiangmei; Wu, Di

    2015-01-01

    Hypouricemia is caused by various diseases and disorders, such as hepatic failure, Fanconi renotubular syndrome, nutritional deficiencies and genetic defects. Genetic defects of the molybdoflavoprotein enzymes induce hypouricemia and xanthinuria. Here, we identified a patient whose plasma and urine uric acid levels were both extremely low and aimed to identify the pathogenic gene and verify its mechanism. Using next-generation sequencing (NGS), we detected a mutation in the human molybdenum cofactor sulfurase (MCSU) gene that may cause hypouricemia. We cultured L02 cells, knocked down MCSU with RNAi, and then detected the uric acid and MCSU concentrations, xanthine oxidase (XOD) and xanthine dehydrogenase (XDH) activity levels, and xanthine/hypoxanthine concentrations in cell lysates and culture supernatants. The NGS results showed that the patient had a mutation in the human MCSU gene. The in vitro study showed that RNAi of MCSU caused the uric acid, human MCSU concentrations, the XOD and XDH activity levels among cellular proteins and culture supernatants to be extremely low relative to those of the control. However, the xanthine/hypoxanthine concentrations were much higher than those of the control. We strongly confirmed the pathogenicity of the human MCSU gene. © 2015 S. Karger AG, Basel.

  14. Whole-genome sequencing overcomes pseudogene homology to diagnose autosomal dominant polycystic kidney disease.

    Science.gov (United States)

    Mallawaarachchi, Amali C; Hort, Yvonne; Cowley, Mark J; McCabe, Mark J; Minoche, André; Dinger, Marcel E; Shine, John; Furlong, Timothy J

    2016-11-01

    Autosomal dominant polycystic kidney disease (ADPKD) is the most common monogenic kidney disorder and is due to disease-causing variants in PKD1 or PKD2. Strong genotype-phenotype correlation exists although diagnostic sequencing is not part of routine clinical practice. This is because PKD1 bears 97.7% sequence similarity with six pseudogenes, requiring laborious and error-prone long-range PCR and Sanger sequencing to overcome. We hypothesised that whole-genome sequencing (WGS) would be able to overcome the problem of this sequence homology, because of 150 bp, paired-end reads and avoidance of capture bias that arises from targeted sequencing. We prospectively recruited a cohort of 28 unique pedigrees with ADPKD phenotype. Standard DNA extraction, library preparation and WGS were performed using Illumina HiSeq X and variants were classified following standard guidelines. Molecular diagnosis was made in 24 patients (86%), with 100% variant confirmation by current gold standard of long-range PCR and Sanger sequencing. We demonstrated unique alignment of sequencing reads over the pseudogene-homologous region. In addition to identifying function-affecting single-nucleotide variants and indels, we identified single- and multi-exon deletions affecting PKD1 and PKD2, which would have been challenging to identify using exome sequencing. We report the first use of WGS to diagnose ADPKD. This method overcomes pseudogene homology, provides uniform coverage, detects all variant types in a single test and is less labour-intensive than current techniques. This technique is translatable to a diagnostic setting, allows clinicians to make better-informed management decisions and has implications for other disease groups that are challenged by regions of confounding sequence homology.

  15. Synteny between Arabidopsis thaliana and rice at the genome level: a tool to identify conservation in the ongoing rice genome sequencing project

    OpenAIRE

    Salse, Jerome; Piégu, Benoît; Cooke, Richard; Delseny, Michel

    2002-01-01

    BLASTX alignment between 189.5 Mb of rice genomic sequence and translated Arabidopsis thaliana annotated coding sequences (CDS) identified 60 syntenic regions involving 4-22 rice orthologs covering < or =3.2 cM (centiMorgan). Most regions are

  16. Deep sequencing reveals the complete genome and evidence for transcriptional activity of the first virus-like sequences identified in Aristotelia chilensis (Maqui Berry).

    Science.gov (United States)

    Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F; Alzate, Juan F; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor

    2015-04-03

    Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%-73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  17. Sequencing Cucumber (Cucumis Sativus L.) Chloroplast Genomes Identifies Differences Between Chilling-Tolerant and-Susceptible Cucumber Lines

    Science.gov (United States)

    Complete sequencing of cucumber chloroplast (cp)DNA was facilitated by the development of 414 consensus chloroplast sequencing primers (CCSPs) from conserved cpDNA sequences of Arabidopsis (Arabidopsis thaliana L.), spinach (Spinacia oleracea L.), and tobacco (Nicotiana tabacum L.) cpDNAs, using deg...

  18. De Novo Transcriptome Sequencing in Passiflora edulis Sims to Identify Genes and Signaling Pathways Involved in Cold Tolerance

    Directory of Open Access Journals (Sweden)

    Sian Liu

    2017-11-01

    Full Text Available The passion fruit (Passiflora edulis Sims, also known as the purple granadilla, is widely cultivated as the new darling of the fruit market throughout southern China. This exotic and perennial climber is adapted to warm and humid climates, and thus is generally intolerant of cold. There is limited information about gene regulation and signaling pathways related to the cold stress response in this species. In this study, two transcriptome libraries (KEDU_AP vs. GX_AP were constructed from the aerial parts of cold-tolerant and cold-susceptible varieties of P. edulis, respectively. Overall, 126,284,018 clean reads were obtained, and 86,880 unigenes with a mean size of 1449 bp were assembled. Of these, there were 64,067 (73.74% unigenes with significant similarity to publicly available plant protein sequences. Expression profiles were generated, and 3045 genes were found to be significantly differentially expressed between the KEDU_AP and GX_AP libraries, including 1075 (35.3% up-regulated and 1970 (64.7% down-regulated. These included 36 genes in enriched pathways of plant hormone signal transduction, and 56 genes encoding putative transcription factors. Six genes involved in the ICE1–CBF–COR pathway were induced in the cold-tolerant variety, and their expression levels were further verified using quantitative real-time PCR. This report is the first to identify genes and signaling pathways involved in cold tolerance using high-throughput transcriptome sequencing in P. edulis. These findings may provide useful insights into the molecular mechanisms regulating cold tolerance and genetic breeding in Passiflora spp.

  19. Whole-exome sequencing to identify genetic risk variants underlying inhibitor development in severe hemophilia A patients.

    Science.gov (United States)

    Gorski, Marcin M; Blighe, Kevin; Lotta, Luca A; Pappalardo, Emanuela; Garagiola, Isabella; Mancini, Ilaria; Mancuso, Maria Elisa; Fasulo, Maria Rosaria; Santagostino, Elena; Peyvandi, Flora

    2016-06-09

    The development of neutralizing antibodies (inhibitors) against coagulation factor VIII (FVIII) is the most problematic and costly complication of FVIII replacement therapy that affects up to 30% of previously untreated patients with severe hemophilia A. The development of inhibitors is a multifactorial complication involving environmental and genetic factors. Among the latter, F8 gene mutations, ethnicity, family history of inhibitors, and polymorphisms affecting genes involved in the immune response have been previously investigated. To identify novel genetic elements underling the risk of inhibitor development in patients with severe hemophilia A, we applied whole-exome sequencing (WES) and data analysis in a selected group of 26 Italian patients with (n = 17) and without (n = 9) inhibitors. WES revealed several rare, damaging variants in immunoregulatory genes as novel candidate mutations. A case-control association analysis using Cochran-Armitage and Fisher's exact statistical tests identified 1364 statistically significant variants. Hierarchical clustering of these genetic variants showed 2 distinct patterns of homozygous variants with a protective or harmful role in inhibitor development. When looking solely at coding variants, a total of 28 nonsynonymous variants were identified and replicated in 53 inhibitor-positive and 174 inhibitor-negative Italian severe hemophilia A patients using a TaqMan genotyping assay. The genotyping results revealed 10 variants showing estimated odds ratios in the same direction as in the discovery phase and confirmed the association of the rs3754689 missense variant (OR 0.58; 95% CI 0.36-0.94; P = .028) in a highly conserved haplotype region surrounding the LCT locus on chromosome 2q21 with inhibitor development. © 2016 by The American Society of Hematology.

  20. Identification of Alternative Splicing and Fusion Transcripts in Non-Small Cell Lung Cancer by RNA Sequencing.

    Science.gov (United States)

    Hong, Yoonki; Kim, Woo Jin; Bang, Chi Young; Lee, Jae Cheol; Oh, Yeon-Mok

    2016-04-01

    Lung cancer is the most common cause of cancer related death. Alterations in gene sequence, structure, and expression have an important role in the pathogenesis of lung cancer. Fusion genes and alternative splicing of cancer-related genes have the potential to be oncogenic. In the current study, we performed RNA-sequencing (RNA-seq) to investigate potential fusion genes and alternative splicing in non-small cell lung cancer. RNA was isolated from lung tissues obtained from 86 subjects with lung cancer. The RNA samples from lung cancer and normal tissues were processed with RNA-seq using the HiSeq 2000 system. Fusion genes were evaluated using Defuse and ChimeraScan. Candidate fusion transcripts were validated by Sanger sequencing. Alternative splicing was analyzed using multivariate analysis of transcript sequencing and validated using quantitative real time polymerase chain reaction. RNA-seq data identified oncogenic fusion genes EML4-ALK and SLC34A2-ROS1 in three of 86 normal-cancer paired samples. Nine distinct fusion transcripts were selected using DeFuse and ChimeraScan; of which, four fusion transcripts were validated by Sanger sequencing. In 33 squamous cell carcinoma, 29 tumor specific skipped exon events and six mutually exclusive exon events were identified. ITGB4 and PYCR1 were top genes that showed significant tumor specific splice variants. In conclusion, RNA-seq data identified novel potential fusion transcripts and splice variants. Further evaluation of their functional significance in the pathogenesis of lung cancer is required.

  1. Mutational analysis using Sanger and next generation sequencing in sporadic spindle cell hemangiomas: A study of 19 cases

    NARCIS (Netherlands)

    Broek, R.W. ten; Bekers, E.M.; Leng, W.W.J. de; Strengman, E.; Tops, B.B.J.; Kutzner, H.; Leeuwis, J.W.; Gorp, J.M. van; Creytens, D.H.; Mentzel, T.; Diest, P.J. van; Eijkelenboom, A.; Flucke, U.

    2017-01-01

    Spindle cell hemangioma (SCH) is a distinct vascular soft-tissue lesion characterized by cavernous blood vessels and a spindle cell component mainly occurring in the distal extremities of young adults. The majority of cases harbor heterozygous mutations in IDH1/2 sporadically or rarely in

  2. Homozygosity mapping and targeted sanger sequencing reveal genetic defects underlying inherited retinal disease in families from pakistan

    NARCIS (Netherlands)

    Maria, M.; Ajmal, M.; Azam, M.; Waheed, N.K.; Siddiqui, S.N.; Mustafa, B.; Ayub, H.; Ali, L.; Ahmad, S.; Micheal, S.; Hussain, A.; Shah, S.T.; Ali, S.H.; Ahmed, W.; Khan, Y.M.; Hollander, A.I. den; Haer-Wigman, L.; Collin, R.W.J.; Khan, M.I.; Qamar, R.; Cremers, F.P.M.

    2015-01-01

    BACKGROUND: Homozygosity mapping has facilitated the identification of the genetic causes underlying inherited diseases, particularly in consanguineous families with multiple affected individuals. This knowledge has also resulted in a mutation dataset that can be used in a cost and time effective

  3. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius).

    LENUS (Irish Health Repository)

    Edwards, Ceiridwen J

    2010-01-01

    BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius) has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs) from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+\\/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer). In total, 289.9 megabases (22.48%) of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously identified

  4. Clinical Use of Next-Generation Sequencing in the Diagnosis of Wilson’s Disease

    Directory of Open Access Journals (Sweden)

    Dániel Németh

    2016-01-01

    Full Text Available Objective. Wilson’s disease is a disorder of copper metabolism which is fatal without treatment. The great number of disease-causing ATP7B gene mutations and the variable clinical presentation of WD may cause a real diagnostic challenge. The emergence of next-generation sequencing provides a time-saving, cost-effective method for full sequencing of the whole ATP7B gene compared to the traditional Sanger sequencing. This is the first report on the clinical use of NGS to examine ATP7B gene. Materials and Methods. We used Ion Torrent Personal Genome Machine in four heterozygous patients for the identification of the other mutations and also in two patients with no known mutation. One patient with acute on chronic liver failure was a candidate for acute liver transplantation. The results were validated by Sanger sequencing. Results. In each case, the diagnosis of Wilson’s disease was confirmed by identifying the mutations in both alleles within 48 hours. One novel mutation (p.Ala1270Ile was found beyond the eight other known ones. The rapid detection of the mutations made possible the prompt diagnosis of WD in a patient with acute liver failure. Conclusions. According to our results we found next-generation sequencing a very useful, reliable, time-saving, and cost-effective method for diagnosing Wilson’s disease in selected cases.

  5. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    Directory of Open Access Journals (Sweden)

    Fabio eMarroni

    2012-06-01

    Full Text Available Next generation sequencing (NGS instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, only three research groups working in plant sciences have exploited this potentiality. They showed that pooled NGS can provide results in excellent agreement with those obtained by individual Sanger sequencing. Aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method we will explain in detail the variations in study design and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled next generation sequencing can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity and Tajima’s D. Finally we will discuss applications and future perspectives of the multiplexed NGS approach.

  6. [Screening for genetic mutations in hyperphenylalaninemia using Ion Torrent PGM sequencing].

    Science.gov (United States)

    Cao, Yanyan; Qu, Yujin; Song, Fang; Bai, Jinli; Jin, Yuwei; Wang, Hong

    2015-02-01

    To establish a hyperphenylalaninemia related genes screening method using Ion Torrent Personal Genome Machine (PGM) for early detection and differential diagnosis of hyperphenylalaninemia (HPA). Three children with known HPA mutations and a healthy control were used for setting up the method. Ten children with HPA with known mutations were recruited for validating the method. Ion Ampliseq PCR was used to amplify the 5' and 3' untranslated region, coding sequence, and flanking introns of PAH, GCH1, PTS, QDPR, and PCBD1 genes. After the enrichment with the Ion OneTouch system, the products were sequenced by PGM. Data from the PGM were processed with Torrent Suite v2.2 software package. All variations were confirmed by Sanger sequencing. For the 4 samples, the PGM output was 94.22 Mb, with approximately 99.5% of reads mapping to the target regions. Among these samples, we detected 74 variations (28 positions) including 6 known mutations. Compared with database and results of Sanger sequencing, 55 (18 positions) polymorphisms and 13 (4 positions) false positive calls were confirmed. For the 10 samples, all the known mutations were successfully identified. Ion Torrent PGM sequencing is suitable for screening genetic mutation underlying HPA from the perspective of metabolic pathways, which can meet the clinical demand for individualized diagnosis and treatment.

  7. CA88, a nuclear repetitive DNA sequence identified in Schistosoma mansoni, aids in the genotyping of nine Schistosoma species of medical and veterinary importance.

    Science.gov (United States)

    Bahia, Diana; Rodrigues, Nilton B; Araújo, Flávio Marcos G; Romanha, Alvaro José; Ruiz, Jerônimo C; Johnston, David A; Oliveira, Guilherme

    2010-07-01

    CA88 is the first long nuclear repetitive DNA sequence identified in the blood fluke, Schistosoma mansoni. The assembled S. mansoni sequence, which contains the CA88 repeat, has 8,887 nucleotides and at least three repeat units of approximately 360 bp. In addition, CA88 also possesses an internal CA microsatellite, identified as SmBr18. Both PCR and BLAST analysis have been used to analyse and confirm the CA88 sequence in other S. mansoni sequences in the public database. PCR-acquired nuclear repetitive DNA sequence profiles from nine Schistosoma species were used to classify this organism into four genotypes. Included among the nine species analysed were five sequences of both African and Asian lineages that are known to infect humans. Within these genotypes, three of them refer to recognised species groups. A panel of four microsatellite loci, including SmBr18 and three previously published loci, has been used to characterise the nine Schistosoma species. Each species has been identified and classified based on its CA88 DNA fingerprint profile. Furthermore, microsatellite sequences and intra-specific variation have also been observed within the nine Schistosoma species sequences. Taken together, these results support the use of these markers in studying the population dynamics of Schistosoma isolates from endemic areas and also provide new methods for investigating the relationships between different populations of parasites. In addition, these data also indicate that Schistosoma magrebowiei is not a sister taxon to Schistosoma mattheei, prompting a new designation to a basal clade.

  8. Patterns of oligonucleotide sequences in viral and host cell RNA identify mediators of the host innate immune system.

    Directory of Open Access Journals (Sweden)

    Benjamin D Greenbaum

    Full Text Available The innate immune response provides a first line of defense against pathogens by targeting generic differential features that are present in foreign organisms but not in the host. These innate responses generate selection forces acting both in pathogens and hosts that further determine their co-evolution. Here we analyze the nucleic acid sequence fingerprints of these selection forces acting in parallel on both host innate immune genes and ssRNA viral genomes. We do this by identifying dinucleotide biases in the coding regions of innate immune response genes in plasmacytoid dendritic cells, and then use this signal to identify other significant host innate immune genes. The persistence of these biases in the orthologous groups of genes in humans and chickens is also examined. We then compare the significant motifs in highly expressed genes of the innate immune system to those in ssRNA viruses and study the evolution of these motifs in the H1N1 influenza genome. We argue that the significant under-represented motif pattern of CpG in an AU context--which is found in both the ssRNA viruses and innate genes, and has decreased throughout the history of H1N1 influenza replication in humans--is immunostimulatory and has been selected against during the co-evolution of viruses and host innate immune genes. This shows how differences in host immune biology can drive the evolution of viruses that jump into species with different immune priorities than the original host.

  9. snpTree--a web-server to identify and construct SNP trees from whole genome sequence data.

    Science.gov (United States)

    Leekitcharoenphon, Pimlapas; Kaas, Rolf S; Thomsen, Martin Christen Frølund; Friis, Carsten; Rasmussen, Simon; Aarestrup, Frank M

    2012-01-01

    The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script.The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evaluation results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree. The snpTree server is an easy to use option for rapid standardised and automatic SNP analysis in epidemiological studies also for users with limited bioinformatic experience. The web server is freely accessible at http://www.cbs.dtu.dk/services/snpTree-1.0/.

  10. TCR repertoire sequencing identifies synovial Treg cell clonotypes in the bloodstream during active inflammation in human arthritis.

    Science.gov (United States)

    Rossetti, Maura; Spreafico, Roberto; Consolaro, Alessandro; Leong, Jing Yao; Chua, Camillus; Massa, Margherita; Saidin, Suzan; Magni-Manzoni, Silvia; Arkachaisri, Thaschawee; Wallace, Carol A; Gattorno, Marco; Martini, Alberto; Lovell, Daniel J; Albani, Salvatore

    2017-02-01

    The imbalance between effector and regulatory T (Treg) cells is crucial in the pathogenesis of autoimmune arthritis. Immune responses are often investigated in the blood because of its accessibility, but circulating lymphocytes are not representative of those found in inflamed tissues. This disconnect hinders our understanding of the mechanisms underlying disease. Our goal was to identify Treg cells implicated in autoimmunity at the inflamed joints, and also readily detectable in the blood upon recirculation. We compared Treg cells of patients with juvenile idiopathic arthritis responding or not to therapy by using: (i) T cell receptor (TCR) sequencing, to identify clonotypes shared between blood and synovial fluid; (ii) FOXP3 Treg cell-specific demethylated region DNA methylation assays, to investigate their stability and (iii) flow cytometry and suppression assays to probe their tolerogenic functions. We found a subset of synovial Treg cells that recirculated into the bloodstream of patients with juvenile idiopathic and adult rheumatoid arthritis. These inflammation-associated (ia)Treg cells, but not other blood Treg cells, expanded during active disease and proliferated in response to their cognate antigens. Despite the typical inflammatory-skewed balance of immune mechanisms in arthritis, iaTreg cells were stably committed to the regulatory lineage and fully suppressive. A fraction of iaTreg clonotypes were in common with pathogenic effector T cells. Using an innovative antigen-agnostic approach, we uncovered a population of bona fide synovial Treg cells readily accessible from the blood and selectively expanding during active disease, paving the way to non-invasive diagnostics and better understanding of the pathogenesis of autoimmunity. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  11. De novo transcriptome sequencing in Bixa orellana to identify genes involved in methylerythritol phosphate, carotenoid and bixin biosynthesis.

    Science.gov (United States)

    Cárdenas-Conejo, Yair; Carballo-Uicab, Víctor; Lieberman, Meric; Aguilar-Espinosa, Margarita; Comai, Luca; Rivera-Madrid, Renata

    2015-10-28

    Bixin or annatto is a commercially important natural orange-red pigment derived from lycopene that is produced and stored in seeds of Bixa orellana L. An enzymatic pathway for bixin biosynthesis was inferred from homology of putative proteins encoded by differentially expressed seed cDNAs. Some activities were later validated in a heterologous system. Nevertheless, much of the pathway remains to be clarified. For example, it is essential to identify the methylerythritol phosphate (MEP) and carotenoid pathways genes. In order to investigate the MEP, carotenoid, and bixin pathways genes, total RNA from young leaves and two different developmental stages of seeds from B. orellana were used for the construction of indexed mRNA libraries, sequenced on the Illumina HiSeq 2500 platform and assembled de novo using Velvet, CLC Genomics Workbench and CAP3 software. A total of 52,549 contigs were obtained with average length of 1,924 bp. Two phylogenetic analyses of inferred proteins, in one case encoded by thirteen general, single-copy cDNAs, in the other from carotenoid and MEP cDNAs, indicated that B. orellana is closely related to sister Malvales species cacao and cotton. Using homology, we identified 7 and 14 core gene products from the MEP and carotenoid pathways, respectively. Surprisingly, previously defined bixin pathway cDNAs were not present in our transcriptome. Here we propose a new set of gene products involved in bixin pathway. The identification and qRT-PCR quantification of cDNAs involved in annatto production suggest a hypothetical model for bixin biosynthesis that involve coordinated activation of some MEP, carotenoid and bixin pathway genes. These findings provide a better understanding of the mechanisms regulating these pathways and will facilitate the genetic improvement of B. orellana.

  12. Next Generation Sequencing Identifies Five Major Classes of Potentially Therapeutic Enzymes Secreted by Lucilia sericata Medical Maggots

    Science.gov (United States)

    Franta, Zdeněk; Vogel, Heiko; Lehmann, Rüdiger; Rupp, Oliver; Goesmann, Alexander; Vilcinskas, Andreas

    2016-01-01

    Lucilia sericata larvae are used as an alternative treatment for recalcitrant and chronic wounds. Their excretions/secretions contain molecules that facilitate tissue debridement, disinfect, or accelerate wound healing and have therefore been recognized as a potential source of novel therapeutic compounds. Among the substances present in excretions/secretions various peptidase activities promoting the wound healing processes have been detected but the peptidases responsible for these activities remain mostly unidentified. To explore these enzymes we applied next generation sequencing to analyze the transcriptomes of different maggot tissues (salivary glands, gut, and crop) associated with the production of excretions/secretions and/or with digestion as well as the rest of the larval body. As a result we obtained more than 123.8 million paired-end reads, which were assembled de novo using Trinity and Oases assemblers, yielding 41,421 contigs with an N50 contig length of 2.22 kb and a total length of 67.79 Mb. BLASTp analysis against the MEROPS database identified 1729 contigs in 577 clusters encoding five peptidase classes (serine, cysteine, aspartic, threonine, and metallopeptidases), which were assigned to 26 clans, 48 families, and 185 peptidase species. The individual enzymes were differentially expressed among maggot tissues and included peptidase activities related to the therapeutic effects of maggot excretions/secretions. PMID:27119084

  13. A quantitative high-resolution genetic profile rapidly identifies sequence determinants of hepatitis C viral fitness and drug sensitivity.

    Directory of Open Access Journals (Sweden)

    Hangfei Qi

    2014-04-01

    Full Text Available Widely used chemical genetic screens have greatly facilitated the identification of many antiviral agents. However, the regions of interaction and inhibitory mechanisms of many therapeutic candidates have yet to be elucidated. Previous chemical screens identified Daclatasvir (BMS-790052 as a potent nonstructural protein 5A (NS5A inhibitor for Hepatitis C virus (HCV infection with an unclear inhibitory mechanism. Here we have developed a quantitative high-resolution genetic (qHRG approach to systematically map the drug-protein interactions between Daclatasvir and NS5A and profile genetic barriers to Daclatasvir resistance. We implemented saturation mutagenesis in combination with next-generation sequencing technology to systematically quantify the effect of every possible amino acid substitution in the drug-targeted region (domain IA of NS5A on replication fitness and sensitivity to Daclatasvir. This enabled determination of the residues governing drug-protein interactions. The relative fitness and drug sensitivity profiles also provide a comprehensive reference of the genetic barriers for all possible single amino acid changes during viral evolution, which we utilized to predict clinical outcomes using mathematical models. We envision that this high-resolution profiling methodology will be useful for next-generation drug development to select drugs with higher fitness costs to resistance, and also for informing the rational use of drugs based on viral variant spectra from patients.

  14. Whole exome sequencing identifies de novo heterozygous CAV1 mutations associated with a novel neonatal onset lipodystrophy syndrome.

    Science.gov (United States)

    Garg, Abhimanyu; Kircher, Martin; Del Campo, Miguel; Amato, R Stephen; Agarwal, Anil K

    2015-08-01

    Despite remarkable progress in identifying causal genes for many types of genetic lipodystrophies in the last decade, the molecular basis of many extremely rare lipodystrophy patients with distinctive phenotypes remains unclear. We conducted whole exome sequencing of the parents and probands from six pedigrees with neonatal onset of generalized loss of subcutaneous fat with additional distinctive phenotypic features and report de novo heterozygous null mutations, c.424C>T (p.Q142*) and c.479_480delTT (p.F160*), in CAV1 in a 7-year-old male and a 3-year-old female of European origin, respectively. Both the patients had generalized fat loss, thin mottled skin and progeroid features at birth. The male patient had cataracts requiring extraction at age 30 months and the female patient had pulmonary arterial hypertension. Dermal fibroblasts of the female patient revealed negligible CAV1 immunofluorescence staining compared to control but there were no differences in the number and morphology of caveolae upon electron microscopy examination. Based upon the similarities in the clinical features of these two patients, previous reports of CAV1 mutations in patients with lipodystrophies and pulmonary hypertension, and similar features seen in CAV1 null mice, we conclude that these variants are the most likely cause of one subtype of neonatal onset generalized lipodystrophy syndrome. © 2015 Wiley Periodicals, Inc.

  15. Treatment inferred from mutations identified using massive parallel sequencing leads to clinical benefit in some heavily pretreated cancer patients.

    Science.gov (United States)

    Zick, Aviad; Peretz, Tamar; Lotem, Michal; Hubert, Ayala; Katz, Daniela; Temper, Mark; Rottenberg, Yakir; Uziely, Beatrice; Nechushtan, Hovav; Meirovitz, Amichai; Sonnenblick, Amir; Sapir, Eli; Edelman, David; Goldberg, Yael; Lossos, Alexander; Rosenberg, Shai; Fried, Iris; Finklstein, Ruth; Pikarsky, Eli; Goldshmidt, Hanoch

    2017-05-01

    Molecular portraits of numerous tumors have flooded oncologists with vast amounts of data. In parallel, effective inhibitors of central pathways have shown great clinical benefit. Together, this promises potential clinical benefits to otherwise end-stage cancer patients. Here, we report a clinical service offering mutation detection of archived samples using the ion Ampliseq cancer panel coupled with clinical consultation.A multidisciplinary think tank consisting of oncologists, molecular-biologists, genetic counselors, and pathologists discussed 67 heavily pretreated, advanced cancer patient cases, taking into account mutations identified using ion Ampliseq cancer panel, medical history, and relevant literature.The team generated a treatment plan, targeting specific mutations, for 41 out of 64 cases. Three patients died before results were available. For 32 patients, the treating oncologists chose not to include the panel recommendation in the treatment plan for various reasons. Nine patients were treated as recommended by the panel, 5 with clinical benefit, and 4 with disease progression.This study suggests that routine use of massive parallel tumor sequencing is feasible and can judiciously affect treatment decisions when coupled with multidisciplinary team-based decision making. Administration of personalized based therapies at an earlier stage of disease, expansion of genetic alterations examined, and increased availability of targeted therapies may lead to further improvement in the clinical outcome of metastatic cancer patients.

  16. Next generation sequencing identifies mutations in Atonal homolog 7 (ATOH7) in families with global eye developmental defects

    Science.gov (United States)

    Khan, Kamron; Logan, Clare V.; McKibbin, Martin; Sheridan, Eamonn; Elçioglu, Nursel H.; Yenice, Ozlem; Parry, David A.; Fernandez-Fuentes, Narcis; Abdelhamed, Zakia I.A.; Al-Maskari, Ahmed; Poulter, James A.; Mohamed, Moin D.; Carr, Ian M.; Morgan, Joanne E.; Jafri, Hussain; Raashid, Yasmin; Taylor, Graham R.; Johnson, Colin A.; Inglehearn, Chris F.; Toomes, Carmel; Ali, Manir

    2012-01-01

    The atonal homolog 7 (ATOH7) gene encodes a transcription factor involved in determining the fate of retinal progenitor cells and is particularly required for optic nerve and ganglion cell development. Using a combination of autozygosity mapping and next generation sequencing, we have identified homozygous mutations in this gene, p.E49V and p.P18RfsX69, in two consanguineous families diagnosed with multiple ocular developmental defects, including severe vitreoretinal dysplasia, optic nerve hypoplasia, persistent fetal vasculature, microphthalmia, congenital cataracts, microcornea, corneal opacity and nystagmus. Most of these clinical features overlap with defects in the Norrin/β-catenin signalling pathway that is characterized by dysgenesis of the retinal and hyaloid vasculature. Our findings document Mendelian mutations within ATOH7 and imply a role for this molecule in the development of structures at the front as well as the back of the eye. This work also provides further insights into the function of ATOH7, especially its importance in retinal vascular development and hyaloid regression. PMID:22068589

  17. Next Generation Sequencing Identifies Five Major Classes of Potentially Therapeutic Enzymes Secreted by Lucilia sericata Medical Maggots.

    Science.gov (United States)

    Franta, Zdeněk; Vogel, Heiko; Lehmann, Rüdiger; Rupp, Oliver; Goesmann, Alexander; Vilcinskas, Andreas

    2016-01-01

    Lucilia sericata larvae are used as an alternative treatment for recalcitrant and chronic wounds. Their excretions/secretions contain molecules that facilitate tissue debridement, disinfect, or accelerate wound healing and have therefore been recognized as a potential source of novel therapeutic compounds. Among the substances present in excretions/secretions various peptidase activities promoting the wound healing processes have been detected but the peptidases responsible for these activities remain mostly unidentified. To explore these enzymes we applied next generation sequencing to analyze the transcriptomes of different maggot tissues (salivary glands, gut, and crop) associated with the production of excretions/secretions and/or with digestion as well as the rest of the larval body. As a result we obtained more than 123.8 million paired-end reads, which were assembled de novo using Trinity and Oases assemblers, yielding 41,421 contigs with an N50 contig length of 2.22 kb and a total length of 67.79 Mb. BLASTp analysis against the MEROPS database identified 1729 contigs in 577 clusters encoding five peptidase classes (serine, cysteine, aspartic, threonine, and metallopeptidases), which were assigned to 26 clans, 48 families, and 185 peptidase species. The individual enzymes were differentially expressed among maggot tissues and included peptidase activities related to the therapeutic effects of maggot excretions/secretions.

  18. Next Generation Sequencing Identifies Five Major Classes of Potentially Therapeutic Enzymes Secreted by Lucilia sericata Medical Maggots

    Directory of Open Access Journals (Sweden)

    Zdeněk Franta

    2016-01-01

    Full Text Available Lucilia sericata larvae are used as an alternative treatment for recalcitrant and chronic wounds. Their excretions/secretions contain molecules that facilitate tissue debridement, disinfect, or accelerate wound healing and have therefore been recognized as a potential source of novel therapeutic compounds. Among the substances present in excretions/secretions various peptidase activities promoting the wound healing processes have been detected but the peptidases responsible for these activities remain mostly unidentified. To explore these enzymes we applied next generation sequencing to analyze the transcriptomes of different maggot tissues (salivary glands, gut, and crop associated with the production of excretions/secretions and/or with digestion as well as the rest of the larval body. As a result we obtained more than 123.8 million paired-end reads, which were assembled de novo using Trinity and Oases assemblers, yielding 41,421 contigs with an N50 contig length of 2.22 kb and a total length of 67.79 Mb. BLASTp analysis against the MEROPS database identified 1729 contigs in 577 clusters encoding five peptidase classes (serine, cysteine, aspartic, threonine, and metallopeptidases, which were assigned to 26 clans, 48 families, and 185 peptidase species. The individual enzymes were differentially expressed among maggot tissues and included peptidase activities related to the therapeutic effects of maggot excretions/secretions.

  19. Deep sequencing and in silico analyses identify MYB-regulated gene networks and signaling pathways in pancreatic cancer.

    Science.gov (United States)

    Azim, Shafquat; Zubair, Haseeb; Srivastava, Sanjeev K; Bhardwaj, Arun; Zubair, Asif; Ahmad, Aamir; Singh, Seema; Khushman, Moh'd; Singh, Ajay P

    2016-06-29

    We have recently demonstrated that the transcription factor MYB can modulate several cancer-associated phenotypes in pancreatic cancer. In order to understand the molecular basis of these MYB-associated changes, we conducted deep-sequencing of transcriptome of MYB-overexpressing and -silenced pancreatic cancer cells, followed by in silico pathway analysis. We identified significant modulation of 774 genes upon MYB-silencing (p MYB-silenced pancreatic cancer cells exhibiting suppression of EGFR and NF-κB. Decreased expression of EGFR and RELA was validated by both qPCR and immunoblotting and they were both shown to be under direct transcriptional control of MYB. These observations were further confirmed in a converse approach wherein MYB was overexpressed ectopically in a MYB-null pancreatic cancer cell line. Our findings thus suggest that MYB potentially regulates growth and genomic stability of pancreatic cancer cells via targeting complex gene networks and signaling pathways. Further in-depth functional studies are warranted to fully understand MYB signaling in pancreatic cancer.

  20. Automatically Identifying Fusion Events between GLUT4 Storage Vesicles and the Plasma Membrane in TIRF Microscopy Image Sequences.

    Science.gov (United States)

    Wu, Jian; Xu, Yingke; Feng, Zhouyan; Zheng, Xiaoxiang

    2015-01-01

    Quantitative analysis of the dynamic behavior about membrane-bound secretory vesicles has proven to be important in biological research. This paper proposes a novel approach to automatically identify the elusive fusion events between VAMP2-pHluorin labeled GLUT4 storage vesicles (GSVs) and the plasma membrane. The differentiation is implemented to detect the initiation of fusion events by modified forward subtraction of consecutive frames in the TIRFM image sequence. Spatially connected pixels in difference images brighter than a specified adaptive threshold are grouped into a distinct fusion spot. The vesicles are located at the intensity-weighted centroid of their fusion spots. To reveal the true in vivo nature of a fusion event, 2D Gaussian fitting for the fusion spot is used to derive the intensity-weighted centroid and the spot size during the fusion process. The fusion event and its termination can be determined according to the change of spot size. The method is evaluated on real experiment data with ground truth annotated by expert cell biologists. The evaluation results show that it can achieve relatively high accuracy comparing favorably to the manual analysis, yet at a small fraction of time.

  1. Automatically Identifying Fusion Events between GLUT4 Storage Vesicles and the Plasma Membrane in TIRF Microscopy Image Sequences

    Directory of Open Access Journals (Sweden)

    Jian Wu

    2015-01-01

    Full Text Available Quantitative analysis of the dynamic behavior about membrane-bound secretory vesicles has proven to be important in biological research. This paper proposes a novel approach to automatically identify the elusive fusion events between VAMP2-pHluorin labeled GLUT4 storage vesicles (GSVs and the plasma membrane. The differentiation is implemented to detect the initiation of fusion events by modified forward subtraction of consecutive frames in the TIRFM image sequence. Spatially connected pixels in difference images brighter than a specified adaptive threshold are grouped into a distinct fusion spot. The vesicles are located at the intensity-weighted centroid of their fusion spots. To reveal the true in vivo nature of a fusion event, 2D Gaussian fitting for the fusion spot is used to derive the intensity-weighted centroid and the spot size during the fusion process. The fusion event and its termination can be determined according to the change of spot size. The method is evaluated on real experiment data with ground truth annotated by expert cell biologists. The evaluation results show that it can achieve relatively high accuracy comparing favorably to the manual analysis, yet at a small fraction of time.

  2. In situ sequencing identifies TMPRSS2-ERG fusion transcripts, somatic point mutations and gene expression levels in prostate cancers.

    Science.gov (United States)

    Kiflemariam, Sara; Mignardi, Marco; Ali, Muhammad Akhtar; Bergh, Anders; Nilsson, Mats; Sjöblom, Tobias

    2014-10-01

    Translocations contribute to the genesis and progression of epithelial tumours and in particular to prostate cancer development. To better understand the contribution of fusion transcripts and visualize the clonal composition of multifocal tumours, we have developed a technology for multiplex in situ detection and identification of expressed fusion transcripts. When compared to immunohistochemistry, TMPRSS2-ERG fusion-negative and fusion-positive prostate tumours were correctly classified. The most prevalent TMPRSS2-ERG fusion variants were visualized, identified, and quantitated in human prostate cancer tissues, and the ratio of the variant fusion transcripts could for the first time be directly determined by in situ sequencing. Further, we demonstrate concurrent in situ detection of gene expression, point mutations, and gene fusions of the prostate cancer relevant targets AMACR, AR, TP53, and TMPRSS2-ERG. This unified approach to in situ analyses of somatic mutations can empower studies of intra-tumoural heterogeneity and future tissue-based diagnostics of mutations and translocations. Copyright © 2014 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

  3. High throughput sequencing identifies an imprinted gene, Grb10, associated with the pluripotency state in nuclear transfer embryonic stem cells.

    Science.gov (United States)

    Li, Hui; Gao, Shuai; Huang, Hua; Liu, Wenqiang; Huang, Huanwei; Liu, Xiaoyu; Gao, Yawei; Le, Rongrong; Kou, Xiaochen; Zhao, Yanhong; Kou, Zhaohui; Li, Jia; Wang, Hong; Zhang, Yu; Wang, Hailin; Cai, Tao; Sun, Qingyuan; Gao, Shaorong; Han, Zhiming

    2017-07-18

    Somatic cell nuclear transfer and transcription factor mediated reprogramming are two widely used techniques for somatic cell reprogramming. Both fully reprogrammed nuclear transfer embryonic stem cells and induced pluripotent stem cells hold potential for regenerative medicine, and evaluation of the stem cell pluripotency state is crucial for these applications. Previous reports have shown that the Dlk1-Dio3 region is associated with pluripotency in induced pluripotent stem cells and the incomplete somatic cell reprogramming causes abnormally elevated levels of genomic 5-methylcytosine in induced pluripotent stem cells compared to nuclear transfer embryonic stem cells and embryonic stem cells. In this study, we compared pluripotency associated genes Rian and Gtl2 in the Dlk1-Dio3 region in exactly syngeneic nuclear transfer embryonic stem cells and induced pluripotent stem cells with same genomic insertion. We also assessed 5-methylcytosine and 5-hydroxymethylcytosine levels and performed high-throughput sequencing in these cells. Our results showed that Rian and Gtl2 in the Dlk1-Dio3 region related to pluripotency in induced pluripotent stem cells did not correlate with the genes in nuclear transfer embryonic stem cells, and no significant difference in 5-methylcytosine and 5-hydroxymethylcytosine levels were observed between fully and partially reprogrammed nuclear transfer embryonic stem cells and induced pluripotent stem cells. Through syngeneic comparison, our study identifies for the first time that Grb10 is associated with the pluripotency state in nuclear transfer embryonic stem cells.

  4. Sequencing Centers Panel at SFAF

    Energy Technology Data Exchange (ETDEWEB)

    Schilkey, Faye [NCGR; Ali, Johar [OICR; Grafham, Darren [Wellcome Trust Sanger Institute; Muzny, Donna [Baylor College of Medicine; Fulton, Bob [Washington University; Fitzgerald, Mike [Broad Institute; Hostetler, Jessica [J. Craig Venter Institute; Daum, Chris [DOE Joint Genome Institute

    2010-06-02

    From left to right: Faye Schilkey of NCGR, Johar Ali of OICR, Darren Grafham of Wellcome Trust Sanger Institute, Donna Muzny of the Baylor College of Medicine, Bob Fulton of Washington University, Mike Fitzgerald of the Broad Institute, Jessica Hostetler of the J. Craig Venter Institute and Chris Daum of the DOE Joint Genome Institute discuss sequencing technologies, applications and pipelines on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  5. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Directory of Open Access Journals (Sweden)

    Karin Soares Cunha

    2016-12-01

    Full Text Available Neurofibromatosis 1 (NF1 is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11. We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G. Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns for different types of pathogenic variations, including the deep intronic splicing mutations.

  6. Avian resistance to Campylobacter jejuni colonization is associated with an intestinal immunogene expression signature identified by mRNA sequencing.

    Directory of Open Access Journals (Sweden)

    Sarah Connell

    Full Text Available Campylobacter jejuni is the most common cause of human bacterial gastroenteritis and is associated with several post-infectious manifestations, including onset of the autoimmune neuropathy Guillain-Barré syndrome, causing significant morbidity and mortality. Poorly-cooked chicken meat is the most frequent source of infection as C. jejuni colonizes the avian intestine in a commensal relationship. However, not all chickens are equally colonized and resistance seems to be genetically determined. We hypothesize that differences in immune response may contribute to variation in colonization levels between susceptible and resistant birds. Using high-throughput sequencing in an avian infection model, we investigate gene expression associated with resistance or susceptibility to colonization of the gastrointestinal tract with C. jejuni and find that gut related immune mechanisms are critical for regulating colonization. Amongst a single population of 300 4-week old chickens, there was clear segregation in levels of C. jejuni colonization 48 hours post-exposure. RNAseq analysis of caecal tissue from 14 C. jejuni-susceptible and 14 C. jejuni-resistant birds generated over 363 million short mRNA sequences which were investigated to identify 219 differentially expressed genes. Significantly higher expression of genes involved in the innate immune response, cytokine signaling, B cell and T cell activation and immunoglobulin production, as well as the renin-angiotensin system was observed in resistant birds, suggesting an early active immune response to C. jejuni. Lower expression of these genes in colonized birds suggests suppression or inhibition of a clearing immune response thus facilitating commensal colonization and generating vectors for zoonotic transmission. This study describes biological processes regulating C. jejuni colonization of the avian intestine and gives insight into the differential immune mechanisms incited in response to commensal

  7. Sequence analysis of cereal sucrose synthase genes and isolation ...

    African Journals Online (AJOL)

    SERVER

    2007-10-18

    Oct 18, 2007 ... comparative analysis of grass genomes and as a source of beneficial genes for agriculture. Recent studies have shown that ... sequencing of sucrose synthase gene fragment from sor- ghum using primers designed at their ... Sequencing was carried out by Sanger dideoxy DNA sequencing method. Results.

  8. Next Generation Sequencing at the University of Chicago Genomics Core

    Energy Technology Data Exchange (ETDEWEB)

    Faber, Pieter [University of Chicago

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  9. Genetic diagnosis of Duchenne/Becker muscular dystrophy using next-generation sequencing: validation analysis of DMD mutations.

    Science.gov (United States)

    Okubo, Mariko; Minami, Narihiro; Goto, Kanako; Goto, Yuichi; Noguchi, Satoru; Mitsuhashi, Satomi; Nishino, Ichizo

    2016-06-01

    Duchenne and Becker muscular dystrophies (DMD/BMD) are the most common inherited neuromuscular disease. The genetic diagnosis is not easily made because of the large size of the dystrophin gene, complex mutational spectrum and high number of tests patients undergo for diagnosis. Multiplex ligation-dependent probe amplification (MLPA) has been used as the initial diagnostic test of choice. Although MLPA can diagnose 70% of DMD/BMD patients having deletions/duplications, the remaining 30% of patients with small mutations require further analysis, such as Sanger sequencing. We applied a high-throughput method using Ion Torrent next-generation sequencing technology and diagnosed 92% of patients with DMD/BMD in a single analysis. We designed a multiplex primer pool for DMD and sequenced 67 cases having different mutations: 37 with deletions/duplications and 30 with small mutations or short insertions/deletions in DMD, using an Ion PGM sequencer. The results were compared with those from MLPA or Sanger sequencing. All deletions were detected. In contrast, 50% of duplications were correctly identified compared with the MLPA method. Small insertions in consecutive bases could not be detected. We estimated that Ion Torrent sequencing could diagnose ~92% of DMD/BMD patients according to the mutational spectrum of our cohort. Our results clearly indicate that this method is suitable for routine clinical practice providing novel insights into comprehensive genetic information for future molecular therapy.

  10. Detection of common sequence variations of familial hypercholesterolemia in Taiwan using DNA mass spectrometry.

    Science.gov (United States)

    Chiou, Kuan-Rau; Charng, Min-Ji

    Familial hypercholesterolemia (FH) is a heterogeneous autosomal dominant disease. The genetic heterogeneity of FH requires low-cost, high-throughput, and rapid mutation detection technology to efficiently integrate genetic screening into clinical practice. The aims of the study were to customize the MassARRAY assay to (1) establish an FH mutation assay panel, comprising known point mutations located on FH-causing genes and (2) test the feasibility of the assay for screening FH patients residing in Taiwan who fit the clinical criteria of FH diagnosis. We designed a custom Agena iPLEX assay to detect 68 point mutations on FH-causing genes. First, the assay performance was verified by analyzing 180 previously sequenced subjects (120 with point mutations and 60 healthy controls), with the results being compared with those of Sanger DNA sequencing. Second, a blind study was carried out on 185 FH probands (44 definite FH and 141 probable/possible FH). In the first part of this study, only 1 discrepancy was found between the Agena iPLEX and Sanger sequencing genotyping results. In the blind study, a total of 62 probands with mutations were identified by both techniques. Five mutations were detected by Sanger sequencing assay only. The detection sensitivity and specificity rates of Agena iPLEX were 92.5% and 100%, respectively, in the blind study. The hands-on time for the Agena iPLEX assay was around 1 day. The custom-designed Agena iPLEX assay has high specificity and sensitivity for FH genetic screening. Considering its low cost, rapidity, and flexibility, the assay has great potential to be incorporated into FH screening in Taiwan. Copyright © 2017 National Lipid Association. Published by Elsevier Inc. All rights reserved.

  11. Sequence analysis of cytoplasmic mRNA-binding proteins of Xenopus oocytes identifies a family of RNA-binding proteins.

    Science.gov (United States)

    Murray, M T; Schiller, D L; Franke, W W

    1992-01-01

    Storage of maternal mRNAs as nontranslated ribonucleoprotein (RNP) complexes is an adaptive strategy in various vertebrate and invertebrate oocytes, for rapid translational recruitment during embryonic development. Previously, we showed that Xenopus laevis oocytes have a soluble cytoplasmic pool of mRNA-binding proteins and particles competent for messenger RNP assembly in vitro. Here we report the isolation of cDNAs for the most abundant messenger RNPs, the 54- and 56-kDa polypeptide (p54/p56) components of the approximately 6S mRNA-binding particle, from an ovarian expression library. The nucleotide sequence of p56 cDNA is almost identical to that recently reported for the putative Xenopus transcription factor FRG Y2. p54 and p56 are highly homologous and are smaller than expected by SDS/PAGE (36 kDa and 37 kDa) due to anomalous electrophoretic mobility. They lack the "RNP consensus motif" but contain four arginine-rich "basic/aromatic islands" that are similar to the RNA-binding domain of bacteriophage mRNA antiterminator proteins and of tat protein of human immunodeficiency virus. The basic/aromatic regions and a second conspicuous 100-amino acid "domain C" of p54 and p56 are conserved in the following DNA-binding proteins: human proteins dpbA, dpbB, and YB-1, rat protein EFIA, and Xenopus protein FRG Y1, all reported to bind to DNA; domain C is homologous to the major Escherichia coli cold-stress-response protein reportedly involved in translational control. Antibodies raised against a peptide of domain C have identified similar proteins in Xenopus somatic cells and in some mammalian cells and tissues. We conclude that p54 and p56 define a family of RNA-binding proteins, at least some of which may be involved in translational regulation.

  12. Understanding PRRSV infection in porcine lung based on genome-wide transcriptome response identified by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Shuqi Xiao

    Full Text Available Porcine reproductive and respiratory syndrome (PRRS has been one of the most economically important diseases affecting swine industry worldwide and causes great economic losses each year. PRRS virus (PRRSV replicates mainly in porcine alveolar macrophages (PAMs and dendritic cells (DCs and develops persistent infections, antibody-dependent enhancement (ADE, interstitial pneumonia and immunosuppression. But the molecular mechanisms of PRRSV infection still are poorly understood. Here we report on the first genome-wide host transcriptional responses to classical North American type PRRSV (N-PRRSV strain CH 1a infection using Solexa/Illumina's digital gene expression (DGE system, a tag-based high-throughput transcriptome sequencing method, and analyse systematically the relationship between pulmonary gene expression profiles after N-PRRSV infection and infection pathology. Our results suggest that N-PRRSV appeared to utilize multiple strategies for its replication and spread in infected pigs, including subverting host innate immune response, inducing an anti-apoptotic and anti-inflammatory state as well as developing ADE. Upregulation expression of virus-induced pro-inflammatory cytokines, chemokines, adhesion molecules and inflammatory enzymes and inflammatory cells, antibodies, complement activation were likely to result in the development of inflammatory responses during N-PRRSV infection processes. N-PRRSV-induced immunosuppression might be mediated by apoptosis of infected cells, which caused depletion of immune cells and induced an anti-inflammatory cytokine response in which they were unable to eradicate the primary infection. Our systems analysis will benefit for better understanding the molecular pathogenesis of N-PRRSV infection, developing novel antiviral therapies and identifying genetic components for swine resistance/susceptibility to PRRS.

  13. Identification of Genomic Insertion and Flanking Sequence of G2-EPSPS and GAT Transgenes in Soybean Using Whole Genome Sequencing Method

    Science.gov (United States)

    Guo, Bingfu; Guo, Yong; Hong, Huilong; Qiu, Li-Juan

    2016-01-01

    Molecular characterization of sequence flanking exogenous fragment insertion is essential for safety assessment and labeling of genetically modified organism (GMO). In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS) method. More than 22.4 Gb sequence data (∼21 × coverage) for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundaries of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767–50543792 and Chr17: 7980527–7980541 in these two transgenic lines. Identification of genomic insertion sites of G2-EPSPS and GAT transgenes will facilitate the utilization of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS was a cost-effective and rapid method for identifying sites of T-DNA insertions and flanking sequences in soybean. PMID:27462336

  14. Identification of genomic insertion and flanking sequence of G2-EPSPS and GAT transgenes in soybean using whole genome sequencing method

    Directory of Open Access Journals (Sweden)

    Bingfu Guo

    2016-07-01

    Full Text Available Molecular characterization of sequences flanking exogenous fragment insertions is essential for safety assessment and labeling of genetically modified organisms (GMO. In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS method. About 21 Gb sequence data (~21× coverage for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundary of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767-50543792 and Chr17: 7980527-7980541 in these two transgenic lines. Identification of the genomic insertion site of the G2-EPSPS and GAT transgenes will facilitate the use of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS is a cost-effective and rapid method of identifying sites of T-DNA insertions and flanking sequences in soybean.

  15. Next generation sequencing uncovers a missense mutation in COL4A1 as the cause of familial retinal arteriolar tortuosity.

    Science.gov (United States)

    Zenteno, Juan C; Crespí, Jaume; Buentello-Volante, Beatriz; Buil, Jose A; Bassaganyas, Francisca; Vela-Segarra, Jose I; Diaz-Cascajosa, Jesus; Marieges, Maria T

    2014-11-01

    Our aim was to determine the molecular cause of autosomal dominant familial retinal arteriolar tortuosity (FRAT) in a family with three affected subjects. Ophthalmologic evaluation included determination of best-corrected visual acuity (BCVA), slit-lamp and dilated fundus inspection, applanation tonometry, fundus photography, and fluorescein retinal angiography (FA). Molecular methods included whole exome sequencing analysis and Sanger sequencing validation of putative causal mutation in DNA from affected individuals. Typical signs of familial retinal arteriolar tortuosity were observed in all three patients. Exome sequencing identified a heterozygous c.1528G > A (p. Gly510Arg) mutation in COL4A1. Sanger sequencing confirmed that all three patients harbored the same pathogenetic mutation in COL4A1. The p. Gly510Arg variant in COL4A1 was absent in DNA from an available unaffected daughter, from a set of control alleles, and from publicly available databases. The molecular basis of familial retinal arteriolar tortuosity was identified for the first time, thus expanding the human phenotypes linked to COL4A1 mutations. Interestingly, the COL4A1 p.Gly510Arg mutation has been previously identified in a family with HANAC (Hereditary Angiopathy with Nephropathy, Aneurysm and Cramps), a multisystemic disease featuring retinal arteriolar tortuosity. No cerebral, neurologic, renal, cardiac or vascular anomalies were recognized in the pedigree described here. These data indicate that identical mutations in COL4A1 can originate both eye-restricted and systemic phenotypes.

  16. Genome-wide sequencing to identify the cause of hereditary cancer syndromes: with examples from familial pancreatic cancer

    Science.gov (United States)

    Roberts, Nicholas J.; Klein, Alison P.

    2013-01-01

    Advances in our understanding of the human genome and next-generation technologies have facilitated the use of genome-wide sequencing to decipher the genetic basis of Mendelian disease and hereditary cancer syndromes. The application of genome-wide sequencing in hereditary cancer syndromes has had mixed success, in part, due to complex nature of the underlying genetic architecture. In this review we discuss the use of genome-wide sequencing in both Mendelian diseases and hereditary cancer syndromes, highlighting the potential and challenges of this approach using familial pancreatic cancer as an example. PMID:23196058

  17. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  18. Advantage of Whole Exome Sequencing over Allele-Specific and Targeted Segment Sequencing in Detection of Novel TULP1 Mutation in Leber Congenital Amaurosis.

    Science.gov (United States)

    Guo, Yiran; Prokudin, Ivan; Yu, Cong; Liang, Jinlong; Xie, Yi; Flaherty, Maree; Tian, Lifeng; Crofts, Stephanie; Wang, Fengxiang; Snyder, James; Donaldson, Craig; Abdel-Magid, Nada; Vazquez, Lyam; Keating, Brendan; Hakonarson, Hakon; Wang, Jun; Jamieson, Robyn V

    2015-01-01

    Leber congenital amaurosis (LCA) is a severe form of retinal dystrophy with marked underlying genetic heterogeneity. Until recently, allele-specific assays and Sanger sequencing of targeted segments were the only available approaches for attempted genetic diagnosis in this condition. A broader next-generation sequencing (NGS) strategy, such as whole exome sequencing, provides an improved molecular genetic diagnostic capacity for patients with these conditions. In a child with LCA, an allele-specific assay analyzing 135 known LCA-causing variations, followed by targeted segment sequencing of 61 regions in 14 causative genes was performed. Subsequently, exome sequencing was undertaken in the proband, unaffected consanguineous parents and two unaffected siblings. Bioinformatic analysis used two independent pipelines, BWA-GATK and SOAP, followed by Annovar and SnpEff to annotate the variants. No disease-causing variants were found using the allele-specific or targeted segment Sanger sequencing assays. Analysis of variants in the exome sequence data revealed a novel homozygous nonsense mutation (c.1081C > T, p.Arg361*) in TULP1, a gene with roles in photoreceptor function where mutations were previously shown to cause LCA and retinitis pigmentosa. The identified homozygous variant was the top candidate using both bioinformatic pipelines. This study highlights the value of the broad sequencing strategy of exome sequencing for disease gene identification in LCA, over other existing methods. NGS is particularly beneficial in LCA where there are a large number of causative disease genes, few distinguishing clinical features for precise candidate disease gene selection, and few mutation hotspots in any of the known disease genes.

  19. Multiple viral infections in Agaricus bisporus - Characterisation of 18 unique RNA viruses and 8 ORFans identified by deep sequencing

    OpenAIRE

    Deakin, Gregory; Dobbs, Edward; Julie M. Bennett; Ian M Jones; Grogan, Helen M.; Burton, Kerry S.

    2017-01-01

    Thirty unique non-host RNAs were sequenced in the cultivated fungus, Agaricus bisporus, comprising 18 viruses each encoding an RdRp domain with an additional 8 ORFans (non-host RNAs with no similarity to known sequences). Two viruses were multipartite with component RNAs showing correlative abundances and common 3′ motifs. The viruses, all positive sense single-stranded, were classified into diverse orders/families. Multiple infections of Agaricus may represent a diverse, dynamic and interact...

  20. EnD-Seq and AppEnD: sequencing 3′ ends to identify nontemplated tails and degradation intermediates

    Science.gov (United States)

    Welch, Joshua D.; Slevin, Michael K.; Tatomer, Deirdre C.; Duronio, Robert J.

    2015-01-01

    Existing methods for detecting RNA intermediates resulting from exonuclease degradation are low-throughput and laborious. In addition, mapping the 3′ ends of RNA molecules to the genome after high-throughput sequencing is challenging, particularly if the 3′ ends contain post-transcriptional modifications. To address these problems, we developed EnD-Seq, a high-throughput sequencing protocol that preserves the 3′ end of RNA molecules, and AppEnD, a computational method for analyzing high-throughput sequencing data. Together these allow determination of the 3′ ends of RNA molecules, including nontemplated additions. Applying EnD-Seq and AppEnD to histone mRNAs revealed that a significant fraction of cytoplasmic histone mRNAs end in one or two uridines, which have replaced the 1–2 nt at the 3′ end of mature histone mRNA maintaining the length of the histone transcripts. Histone mRNAs in fly embryos and ovaries show the same pattern, but with different tail nucleotide compositions. We increase the sensitivity of EnD-Seq by using cDNA priming to specifically enrich low-abundance tails of known sequence composition allowing identification of degradation intermediates. In addition, we show the broad applicability of our computational approach by using AppEnD to gain insight into 3′ additions from diverse types of sequencing data, including data from small capped RNA sequencing and some alternative polyadenylation protocols. PMID:26015596

  1. Opa-typing can identify epidemiologically distinct subgroups within Neisseria gonorrhoeae multi-antigen sequence type (NG-MAST) clusters

    Science.gov (United States)

    MORRIS, A. K.; PALMER, H. M.; YOUNG, H.

    2008-01-01

    SUMMARY A collection of 106 Neisseria gonorrhoeae ciprofloxacin-resistant isolates were typed using Neisseria gonorrhoeae multi-antigen sequence typing (NG-MAST). Opa-typing was performed on 74 isolates which had non-unique sequence types to determine if further discrimination could be achieved and if so whether this had any epidemiological basis. The 74 isolates were separated into 12 sequence types and 20 opa-types (OT). Seven opa-type clusters were congruent with the sequence types and five sequence types could be subdivided by opa-typing. These results demonstrate that opa-typing can add a further level of discrimination compared with NG-MAST. The surveillance data for isolates in the largest sequence type cluster (ST 147) indicated that two major subdivisions OT 1 and OT 2 differed epidemiologically by patients' sexual preference and geographical location. ST 147 is a common strain that has been isolated in several countries since 1999; our results suggest that it has diverged into at least two epidemiologically discrete forms. PMID:18241521

  2. Using sequence data to identify alternative routes and risk of infection: a case-study of campylobacter in Scotland

    Directory of Open Access Journals (Sweden)

    Bessell Paul R

    2012-04-01

    Full Text Available Abstract Background Genetic typing data are a potentially powerful resource for determining how infection is acquired. In this paper MLST typing was used to distinguish the routes and risks of infection of humans with Campylobacter jejuni from poultry and ruminant sources Methods C. jejuni samples from animal and environmental sources and from reported human cases confirmed between June 2005 and September 2006 were typed using MLST. The STRUCTURE software was used to assign the specific sequence types of the sporadic human cases to a particular source. We then used mixed case-case logistic regression analysis to compare the risk factors for being infected with C. jejuni from different sources. Results A total of 1,599 (46.3% cases were assigned to poultry, 1,070 (31.0% to ruminant and 67 (1.9% to wild bird sources; the remaining 715 (20.7% did not have a source that could be assigned with a probability of greater than 0.95. Compared to ruminant sources, cases attributed to poultry sources were typically among adults (odds ratio (OR = 1.497, 95% confidence intervals (CIs = 1.211, 1.852, not among males (OR = 0.834, 95% CIs = 0.712, 0.977, in areas with population density of greater than 500 people/km2 (OR = 1.213, 95% CIs = 1.030, 1.431, reported in the winter (OR = 1.272, 95% CIs = 1.067, 1.517 and had undertaken recent overseas travel (OR = 1.618, 95% CIs = 1.056, 2.481. The poultry assigned strains had a similar epidemiology to the unassigned strains, with the exception of a significantly higher likelihood of reporting overseas travel in unassigned strains. Conclusions Rather than estimate relative risks for acquiring infection, our analyses show that individuals acquire C. jejuni infection from different sources have different associated risk factors. By enhancing our ability to identify at-risk groups and the times at which these groups are likely to be at risk, this work allows public health messages to be targeted more effectively. The

  3. Whole genome sequencing identifies a deletion in protein phosphatase 2A that affects its stability and localization in Chlamydomonas reinhardtii.

    Directory of Open Access Journals (Sweden)

    Huawen Lin

    Full Text Available Whole genome sequencing is a powerful tool in the discovery of single nucleotide polymorphisms (SNPs and small insertions/deletions (indels among mutant strains, which simplifies forward genetics approaches. However, identification of the causative mutation among a large number of non-causative SNPs in a mutant strain remains a big challenge. In the unicellular biflagellate green alga Chlamydomonas reinhardtii, we generated a SNP/indel library that contains over 2 million polymorphisms from four wild-type strains, one highly polymorphic strain that is frequently used in meiotic mapping, ten mutant strains that have flagellar assembly or motility defects, and one mutant strain, imp3, which has a mating defect. A comparison of polymorphisms in the imp3 strain and the other 15 strains allowed us to identify a deletion of the last three amino acids, Y313F314L315, in a protein phosphatase 2A catalytic subunit (PP2A3 in the imp3 strain. Introduction of a wild-type HA-tagged PP2A3 rescues the mutant phenotype, but mutant HA-PP2A3 at Y313 or L315 fail to rescue. Our immunoprecipitation results indicate that the Y313, L315, or YFLΔ mutations do not affect the binding of PP2A3 to the scaffold subunit, PP2A-2r. In contrast, the Y313, L315, or YFLΔ mutations affect both the stability and the localization of PP2A3. The PP2A3 protein is less abundant in these mutants and fails to accumulate in the basal body area as observed in transformants with either wild-type HA-PP2A3 or a HA-PP2A3 with a V310T change. The accumulation of HA-PP2A3 in the basal body region disappears in mated dikaryons, which suggests that the localization of PP2A3 may be essential to the mating process. Overall, our results demonstrate that the terminal YFL tail of PP2A3 is important in the regulation on Chlamydomonas mating.

  4. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Directory of Open Access Journals (Sweden)

    Nathan D. Olson

    2015-03-01

    Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  5. CA88, a nuclear repetitive DNA sequence identified in Schistosoma mansoni, aids in the genotyping of nine Schistosoma species of medical and veterinary importance

    Directory of Open Access Journals (Sweden)

    Diana Bahia

    2010-07-01

    Full Text Available CA88 is the first long nuclear repetitive DNA sequence identified in the blood fluke, Schistosoma mansoni. The assembled S. mansoni sequence, which contains the CA88 repeat, has 8,887 nucleotides and at least three repeat units of approximately 360 bp. In addition, CA88 also possesses an internal CA microsatellite, identified as SmBr18. Both PCR and BLAST analysis have been used to analyse and confirm the CA88 sequence in other S. mansoni sequences in the public database. PCR-acquired nuclear repetitive DNA sequence profiles from nine Schistosoma species were used to classify this organism into four genotypes. Included among the nine species analysed were five sequences of both African and Asian lineages that are known to infect humans. Within these genotypes, three of them refer to recognised species groups. A panel of four microsatellite loci, including SmBr18 and three previously published loci, has been used to characterise the nine Schistosoma species. Each species has been identified and classified based on its CA88 DNA fingerprint profile. Furthermore, microsatellite sequences and intra-specific variation have also been observed within the nine Schistosoma species sequences. Taken together, these results support the use of these markers in studying the population dynamics of Schistosoma isolates from endemic areas and also provide new methods for investigating the relationships between different populations of parasites. In addition, these data also indicate that Schistosoma magrebowiei is not a sister taxon to Schistosoma mattheei, prompting a new designation to a basal clade.

  6. Deep Ion Torrent sequencing identifies soil fungal community shifts after frequent prescribed fires in a southeastern US forest ecosystem.

    Science.gov (United States)

    Brown, Shawn P; Callaham, Mac A; Oliver, Alena K; Jumpponen, Ari

    2013-12-01

    Prescribed burning is a common management tool to control fuel loads, ground vegetation, and facilitate desirable game species. We evaluated soil fungal community responses to long-term prescribed fire treatments in a loblolly pine forest on the Piedmont of Georgia and utilized deep Internal Transcribed Spacer Region 1 (ITS1) amplicon sequencing afforded by the recent Ion Torrent Personal Genome Machine (PGM). These deep sequence data (19,000 + reads per sample after subsampling) indicate that frequent fires (3-year fire interval) shift soil fungus communities, whereas infrequent fires (6-year fire interval) permit system resetting to a state similar to that without prescribed fire. Furthermore, in nonmetric multidimensional scaling analyses, primarily ectomycorrhizal taxa were correlated with axes associated with long fire intervals, whereas soil saprobes tended to be correlated with the frequent fire recurrence. We conclude that (1) multiplexed Ion Torrent PGM analyses allow deep cost effective sequencing of fungal communities but may suffer from short read lengths and inconsistent sequence quality adjacent to the sequencing adaptor; (2) frequent prescribed fires elicit a shift in soil fungal communities; and (3) such shifts do not occur when fire intervals are longer. Our results emphasize the general responsiveness of these forests to management, and the importance of fire return intervals in meeting management objectives. © 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  7. Linkage and whole genome sequencing identify a locus on 6q25-26 for formal thought disorder and implicate MEF2A regulation

    DEFF Research Database (Denmark)

    Thygesen, Johan Hilge; Zambach, Sine Katharina; Ingason, Andrés

    2015-01-01

    analysis of the implicated region using phased microsatellite and SNP genotypes. Whole genome sequencing (N=3) was used in the attempt to identify causative variants in the linkage region. Linkage analysis of formal thought disorder resulted in a single peak at chromosome 6(q26-q27) centred on marker D6S...... disorder index score (P=4.9 × 10(-5)) and qualitatively severe forms of thought disturbances. Whole genome sequencing identified a novel nucleotide deletion (chr6:164377205 AG>A, hg18) predicted to disrupt the potential binding of the transcription factor MEF2A. The MEF2A binding site is located between...

  8. Determination of the whole-genome consensus sequence of the prototype DS-1 rotavirus using sequence-independent genome amplification and 454® pyrosequencing.

    Science.gov (United States)

    Mlera, Luwanika; Jere, Khuzwayo C; van Dijk, Alberdina A; O'Neill, Hester G

    2011-08-01

    The prototype DS-1 rotavirus strain, is characterised by a short electropherotype and G2P[4] serotype specificity. Following sequence-independent genome amplification and 454(®) pyrosequencing of genomic cDNA, differences between the newly determined consensus sequence and GenBank sequences were observed in 10 of the 11 genome segments. Only the consensus sequence of genome segment 1 was identical to sequences deposited in GenBank. A novel isoleucine at position 397 in a hydrophobic region of VP4 is described. An additional 7 N-terminal amino acids was found in NSP1. For genome segment 10 the first 34 and last 30 nucleotides of the 5' and 3'-terminal ends, respectively, were identified. Genome segment 11 was found to be 821 bp long, which is 148 bp longer than the full length genome segment 11 sequence reported previously. This paper reports the first complete consensus genome sequence for the tissue culture adapted DS-1 strain free from cloning bias and the limitations of Sanger sequencing. Sequence differences in previous publications reporting on DS-1 rotavirus genome segment sequencing, were identified and discussed. Copyright © 2011 Elsevier B.V. All rights reserved.

  9. Deep Sequencing of Cell-Free Peripheral Blood DNA as a Reliable Method for Confirming the Diagnosis of Myelodysplastic Syndrome.

    Science.gov (United States)

    Albitar, Ferras; Ma, Wanlong; Diep, Kevin; De Dios, Ivan; Agersborg, Sally; Thangavelu, Maya; Brodie, Steve; Albitar, Maher

    2016-07-01

    Demonstrating the presence of myelodysplastic syndrome (MDS)-specific molecular abnormalities can aid in diagnosis and patient management. We explored the potential of using peripheral blood (PB) cell-free DNA (cf-DNA) and next-generation sequencing (NGS). We performed NGS on a panel of 14 target genes using total nucleic acid extracted from the plasma of 16 patients, all of whom had confirmed diagnoses for early MDS with blasts DNA from the same patients was sequenced using conventional Sanger sequencing and NGS. Deep sequencing of the cf-DNA identified one or more mutated gene(s), confirming the diagnosis of MDS in all cases. Five samples (31%) showed abnormalities in cf-DNA by NGS that were not detected by Sanger sequencing on cellular PB DNA. NGS of PB cell DNA showed the same findings as those of cf-DNA in four of five patients, but failed to show a mutation in the RUNX1 gene that was detected in one patient's cf-DNA. Mutant allele frequency was significantly higher in cf-DNA compared with cellular DNA (p = 0.008). These data suggest that cf-DNA when analyzed using NGS is a reliable approach for detecting molecular abnormalities in MDS and should be used to determine if bone marrow aspiration and biopsy are necessary.

  10. VirusSeq: software to identify viruses and their integration sites using next-generation sequencing of human cancer tissue.

    Science.gov (United States)

    Chen, Yunxin; Yao, Hui; Thompson, Erika J; Tannir, Nizar M; Weinstein, John N; Su, Xiaoping

    2013-01-15

    We developed a new algorithmic method, VirusSeq, for detecting known viruses and their integration sites in the human genome using next-generation sequencing data. We evaluated VirusSeq on whole-transcriptome sequencing (RNA-Seq) data of 256 human cancer samples from The Cancer Genome Atlas. Using these data, we showed that VirusSeq accurately detects the known viruses and their integration sites with high sensitivity and specificity. VirusSeq can also perform this function using whole-genome sequencing data of human tissue. VirusSeq has been implemented in PERL and is available at http://odin.mdacc.tmc.edu/∼xsu1/VirusSeq.html. xsu1@mdanderson.org Supplementary data are available at Bioinformatics online.

  11. Novel TNS3-MAP3K3 and ZFPM2-ELF5 fusion genes identified by RNA sequencing in multicystic mesothelioma with t(7;17)(p12;q23) and t(8;11)(q23;p13).

    Science.gov (United States)

    Panagopoulos, Ioannis; Gorunova, Ludmila; Davidson, Ben; Heim, Sverre

    2015-02-28

    Multicystic mesothelioma is a rare disease of unknown etiology and pathogenesis. Nothing has been known about the cytogenetic and molecular genetic features of these tumors. Here we present the first cytogenetically analyzed multicystic mesothelioma with the karyotype 46,XX,t(7;17)(p13;q23),t(8;11)(q23;p13). RNA-sequencing showed that the t(7;17)(p13;q23) generated a chimeric TNS3-MAP3K3 gene, which codes for a chimeric protein kinase, as well as the reciprocal MAP3K3-TNS3 in which the region of TNS3 coding for the SH2_Tensin_like region and the tensin phosphotyrosine-binding domain is under the control of the MAP3K3 promoter. The other translocation, t(8;11)(q23;p13), generated a chimeric ZFPM2-ELF5 gene which codes for a chimeric transcription factor in which the first 40 amino acids of ELF5 are replaced by the first 100 amino acids of ZFPM2. RT-PCR together with Sanger sequencing verified the presence of the above-mentioned fusion transcripts. The finding of acquired clonal chromosome abnormalities in cells cultured from the lesion and the presence of the TNS3-MAP3K3 chimeric protein kinase and the ZFPM2-ELF5 chimeric transcription factor confirm the neoplastic nature of multicystic mesothelioma. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  12. Molecular genetics of the Usher syndrome in Lebanon: identification of 11 novel protein truncating mutations by whole exome sequencing.

    Directory of Open Access Journals (Sweden)

    Ramesh Reddy

    Full Text Available Usher syndrome (USH is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II.Whole exome sequencing followed by expanded familial validation by Sanger sequencing.We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98.Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes.

  13. Molecular genetics of the Usher syndrome in Lebanon: identification of 11 novel protein truncating mutations by whole exome sequencing.

    Science.gov (United States)

    Reddy, Ramesh; Fahiminiya, Somayyeh; El Zir, Elie; Mansour, Ahmad; Megarbane, Andre; Majewski, Jacek; Slim, Rima

    2014-01-01

    Usher syndrome (USH) is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II. Whole exome sequencing followed by expanded familial validation by Sanger sequencing. We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98. Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes.

  14. Ancient mtDNA sequences in the human nuclear genome: A potential source of errors in identifying pathogenic mutations

    OpenAIRE

    Wallace, Douglas C.; Stugard, Carol; Murdock, Deborah; Schurr, Theodore; Brown, Michael D

    1997-01-01

    Nuclear-localized mtDNA pseudogenes might explain a recent report describing a heteroplasmic mtDNA molecule containing five linked missense mutations dispersed over the contiguous mtDNA CO1 and CO2 genes in Alzheimer’s disease (AD) patients. To test this hypothesis, we have used the PCR primers utilized in the original report to amplify CO1 and CO2 sequences from two independent ρ° (mtDNA-less) cell lines. CO1 and CO2 sequences amplified from both of the ρ° cells, ...

  15. MicroRNA of the fifth-instar posterior silk gland of silkworm identified by Solexa sequencing

    Directory of Open Access Journals (Sweden)

    Jisheng Li

    2014-12-01

    Full Text Available No special studies have been focused on the microRNA (miRNA in the fifth-instar posterior silk gland of Bombyx mori. Here, using next-generation sequencing, we acquired 93.2 million processed reads from 10 small RNA libraries. In this paper, we tried to thoroughly describe how our dataset generated from deep sequencing which was recently published in BMC genomics. Results showed that our findings are largely enriched silkworm miRNA depository and may benefit us to reveal the miRNA functions in the process of silk production.

  16. RNA-Sequencing Analysis of Messenger RNA/MicroRNA in a Rabbit Aneurysm Model Identifies Pathways and Genes of Interest.

    Science.gov (United States)

    Holcomb, M; Ding, Y-H; Dai, D; McDonald, R J; McDonald, J S; Kallmes, D F; Kadirvel, R

    2015-09-01

    Rabbit aneurysm models are used for the testing of embolization devices and elucidating the mechanisms of human intracranial aneurysm growth and healing. We used RNA-sequencing technology to identify genes relevant to induced rabbit aneurysm biology and to identify genes and pathways of potential clinical interest. This process included sequencing microRNAs, which are important regulatory noncoding RNAs. Elastase-induced saccular aneurysms were created at the origin of the right common carotid artery in 6 rabbits. Messenger RNA and microRNA were isolated from the aneurysm and from the control left common carotid artery at 12 weeks and processed by using RNA-sequencing technology. The results from RNA sequencing were analyzed by using the Ingenuity Pathway Analysis tool. A total of 9396 genes were analyzed by using RNA sequencing, 648 (6.9%) of which were found to be significantly differentially expressed between the aneurysms and control tissues (P 2 or rabbit aneurysms revealed differential regulation of some key pathways, including inflammation and antigen presentation. ANKRD1 and TACR1 were identified as genes of interest in the regulation of matrix metalloproteinases. © 2015 by American Journal of Neuroradiology.

  17. Complete Genome Sequence of Porcine Sapelovirus Strain USA/IA33375/2015 Identified in the United States

    OpenAIRE

    Chen, Qi; Zheng, Ying; Guo, Baoqing; Zhang, Jianqiang; Yoon, Kyoung-Jin; Karen M Harmon; Main, Rodger G.; Li, Ganwu

    2016-01-01

    The complete genome of sapelovirus A, formerly known as porcine sapelovirus (PSV), from a diarrheic pig was sequenced for the first time in the United States (designated PSV USA/IA33375/2015). It shares 87.8% to 83.9% nucleotide identities with other reported PSV strains globally and is most closely related to Asia PSV strains.

  18. Whole Genome Sequencing of High-Risk Families to Identify New Mutational Mechanisms of Breast Cancer Predisposition

    Science.gov (United States)

    2015-12-01

    principal discipline(s) of the project? Our approach integrated whole genome sequencing with experimental biology and with application and development of...pathogenicity of genetic variants. Bioinformatics. 31:761-763. 13 Shihab HA, Rogers MF, Gough J, Mort M, Cooper DN, Day IN, Gaunt TR, Campbell C. (2015

  19. Genome-Wide Association Study with Sequence Variants Identifies Candidate Genes for Mastitis Resistance in Dairy Cattle

    DEFF Research Database (Denmark)

    Sahana, Goutam; Guldbrandtsen, Bernt; Bendixen, Christian

    in the six QTL regions were imputed in 5,193 genotyped bulls using Beagle v3.3. After imputation and quality control there were on average 4,600 markers/Mbp. Association analyses with imputed sequence data were repeated for the targeted regions. Functional annotations of the variants were done using Variant...

  20. Isolation of Canine parvovirus with a view to identify the prevalent serotype on the basis of partial sequence analysis

    Directory of Open Access Journals (Sweden)

    Gurpreet Kaur

    2015-01-01

    Full Text Available Aim: The aim of this study was to isolate Canine parvovirus (CPV from suspected dogs on madin darby canine kidney (MDCK cell line and its confirmation by polymerase chain reaction (PCR and nested PCR (NPCR. Further, VP2 gene of the CPV isolates was amplified and sequenced to determine prevailing antigenic type. Materials and Methods: A total of 60 rectal swabs were collected from dogs showing signs of gastroenteritis, processed and subjected to isolation in MDCK cell line. The samples showing cytopathic effects (CPE were confirmed by PCR and NPCR. These samples were subjected to PCR for amplification of VP2 gene of CPV, sequenced and analyzed to study the prevailing antigenic types of CPV. Results: Out of the 60 samples subjected to isolation in MDCK cell line five samples showed CPE in the form of rounding of cells, clumping of cells and finally detachment of the cells. When these samples and the two commercially available vaccines were subjected to PCR for amplification of VP2 gene, a 1710 bp product was amplified. The sequence analysis revealed that the vaccines belonged to the CPV-2 type and the samples were of CPV-2b type. Conclusion: It can be concluded from the present study that out of a total of 60 samples 5 samples exhibited CPE as observed in MDCK cell line. Sequence analysis of the VP2 gene among the samples and vaccine strains revealed that samples belonged to CPV-2b type and vaccines belonging to CPV-2.

  1. The thermal stability of oligonucleotide duplexes is sequence independent in tetraalkylammonium salt solutions: application to identifying recombinant DNA clones.

    OpenAIRE

    Jacobs, K. A.; Rudersdorf, R; Neill, S D; Dougherty, J P; Brown, E L; Fritsch, E F

    1988-01-01

    In solutions of tetraalkylammonium salts the melting temperature of oligonucleotide duplexes is independent of nucleotide sequence and thus GC content. Data quantitating the destabilizing effects of various mismatches in these solvents are also presented. The results are in accord with theories on DNA melting and establish conditions under which oligonucleotides can be used as hybridization probes with predictable and controllable specificity.

  2. Sequence specificity between interacting and non-interacting homologs identifies interface residues--a homodimer and monomer use case

    NARCIS (Netherlands)

    Hou, Q.; Dutilh, B.E.; Huynen, M.A.; Heringa, J.; Feenstra, K.A.

    2015-01-01

    BACKGROUND: Protein families participating in protein-protein interactions may contain sub-families that have different binding characteristics, ranging from right binding to showing no interaction at all. Composition differences at the sequence level in these sub-families are often decisive to

  3. Sequence specificity between interacting and non-interacting homologs identifies interface residues - a homodimer and monomer use case

    NARCIS (Netherlands)

    Hou, Qingzhen; Dutilh, Bas E; Huynen, Martijn A; Heringa, Jaap; Feenstra, K Anton

    2015-01-01

    BACKGROUND: Protein families participating in protein-protein interactions may contain sub-families that have different binding characteristics, ranging from right binding to showing no interaction at all. Composition differences at the sequence level in these sub-families are often decisive to

  4. Exome sequencing identifies putative drivers of progression of transient myeloproliferative disorder to AMKL in infants with Down syndrome

    NARCIS (Netherlands)

    Nikolaev, S.I.; Santoni, F.; Vannier, A.; Falconnet, E.; Giarin, E.; Basso, G.; Hoischen, A.; Veltman, J.A.; Groet, J.; Nizetic, D.; Antonarakis, S.E.

    2013-01-01

    Some neonates with Down syndrome (DS) are diagnosed with self-regressing transient myeloproliferative disorder (TMD), and 20% to 30% of those progress to acute megakaryoblastic leukemia (AMKL). We performed exome sequencing in 7 TMD/AMKL cases and copy-number analysis in these and 10 additional

  5. PolyA_DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes.

    Science.gov (United States)

    Wang, Ruijia; Nambiar, Ram; Zheng, Dinghai; Tian, Bin

    2018-01-04

    PolyA_DB is a database cataloging cleavage and polyadenylation sites (PASs) in several genomes. Previous versions were based mainly on expressed sequence tags (ESTs), which had a limited amount and could lead to inaccurate PAS identification due to the presence of internal A-rich sequences in transcripts. Here, we present an updated version of the database based solely on deep sequencing data. First, PASs are mapped by the 3' region extraction and deep sequencing (3'READS) method, ensuring unequivocal PAS identification. Second, a large volume of data based on diverse biological samples increases PAS coverage by 3.5-fold over the EST-based version and provides PAS usage information. Third, strand-specific RNA-seq data are used to extend annotated 3' ends of genes to obtain more thorough annotations of alternative polyadenylation (APA) sites. Fourth, conservation information of PAS across mammals sheds light on significance of APA sites. The database (URL: http://www.polya-db.org/v3) currently holds PASs in human, mouse, rat and chicken, and has links to the UCSC genome browser for further visualization and for integration with other genomic data. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Mutation profiling of adenoid cystic carcinomas from multiple anatomical sites identifies mutations in the RAS pathway, but no KIT mutations

    Science.gov (United States)

    Wetterskog, Daniel; Wilkerson, Paul M; Rodrigues, Daniel N; Lambros, Maryou B; Fritchie, Karen; Andersson, Mattias K; Natrajan, Rachael; Gauthier, Arnaud; Di Palma, Silvana; Shousha, Sami; Gatalica, Zoran; Töpfer, Chantal; Vukovic, Vesna; A’Hern, Roger; Weigelt, Britta; Vincent-Salomon, Anne; Stenman, Göran; Rubin, Brian P; Reis-Filho, Jorge S

    2016-01-01

    Aims The majority of adenoid cystic carcinomas (AdCCs), regardless of anatomical site, harbour the MYB–NFIB fusion gene. The aim of this study was to characterize the repertoire of somatic genetic events affecting known cancer genes in AdCCs. Methods and results DNA was extracted from 13 microdissected breast AdCCs, and subjected to a mutation survey using the Sequenom OncoCarta Panel v1.0. Genes found to be mutated in any of the breast AdCCs and genes related to the same canonical molecular pathways, as well as KIT, a proto-oncogene whose protein product is expressed in AdCCs, were sequenced in an additional 68 AdCCs from various anatomical sites by Sanger sequencing. Using the Sequenom MassARRAY platform and Sanger sequencing, mutations in BRAF and HRAS were identified in three and one cases, respectively (breast, and head and neck). KIT, which has previously been reported to be mutated in AdCCs, was also investigated, but no mutations were identified. Conclusions Our results demonstrate that mutations in genes pertaining to the canonical RAS pathway are found in a minority of AdCCs, and that activating KIT mutations are either absent or remarkably rare in these cancers, and unlikely to constitute a driver and therapeutic target for patients with AdCC. PMID:23398044

  7. Memory Efficient Sequence Analysis Using Compressed Data Structures (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    Energy Technology Data Exchange (ETDEWEB)

    Simpson, Jared

    2011-10-13

    Wellcome Trust Sanger Institute's Jared Simpson on Memory efficient sequence analysis using compressed data structures at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  8. Tn5401 disruption of the spo0F gene, identified by direct chromosomal sequencing, results in CryIIIA overproduction in Bacillus thuringiensis.

    OpenAIRE

    Malvar, T; Baum, J A

    1994-01-01

    The Bacillus thuringiensis spo0F gene was identified by chromosomal DNA sequencing of sporulation mutants derived from a B. thuringiensis transposon insertion library. A spo0F defect in B. thuringiensis, which was suppressed by multicopy hknA or kinA, resulted in the overproduction of the CryIIIA insecticidal crystal protein.

  9. THE USE OF INTER SIMPLE SEQUENCE REPEATS (ISSR) IN DISTINGUISHING NEIGHBORING DOUGLAS-FIR TREES AS A MEANS TO IDENTIFYING TREE ROOTS WITH ABOVE-GROUND BIOMASS

    Science.gov (United States)

    We are attempting to identify specific root fragments from soil cores with individual trees. We successfully used Inter Simple Sequence Repeats (ISSR) to distinguish neighboring old-growth Douglas-fir trees from one another, while maintaining identity among each tree's parts. W...

  10. Segmental overgrowth syndrome due to an activating PIK3CA mutation identified in affected muscle tissue by exome sequencing

    DEFF Research Database (Denmark)

    Rasmussen, Maria; Sunde, Lone; Weigert, Karen Petra

    2014-01-01

    Mosaic PIK3CA-mutations have been described in an increasing number of overgrowth syndromes. We describe a patient with a previously unreported segmental overgrowth syndrome with the mutation, PIKCA3 c.3140A>G (p.His1047Arg) in affected tissue diagnosed by exome sequencing. This PIK3CA-associated......-associated segmental overgrowth syndrome overlaps with CLOVES syndrome and fibroadipose hyperplasia but is distinct from each of these entities....

  11. Single site suppressors of a fission yeast temperature-sensitive mutant in cdc48 identified by whole genome sequencing.

    Science.gov (United States)

    Marinova, Irina N; Engelbrecht, Jacob; Ewald, Adrian; Langholm, Lasse L; Holmberg, Christian; Kragelund, Birthe B; Gordon, Colin; Nielsen, Olaf; Hartmann-Petersen, Rasmus

    2015-01-01

    The protein called p97 in mammals and Cdc48 in budding and fission yeast is a homo-hexameric, ring-shaped, ubiquitin-dependent ATPase complex involved in a range of cellular functions, including protein degradation, vesicle fusion, DNA repair, and cell division. The cdc48+ gene is essential for viability in fission yeast, and point mutations in the human orthologue have been linked to disease. To analyze the function of p97/Cdc48 further, we performed a screen for cold-sensitive suppressors of the temperature-sensitive cdc48-353 fission yeast strain. In total, 29 independent pseudo revertants that had lost the temperature-sensitive growth defect of the cdc48-353 strain were isolated. Of these, 28 had instead acquired a cold-sensitive phenotype. Since the suppressors were all spontaneous mutants, and not the result of mutagenesis induced by chemicals or UV irradiation, we reasoned that the genome sequences of the 29 independent cdc48-353 suppressors were most likely identical with the exception of the acquired suppressor mutations. This prompted us to test if a whole genome sequencing approach would allow us to map the mutations. Indeed genome sequencing unambiguously revealed that the cold-sensitive suppressors were all second site intragenic cdc48 mutants. Projecting these onto the Cdc48 structure revealed that while the original temperature-sensitive G338D mutation is positioned near the central pore in the hexameric ring, the suppressor mutations locate to subunit-subunit and inter-domain boundaries. This suggests that Cdc48-353 is structurally compromized at the restrictive temperature, but re-established in the suppressor mutants. The last suppressor was an extragenic frame shift mutation in the ufd1 gene, which encodes a known Cdc48 co-factor. In conclusion, we show, using a novel whole genome sequencing approach, that Cdc48-353 is structurally compromized at the restrictive temperature, but stabilized in the suppressors.

  12. Genomic sequencing identifies a few mutations driving the independent origin of primary liver tumors in a chronic hepatitis murine model.

    Science.gov (United States)

    Yang, Zuyu; Jia, Mingming; Liu, Guojing; Hao, Huaining; Chen, Li; Li, Guanghao; Liu, Sixue; Li, Yawei; Wu, Chung-I; Lu, Xuemei; Wang, Shengdian

    2017-01-01

    With the development of high-throughput genomic analysis, sequencing a mouse primary cancer model provides a new opportunity to understand fundamental mechanisms of tumorigenesis and progression. Here, we characterized the genomic variations in a hepatitis-related primary hepatocellular carcinoma (HCC) mouse model. A total of 12 tumor sections and four adjacent non-tumor tissues from four mice were used for whole exome and/or whole genome sequencing and validation of genotyping. The functions of the mutated genes in tumorigenesis were studied by analyzing their mutation frequency and expression in clinical HCC samples. A total of 46 single nucleotide variations (SNVs) were detected within coding regions. All SNVs were only validated in the sequencing samples, except the Hras mutation, which was shared by three tumors in the M1 mouse. However, the mutated allele frequency varied from high (0.4) to low (0.1), and low frequency (0.1-0.2) mutations existed in almost every tumor. Together with a diploid karyotype and an equal distribution pattern of these SNVs within the tumor, these results suggest the existence of subclones within tumors. A total of 26 mutated genes were mapped to 17 terms describing different molecular and cellular functions. All 41 human homologs of the mutated genes were mutated in the clinical samples, and some mutations were associated with clinical outcomes, suggesting a high probability of cancer driver genes in the spontaneous tumors of the mouse model. Genomic sequencing shows that a few mutations can drive the independent origin of primary liver tumors and reveals high heterogeneity among tumors in the early stage of hepatitis-related primary hepatocellular carcinoma.

  13. Genomic sequencing identifies a few mutations driving the independent origin of primary liver tumors in a chronic hepatitis murine model.

    Directory of Open Access Journals (Sweden)

    Zuyu Yang

    Full Text Available With the development of high-throughput genomic analysis, sequencing a mouse primary cancer model provides a new opportunity to understand fundamental mechanisms of tumorigenesis and progression. Here, we characterized the genomic variations in a hepatitis-related primary hepatocellular carcinoma (HCC mouse model. A total of 12 tumor sections and four adjacent non-tumor tissues from four mice were used for whole exome and/or whole genome sequencing and validation of genotyping. The functions of the mutated genes in tumorigenesis were studied by analyzing their mutation frequency and expression in clinical HCC samples. A total of 46 single nucleotide variations (SNVs were detected within coding regions. All SNVs were only validated in the sequencing samples, except the Hras mutation, which was shared by three tumors in the M1 mouse. However, the mutated allele frequency varied from high (0.4 to low (0.1, and low frequency (0.1-0.2 mutations existed in almost every tumor. Together with a diploid karyotype and an equal distribution pattern of these SNVs within the tumor, these results suggest the existence of subclones within tumors. A total of 26 mutated genes were mapped to 17 terms describing different molecular and cellular functions. All 41 human homologs of the mutated genes were mutated in the clinical samples, and some mutations were associated with clinical outcomes, suggesting a high probability of cancer driver genes in the spontaneous tumors of the mouse model. Genomic sequencing shows that a few mutations can drive the independent origin of primary liver tumors and reveals high heterogeneity among tumors in the early stage of hepatitis-related primary hepatocellular carcinoma.

  14. Multiple viral infections in Agaricus bisporus - Characterisation of 18 unique RNA viruses and 8 ORFans identified by deep sequencing.

    Science.gov (United States)

    Deakin, Gregory; Dobbs, Edward; Bennett, Julie M; Jones, Ian M; Grogan, Helen M; Burton, Kerry S

    2017-05-26

    Thirty unique non-host RNAs were sequenced in the cultivated fungus, Agaricus bisporus, comprising 18 viruses each encoding an RdRp domain with an additional 8 ORFans (non-host RNAs with no similarity to known sequences). Two viruses were multipartite with component RNAs showing correlative abundances and common 3' motifs. The viruses, all positive sense single-stranded, were classified into diverse orders/families. Multiple infections of Agaricus may represent a diverse, dynamic and interactive viral ecosystem with sequence variability ranging over 2 orders of magnitude and evidence of recombination, horizontal gene transfer and variable fragment numbers. Large numbers of viral RNAs were detected in multiple Agaricus samples; up to 24 in samples symptomatic for disease and 8-17 in asymptomatic samples, suggesting adaptive strategies for co-existence. The viral composition of growing cultures was dynamic, with evidence of gains and losses depending on the environment and included new hypothetical viruses when compared with the current transcriptome and EST databases. As the non-cellular transmission of mycoviruses is rare, the founding infections may be ancient, preserved in wild Agaricus populations, which act as reservoirs for subsequent cell-to-cell infection when host populations are expanded massively through fungiculture.

  15. Comparative sequence analysis of enteroaggregative Escherichia coli heat-stable enterotoxin 1 identified in Korean and Japanese Escherichia coli strains.

    Science.gov (United States)

    Seo, Dong Joo; Choi, SunKeum; Jeon, Su Been; Jeong, Suntak; Park, Hyunkyung; Lee, Bog-Hieu; Kim, Geun-Bae; Yang, Soo-Jin; Nishikawa, Yoshikazu; Choi, Changsun

    2017-02-21

    The aim of this study was to compare the sequence of the astA gene found in 8 Korean and 11 Japanese Escherichia coli isolates. Conventional PCR was used to amplify the astA gene from the chromosomal and plasmid DNA preparation samples of each isolate using commercial DNA extraction kits. Cloning of the PCR products, sequence analysis, and pulse field gel electrophoresis (PFGE) were sequentially performed. An identical copy of astA in each isolate were found for 8 Korean and 8 Japanese E. coli strains isolated from bovine, porcine, and healthy human carriers. Among these, 1 Korean and 4 Japanese isolates carried a stop mutation at residue 16. Three Japanese outbreak strains (V199, V638, and 96-127-23) carried multiple clones of astA gene with multiple amino acids changes at residues 11, 16, 20, 23, 30, 33, and 34. Compared with the non-diarrheal isolates, clonal diversity and sequence variations of the astA gene in outbreak isolates may be associated with virulence potential of EAST1. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. High-throughput amplicon sequencing and stream benthic bacteria: identifying the best taxonomic level for multiple-stressor research

    Science.gov (United States)

    Salis, R. K.; Bruder, A.; Piggott, J. J.; Summerfield, T. C.; Matthaei, C. D.

    2017-03-01

    Disentangling the individual and interactive effects of multiple stressors on microbial communities is a key challenge to our understanding and management of ecosystems. Advances in molecular techniques allow studying microbial communities in situ and with high taxonomic resolution. However, the taxonomic level which provides the best trade-off between our ability to detect multiple-stressor effects versus the goal of studying entire communities remains unknown. We used outdoor mesocosms simulating small streams to investigate the effects of four agricultural stressors (nutrient enrichment, the nitrification inhibitor dicyandiamide (DCD), fine sediment and flow velocity reduction) on stream bacteria (phyla, orders, genera, and species represented by Operational Taxonomic Units with 97% sequence similarity). Community composition was assessed using amplicon sequencing (16S rRNA gene, V3-V4 region). DCD was the most pervasive stressor, affecting evenness and most abundant taxa, followed by sediment and flow velocity. Stressor pervasiveness was similar across taxonomic levels and lower levels did not perform better in detecting stressor effects. Community coverage decreased from 96% of all sequences for abundant phyla to 28% for species. Order-level responses were generally representative of responses of corresponding genera and species, suggesting that this level may represent the best compromise between stressor sensitivity and coverage of bacterial communities.

  17. Whole genome sequencing indicates Corynebacterium jeikeium comprises 4 separate genomospecies and identifies a dominant genomospecies among clinical isolates

    Science.gov (United States)

    Salipante, Stephen J.; Sengupta, Dhruba J.; Cummings, Lisa A.; Robinson, Aaron; Kurosawa, Kyoko; Hoogestraat, Daniel R.; Cookson, Brad T.

    2014-01-01

    Corynebacterium jeikeium is an opportunistic pathogen which has been noted for significant genomic diversity. The population structure within this species remains poorly understood. Here we explore the relationships among fifteen clinical isolates of C. jeikeium (reference strains K411 and ATCC 43734, and 13 primary isolates collected over a period of 7 years) through genetic, genomic, and phenotypic studies. We report a high degree of divergence among strains based on 16S ribosomal RNA (rRNA) gene and rpoB gene sequence analysis, supporting the presence of genetically distinct subgroups. Whole genome sequencing indicates genomic-level dissimilarity among subgroups, which qualify as 4 separate and distinct Corynebacterium species based on an Average Nucleotide Identity (ANIb) threshold of susceptibilities and metabolic profiles characterize two of these genomospecies, allowing their differentiation from others through routine laboratory testing. The remaining genomospecies can be classified through a biphasic approach integrating phenotypic testing and rpoB gene sequencing. The genomospecies predominantly recovered from patient specimens does not include either of the existing C. jeikeium reference strains, implying that studies of this pathogen would benefit from examination of representatives from the primary disease-causing group. The clinically dominant genomospecies also has the smallest genome size and gene repertoire, suggesting the possibility of increased virulence relative to the other genomospecies. The ability to classify isolates to one of the four C. jeikeium genomospecies in a clinical context provides diagnostic information for tailoring antimicrobial therapy and may aid in identification of species-specific disease associations. PMID:25116839

  18. [Leber hereditary optic neuropathy: Usefulness of next generation sequencing to study mitochondrial mutations on apparent homoplasmy].

    Science.gov (United States)

    Carrasco Salas, Pilar; Palma Milla, Carmen; López Montiel, Javier; Benito, Carmen; Franco Freire, Sara; López Siles, Juan

    2016-02-19

    Leber hereditary optic neuropathy is characterized by acute and subacute visual loss, produced by mitochondrial DNA mutations. The molecular study of a family with only one affected member is presented. In the index case and in her mother, the mitochondrial mutation m.11778G>A in the MT-ND4 was detected in the heteroplasmic state. The index case's sister, without ocular manifestations, asked for genetic counseling. The study of the mentioned mutation by Sanger sequencing identified it in an apparent homoplasmic state. However, by means of next-generation sequencing (NGS), the mutation was actually in a heteroplasmic state. Regarding genetic counseling, verifying a mutation in homoplasmic state is really important. We have observed that NGS allows us to discriminate between high levels of heteroplasmy and homoplasmy, meaning that it is a useful technique for the analysis of apparent homoplasmic results obtained with less sensitive technique, as Sanger sequencing. Copyright © 2015 Elsevier España, S.L.U. All rights reserved.

  19. Exome Sequence Analysis of 14 Families With High Myopia.

    Science.gov (United States)

    Kloss, Bethany A; Tompson, Stuart W; Whisenhunt, Kristina N; Quow, Krystina L; Huang, Samuel J; Pavelec, Derek M; Rosenberg, Thomas; Young, Terri L

    2017-04-01

    To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sanger sequencing was used to confirm variants in original DNA, and to test for disease cosegregation in additional family members. Candidate genes and chromosomal loci previously associated with myopic refractive error and its endophenotypes were comprehensively screened. In 14 high myopia families, we identified 73 rare and 31 novel gene variants as candidates for pathogenicity. In seven of these families, two of the novel and eight of the rare variants were within known myopia loci. A total of 104 heterozygous nonsynonymous rare variants in 104 genes were identified in 10 out of 14 probands. Each variant cosegregated with affection status. No rare variants were identified in genes known to cause myopia or in genes closest to published genome-wide association study association signals for refractive error or its endophenotypes. Whole exome sequencing was performed to determine gene variants implicated in the pathogenesis of AD high myopia. This study provides new genes for consideration in the pathogenesis of high myopia, and may aid in the development of genetic profiling of those at greatest risk for attendant ocular morbidities of this disorder.

  20. Fast clinical molecular diagnosis of hyperphenylalaninemia using next-generation sequencing-based on a custom AmpliSeq™ panel and Ion Torrent PGM sequencing.

    Science.gov (United States)

    Cao, Yan-yan; Qu, Yu-jin; Song, Fang; Zhang, Ting; Bai, Jin-li; Jin, Yu-wei; Wang, Hong

    2014-12-01

    Hyperphenylalaninemia (HPA) can be classified into phenylketonuria (PKU) and tetrahydrobiopterin deficiency (BH4D), according to the defect of enzyme activity, both of which vary substantially in severity, treatment, and prognosis of the disease. To set up a fast and comprehensive assay in order to achieve early etiological diagnosis and differential diagnosis for children with HPA, we designed a custom AmpliSeq™ panel for the sequencing of coding DNA sequence (CDS), flanking introns, 5' untranslated region (UTR) and 3' UTR from five HPA-causing genes (PAH, PTS, QDPR, GCH1, and PCBD1) using the Ion Torrent Personal Genome Machine (PGM) Sequencer. A standard group of 15 samples with previously known DNA sequences and a test group of 37 HPA patients with unknown mutations were used for assay validation and application, respectively. All variations were confirmed by Sanger sequencing. In the standard group, all the known mutations were detected and were consistent with the results of previous Sanger sequencing. In the test group, we identified mutations in 71 of 74 alleles, with a mutation detection rate of 95.9%. We also found a frame shift deletion p.Ile25Metfs*13 in PAH that was previously unreported. In addition, 1 of 37 in the test group was inconsistent with either the molecular diagnosis or clinical diagnosis by traditional differential methods. In conclusion, our comprehensive assay based on a custom AmpliSeq™ panel and Ion Torrent PGM sequencing has wider coverage, higher throughput, is much faster, and more efficient when compared with the traditional molecular detection method for HPA patients, which could meet the medical need for individualized diagnosis and treatment. Copyright © 2014 Elsevier Inc. All rights reserved.

  1. Changes in the six most common sequence types of Neisseria gonorrhoeae, including ST4378, identified by surveillance of antimicrobial resistance in northern Taiwan from 2006 to 2013.

    Science.gov (United States)

    Cheng, Ching-Wai; Li, Lan-Hui; Su, Chen-Yi; Li, Shu-Ying; Yen, Muh-Yong

    2016-10-01

    There has been no longitudinal study of drug susceptibility in Neisseria gonorrhoeae in Taiwan since 2006. We collected 1090 gonococcal isolates from Taipei City Hospital, Taiwan from April 2006 to August 2013. We used a disk diffusion assay to determine the susceptibility to five antibiotics and an E-test to determine the minimum inhibitory concentrations for cefixime and ceftriaxone in isolates with resistance. Neisseria gonorrhoeae-multi Antigen Sequence Typing and DNA sequencing of the por and tbpB genes were used to identify sequence types. Among the 1090 isolates, the resistances to penicillin, ciprofloxacin, cefpodoxime, cefixime, and ceftriaxone were 61.01%, 83.39%, 9.63%, 6.70%, and 2.39%, respectively. The highest minimum inhibitory concentrations of cefixime and ceftriaxone were 0.19 mg/L and 0.50 mg/L, respectively. There were 327 sequence types. The four most common sequence types in homosexuals were ST4378, ST359, ST4654, and ST547; the two most common sequence types in heterosexuals were ST421 and ST419. Each of these sequence types had more than 25 isolates. There were significant differences in the sequence types in patients with different sexual orientations (p < 0.001). Oral cefixime or ceftriaxone injections were used as first-line drugs for the treatment of gonorrhea from 2006 to 2013 because gonorrhea isolates had low minimum inhibitory concentrations for these two drugs. The abrupt emergence of ST4378 (closely related to the notorious ST1407) since 2009 is a cause for alarm. Changes in sexual behavior, including an increase in sexual activity without the use of condoms, may have contributed to the peak in gonorrhea in 2010. Further molecular epidemiological investigations are required. Copyright © 2014. Published by Elsevier B.V.

  2. Sequence variability of Chrysanthemum stunt viroid in different chrysanthemum cultivars

    Science.gov (United States)

    Yoon, Ju-Yeon; Choi, Seung-Kook

    2017-01-01

    Viroids are the smallest infectious agents, and their genomes consist of a short single strand of RNA that does not encode any protein. Chrysanthemum stunt viroid (CSVd), a member of the family Pospiviroidae, causes chrysanthemum stunt disease. Here, we report the genomic variations of CSVd to understand the sequence variability of CSVd in different chrysanthemum cultivars. We randomly sampled 36 different chrysanthemum cultivars and examined the infection of CSVd in each cultivar by reverse transcription polymerase chain reaction (RT-PCR). Eleven cultivars were infected by CSVd. Cloning followed by Sanger sequencing successfully identified a total of 271 CSVd genomes derived from 12 plants from 11 cultivars. They were further classified into 105 CSVd variants. Each single chrysanthemum plant had a different set of CSVd variants. Moreover, different single plants from the same cultivar had different sets of CSVd variants but identical consensus genome sequences. A phylogenetic tree using 12 consensus genome sequences revealed three groups of CSVd genomes, while six different groups were defined by the phylogenetic analysis using 105 variants. Based on the consensus CSVd genome, by combining all variant sequences, we identified 99 single-nucleotide variations (SNVs) as well as three nucleotide positions showing high mutation rates. Although 99 SNVs were identified, most CSVd genomes in this study were derived from variant 1, which is identical to known CSVd SK1 showing pathogenicity. PMID:28149699

  3. Single site suppressors of a fission yeast temperature-sensitive mutant in cdc48 identified by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Irina N Marinova

    Full Text Available The protein called p97 in mammals and Cdc48 in budding and fission yeast is a homo-hexameric, ring-shaped, ubiquitin-dependent ATPase complex involved in a range of cellular functions, including protein degradation, vesicle fusion, DNA repair, and cell division. The cdc48+ gene is essential for viability in fission yeast, and point mutations in the human orthologue have been linked to disease. To analyze the function of p97/Cdc48 further, we performed a screen for cold-sensitive suppressors of the temperature-sensitive cdc48-353 fission yeast strain. In total, 29 independent pseudo revertants that had lost the temperature-sensitive growth defect of the cdc48-353 strain were isolated. Of these, 28 had instead acquired a cold-sensitive phenotype. Since the suppressors were all spontaneous mutants, and not the result of mutagenesis induced by chemicals or UV irradiation, we reasoned that the genome sequences of the 29 independent cdc48-353 suppressors were most likely identical with the exception of the acquired suppressor mutations. This prompted us to test if a whole genome sequencing approach would allow us to map the mutations. Indeed genome sequencing unambiguously revealed that the cold-sensitive suppressors were all second site intragenic cdc48 mutants. Projecting these onto the Cdc48 structure revealed that while the original temperature-sensitive G338D mutation is positioned near the central pore in the hexameric ring, the suppressor mutations locate to subunit-subunit and inter-domain boundaries. This suggests that Cdc48-353 is structurally compromized at the restrictive temperature, but re-established in the suppressor mutants. The last suppressor was an extragenic frame shift mutation in the ufd1 gene, which encodes a known Cdc48 co-factor. In conclusion, we show, using a novel whole genome sequencing approach, that Cdc48-353 is structurally compromized at the restrictive temperature, but stabilized in the suppressors.

  4. Melon Transcriptome Characterization: Simple Sequence Repeats and Single Nucleotide Polymorphisms Discovery for High Throughput Genotyping across the Species

    Directory of Open Access Journals (Sweden)

    José Miguel Blanca

    2011-07-01

    Full Text Available Melon ( L. ranks among the highest-valued fruit crops worldwide. Some genomic tools are available for this crop, including a Sanger transcriptome. We report the generation of 689,054 high-quality expressed sequence tags (ESTs from two 454 sequencing runs, using normalized and nonnormalized complementary DNA (cDNA libraries prepared from four genotypes belonging to the two subspecies and the main commercial types. 454 ESTs were combined with the Sanger available ESTs and de novo assembled into 53,252 unigenes. Over 63% of the unigenes were functionally annotated with Gene Ontology (GO terms and 21% had known orthologs of (L. Heynh. Annotation distribution followed similar tendencies than that reported for , suggesting that the dataset represents a fairly complete melon transcriptome. Furthermore, we identified a set of 3298 unigenes with microsatellite motifs and 14,417 sequences with single nucleotide variants of which 11,655 single nucleotide polymorphism met criteria for use with high-throughput genotyping platforms, and 453 could be detected as cleaved amplified polymorphic sequence (CAPS. A set of markers were validated, 90% of them being polymorphic in a number of variable accessions. This transcriptome provides an invaluable new tool for biological research, more so when it includes transcripts not described previously. It is being used for genome annotation and has provided a large collection of markers that will allow speeding up the process of breeding new melon varieties.

  5. A novel CRX mutation by whole-exome sequencing in an autosomal dominant cone-rod dystrophy pedigree

    Directory of Open Access Journals (Sweden)

    Qin-Kang Lu

    2015-12-01

    Full Text Available AIM: To identify the disease-causing gene mutation in a Chinese pedigree with autosomal dominant cone-rod dystrophy (adCORD. METHODS: A southern Chinese adCORD pedigree including 9 affected individuals was studied. Whole-exome sequencing (WES, coupling the Agilent whole-exome capture system to the Illumina HiSeq 2000 DNA sequencing platform was used to search the specific gene mutation in 3 affected family members and 1 unaffected member. After a suggested variant was found through the data analysis, the putative mutation was validated by Sanger DNA sequencing of samples from all available family members. RESULTS: The results of both WES and Sanger sequencing revealed a novel nonsense mutation c.C766T (p.Q256X within exon 5 of CRX gene which was pathogenic for adCORD in this family. The mutation could affect photoreceptor-specific gene expression with a dominant-negative effect and resulted in loss of the OTX tail, thus the mutant protein occupies the CRX-binding site in target promoters without establishing an interaction and, consequently, may block transactivation. CONCLUSION: All modes of Mendelian inheritance in CORD have been observed, and genetic heterogeneity is a hallmark of CORD. Therefore, conventional genetic diagnosis of CORD would be time-consuming and labor-intensive. Our study indicated the robustness and cost-effectiveness of WES in the genetic diagnosis of CORD.

  6. SMRT Gate: A method for validation of synthetic constructs on Pacific Biosciences sequencing platforms.

    Science.gov (United States)

    D'Amore, Rosalinda; Johnson, James; Haldenby, Sam; Hall, Neil; Hughes, Margaret; Joynson, Ryan; Kenny, John G; Patron, Nicola; Hertz-Fowler, Christiane; Hall, Anthony

    2017-07-01

    Current DNA assembly methods are prone to sequence errors, requiring rigorous quality control (QC) to identify incorrect assemblies or synthesized constructs. Such errors can lead to misinterpretation of phenotypes. Because of this intrinsic problem, routine QC analysis is generally performed on three or more clones using a combination of restriction endonuclease assays, colony PCR, and Sanger sequencing. However, as new automation methods emerge that enable high-throughput assembly, QC using these techniques has become a major bottleneck. Here, we describe a quick and affordable methodology for the QC of synthetic constructs. Our method involves a one-pot digestion-ligation DNA assembly reaction, based on the Golden Gate assembly methodology, that is coupled with Pacific Biosciences' Single Molecule, Real-Time (PacBio SMRT) sequencing technology.

  7. Identification of DNA lesions using a third base pair for amplification and nanopore sequencing

    Science.gov (United States)

    Riedl, Jan; Ding, Yun; Fleming, Aaron M.; Burrows, Cynthia J.

    2015-01-01

    Damage to the genome is implicated in the progression of cancer and stress-induced diseases. DNA lesions exist in low levels, and cannot be amplified by standard PCR because they are frequently strong blocks to polymerases. Here, we describe a method for PCR amplification of lesion-containing DNA in which the site and identity could be marked, copied and sequenced. Critical for this method is installation of either the dNaM or d5SICS nucleotides at the lesion site after processing via the base excision repair process. These marker nucleotides constitute an unnatural base pair, allowing large quantities of marked DNA to be made by PCR amplification. Sanger sequencing confirms the potential for this method to locate lesions by marking, amplifying and sequencing a lesion in the KRAS gene. Detection using the α-hemolysin nanopore is also developed to analyse the markers in individual DNA strands with the potential to identify multiple lesions per strand. PMID:26542210

  8. Sequence and organization of the complete mitochondrial genome of the marsh tit Poecile palustris (Aves: Paridae).

    Science.gov (United States)

    Day, John C; Broughton, Richard K; Hinsley, Shelley A

    2016-09-01

    The complete mitochondrial genome of the marsh tit Poecile palustris (Linnaeus, 1758) was sequenced using a combined Illumina and Sanger sequencing approach. Using the known sequence of Poecile atricapillus Linnaeus, 1766 (Paridae) homologous NGS reads were identified and assembled. The genome is 16,824 bp in length and includes 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a control region. Gene order resembles that of the standard avian gene order. The base composition of the genome is A (29.15%), T (22.50%), C (33.61%) and G (14.73%). The control region between tRNA(Glu) and tRNA(Phe) is composed of 1240 bp with no obvious repetitive motifs.

  9. Accurate Breakpoint Mapping in Apparently Balanced Translocation Families with Discordant Phenotypes Using Whole Genome Mate-Pair Sequencing

    DEFF Research Database (Denmark)

    Aristidou, Constantia; Koufaris, Costas; Theodosiou, Athina

    2017-01-01

    -MPS) was applied to map the breakpoints in nine two-way ABT carriers from four families. Translocation breakpoints and patient-specific structural variants were validated by Sanger sequencing and quantitative Real Time PCR, respectively. Identical sequencing patterns and breakpoints were identified in affected......Familial apparently balanced translocations (ABTs) segregating with discordant phenotypes are extremely challenging for interpretation and counseling due to the scarcity of publications and lack of routine techniques for quick investigation. Recently, next generation sequencing has emerged......