WorldWideScience

Sample records for sanger sequencing identifies

  1. Comparison of the Equine Reference Sequence with Its Sanger Source Data and New Illumina Reads.

    Directory of Open Access Journals (Sweden)

    Jovan Rebolledo-Mendez

    Full Text Available The reference assembly for the domestic horse, EquCab2, published in 2009, was built using approximately 30 million Sanger reads from a Thoroughbred mare named Twilight. Contiguity in the assembly was facilitated using nearly 315 thousand BAC end sequences from Twilight's half brother Bravo. Since then, it has served as the foundation for many genome-wide analyses that include not only the modern horse, but ancient horses and other equid species as well. As data mapped to this reference has accumulated, consistent variation between mapped datasets and the reference, in terms of regions with no read coverage, single nucleotide variants, and small insertions/deletions have become apparent. In many cases, it is not clear whether these differences are the result of true sequence variation between the research subjects' and Twilight's genome or due to errors in the reference. EquCab2 is regarded as "The Twilight Assembly." The objective of this study was to identify inconsistencies between the EquCab2 assembly and the source Twilight Sanger data used to build it. To that end, the original Sanger and BAC end reads have been mapped back to this equine reference and assessed with the addition of approximately 40X coverage of new Illumina Paired-End sequence data. The resulting mapped datasets identify those regions with low Sanger read coverage, as well as variation in genomic content that is not consistent with either the original Twilight Sanger data or the new genomic sequence data generated from Twilight on the Illumina platform. As the haploid EquCab2 reference assembly was created using Sanger reads derived largely from a single individual, the vast majority of variation detected in a mapped dataset comprised of those same Sanger reads should be heterozygous. In contrast, homozygous variations would represent either errors in the reference or contributions from Bravo's BAC end sequences. Our analysis identifies 720,843 homozygous discrepancies

  2. Comparison of base composition analysis and Sanger sequencing of mitochondrial DNA for four U.S. population groups.

    Science.gov (United States)

    Kiesler, Kevin M; Coble, Michael D; Hall, Thomas A; Vallone, Peter M

    2014-01-01

    A set of 711 samples from four U.S. population groups was analyzed using a novel mass spectrometry based method for mitochondrial DNA (mtDNA) base composition profiling. Comparison of the mass spectrometry results with Sanger sequencing derived data yielded a concordance rate of 99.97%. Length heteroplasmy was identified in 46% of samples and point heteroplasmy was observed in 6.6% of samples in the combined mass spectral and Sanger data set. Using discrimination capacity as a metric, Sanger sequencing of the full control region had the highest discriminatory power, followed by the mass spectrometry base composition method, which was more discriminating than Sanger sequencing of just the hypervariable regions. This trend is in agreement with the number of nucleotides covered by each of the three assays. Published by Elsevier Ireland Ltd.

  3. Homozygosity mapping and targeted sanger sequencing reveal genetic defects underlying inherited retinal disease in families from pakistan.

    Directory of Open Access Journals (Sweden)

    Maleeha Maria

    Full Text Available Homozygosity mapping has facilitated the identification of the genetic causes underlying inherited diseases, particularly in consanguineous families with multiple affected individuals. This knowledge has also resulted in a mutation dataset that can be used in a cost and time effective manner to screen frequent population-specific genetic variations associated with diseases such as inherited retinal disease (IRD.We genetically screened 13 families from a cohort of 81 Pakistani IRD families diagnosed with Leber congenital amaurosis (LCA, retinitis pigmentosa (RP, congenital stationary night blindness (CSNB, or cone dystrophy (CD. We employed genome-wide single nucleotide polymorphism (SNP array analysis to identify homozygous regions shared by affected individuals and performed Sanger sequencing of IRD-associated genes located in the sizeable homozygous regions. In addition, based on population specific mutation data we performed targeted Sanger sequencing (TSS of frequent variants in AIPL1, CEP290, CRB1, GUCY2D, LCA5, RPGRIP1 and TULP1, in probands from 28 LCA families.Homozygosity mapping and Sanger sequencing of IRD-associated genes revealed the underlying mutations in 10 families. TSS revealed causative variants in three families. In these 13 families four novel mutations were identified in CNGA1, CNGB1, GUCY2D, and RPGRIP1.Homozygosity mapping and TSS revealed the underlying genetic cause in 13 IRD families, which is useful for genetic counseling as well as therapeutic interventions that are likely to become available in the near future.

  4. Comparing Whole-Genome Sequencing with Sanger Sequencing for spa Typing of Methicillin-Resistant Staphylococcus aureus

    DEFF Research Database (Denmark)

    Bartels, Mette Damkjaer; Petersen, Andreas; Worning, Peder

    2014-01-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and ...

  5. Rapid Sanger sequencing of the 16S rRNA gene for identification of some common pathogens.

    Directory of Open Access Journals (Sweden)

    Linxiang Chen

    Full Text Available Conventional Sanger sequencing remains time-consuming and laborious. In this study, we developed a rapid improved sequencing protocol of 16S rRNA for pathogens identification by using a new combination of SYBR Green I real-time PCR and Sanger sequencing with FTA® cards. To compare the sequencing quality of this method with conventional Sanger sequencing, 12 strains, including three kinds of strains (1 reference strain and 3 clinical strains, which were previously identified by biochemical tests, which have 4 Pseudomonas aeruginosa, 4 Staphyloccocus aureus and 4 Escherichia coli, were targeted. Additionally, to validate the sequencing results and bacteria identification, expanded specimens with 90 clinical strains, also comprised of the three kinds of strains which included 30 samples respectively, were performed as just described. The results showed that although statistical differences (P<0.05 were found in sequencing quality between the two methods, their identification results were all correct and consistent. The workload, the time consumption and the cost per batch were respectively light versus heavy, 8 h versus 11 h and $420 versus $400. In the 90 clinical strains, all of the Pseudomonas aeruginosa and Staphyloccocus aureus strains were correctly identified, but only 26.7% of the Escherichia coli strains were recognized as Escherichia coli, while 33.3% as Shigella sonnei and 40% as Shigella dysenteriae. The protocol described here is a rapid, reliable, stable and convenient method for 16S rRNA sequencing, and can be used for Pseudomonas aeruginosa and Staphyloccocus aureus identification, yet it is not completely suitable for discriminating Escherichia coli and Shigella strains.

  6. Insights into bacterioplankton community structure from Sundarbans mangrove ecoregion using Sanger and Illumina MiSeq sequencing approaches: A comparative analysis

    Directory of Open Access Journals (Sweden)

    Anwesha Ghosh

    2017-03-01

    Full Text Available Next generation sequencing using platforms such as Illumina MiSeq provides a deeper insight into the structure and function of bacterioplankton communities in coastal ecosystems compared to traditional molecular techniques such as clone library approach which incorporates Sanger sequencing. In this study, structure of bacterioplankton communities was investigated from two stations of Sundarbans mangrove ecoregion using both Sanger and Illumina MiSeq sequencing approaches. The Illumina MiSeq data is available under the BioProject ID PRJNA35180 and Sanger sequencing data under accession numbers KX014101-KX014140 (Stn1 and KX014372-KX014410 (Stn3. Proteobacteria-, Firmicutes- and Bacteroidetes-like sequences retrieved from both approaches appeared to be abundant in the studied ecosystem. The Illumina MiSeq data (2.1 GB provided a deeper insight into the structure of bacterioplankton communities and revealed the presence of bacterial phyla such as Actinobacteria, Cyanobacteria, Tenericutes, Verrucomicrobia which were not recovered based on Sanger sequencing. A comparative analysis of bacterioplankton communities from both stations highlighted the presence of genera that appear in both stations and genera that occur exclusively in either station. However, both the Sanger sequencing and Illumina MiSeq data were coherent at broader taxonomic levels. Pseudomonas, Devosia, Hyphomonas and Erythrobacter-like sequences were the abundant bacterial genera found in the studied ecosystem. Both the sequencing methods showed broad coherence although as expected the Illumina MiSeq data helped identify rarer bacterioplankton groups and also showed the presence of unassigned OTUs indicating possible presence of novel bacterioplankton from the studied mangrove ecosystem.

  7. Comparing whole-genome sequencing with Sanger sequencing for spa typing of methicillin-resistant Staphylococcus aureus.

    Science.gov (United States)

    Bartels, Mette Damkjær; Petersen, Andreas; Worning, Peder; Nielsen, Jesper Boye; Larner-Svensson, Hanna; Johansen, Helle Krogh; Andersen, Leif Percival; Jarløv, Jens Otto; Boye, Kit; Larsen, Anders Rhod; Westh, Henrik

    2014-12-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and an in-house analysis pipeline determines the spa types. Due to national surveillance, all MRSA isolates are sent to Statens Serum Institut, where the spa type is determined by PCR and Sanger sequencing. The purpose of this study was to evaluate the reliability of the spa types obtained by 150-bp paired-end Illumina WGS. MRSA isolates from new MRSA patients in 2013 (n = 699) in the capital region of Denmark were included. We found a 97% agreement between spa types obtained by the two methods. All isolates achieved a spa type by both methods. Nineteen isolates differed in spa types by the two methods, in most cases due to the lack of 24-bp repeats in the whole-genome-sequenced isolates. These related but incorrect spa types should have no consequence in outbreak investigations, since all epidemiologically linked isolates, regardless of spa type, will be included in the single nucleotide polymorphism (SNP) analysis. This will reveal the close relatedness of the spa types. In conclusion, our data show that WGS is a reliable method to determine the spa type of MRSA. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  8. Diagnosis of Fanconi Anemia: Mutation Analysis by Multiplex Ligation-Dependent Probe Amplification and PCR-Based Sanger Sequencing

    Science.gov (United States)

    Gille, Johan J. P.; Floor, Karijn; Kerkhoven, Lianne; Ameziane, Najim; Joenje, Hans; de Winter, Johan P.

    2012-01-01

    Fanconi anemia (FA) is a rare inherited disease characterized by developmental defects, short stature, bone marrow failure, and a high risk of malignancies. FA is heterogeneous: 15 genetic subtypes have been distinguished so far. A clinical diagnosis of FA needs to be confirmed by testing cells for sensitivity to cross-linking agents in a chromosomal breakage test. As a second step, DNA testing can be employed to elucidate the genetic subtype of the patient and to identify the familial mutations. This knowledge allows preimplantation genetic diagnosis (PGD) and enables prenatal DNA testing in future pregnancies. Although simultaneous testing of all FA genes by next generation sequencing will be possible in the near future, this technique will not be available immediately for all laboratories. In addition, in populations with strong founder mutations, a limited test using Sanger sequencing and MLPA will be a cost-effective alternative. We describe a strategy and optimized conditions for the screening of FANCA, FANCB, FANCC, FANCE, FANCF, and FANCG and present the results obtained in a cohort of 54 patients referred to our diagnostic service since 2008. In addition, the follow up with respect to genetic counseling and carrier screening in the families is discussed. PMID:22778927

  9. Diagnosis of Fanconi Anemia: Mutation Analysis by Multiplex Ligation-Dependent Probe Amplification and PCR-Based Sanger Sequencing

    Directory of Open Access Journals (Sweden)

    Johan J. P. Gille

    2012-01-01

    Full Text Available Fanconi anemia (FA is a rare inherited disease characterized by developmental defects, short stature, bone marrow failure, and a high risk of malignancies. FA is heterogeneous: 15 genetic subtypes have been distinguished so far. A clinical diagnosis of FA needs to be confirmed by testing cells for sensitivity to cross-linking agents in a chromosomal breakage test. As a second step, DNA testing can be employed to elucidate the genetic subtype of the patient and to identify the familial mutations. This knowledge allows preimplantation genetic diagnosis (PGD and enables prenatal DNA testing in future pregnancies. Although simultaneous testing of all FA genes by next generation sequencing will be possible in the near future, this technique will not be available immediately for all laboratories. In addition, in populations with strong founder mutations, a limited test using Sanger sequencing and MLPA will be a cost-effective alternative. We describe a strategy and optimized conditions for the screening of FANCA, FANCB, FANCC, FANCE, FANCF, and FANCG and present the results obtained in a cohort of 54 patients referred to our diagnostic service since 2008. In addition, the follow up with respect to genetic counseling and carrier screening in the families is discussed.

  10. Identification of novel BRCA founder mutations in Middle Eastern breast cancer patients using capture and Sanger sequencing analysis.

    Science.gov (United States)

    Bu, Rong; Siraj, Abdul K; Al-Obaisi, Khadija A S; Beg, Shaham; Al Hazmi, Mohsen; Ajarim, Dahish; Tulbah, Asma; Al-Dayel, Fouad; Al-Kuraya, Khawla S

    2016-09-01

    Ethnic differences of breast cancer genomics have prompted us to investigate the spectra of BRCA1 and BRCA2 mutations in different populations. The prevalence and effect of BRCA 1 and BRCA 2 mutations in Middle Eastern population is not fully explored. To characterize the prevalence of BRCA mutations in Middle Eastern breast cancer patients, BRCA mutation screening was performed in 818 unselected breast cancer patients using Capture and/or Sanger sequencing. 19 short tandem repeat (STR) markers were used for founder mutation analysis. In our study, nine different types of deleterious mutation were identified in 28 (3.4%) cases, 25 (89.3%) cases in BRCA 1 and 3 (10.7%) cases in BRCA 2. Seven recurrent mutations identified accounted for 92.9% (26/28) of all the mutant cases. Haplotype analysis was performed to confirm c.1140 dupG and c.4136_4137delCT mutations as novel putative founder mutation, accounting for 46.4% (13/28) of all BRCA mutant cases and 1.6% (13/818) of all the breast cancer cases, respectively. Moreover, BRCA 1 mutation was significantly associated with BRCA 1 protein expression loss (p = 0.0005). Our finding revealed that a substantial number of BRCA mutations were identified in clinically high risk breast cancer from Middle East region. Identification of the mutation spectrum, prevalence and founder effect in Middle Eastern population facilitates genetic counseling, risk assessment and development of cost-effective screening strategy. © 2016 UICC.

  11. KRAS mutation detection in colorectal cancer by a commercially available gene chip array compares well with Sanger sequencing.

    Science.gov (United States)

    French, Deborah; Smith, Andrew; Powers, Martin P; Wu, Alan H B

    2011-08-17

    Binding of a ligand to the epidermal growth factor receptor (EGFR) stimulates various intracellular signaling pathways resulting in cell cycle progression, proliferation, angiogenesis and apoptosis inhibition. KRAS is involved in signaling pathways including RAF/MAPK and PI3K and mutations in this gene result in constitutive activation of these pathways, independent of EGFR activation. Seven mutations in codons 12 and 13 of KRAS comprise around 95% of the observed human mutations, rendering monoclonal antibodies against EGFR (e.g. cetuximab and panitumumab) useless in treatment of colorectal cancer. KRAS mutation testing by two different methodologies was compared; Sanger sequencing and AutoGenomics INFINITI® assay, on DNA extracted from colorectal cancers. Out of 29 colorectal tumor samples tested, 28 were concordant between the two methodologies for the KRAS mutations that were detected in both assays with the INFINITI® assay detecting a mutation in one sample that was indeterminate by Sanger sequencing and a third methodology; single nucleotide primer extension. This study indicates the utility of the AutoGenomics INFINITI® methodology in a clinical laboratory setting where technical expertise or access to equipment for DNA sequencing does not exist. Copyright © 2011 Elsevier B.V. All rights reserved.

  12. Very high resolution single pass HLA genotyping using amplicon sequencing on the 454 next generation DNA sequencers: Comparison with Sanger sequencing.

    Science.gov (United States)

    Yamamoto, F; Höglund, B; Fernandez-Vina, M; Tyan, D; Rastrou, M; Williams, T; Moonsamy, P; Goodridge, D; Anderson, M; Erlich, H A; Holcomb, C L

    2015-12-01

    Compared to Sanger sequencing, next-generation sequencing offers advantages for high resolution HLA genotyping including increased throughput, lower cost, and reduced genotype ambiguity. Here we describe an enhancement of the Roche 454 GS GType HLA genotyping assay to provide very high resolution (VHR) typing, by the addition of 8 primer pairs to the original 14, to genotype 11 HLA loci. These additional amplicons help resolve common and well-documented alleles and exclude commonly found null alleles in genotype ambiguity strings. Simplification of workflow to reduce the initial preparation effort using early pooling of amplicons or the Fluidigm Access Array™ is also described. Performance of the VHR assay was evaluated on 28 well characterized cell lines using Conexio Assign MPS software which uses genomic, rather than cDNA, reference sequence. Concordance was 98.4%; 1.6% had no genotype assignment. Of concordant calls, 53% were unambiguous. To further assess the assay, 59 clinical samples were genotyped and results compared to unambiguous allele assignments obtained by prior sequence-based typing supplemented with SSO and/or SSP. Concordance was 98.7% with 58.2% as unambiguous calls; 1.3% could not be assigned. Our results show that the amplicon-based VHR assay is robust and can replace current Sanger methodology. Together with software enhancements, it has the potential to provide even higher resolution HLA typing. Copyright © 2015. Published by Elsevier Inc.

  13. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L. using sanger and next generation sequencing platforms: development and applications.

    Directory of Open Access Journals (Sweden)

    Himabindu Kudapa

    Full Text Available A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201, comprising 46,369 transcript assembly contigs (TACs has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8% of the TACs and gene ontology assignments were determined for 21,471 (46.3%. The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs and intron spanning regions (ISRs for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding

  14. A comparison of parallel pyrosequencing and sanger clone-based sequencing and its impact on the characterization of the genetic diversity of HIV-1.

    Directory of Open Access Journals (Sweden)

    Binhua Liang

    Full Text Available BACKGROUND: Pyrosequencing technology has the potential to rapidly sequence HIV-1 viral quasispecies without requiring the traditional approach of cloning. In this study, we investigated the utility of ultra-deep pyrosequencing to characterize genetic diversity of the HIV-1 gag quasispecies and assessed the possible contribution of pyrosequencing technology in studying HIV-1 biology and evolution. METHODOLOGY/PRINCIPAL FINDINGS: HIV-1 gag gene was amplified from 96 patients using nested PCR. The PCR products were cloned and sequenced using capillary based Sanger fluorescent dideoxy termination sequencing. The same PCR products were also directly sequenced using the 454 pyrosequencing technology. The two sequencing methods were evaluated for their ability to characterize quasispecies variation, and to reveal sites under host immune pressure for their putative functional significance. A total of 14,034 variations were identified by 454 pyrosequencing versus 3,632 variations by Sanger clone-based (SCB sequencing. 11,050 of these variations were detected only by pyrosequencing. These undetected variations were located in the HIV-1 Gag region which is known to contain putative cytotoxic T lymphocyte (CTL and neutralizing antibody epitopes, and sites related to virus assembly and packaging. Analysis of the positively selected sites derived by the two sequencing methods identified several differences. All of them were located within the CTL epitope regions. CONCLUSIONS/SIGNIFICANCE: Ultra-deep pyrosequencing has proven to be a powerful tool for characterization of HIV-1 genetic diversity with enhanced sensitivity, efficiency, and accuracy. It also improved reliability of downstream evolutionary and functional analysis of HIV-1 quasispecies.

  15. Barcoding the food chain: from Sanger to high-throughput sequencing.

    Science.gov (United States)

    Littlefair, Joanne E; Clare, Elizabeth L

    2016-11-01

    Society faces the complex challenge of supporting biodiversity and ecosystem functioning, while ensuring food security by providing safe traceable food through an ever-more-complex global food chain. The increase in human mobility brings the added threat of pests, parasites, and invaders that further complicate our agro-industrial efforts. DNA barcoding technologies allow researchers to identify both individual species, and, when combined with universal primers and high-throughput sequencing techniques, the diversity within mixed samples (metabarcoding). These tools are already being employed to detect market substitutions, trace pests through the forensic evaluation of trace "environmental DNA", and to track parasitic infections in livestock. The potential of DNA barcoding to contribute to increased security of the food chain is clear, but challenges remain in regulation and the need for validation of experimental analysis. Here, we present an overview of the current uses and challenges of applied DNA barcoding in agriculture, from agro-ecosystems within farmland to the kitchen table.

  16. Screening for duplications, deletions and a common intronic mutation detects 35% of second mutations in patients with USH2A monoallelic mutations on Sanger sequencing.

    Science.gov (United States)

    Steele-Stallard, Heather B; Le Quesne Stabej, Polona; Lenassi, Eva; Luxon, Linda M; Claustres, Mireille; Roux, Anne-Francoise; Webster, Andrew R; Bitner-Glindzicz, Maria

    2013-08-08

    Usher Syndrome is the leading cause of inherited deaf-blindness. It is divided into three subtypes, of which the most common is Usher type 2, and the USH2A gene accounts for 75-80% of cases. Despite recent sequencing strategies, in our cohort a significant proportion of individuals with Usher type 2 have just one heterozygous disease-causing mutation in USH2A, or no convincing disease-causing mutations across nine Usher genes. The purpose of this study was to improve the molecular diagnosis in these families by screening USH2A for duplications, heterozygous deletions and a common pathogenic deep intronic variant USH2A: c.7595-2144A>G. Forty-nine Usher type 2 or atypical Usher families who had missing mutations (mono-allelic USH2A or no mutations following Sanger sequencing of nine Usher genes) were screened for duplications/deletions using the USH2A SALSA MLPA reagent kit (MRC-Holland). Identification of USH2A: c.7595-2144A>G was achieved by Sanger sequencing. Mutations were confirmed by a combination of reverse transcription PCR using RNA extracted from nasal epithelial cells or fibroblasts, and by array comparative genomic hybridisation with sequencing across the genomic breakpoints. Eight mutations were identified in 23 Usher type 2 families (35%) with one previously identified heterozygous disease-causing mutation in USH2A. These consisted of five heterozygous deletions, one duplication, and two heterozygous instances of the pathogenic variant USH2A: c.7595-2144A>G. No variants were found in the 15 Usher type 2 families with no previously identified disease-causing mutations. In 11 atypical families, none of whom had any previously identified convincing disease-causing mutations, the mutation USH2A: c.7595-2144A>G was identified in a heterozygous state in one family. All five deletions and the heterozygous duplication we report here are novel. This is the first time that a duplication in USH2A has been reported as a cause of Usher syndrome. We found that 8 of

  17. Comparison of three human papillomavirus DNA detection methods: Next generation sequencing, multiplex-PCR and nested-PCR followed by Sanger based sequencing.

    Science.gov (United States)

    da Fonseca, Allex Jardim; Galvão, Renata Silva; Miranda, Angelica Espinosa; Ferreira, Luiz Carlos de Lima; Chen, Zigui

    2016-05-01

    To compare the diagnostic performance for HPV infection using three laboratorial techniques. Ninty-five cervicovaginal samples were randomly selected; each was tested for HPV DNA and genotypes using 3 methods in parallel: Multiplex-PCR, the Nested PCR followed by Sanger sequencing, and the Next_Gen Sequencing (NGS) with two assays (NGS-A1, NGS-A2). The study was approved by the Brazilian National IRB (CONEP protocol 16,800). The prevalence of HPV by the NGS assays was higher than that using the Multiplex-PCR (64.2% vs. 45.2%, respectively; P = 0.001) and the Nested-PCR (64.2% vs. 49.5%, respectively; P = 0.003). NGS also showed better performance in detecting high-risk HPV (HR-HPV) and HPV16. There was a weak interobservers agreement between the results of Multiplex-PCR and Nested-PCR in relation to NGS for the diagnosis of HPV infection, and a moderate correlation for HR-HPV detection. Both NGS assays showed a strong correlation for detection of HPVs (k = 0.86), HR-HPVs (k = 0.91), HPV16 (k = 0.92) and HPV18 (k = 0.91). NGS is more sensitive than the traditional Sanger sequencing and the Multiplex PCR to genotype HPVs, with promising ability to detect multiple infections, and may have the potential to establish an alternative method for the diagnosis and genotyping of HPV. © 2015 Wiley Periodicals, Inc.

  18. Discovery of novel MHC-class I alleles and haplotypes in Filipino cynomolgus macaques (Macaca fascicularis) by pyrosequencing and Sanger sequencing: Mafa-class I polymorphism.

    Science.gov (United States)

    Shiina, Takashi; Yamada, Yukiho; Aarnink, Alice; Suzuki, Shingo; Masuya, Anri; Ito, Sayaka; Ido, Daisuke; Yamanaka, Hisashi; Iwatani, Chizuru; Tsuchiya, Hideaki; Ishigaki, Hirohito; Itoh, Yasushi; Ogasawara, Kazumasa; Kulski, Jerzy K; Blancher, Antoine

    2015-10-01

    Although the low polymorphism of the major histocompatibility complex (MHC) transplantation genes in the Filipino cynomolgus macaque (Macaca fascicularis) is expected to have important implications in the selection and breeding of animals for medical research, detailed polymorphism information is still lacking for many of the duplicated class I genes. To better elucidate the degree and types of MHC polymorphisms and haplotypes in the Filipino macaque population, we genotyped 127 unrelated animals by the Sanger sequencing method and high-resolution pyrosequencing and identified 112 different alleles, 28 at cynomolgus macaque MHC (Mafa)-A, 54 at Mafa-B, 12 at Mafa-I, 11 at Mafa-E, and seven at Mafa-F alleles, of which 56 were newly described. Of them, the newly discovered Mafa-A8*01:01 lineage allele had low nucleotide similarities (Filipino macaque population would identify these and other high-frequency Mafa-class I haplotypes that could be used as MHC control animals for the benefit of biomedical research.

  19. Sanger sequencing as a first-line approach for molecular diagnosis of Andersen-Tawil syndrome [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Armando Totomoch-Serra

    2017-06-01

    Full Text Available In 1977, Frederick Sanger developed a new method for DNA sequencing based on the chain termination method, now known as the Sanger sequencing method (SSM.  Recently, massive parallel sequencing, better known as next-generation sequencing (NGS,  is replacing the SSM for detecting mutations in cardiovascular diseases with a genetic background. The present opinion article wants to remark that “targeted” SSM is still effective as a first-line approach for the molecular diagnosis of some specific conditions, as is the case for Andersen-Tawil syndrome (ATS. ATS is described as a rare multisystemic autosomal dominant channelopathy syndrome caused mainly by a heterozygous mutation in the KCNJ2 gene. KCJN2 has particular characteristics that make it attractive for “directed” SSM. KCNJ2 has a sequence of 17,510 base pairs (bp, and a short coding region with two exons (exon 1=166 bp and exon 2=5220 bp, half of the mutations are located in the C-terminal cytosolic domain, a mutational hotspot has been described in residue Arg218, and this gene explains the phenotype in 60% of ATS cases that fulfill all the clinical criteria of the disease. In order to increase the diagnosis of ATS we urge cardiologists to search for facial and muscular abnormalities in subjects with frequent ventricular arrhythmias (especially bigeminy and prominent U waves on the electrocardiogram.

  20. Comparison of cobas HCV GT against Versant HCV Genotype 2.0 (LiPA) with confirmation by Sanger sequencing.

    Science.gov (United States)

    Yusrina, Falah; Chua, Cui Wen; Lee, Chun Kiat; Chiu, Lily; Png, Tracy Si-Yu; Khoo, Mui Joo; Yan, Gabriel; Lee, Guan Huei; Yan, Benedict; Lee, Hong Kai

    2018-05-01

    Correct identification of infecting hepatitis C virus (HCV) genotype is helpful for targeted antiviral therapy. Here, we compared the HCV genotyping performance of the cobas HCV GT assay against the Versant HCV Genotype 2.0 (LiPA) assay, using 97 archived serum samples. In the event of discrepant or indeterminate results produced by either assay, the core and NS5B regions were sequenced. Of the 97 samples tested by the cobas, 25 (26%) were deemed indeterminate. Sequencing analyses confirmed 21 (84%) of the 25 samples as genotype 6 viruses with either subtype 6m, 6n, 6v, 6xa, or unknown subtype. Of the 97 samples tested by the LiPA, thirteen (13%) were deemed indeterminate. Seven (7%) were assigned with genotype 1, with unavailable/inconclusive results from the core region of the LiPA. Notably, the 7 samples were later found to be either genotype 3 or 6 by sequencing analyses. Moreover, 1 sample by the LiPA was assigned as genotypes 4 (cobas: indeterminate) but were later found to be genotype 3 by sequencing analyses, highlighting its limitation in assigning the correct genotype. The cobas showed similar or slightly higher accuracy (100%; 95% CI 94-100%) compared to the LiPA (99%; 95% CI 92-100%). Twenty-six percent of the 97 samples tested by the cobas had indeterminate results, mainly due to its limitation in identifying genotype 6 other than subtypes 6a and 6b. This presents a significant assay limitation in Southeast Asia, where genotype 6 infection is highly prevalent. Copyright © 2018 Elsevier B.V. All rights reserved.

  1. Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons

    Science.gov (United States)

    Haas, Brian J.; Gevers, Dirk; Earl, Ashlee M.; Feldgarden, Mike; Ward, Doyle V.; Giannoukos, Georgia; Ciulla, Dawn; Tabbaa, Diana; Highlander, Sarah K.; Sodergren, Erica; Methé, Barbara; DeSantis, Todd Z.; Petrosino, Joseph F.; Knight, Rob; Birren, Bruce W.

    2011-01-01

    Bacterial diversity among environmental samples is commonly assessed with PCR-amplified 16S rRNA gene (16S) sequences. Perceived diversity, however, can be influenced by sample preparation, primer selection, and formation of chimeric 16S amplification products. Chimeras are hybrid products between multiple parent sequences that can be falsely interpreted as novel organisms, thus inflating apparent diversity. We developed a new chimera detection tool called Chimera Slayer (CS). CS detects chimeras with greater sensitivity than previous methods, performs well on short sequences such as those produced by the 454 Life Sciences (Roche) Genome Sequencer, and can scale to large data sets. By benchmarking CS performance against sequences derived from a controlled DNA mixture of known organisms and a simulated chimera set, we provide insights into the factors that affect chimera formation such as sequence abundance, the extent of similarity between 16S genes, and PCR conditions. Chimeras were found to reproducibly form among independent amplifications and contributed to false perceptions of sample diversity and the false identification of novel taxa, with less-abundant species exhibiting chimera rates exceeding 70%. Shotgun metagenomic sequences of our mock community appear to be devoid of 16S chimeras, supporting a role for shotgun metagenomics in validating novel organisms discovered in targeted sequence surveys. PMID:21212162

  2. Highly sensitive KRAS mutation detection from formalin-fixed paraffin-embedded biopsies and circulating tumour cells using wild-type blocking polymerase chain reaction and Sanger sequencing.

    Science.gov (United States)

    Huang, Meggie Mo Chao; Leong, Sai Mun; Chua, Hui Wen; Tucker, Steven; Cheong, Wai Chye; Chiu, Lily; Li, Mo-Huang; Koay, Evelyn Siew-Chuan

    2014-08-01

    Among patients with colorectal cancer (CRC), KRAS mutations were reported to occur in 30-51 % of all cases. CRC patients with KRAS mutations were reported to be non-responsive to anti-epidermal growth factor receptor (EGFR) monoclonal antibody (MoAb) treatment in many clinical trials. Hence, accurate detection of KRAS mutations would be critical in guiding the use of anti-EGFR MoAb therapies in CRC. In this study, we carried out a detailed investigation of the efficacy of a wild-type (WT) blocking real-time polymerase chain reaction (PCR), employing WT KRAS locked nucleic acid blockers, and Sanger sequencing, for KRAS mutation detection in rare cells. Analyses were first conducted on cell lines to optimize the assay protocol which was subsequently applied to peripheral blood and tissue samples from patients with CRC. The optimized assay provided a superior sensitivity enabling detection of as little as two cells with mutated KRAS in the background of 10(4) WT cells (0.02 %). The feasibility of this assay was further investigated to assess the KRAS status of 45 colorectal tissue samples, which had been tested previously, using a conventional PCR sequencing approach. The analysis showed a mutational discordance between these two methods in 4 of 18 WT cases. Our results present a simple, effective, and robust method for KRAS mutation detection in both paraffin embedded tissues and circulating tumour cells, at single-cell level. The method greatly enhances the detection sensitivity and alleviates the need of exhaustively removing co-enriched contaminating lymphocytes.

  3. A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

    Directory of Open Access Journals (Sweden)

    Solis Julio

    2010-10-01

    Full Text Available Abstract Background Sweetpotato (Ipomoea batatas (L. Lam., a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species. Results Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas and 2 diploid (I. trifida accessions. Conclusions The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at http://www.cipotato.org/sweetpotato_gene_index.

  4. Exome sequencing identifies SUCO mutations in mesial temporal lobe epilepsy.

    Science.gov (United States)

    Sha, Zhiqiang; Sha, Longze; Li, Wenting; Dou, Wanchen; Shen, Yan; Wu, Liwen; Xu, Qi

    2015-03-30

    Mesial temporal lobe epilepsy (mTLE) is the main type and most common medically intractable form of epilepsy. Severity of disease-based stratified samples may help identify new disease-associated mutant genes. We analyzed mRNA expression profiles from patient hippocampal tissue. Three of the seven patients had severe mTLE with generalized-onset convulsions and consciousness loss that occurred over many years. We found that compared with other groups, patients with severe mTLE were classified into a distinct group. Whole-exome sequencing and Sanger sequencing validation in all seven patients identified three novel SUN domain-containing ossification factor (SUCO) mutations in severely affected patients. Furthermore, SUCO knock down significantly reduced dendritic length in vitro. Our results indicate that mTLE defects may affect neuronal development, and suggest that neurons have abnormal development due to lack of SUCO, which may be a generalized-onset epilepsy-related gene. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  5. Novel Genetic Variants of Sporadic Atrial Septal Defect (ASD) in a Chinese Population Identified by Whole-Exome Sequencing (WES).

    Science.gov (United States)

    Liu, Yong; Cao, Yu; Li, Yaxiong; Lei, Dongyun; Li, Lin; Hou, Zong Liu; Han, Shen; Meng, Mingyao; Shi, Jianlin; Zhang, Yayong; Wang, Yi; Niu, Zhaoyi; Xie, Yanhua; Xiao, Benshan; Wang, Yuanfei; Li, Xiao; Yang, Lirong; Wang, Wenju; Jiang, Lihong

    2018-03-05

    BACKGROUND Recently, mutations in several genes have been described to be associated with sporadic ASD, but some genetic variants remain to be identified. The aim of this study was to use whole-exome sequencing (WES) combined with bioinformatics analysis to identify novel genetic variants in cases of sporadic congenital ASD, followed by validation by Sanger sequencing. MATERIAL AND METHODS Five Han patients with secundum ASD were recruited, and their tissue samples were analyzed by WES, followed by verification by Sanger sequencing of tissue and blood samples. Further evaluation using blood samples included 452 additional patients with sporadic secundum ASD (212 male and 240 female patients) and 519 healthy subjects (252 male and 267 female subjects) for further verification by a multiplexed MassARRAY system. Bioinformatic analyses were performed to identify novel genetic variants associated with sporadic ASD. RESULTS From five patients with sporadic ASD, a total of 181,762 genomic variants in 33 exon loci, validated by Sanger sequencing, were selected and underwent MassARRAY analysis in 452 patients with ASD and 519 healthy subjects. Three loci with high mutation frequencies, the 138665410 FOXL2 gene variant, the 23862952 MYH6 gene variant, and the 71098693 HYDIN gene variant were found to be significantly associated with sporadic ASD (PASD (PASD, and supported the use of WES and bioinformatics analysis to identify disease-associated mutations.

  6. Electrostatic Potential Maps and Natural Bond Orbital Analysis: Visualization and Conceptualization of Reactivity in Sanger's Reagent

    Science.gov (United States)

    Mottishaw, Jeffery D.; Erck, Adam R.; Kramer, Jordan H.; Sun, Haoran; Koppang, Miles

    2015-01-01

    Frederick Sanger's early work on protein sequencing through the use of colorimetric labeling combined with liquid chromatography involves an important nucleophilic aromatic substitution (S[subscript N]Ar) reaction in which the N-terminus of a protein is tagged with Sanger's reagent. Understanding the inherent differences between this S[subscript…

  7. Spectrum of benzo[a]pyrene-induced mutations in the Pig-a gene of L5178YTk+/- cells identified with next generation sequencing.

    Science.gov (United States)

    Revollo, Javier; Wang, Yiying; McKinzie, Page; Dad, Azra; Pearce, Mason; Heflich, Robert H; Dobrovolsky, Vasily N

    2017-12-01

    We used Sanger sequencing and next generation sequencing (NGS) for analysis of mutations in the endogenous X-linked Pig-a gene of clonally expanded L5178YTk +/- cells. The clones developed from single cells that were sorted on a flow cytometer based upon the expression pattern of the GPI-anchored marker, CD90, on their surface. CD90-deficient and CD90-proficient cells were sorted from untreated cultures and CD90-deficient cells were sorted from cultures treated with benzo[a]pyrene (B[a]P). Pig-a mutations were identified in all clones developed from CD90-deficient cells; no Pig-a mutations were found in clones of CD90-proficient cells. The spectrum of B[a]P-induced Pig-a mutations was dominated by basepair substitutions, small insertions and deletions at G:C, or at sequences rich in G:C content. We observed high concordance between Pig-a mutations determined by Sanger sequencing and by NGS, but NGS was able to identify mutations in samples that were difficult to analyze by Sanger sequencing (e.g., mixtures of two mutant clones). Overall, the NGS method is a cost and labor efficient high throughput approach for analysis of a large number of mutant clones. Published by Elsevier B.V.

  8. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, nois...... patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer....

  9. Deep sequencing of uveal melanoma identifies a recurrent mutation in PLCB4

    DEFF Research Database (Denmark)

    Johansson, Peter; Aoude, Lauren G; Wadt, Karin

    2016-01-01

    Next generation sequencing of uveal melanoma (UM) samples has identified a number of recurrent oncogenic or loss-of-function mutations in key driver genes including: GNAQ, GNA11, EIF1AX, SF3B1 and BAP1. To search for additional driver mutations in this tumor type we carried out whole......, instead, a BRCA mutation signature predominated. In addition to mutations in the known UM driver genes, we found a recurrent mutation in PLCB4 (c.G1888T, p.D630Y, NM_000933), which was validated using Sanger sequencing. The identical mutation was also found in published UM sequence data (1 of 56 tumors......-genome or whole-exome sequencing of 28 tumors or primary cell lines. These samples have a low mutation burden, with a mean of 10.6 protein changing mutations per sample (range 0 to 53). As expected for these sun-shielded melanomas the mutation spectrum was not consistent with an ultraviolet radiation signature...

  10. Whole exome sequencing identifies novel mutation in eight Chinese children with isolated tetralogy of Fallot.

    Science.gov (United States)

    Liu, Lin; Wang, Hong-Dan; Cui, Cun-Ying; Qin, Yun-Yun; Fan, Tai-Bing; Peng, Bang-Tian; Zhang, Lian-Zhong; Wang, Cheng-Zeng

    2017-12-05

    Tetralogy of Fallot is the most common cyanotic congenital heart disease. However, its pathogenesis remains to be clarified. The purpose of this study was to identify the genetic variants in Tetralogy of Fallot by whole exome sequencing. Whole exome sequencing was performed among eight small families with Tetralogy of Fallot. Differential single nucleotide polymorphisms and small InDels were found by alignment within families and between families and then were verified by Sanger sequencing. Tetralogy of Fallot-related genes were determined by analysis using Gene Ontology /pathway, Online Mendelian Inheritance in Man, PubMed and other databases. A total of sixteen differential single nucleotide polymorphisms loci and eight differential small InDels were discovered. The sixteen differential single nucleotide polymorphisms loci were located on Chr 1, 2, 4, 5, 11, 12, 15, 22 and X. Among the sixteen single nucleotide polymorphisms loci, six has not been reported. The eight differential small InDels were located on Chr 2, 4, 9, 12, 17, 19 and X, whereas of the eight differential small InDels, two has not been reported. Analysis using Gene Ontology /pathway, Online Mendelian Inheritance in Man, PubMed and other databases revealed that PEX5 , NACA , ATXN2 , CELA1 , PCDHB4 and CTBP1 were associated with Tetralogy of Fallot. Our findings identify PEX5 , NACA , ATXN2 , CELA1 , PCDHB4 and CTBP1 mutations as underlying genetic causes of isolated tetralogy of Fallot.

  11. Genetic mapping and exome sequencing identify variants associated with five novel diseases.

    Directory of Open Access Journals (Sweden)

    Erik G Puffenberger

    Full Text Available The Clinic for Special Children (CSC has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain children. Among the Plain people, we have used single nucleotide polymorphism (SNP microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb that contain many genes (mean = 79. For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data.

  12. Exome sequencing identifies CTSK mutations in patients originally diagnosed as intermediate osteopetrosis☆

    Science.gov (United States)

    Pangrazio, Alessandra; Puddu, Alessandro; Oppo, Manuela; Valentini, Maria; Zammataro, Luca; Vellodi, Ashok; Gener, Blanca; Llano-Rivas, Isabel; Raza, Jamal; Atta, Irum; Vezzoni, Paolo; Superti-Furga, Andrea; Villa, Anna; Sobacchi, Cristina

    2014-01-01

    Autosomal Recessive Osteopetrosis is a genetic disorder characterized by increased bone density due to lack of resorption by the osteoclasts. Genetic studies have widely unraveled the molecular basis of the most severe forms, while cases of intermediate severity are more difficult to characterize, probably because of a large heterogeneity. Here, we describe the use of exome sequencing in the molecular diagnosis of 2 siblings initially thought to be affected by “intermediate osteopetrosis”, which identified a homozygous mutation in the CTSK gene. Prompted by this finding, we tested by Sanger sequencing 25 additional patients addressed to us for recessive osteopetrosis and found CTSK mutations in 4 of them. In retrospect, their clinical and radiographic features were found to be compatible with, but not typical for, Pycnodysostosis. We sought to identify modifier genes that might have played a role in the clinical manifestation of the disease in these patients, but our results were not informative. In conclusion, we underline the difficulties of differential diagnosis in some patients whose clinical appearance does not fit the classical malignant or benign picture and recommend that CTSK gene be included in the molecular diagnosis of high bone density conditions. PMID:24269275

  13. Exome sequencing identifies CTSK mutations in patients originally diagnosed as intermediate osteopetrosis.

    Science.gov (United States)

    Pangrazio, Alessandra; Puddu, Alessandro; Oppo, Manuela; Valentini, Maria; Zammataro, Luca; Vellodi, Ashok; Gener, Blanca; Llano-Rivas, Isabel; Raza, Jamal; Atta, Irum; Vezzoni, Paolo; Superti-Furga, Andrea; Villa, Anna; Sobacchi, Cristina

    2014-02-01

    Autosomal Recessive Osteopetrosis is a genetic disorder characterized by increased bone density due to lack of resorption by the osteoclasts. Genetic studies have widely unraveled the molecular basis of the most severe forms, while cases of intermediate severity are more difficult to characterize, probably because of a large heterogeneity. Here, we describe the use of exome sequencing in the molecular diagnosis of 2 siblings initially thought to be affected by "intermediate osteopetrosis", which identified a homozygous mutation in the CTSK gene. Prompted by this finding, we tested by Sanger sequencing 25 additional patients addressed to us for recessive osteopetrosis and found CTSK mutations in 4 of them. In retrospect, their clinical and radiographic features were found to be compatible with, but not typical for, Pycnodysostosis. We sought to identify modifier genes that might have played a role in the clinical manifestation of the disease in these patients, but our results were not informative. In conclusion, we underline the difficulties of differential diagnosis in some patients whose clinical appearance does not fit the classical malignant or benign picture and recommend that CTSK gene be included in the molecular diagnosis of high bone density conditions. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  14. Whole-genome sequencing identifies recurrent somatic NOTCH2 mutations in splenic marginal zone lymphoma.

    Science.gov (United States)

    Kiel, Mark J; Velusamy, Thirunavukkarasu; Betz, Bryan L; Zhao, Lili; Weigelin, Helmut G; Chiang, Mark Y; Huebner-Chan, David R; Bailey, Nathanael G; Yang, David T; Bhagat, Govind; Miranda, Roberto N; Bahler, David W; Medeiros, L Jeffrey; Lim, Megan S; Elenitoba-Johnson, Kojo S J

    2012-08-27

    Splenic marginal zone lymphoma (SMZL), the most common primary lymphoma of spleen, is poorly understood at the genetic level. In this study, using whole-genome DNA sequencing (WGS) and confirmation by Sanger sequencing, we observed mutations identified in several genes not previously known to be recurrently altered in SMZL. In particular, we identified recurrent somatic gain-of-function mutations in NOTCH2, a gene encoding a protein required for marginal zone B cell development, in 25 of 99 (∼25%) cases of SMZL and in 1 of 19 (∼5%) cases of nonsplenic MZLs. These mutations clustered near the C-terminal proline/glutamate/serine/threonine (PEST)-rich domain, resulting in protein truncation or, rarely, were nonsynonymous substitutions affecting the extracellular heterodimerization domain (HD). NOTCH2 mutations were not present in other B cell lymphomas and leukemias, such as chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL; n = 15), mantle cell lymphoma (MCL; n = 15), low-grade follicular lymphoma (FL; n = 44), hairy cell leukemia (HCL; n = 15), and reactive lymphoid hyperplasia (n = 14). NOTCH2 mutations were associated with adverse clinical outcomes (relapse, histological transformation, and/or death) among SMZL patients (P = 0.002). These results suggest that NOTCH2 mutations play a role in the pathogenesis and progression of SMZL and are associated with a poor prognosis.

  15. Whole-exome sequencing identifies novel MPL and JAK2 mutations in triple-negative myeloproliferative neoplasms.

    Science.gov (United States)

    Milosevic Feenstra, Jelena D; Nivarthi, Harini; Gisslinger, Heinz; Leroy, Emilie; Rumi, Elisa; Chachoua, Ilyas; Bagienski, Klaudia; Kubesova, Blanka; Pietra, Daniela; Gisslinger, Bettina; Milanesi, Chiara; Jäger, Roland; Chen, Doris; Berg, Tiina; Schalling, Martin; Schuster, Michael; Bock, Christoph; Constantinescu, Stefan N; Cazzola, Mario; Kralovics, Robert

    2016-01-21

    Essential thrombocythemia (ET) and primary myelofibrosis (PMF) are chronic diseases characterized by clonal hematopoiesis and hyperproliferation of terminally differentiated myeloid cells. The disease is driven by somatic mutations in exon 9 of CALR or exon 10 of MPL or JAK2-V617F in >90% of the cases, whereas the remaining cases are termed "triple negative." We aimed to identify the disease-causing mutations in the triple-negative cases of ET and PMF by applying whole-exome sequencing (WES) on paired tumor and control samples from 8 patients. We found evidence of clonal hematopoiesis in 5 of 8 studied cases based on clonality analysis and presence of somatic genetic aberrations. WES identified somatic mutations in 3 of 8 cases. We did not detect any novel recurrent somatic mutations. In 3 patients with clonal hematopoiesis analyzed by WES, we identified a somatic MPL-S204P, a germline MPL-V285E mutation, and a germline JAK2-G571S variant. We performed Sanger sequencing of the entire coding region of MPL in 62, and of JAK2 in 49 additional triple-negative cases of ET or PMF. New somatic (T119I, S204F, E230G, Y591D) and 1 germline (R321W) MPL mutation were detected. All of the identified MPL mutations were gain-of-function when analyzed in functional assays. JAK2 variants were identified in 5 of 57 triple-negative cases analyzed by WES and Sanger sequencing combined. We could demonstrate that JAK2-V625F and JAK2-F556V are gain-of-function mutations. Our results suggest that triple-negative cases of ET and PMF do not represent a homogenous disease entity. Cases with polyclonal hematopoiesis might represent hereditary disorders. © 2016 by The American Society of Hematology.

  16. Targeted next-generation sequencing analysis identifies novel mutations in families with severe familial exudative vitreoretinopathy

    Science.gov (United States)

    Huang, Xiao-Yan; Zhuang, Hong; Wu, Ji-Hong; Li, Jian-Kang; Hu, Fang-Yuan; Zheng, Yu; Tellier, Laurent Christian Asker M.; Zhang, Sheng-Hai; Gao, Feng-Juan; Zhang, Jian-Guo

    2017-01-01

    Purpose Familial exudative vitreoretinopathy (FEVR) is a genetically and clinically heterogeneous disease, characterized by failure of vascular development of the peripheral retina. The symptoms of FEVR vary widely among patients in the same family, and even between the two eyes of a given patient. This study was designed to identify the genetic defect in a patient cohort of ten Chinese families with a definitive diagnosis of FEVR. Methods To identify the causative gene, next-generation sequencing (NGS)-based target capture sequencing was performed. Segregation analysis of the candidate variant was performed in additional family members by using Sanger sequencing and quantitative real-time PCR (QPCR). Results Of the cohort of ten FEVR families, six pathogenic variants were identified, including four novel and two known heterozygous mutations. Of the variants identified, four were missense variants, and two were novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del]. The two novel heterozygous deletion mutations were not observed in the control subjects and could give rise to a relatively severe FEVR phenotype, which could be explained by the protein function prediction. Conclusions We identified two novel heterozygous deletion mutations [LRP5, c.4053 DelC (p.Ile1351IlefsX88); TSPAN12, EX8Del] using targeted NGS as a causative mutation for FEVR. These genetic deletion variations exhibit a severe form of FEVR, with tractional retinal detachments compared with other known point mutations. The data further enrich the mutation spectrum of FEVR and enhance our understanding of genotype–phenotype correlations to provide useful information for disease diagnosis, prognosis, and effective genetic counseling. PMID:28867931

  17. Genome-wide linkage, exome sequencing and functional analyses identify ABCB6 as the pathogenic gene of dyschromatosis universalis hereditaria.

    Directory of Open Access Journals (Sweden)

    Hong Liu

    Full Text Available As a genetic disorder of abnormal pigmentation, the molecular basis of dyschromatosis universalis hereditaria (DUH had remained unclear until recently when ABCB6 was reported as a causative gene of DUH.We performed genome-wide linkage scan using Illumina Human 660W-Quad BeadChip and exome sequencing analyses using Agilent SureSelect Human All Exon Kits in a multiplex Chinese DUH family to identify the pathogenic mutations and verified the candidate mutations using Sanger sequencing. Quantitative RT-PCR and Immunohistochemistry was performed to verify the expression of the pathogenic gene, Zebrafish was also used to confirm the functional role of ABCB6 in melanocytes and pigmentation.Genome-wide linkage (assuming autosomal dominant inheritance mode and exome sequencing analyses identified ABCB6 as the disease candidate gene by discovering a coding mutation (c.1358C>T; p.Ala453Val that co-segregates with the disease phenotype. Further mutation analysis of ABCB6 in four other DUH families and two sporadic cases by Sanger sequencing confirmed the mutation (c.1358C>T; p.Ala453Val and discovered a second, co-segregating coding mutation (c.964A>C; p.Ser322Lys in one of the four families. Both mutations were heterozygous in DUH patients and not present in the 1000 Genome Project and dbSNP database as well as 1,516 unrelated Chinese healthy controls. Expression analysis in human skin and mutagenesis interrogation in zebrafish confirmed the functional role of ABCB6 in melanocytes and pigmentation. Given the involvement of ABCB6 mutations in coloboma, we performed ophthalmological examination of the DUH carriers of ABCB6 mutations and found ocular abnormalities in them.Our study has advanced our understanding of DUH pathogenesis and revealed the shared pathological mechanism between pigmentary DUH and ocular coloboma.

  18. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes.

    Science.gov (United States)

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4(-/-) mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.

  19. Genomic Aberrations in Crizotinib Resistant Lung Adenocarcinoma Samples Identified by Transcriptome Sequencing.

    Directory of Open Access Journals (Sweden)

    Ali Saber

    Full Text Available ALK-break positive non-small cell lung cancer (NSCLC patients initially respond to crizotinib, but resistance occurs inevitably. In this study we aimed to identify fusion genes in crizotinib resistant tumor samples. Re-biopsies of three patients were subjected to paired-end RNA sequencing to identify fusion genes using deFuse and EricScript. The IGV browser was used to determine presence of known resistance-associated mutations. Sanger sequencing was used to validate fusion genes and digital droplet PCR to validate mutations. ALK fusion genes were detected in all three patients with EML4 being the fusion partner. One patient had no additional fusion genes. Another patient had one additional fusion gene, but without a predicted open reading frame (ORF. The third patient had three additional fusion genes, of which two were derived from the same chromosomal region as the EML4-ALK. A predicted ORF was identified only in the CLIP4-VSNL1 fusion product. The fusion genes validated in the post-treatment sample were also present in the biopsy before crizotinib. ALK mutations (p.C1156Y and p.G1269A detected in the re-biopsies of two patients, were not detected in pre-treatment biopsies. In conclusion, fusion genes identified in our study are unlikely to be involved in crizotinib resistance based on presence in pre-treatment biopsies. The detection of ALK mutations in post-treatment tumor samples of two patients underlines their role in crizotinib resistance.

  20. Whole exome sequencing identifies RAI1 mutation in a morbidly obese child diagnosed with ROHHAD syndrome.

    Science.gov (United States)

    Thaker, Vidhu V; Esteves, Kristyn M; Towne, Meghan C; Brownstein, Catherine A; James, Philip M; Crowley, Laura; Hirschhorn, Joel N; Elsea, Sarah H; Beggs, Alan H; Picker, Jonathan; Agrawal, Pankaj B

    2015-05-01

    The current obesity epidemic is attributed to complex interactions between genetic and environmental factors. However, a limited number of cases, especially those with early-onset severe obesity, are linked to single gene defects. Rapid-onset obesity with hypothalamic dysfunction, hypoventilation and autonomic dysregulation (ROHHAD) is one of the syndromes that presents with abrupt-onset extreme weight gain with an unknown genetic basis. To identify the underlying genetic etiology in a child with morbid early-onset obesity, hypoventilation, and autonomic and behavioral disturbances who was clinically diagnosed with ROHHAD syndrome. Design/Setting/Intervention: The index patient was evaluated at an academic medical center. Whole-exome sequencing was performed on the proband and his parents. Genetic variants were validated by Sanger sequencing. We identified a novel de novo nonsense mutation, c.3265 C>T (p.R1089X), in the retinoic acid-induced 1 (RAI1) gene in the proband. Mutations in the RAI1 gene are known to cause Smith-Magenis syndrome (SMS). On further evaluation, his clinical features were not typical of either SMS or ROHHAD syndrome. This study identifies a de novo RAI1 mutation in a child with morbid obesity and a clinical diagnosis of ROHHAD syndrome. Although extreme early-onset obesity, autonomic disturbances, and hypoventilation are present in ROHHAD, several of the clinical findings are consistent with SMS. This case highlights the challenges in the diagnosis of ROHHAD syndrome and its potential overlap with SMS. We also propose RAI1 as a candidate gene for children with morbid obesity.

  1. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  2. Whole-exome sequencing identifies USH2A mutations in a pseudo-dominant Usher syndrome family.

    Science.gov (United States)

    Zheng, Sui-Lian; Zhang, Hong-Liang; Lin, Zhen-Lang; Kang, Qian-Yan

    2015-10-01

    Usher syndrome (USH) is an autosomal recessive (AR) multi-sensory degenerative disorder leading to deaf-blindness. USH is clinically subdivided into three subclasses, and 10 genes have been identified thus far. Clinical and genetic heterogeneities in USH make a precise diagnosis difficult. A dominant‑like USH family in successive generations was identified, and the present study aimed to determine the genetic predisposition of this family. Whole‑exome sequencing was performed in two affected patients and an unaffected relative. Systematic data were analyzed by bioinformatic analysis to remove the candidate mutations via step‑wise filtering. Direct Sanger sequencing and co‑segregation analysis were performed in the pedigree. One novel and two known mutations in the USH2A gene were identified, and were further confirmed by direct sequencing and co‑segregation analysis. The affected mother carried compound mutations in the USH2A gene, while the unaffected father carried a heterozygous mutation. The present study demonstrates that whole‑exome sequencing is a robust approach for the molecular diagnosis of disorders with high levels of genetic heterogeneity.

  3. MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.

    Directory of Open Access Journals (Sweden)

    Ram Vinay Pandey

    Full Text Available Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.

  4. MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.

    Science.gov (United States)

    Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas

    2016-01-01

    Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.

  5. Exome sequencing identifies mutations in ABCD1 and DACH2 in two brothers with a distinct phenotype

    OpenAIRE

    Zhang, Yanliang; Liu, Yanhui; Li, Ya; Duan, Yong; Zhang, Keyun; Wang, Junwang; Dai, Yong

    2014-01-01

    Background We report on two brothers with a distinct syndromic phenotype and explore the potential pathogenic cause. Methods Cytogenetic tests and exome sequencing were performed on the two brothers and their parents. Variants detected by exome sequencing were validated by Sanger sequencing. Results The main phenotype of the two brothers included congenital language disorder, growth retardation, intellectual disability, difficulty in standing and walking, and urinary and fecal incontinence. T...

  6. Exome Sequencing Identified a Recessive RDH12 Mutation in a Family with Severe Early-Onset Retinitis Pigmentosa

    Directory of Open Access Journals (Sweden)

    Bo Gong

    2015-01-01

    Full Text Available Retinitis pigmentosa (RP is the most important hereditary retinal disease caused by progressive degeneration of the photoreceptor cells. This study is to identify gene mutations responsible for autosomal recessive retinitis pigmentosa (arRP in a Chinese family using next-generation sequencing technology. A Chinese family with 7 members including two individuals affected with severe early-onset RP was studied. All patients underwent a complete ophthalmic examination. Exome sequencing was performed on a single RP patient (the proband of this family and direct Sanger sequencing on other family members and normal controls was followed to confirm the causal mutations. A homozygous mutation c.437Tidentified as being related to the phenotype of this arRP family. This homozygous mutation was detected in the two affected patients, but not present in other family members and 600 normal controls. Another three normal members in the family were found to carry this heterozygous missense mutation. Our results emphasize the importance of c.437T

  7. Identifying structural variants using linked-read sequencing data.

    Science.gov (United States)

    Elyanow, Rebecca; Wu, Hsin-Ta; Raphael, Benjamin J

    2017-11-03

    Structural variation, including large deletions, duplications, inversions, translocations, and other rearrangements, is common in human and cancer genomes. A number of methods have been developed to identify structural variants from Illumina short-read sequencing data. However, reliable identification of structural variants remains challenging because many variants have breakpoints in repetitive regions of the genome and thus are difficult to identify with short reads. The recently developed linked-read sequencing technology from 10X Genomics combines a novel barcoding strategy with Illumina sequencing. This technology labels all reads that originate from a small number (~5-10) DNA molecules ~50Kbp in length with the same molecular barcode. These barcoded reads contain long-range sequence information that is advantageous for identification of structural variants. We present Novel Adjacency Identification with Barcoded Reads (NAIBR), an algorithm to identify structural variants in linked-read sequencing data. NAIBR predicts novel adjacencies in a individual genome resulting from structural variants using a probabilistic model that combines multiple signals in barcoded reads. We show that NAIBR outperforms several existing methods for structural variant identification - including two recent methods that also analyze linked-reads - on simulated sequencing data and 10X whole-genome sequencing data from the NA12878 human genome and the HCC1954 breast cancer cell line. Several of the novel somatic structural variants identified in HCC1954 overlap known cancer genes. Software is available at compbio.cs.brown.edu/software. braphael@princeton.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  8. Whole-Exome Sequencing Identifies One De Novo Variant in the FGD6 Gene in a Thai Family with Autism Spectrum Disorder

    Directory of Open Access Journals (Sweden)

    Chuphong Thongnak

    2018-01-01

    Full Text Available Autism spectrum disorder (ASD has a strong genetic basis, although the genetics of autism is complex and it is unclear. Genetic testing such as microarray or sequencing was widely used to identify autism markers, but they are unsuccessful in several cases. The objective of this study is to identify causative variants of autism in two Thai families by using whole-exome sequencing technique. Whole-exome sequencing was performed with autism-affected children from two unrelated families. Each sample was sequenced on SOLiD 5500xl Genetic Analyzer system followed by combined bioinformatics pipeline including annotation and filtering process to identify candidate variants. Candidate variants were validated, and the segregation study with other family members was performed using Sanger sequencing. This study identified a possible causative variant for ASD, c.2951G>A, in the FGD6 gene. We demonstrated the potential for ASD genetic variants associated with ASD using whole-exome sequencing and a bioinformatics filtering procedure. These techniques could be useful in identifying possible causative ASD variants, especially in cases in which variants cannot be identified by other techniques.

  9. Refining the Results of a Classical SELEX Experiment by Expanding the Sequence Data Set of an Aptamer Pool Selected for Protein A

    OpenAIRE

    Regina Stoltenburg; Beate Strehlitz

    2018-01-01

    New, as yet undiscovered aptamers for Protein A were identified by applying next generation sequencing (NGS) to a previously selected aptamer pool. This pool was obtained in a classical SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiment using the FluMag-SELEX procedure followed by cloning and Sanger sequencing. PA#2/8 was identified as the only Protein A-binding aptamer from the Sanger sequence pool, and was shown to be able to bind intact cells of Staphylococcus aur...

  10. Exome sequencing identifies ZNF644 mutations in high myopia.

    Directory of Open Access Journals (Sweden)

    Yi Shi

    2011-06-01

    Full Text Available Myopia is the most common ocular disorder worldwide, and high myopia in particular is one of the leading causes of blindness. Genetic factors play a critical role in the development of myopia, especially high myopia. Recently, the exome sequencing approach has been successfully used for the disease gene identification of Mendelian disorders. Here we show a successful application of exome sequencing to identify a gene for an autosomal dominant disorder, and we have identified a gene potentially responsible for high myopia in a monogenic form. We captured exomes of two affected individuals from a Han Chinese family with high myopia and performed sequencing analysis by a second-generation sequencer with a mean coverage of 30× and sufficient depth to call variants at ∼97% of each targeted exome. The shared genetic variants of these two affected individuals in the family being studied were filtered against the 1000 Genomes Project and the dbSNP131 database. A mutation A672G in zinc finger protein 644 isoform 1 (ZNF644 was identified as being related to the phenotype of this family. After we performed sequencing analysis of the exons in the ZNF644 gene in 300 sporadic cases of high myopia, we identified an additional five mutations (I587V, R680G, C699Y, 3'UTR+12 C>G, and 3'UTR+592 G>A in 11 different patients. All these mutations were absent in 600 normal controls. The ZNF644 gene was expressed in human retinal and retinal pigment epithelium (RPE. Given that ZNF644 is predicted to be a transcription factor that may regulate genes involved in eye development, mutation may cause the axial elongation of eyeball found in high myopia patients. Our results suggest that ZNF644 might be a causal gene for high myopia in a monogenic form.

  11. Exome sequencing identifies compound heterozygous mutations in CYP4V2 in a pedigree with retinitis pigmentosa.

    Directory of Open Access Journals (Sweden)

    Yun Wang

    Full Text Available Retinitis pigmentosa (RP is a heterogeneous group of progressive retinal degenerations characterized by pigmentation and atrophy in the mid-periphery of the retina. Twenty two subjects from a four-generation Chinese family with RP and thin cornea, congenital cataract and high myopia is reported in this study. All family members underwent complete ophthalmologic examinations. Patients of the family presented with bone spicule-shaped pigment deposits in retina, retinal vascular attenuation, retinal and choroidal dystrophy, as well as punctate opacity of the lens, reduced cornea thickness and high myopia. Peripheral venous blood was obtained from all patients and their family members for genetic analysis. After mutation analysis in a few known RP candidate genes, exome sequencing was used to analyze the exomes of 3 patients III2, III4, III6 and the unaffected mother II2. A total of 34,693 variations shared by 3 patients were subjected to several filtering steps against existing variation databases. Identified variations were verified in the rest family members by PCR and Sanger sequencing. Compound heterozygous c.802-8_810del17insGC and c.1091-2A>G mutations of the CYP4V2 gene, known as genetic defects for Bietti crystalline corneoretinal dystrophy, were identified as causative mutations for RP of this family.

  12. Experience of targeted Usher exome sequencing as a clinical test

    Science.gov (United States)

    Besnard, Thomas; García-García, Gema; Baux, David; Vaché, Christel; Faugère, Valérie; Larrieu, Lise; Léonard, Susana; Millan, Jose M; Malcolm, Sue; Claustres, Mireille; Roux, Anne-Françoise

    2014-01-01

    We show that massively parallel targeted sequencing of 19 genes provides a new and reliable strategy for molecular diagnosis of Usher syndrome (USH) and nonsyndromic deafness, particularly appropriate for these disorders characterized by a high clinical and genetic heterogeneity and a complex structure of several of the genes involved. A series of 71 patients including Usher patients previously screened by Sanger sequencing plus newly referred patients was studied. Ninety-eight percent of the variants previously identified by Sanger sequencing were found by next-generation sequencing (NGS). NGS proved to be efficient as it offers analysis of all relevant genes which is laborious to reach with Sanger sequencing. Among the 13 newly referred Usher patients, both mutations in the same gene were identified in 77% of cases (10 patients) and one candidate pathogenic variant in two additional patients. This work can be considered as pilot for implementing NGS for genetically heterogeneous diseases in clinical service. PMID:24498627

  13. Targeted/exome sequencing identified mutations in ten Chinese patients diagnosed with Noonan syndrome and related disorders

    Directory of Open Access Journals (Sweden)

    Shanshan Xu

    2017-10-01

    Full Text Available Abstract Background Noonan syndrome (NS and Noonan syndrome with multiple lentigines (NSML are autosomal dominant developmental disorders. NS and NSML are caused by abnormalities in genes that encode proteins related to the RAS-MAPK pathway, including PTPN11, RAF1, BRAF, and MAP2K. In this study, we diagnosed ten NS or NSML patients via targeted sequencing or whole exome sequencing (TS/WES. Methods TS/WES was performed to identify mutations in ten Chinese patients who exhibited the following manifestations: potential facial dysmorphisms, short stature, congenital heart defects, and developmental delay. Sanger sequencing was used to confirm the suspected pathological variants in the patients and their family members. Results TS/WES revealed three mutations in the PTPN11 gene, three mutations in RAF1 gene, and four mutations in BRAF gene in the NS and NSML patients who were previously diagnosed based on the abovementioned clinical features. All the identified mutations were determined to be de novo mutations. However, two patients who carried the same mutation in the RAF1 gene presented different clinical features. One patient with multiple lentigines was diagnosed with NSML, while the other patient without lentigines was diagnosed with NS. In addition, a patient who carried a hotspot mutation in the BRAF gene was diagnosed with NS instead of cardiofaciocutaneous syndrome (CFCS. Conclusions TS/WES has emerged as a useful tool for definitive diagnosis and accurate genetic counseling of atypical cases. In this study, we analyzed ten Chinese patients diagnosed with NS and related disorders and identified their correspondingPTPN11, RAF1, and BRAF mutations. Among the target genes, BRAF showed the same degree of correlation with NS incidence as that of PTPN11 or RAF1.

  14. MetaPhinder-Identifying Bacteriophage Sequences in Metagenomic Data Sets

    DEFF Research Database (Denmark)

    Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole

    2016-01-01

    genome structure of many bacteriophages. The method is demonstrated to outperform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source...... and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e. contigs) of phage origin in metage-nomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic...... code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder....

  15. Pooled Enrichment Sequencing Identifies Diversity and Evolutionary Pressures at NLR Resistance Genes within a Wild Tomato Population.

    Science.gov (United States)

    Stam, Remco; Scheikl, Daniela; Tellier, Aurélien

    2016-06-02

    Nod-like receptors (NLRs) are nucleotide-binding domain and leucine-rich repeats containing proteins that are important in plant resistance signaling. Many of the known pathogen resistance (R) genes in plants are NLRs and they can recognize pathogen molecules directly or indirectly. As such, divergence and copy number variants at these genes are found to be high between species. Within populations, positive and balancing selection are to be expected if plants coevolve with their pathogens. In order to understand the complexity of R-gene coevolution in wild nonmodel species, it is necessary to identify the full range of NLRs and infer their evolutionary history. Here we investigate and reveal polymorphism occurring at 220 NLR genes within one population of the partially selfing wild tomato species Solanum pennellii. We use a combination of enrichment sequencing and pooling ten individuals, to specifically sequence NLR genes in a resource and cost-effective manner. We focus on the effects which different mapping and single nucleotide polymorphism calling software and settings have on calling polymorphisms in customized pooled samples. Our results are accurately verified using Sanger sequencing of polymorphic gene fragments. Our results indicate that some NLRs, namely 13 out of 220, have maintained polymorphism within our S. pennellii population. These genes show a wide range of πN/πS ratios and differing site frequency spectra. We compare our observed rate of heterozygosity with expectations for this selfing and bottlenecked population. We conclude that our method enables us to pinpoint NLR genes which have experienced natural selection in their habitat. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. Pooled Enrichment Sequencing Identifies Diversity and Evolutionary Pressures at NLR Resistance Genes within a Wild Tomato Population

    Science.gov (United States)

    Stam, Remco; Scheikl, Daniela; Tellier, Aurélien

    2016-01-01

    Nod-like receptors (NLRs) are nucleotide-binding domain and leucine-rich repeats containing proteins that are important in plant resistance signaling. Many of the known pathogen resistance (R) genes in plants are NLRs and they can recognize pathogen molecules directly or indirectly. As such, divergence and copy number variants at these genes are found to be high between species. Within populations, positive and balancing selection are to be expected if plants coevolve with their pathogens. In order to understand the complexity of R-gene coevolution in wild nonmodel species, it is necessary to identify the full range of NLRs and infer their evolutionary history. Here we investigate and reveal polymorphism occurring at 220 NLR genes within one population of the partially selfing wild tomato species Solanum pennellii. We use a combination of enrichment sequencing and pooling ten individuals, to specifically sequence NLR genes in a resource and cost-effective manner. We focus on the effects which different mapping and single nucleotide polymorphism calling software and settings have on calling polymorphisms in customized pooled samples. Our results are accurately verified using Sanger sequencing of polymorphic gene fragments. Our results indicate that some NLRs, namely 13 out of 220, have maintained polymorphism within our S. pennellii population. These genes show a wide range of πN/πS ratios and differing site frequency spectra. We compare our observed rate of heterozygosity with expectations for this selfing and bottlenecked population. We conclude that our method enables us to pinpoint NLR genes which have experienced natural selection in their habitat. PMID:27189991

  17. Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma

    Energy Technology Data Exchange (ETDEWEB)

    Krauthammer, Michael; Kong, Yong; Ha, Byung Hak; Evans, Perry; Bacchiocchi, Antonella; McCusker, James P.; Cheng, Elaine; Davis, Matthew J.; Goh, Gerald; Choi, Murim; Ariyan, Stephan; Narayan, Deepak; Dutton-Regester, Ken; Capatana, Ana; Holman, Edna C.; Bosenberg, Marcus; Sznol, Mario; Kluger, Harriet M.; Brash, Douglas E.; Stern, David F.; Materin, Miguel A.; Lo, Roger S.; Mane, Shrikant; Ma, Shuangge; Kidd, Kenneth K.; Hayward, Nicholas K.; Lifton, Richard P.; Schlessinger, Joseph; Boggon, Titus J.; Halaban, Ruth (Yale-MED); (UCLA); (Queens)

    2012-10-11

    We characterized the mutational landscape of melanoma, the form of skin cancer with the highest mortality rate, by sequencing the exomes of 147 melanomas. Sun-exposed melanomas had markedly more ultraviolet (UV)-like C>T somatic mutations compared to sun-shielded acral, mucosal and uveal melanomas. Among the newly identified cancer genes was PPP6C, encoding a serine/threonine phosphatase, which harbored mutations that clustered in the active site in 12% of sun-exposed melanomas, exclusively in tumors with mutations in BRAF or NRAS. Notably, we identified a recurrent UV-signature, an activating mutation in RAC1 in 9.2% of sun-exposed melanomas. This activating mutation, the third most frequent in our cohort of sun-exposed melanoma after those of BRAF and NRAS, changes Pro29 to serine (RAC1{sup P29S}) in the highly conserved switch I domain. Crystal structures, and biochemical and functional studies of RAC1{sup P29S} showed that the alteration releases the conformational restraint conferred by the conserved proline, causes an increased binding of the protein to downstream effectors, and promotes melanocyte proliferation and migration. These findings raise the possibility that pharmacological inhibition of downstream effectors of RAC1 signaling could be of therapeutic benefit.

  18. Somatic mutations in histiocytic sarcoma identified by next generation sequencing.

    Science.gov (United States)

    Liu, Qingqing; Tomaszewicz, Keith; Hutchinson, Lloyd; Hornick, Jason L; Woda, Bruce; Yu, Hongbo

    2016-08-01

    Histiocytic sarcoma is a rare malignant neoplasm of presumed hematopoietic origin showing morphologic and immunophenotypic evidence of histiocytic differentiation. Somatic mutation importance in the pathogenesis or disease progression of histiocytic sarcoma was largely unknown. To identify somatic mutations in histiocytic sarcoma, we studied 5 histiocytic sarcomas [3 female and 2 male patients; mean age 54.8 (20-72), anatomic sites include lymph node, uterus, and pleura] and matched normal tissues from each patient as germ line controls. Somatic mutations in 50 "Hotspot" oncogenes and tumor suppressor genes were examined using next generation sequencing. Three (out of five) histiocytic sarcoma cases carried somatic mutations in BRAF. Among them, G464V [variant frequency (VF) of 43.6 %] and G466R (VF of 29.6 %) located at the P loop potentially interfere with the hydrophobic interaction between P and activating loops and ultimately activation of BRAF. Also detected was BRAF somatic mutation N581S (VF of 7.4 %), which was located at the catalytic loop of BRAF kinase domain: its role in modifying kinase activity was unclear. A similar mutational analysis was also performed on nine acute monocytic/monoblastic leukemia cases, which did not identify any BRAF somatic mutations. Our study detected several BRAF mutations in histiocytic sarcomas, which may be important in understanding the tumorigenesis of this rare neoplasm and providing mechanisms for potential therapeutical opportunities.

  19. Novel mutations in CRB1 gene identified in a chinese pedigree with retinitis pigmentosa by targeted capture and next generation sequencing

    Science.gov (United States)

    Lo, David; Weng, Jingning; Liu, xiaohong; Yang, Juhua; He, Fen; Wang, Yun; Liu, Xuyang

    2016-01-01

    PURPOSE To detect the disease-causing gene in a Chinese pedigree with autosomal-recessive retinitis pigmentosa (ARRP). METHODS All subjects in this family underwent a complete ophthalmic examination. Targeted-capture next generation sequencing (NGS) was performed on the proband to detect variants. All variants were verified in the remaining family members by PCR amplification and Sanger sequencing. RESULTS All the affected subjects in this pedigree were diagnosed with retinitis pigmentosa (RP). The compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations in the Crumbs homolog 1 (CRB1) gene were identified in all the affected patients but not in the unaffected individuals in this family. These mutations were inherited from their parents, respectively. CONCLUSION The novel compound heterozygous mutations in CRB1 were identified in a Chinese pedigree with ARRP using targeted-capture next generation sequencing. After evaluating the significant heredity and impaired protein function, the compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations are the causal genes of early onset ARRP in this pedigree. To the best of our knowledge, there is no previous report regarding the compound mutations. PMID:27806333

  20. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

    Science.gov (United States)

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

    2014-04-08

    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  1. Simple sequence repeat (SSR) markers are effective for identifying ...

    African Journals Online (AJOL)

    DNA was extracted from newly formed leaves and amplified using 21 simple sequence repeat (SSR) markers (NH001c, NH002b, NH005b, NH007b, NH008b, NH009b, NH011b, NH013b, NH012a, NH014a, NH015a, NH017a, KA4b, KA5, KA14, KA16, KB16, KU10, BGA35, BGT23b and HGA8b). The data was analyzed by ...

  2. Exome Sequencing Identifies a Novel LMNA Splice-Site Mutation and Multigenic Heterozygosity of Potential Modifiers in a Family with Sick Sinus Syndrome, Dilated Cardiomyopathy, and Sudden Cardiac Death.

    Directory of Open Access Journals (Sweden)

    Michael V Zaragoza

    Full Text Available The goals are to understand the primary genetic mechanisms that cause Sick Sinus Syndrome and to identify potential modifiers that may result in intrafamilial variability within a multigenerational family. The proband is a 63-year-old male with a family history of individuals (>10 with sinus node dysfunction, ventricular arrhythmia, cardiomyopathy, heart failure, and sudden death. We used exome sequencing of a single individual to identify a novel LMNA mutation and demonstrated the importance of Sanger validation and family studies when evaluating candidates. After initial single-gene studies were negative, we conducted exome sequencing for the proband which produced 9 gigabases of sequencing data. Bioinformatics analysis showed 94% of the reads mapped to the reference and identified 128,563 unique variants with 108,795 (85% located in 16,319 genes of 19,056 target genes. We discovered multiple variants in known arrhythmia, cardiomyopathy, or ion channel associated genes that may serve as potential modifiers in disease expression. To identify candidate mutations, we focused on ~2,000 variants located in 237 genes of 283 known arrhythmia, cardiomyopathy, or ion channel associated genes. We filtered the candidates to 41 variants in 33 genes using zygosity, protein impact, database searches, and clinical association. Only 21 of 41 (51% variants were validated by Sanger sequencing. We selected nine confirmed variants with minor allele frequencies G, a novel heterozygous splice-site mutation as the primary mutation with rare or novel variants in HCN4, MYBPC3, PKP4, TMPO, TTN, DMPK and KCNJ10 as potential modifiers and a mechanism consistent with haploinsufficiency.

  3. NFATC3-PLA2G15 Fusion Transcript Identified by RNA Sequencing Promotes Tumor Invasion and Proliferation in Colorectal Cancer Cell Lines.

    Science.gov (United States)

    Jang, Jee-Eun; Kim, Hwang-Phill; Han, Sae-Won; Jang, Hoon; Lee, Si-Hyun; Song, Sang-Hyun; Bang, Duhee; Kim, Tae-You

    2018-06-14

    This study was designed to identify novel fusion transcripts (FTs) and their functional significance in colorectal cancer lines. We performed paired-end RNA sequencing of 28 colorectal cancer (CRC) cell lines. FT candidates were identified using TopHat-fusion, ChimeraScan, and FusionMap tools and further experimental validation was conducted through reverse transcription-polymerase chain reaction and Sanger sequencing. FT was depleted in human CRC line and the effects on cell proliferation, cell migration, and cell invasion were analyzed. 1,380 FT candidates were detected through bioinformatics filtering. We selected 6 candidate FTs, including 4 inter-chromosomal and 2 intra-chromosomal FTs and each FT was found in at least 1 of the 28 cell lines. Moreover, when we tested 19 pairs of CRC tumor and adjacent normal tissue samples, NFATC3-PLA2G15 FT was found in 2. Knockdown of NFATC3-PLA2G15 using siRNA reduced mRNA expression of epithelial-mesenchymal transition (EMT) markers such as vimentin, twist, and fibronectin and increased mesenchymal-epithelial transition markers of E-cadherin, claudin-1, and FOXC2 in colo-320 cell line harboring NFATC3-PLA2G15 FT. The NFATC3-PLA2G15 knockdown also inhibited invasion, colony formation capacity, and cell proliferation. These results suggest that that NFATC3-PLA2G15 FTs may contribute to tumor progression by enhancing invasion by EMT and proliferation.

  4. Transcriptome sequencing in prostate cancer identifies inter-tumor heterogeneity

    Directory of Open Access Journals (Sweden)

    Janet Mendonca

    2015-06-01

    Full Text Available Given the dearth of gene mutations in prostate cancer, [1] ,[2] it is likely that genomic rearrangements play a significant role in the evolution of prostate cancer. However, in the search for recurrent genomic alterations, "private alterations" have received less attention. Such alterations may provide insights into the evolution, behavior, and clinical outcome of an individual tumor. In a recent report in "Genome Biology" Wyatt et al. [3] defines unique alterations in a cohort of high-risk prostate cancer patient with a lethal phenotype. Utilizing a transcriptome sequencing approach they observe high inter-tumor heterogeneity; however, the genes altered distill into three distinct cancer-relevant pathways. Their analysis reveals the presence of several non-ETS fusions, which may contribute to the phenotype of individual tumors, and have significance for disease progression.

  5. Potential of DNA sequences to identify zoanthids (Cnidaria: Zoantharia).

    Science.gov (United States)

    Sinniger, Frederic; Reimer, James D; Pawlowski, Jan

    2008-12-01

    The order Zoantharia is known for its chaotic taxonomy and difficult morphological identification. One method that potentially could help for examining such troublesome taxa is DNA barcoding, which identifies species using standard molecular markers. The mitochondrial cytochrome oxidase subunit I (COI) has been utilized to great success in groups such as birds and insects; however, its applicability in many other groups is controversial. Recently, some studies have suggested that barcoding is not applicable to anthozoans. Here, we examine the use of COI and mitochondrial 16S ribosomal DNA for zoanthid identification. Despite the absence of a clear barcoding gap, our results show that for most of 54 zoanthid samples, both markers could separate samples to the species, or species group, level, particularly when easily accessible ecological or distributional data were included. Additionally, we have used the short V5 region of mt 16S rDNA to identify eight old (13 to 50 years old) museum samples. We discuss advantages and disadvantages of COI and mt 16S rDNA as barcodes for Zoantharia, and recommend that either one or both of these markers be considered for zoanthid identification in the future.

  6. Whole Exome Sequencing Identified a Novel Heterozygous Mutation in HMBS Gene in a Chinese Patient With Acute Intermittent Porphyria With Rare Type of Mild Anemia

    Directory of Open Access Journals (Sweden)

    Yongjiang Zheng

    2018-04-01

    Full Text Available Acute intermittent porphyria (AIP is a rare hereditary metabolic disease with an autosomal dominant mode of inheritance. Germline mutations of HMBS gene causes AIP. Mutation of HMBS gene results into the partial deficiency of the heme biosynthetic enzyme hydroxymethylbilane synthase. AIP is clinically manifested with abdominal pain, vomiting, and neurological complaints. Additionally, an extreme phenotypic heterogeneity has been reported in AIP patients with mutations in HMBS gene. Here, we investigated a Chinese patient with AIP. The proband is a 28-year-old Chinese male manifested with severe stomach ache, constipation, nausea and depression. Proband’s father and mother is normal. Proband’s blood sample was collected and genomic DNA was extracted. Whole exome sequencing and Sanger sequencing identified a heterozygous novel single nucleotide deletion (c.809delC in exon 12 of HMBS gene in the proband. This mutation leads to frameshift followed by formation of a truncated (p.Ala270Valfs∗2 HMBS protein with 272 amino acids comparing with the wild type HMBS protein of 361 amino acids. This mutation has not been found in proband’s unaffected parents as well as in 100 healthy normal control. According to the variant interpretation guidelines of American College of Medical Genetics and Genomics (ACMG, this variant is classified as “likely pathogenic” variant. Our findings expand the mutational spectra of HMBS gene related AIP which are significant for screening and genetic diagnosis for AIP.

  7. Identifying Statistical Dependence in Genomic Sequences via Mutual Information Estimates

    Directory of Open Access Journals (Sweden)

    Wojciech Szpankowski

    2007-12-01

    Full Text Available Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, they are used for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5′ untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's combined DNA index system (CODIS, we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats—an application of importance in genetic profiling.

  8. Somatic loss of function mutations in neurofibromin 1 and MYC associated factor X genes identified by exome-wide sequencing in a wild-type GIST case

    International Nuclear Information System (INIS)

    Belinsky, Martin G.; Rink, Lori; Cai, Kathy Q.; Capuzzi, Stephen J.; Hoang, Yen; Chien, Jeremy; Godwin, Andrew K.; Mehren, Margaret von

    2015-01-01

    Approximately 10–15 % of gastrointestinal stromal tumors (GISTs) lack gain of function mutations in the KIT and platelet-derived growth factor receptor alpha (PDGFRA) genes. An alternate mechanism of oncogenesis through loss of function of the succinate-dehydrogenase (SDH) enzyme complex has been identified for a subset of these “wild type” GISTs. Paired tumor and normal DNA from an SDH-intact wild-type GIST case was subjected to whole exome sequencing to identify the pathogenic mechanism(s) in this tumor. Selected findings were further investigated in panels of GIST tumors through Sanger DNA sequencing, quantitative real-time PCR, and immunohistochemical approaches. A hemizygous frameshift mutation (p.His2261Leufs*4), in the neurofibromin 1 (NF1) gene was identified in the patient’s GIST; however, no germline NF1 mutation was found. A somatic frameshift mutation (p.Lys54Argfs*31) in the MYC associated factor X (MAX) gene was also identified. Immunohistochemical analysis for MAX on a large panel of GISTs identified loss of MAX expression in the MAX-mutated GIST and in a subset of mainly KIT-mutated tumors. This study suggests that inactivating NF1 mutations outside the context of neurofibromatosis may be the oncogenic mechanism for a subset of sporadic GIST. In addition, loss of function mutation of the MAX gene was identified for the first time in GIST, and a broader role for MAX in GIST progression was suggested. The online version of this article (doi:10.1186/s12885-015-1872-y) contains supplementary material, which is available to authorized users

  9. Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

    Science.gov (United States)

    Höps, Wolfram; Jeffryes, Matt; Bateman, Alex

    2018-01-01

    We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation.  Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases.  We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes.  Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.

  10. Identifying Corneal Infections in Formalin-Fixed Specimens Using Next Generation Sequencing.

    Science.gov (United States)

    Li, Zhigang; Breitwieser, Florian P; Lu, Jennifer; Jun, Albert S; Asnaghi, Laura; Salzberg, Steven L; Eberhart, Charles G

    2018-01-01

    We test the ability of next-generation sequencing, combined with computational analysis, to identify a range of organisms causing infectious keratitis. This retrospective study evaluated 16 cases of infectious keratitis and four control corneas in formalin-fixed tissues from the pathology laboratory. Infectious cases also were analyzed in the microbiology laboratory using culture, polymerase chain reaction, and direct staining. Classified sequence reads were analyzed with two different metagenomics classification engines, Kraken and Centrifuge, and visualized using the Pavian software tool. Sequencing generated 20 to 46 million reads per sample. On average, 96% of the reads were classified as human, 0.3% corresponded to known vectors or contaminant sequences, 1.7% represented microbial sequences, and 2.4% could not be classified. The two computational strategies successfully identified the fungal, bacterial, and amoebal pathogens in most patients, including all four bacterial and mycobacterial cases, five of six fungal cases, three of three Acanthamoeba cases, and one of three herpetic keratitis cases. In several cases, additional potential pathogens also were identified. In one case with cytomegalovirus identified by Kraken and Centrifuge, the virus was confirmed by direct testing, while two where Staphylococcus aureus or cytomegalovirus were identified by Centrifuge but not Kraken could not be confirmed. Confirmation was not attempted for an additional three potential pathogens identified by Kraken and 11 identified by Centrifuge. Next generation sequencing combined with computational analysis can identify a wide range of pathogens in formalin-fixed corneal specimens, with potential applications in clinical diagnostics and research.

  11. [Molecular and prenatal diagnosis of a family with Fanconi anemia by next generation sequencing].

    Science.gov (United States)

    Gong, Zhuwen; Yu, Yongguo; Zhang, Qigang; Gu, Xuefan

    2015-04-01

    To provide prenatal diagnosis for a pregnant woman who had given birth to a child with Fanconi anemia with combined next-generation sequencing (NGS) and Sanger sequencing. For the affected child, potential mutations of the FANCA gene were analyzed with NGS. Suspected mutation was verified with Sanger sequencing. For prenatal diagnosis, genomic DNA was extracted from cultured fetal amniotic fluid cells and subjected to analysis of the same mutations. A low-frequency frameshifting mutation c.989_995del7 (p.H330LfsX2, inherited from his father) and a truncating mutation c.3971C>T (p.P1324L, inherited from his mother) have been identified in the affected child and considered to be pathogenic. The two mutations were subsequently verified by Sanger sequencing. Upon prenatal diagnosis, the fetus was found to carry two mutations. The combined next-generation sequencing and Sanger sequencing can reduce the time for diagnosis and identify subtypes of Fanconi anemia and the mutational sites, which has enabled reliable prenatal diagnosis of this disease.

  12. Targeted next-generation sequencing identifies a novel nonsense mutation in SPTB for hereditary spherocytosis: A case report of a Korean family.

    Science.gov (United States)

    Shin, Soyoung; Jang, Woori; Kim, Myungshin; Kim, Yonggoo; Park, Suk Young; Park, Joonhong; Yang, Young Jun

    2018-01-01

    Hereditary spherocytosis (HS) is an inherited disorder characterized by the presence of spherical-shaped red blood cells (RBCs) on the peripheral blood (PB) smear. To date, a number of mutations in 5 genes have been identified and the mutations in SPTB gene account for about 20% patients. A 65-year-old female had been diagnosed as hemolytic anemia 30 years ago, based on a history of persistent anemia and hyperbilirubinemia for several years. She received RBC transfusion several times and a cholecystectomy roughly 20 years ago before. Round, densely staining spherical-shaped erythrocytes (spherocytes) were frequently found on the PB smear. Numerous spherocytes were frequently found in the PB smears of symptomatic family members, her 3rd son and his 2 grandchildren. One heterozygous mutation of SPTB was identified by targeted next-generation sequencing (NGS). The nonsense mutation, c.1956G>A (p.Trp652*), in exon 13 was confirmed by Sanger sequencing and thus the proband was diagnosed with HS. The proband underwent a splenectomy due to transfusion-refractory anemia and splenomegaly. After the splenectomy, her hemoglobin level improved to normal range (14.1 g/dL) and her bilirubin levels decreased dramatically (total bilirubin 1.9 mg/dL; direct bilirubin 0.6 mg/dL). We suggest that NGS of causative genes could be a useful diagnostic tool for the genetically heterogeneous RBC membrane disorders, especially in cases with a mild or atypical clinical manifestation. Copyright © 2017 The Authors. Published by Wolters Kluwer Health, Inc. All rights reserved.

  13. Whole-exome sequencing identifies novel compound heterozygous mutations in USH2A in Spanish patients with autosomal recessive retinitis pigmentosa.

    Science.gov (United States)

    Méndez-Vidal, Cristina; González-Del Pozo, María; Vela-Boza, Alicia; Santoyo-López, Javier; López-Domingo, Francisco J; Vázquez-Marouschek, Carmen; Dopazo, Joaquin; Borrego, Salud; Antiñolo, Guillermo

    2013-01-01

    Retinitis pigmentosa (RP) is an inherited retinal dystrophy characterized by extreme genetic and clinical heterogeneity. Thus, the diagnosis is not always easily performed due to phenotypic and genetic overlap. Current clinical practices have focused on the systematic evaluation of a set of known genes for each phenotype, but this approach may fail in patients with inaccurate diagnosis or infrequent genetic cause. In the present study, we investigated the genetic cause of autosomal recessive RP (arRP) in a Spanish family in which the causal mutation has not yet been identified with primer extension technology and resequencing. We designed a whole-exome sequencing (WES)-based approach using NimbleGen SeqCap EZ Exome V3 sample preparation kit and the SOLiD 5500×l next-generation sequencing platform. We sequenced the exomes of both unaffected parents and two affected siblings. Exome analysis resulted in the identification of 43,204 variants in the index patient. All variants passing filter criteria were validated with Sanger sequencing to confirm familial segregation and absence in the control population. In silico prediction tools were used to determine mutational impact on protein function and the structure of the identified variants. Novel Usher syndrome type 2A (USH2A) compound heterozygous mutations, c.4325T>C (p.F1442S) and c.15188T>G (p.L5063R), located in exons 20 and 70, respectively, were identified as probable causative mutations for RP in this family. Family segregation of the variants showed the presence of both mutations in all affected members and in two siblings who were apparently asymptomatic at the time of family ascertainment. Clinical reassessment confirmed the diagnosis of RP in these patients. Using WES, we identified two heterozygous novel mutations in USH2A as the most likely disease-causing variants in a Spanish family diagnosed with arRP in which the cause of the disease had not yet been identified with commonly used techniques. Our data

  14. ATRX mutation in two adult brothers with non-specific moderate intellectual disability identified by exome sequencing.

    Science.gov (United States)

    Moncini, S; Bedeschi, M F; Castronovo, P; Crippa, M; Calvello, M; Garghentino, R R; Scuvera, G; Finelli, P; Venturin, M

    2013-12-01

    In this report, we describe two adult brothers affected by moderate non-specific intellectual disability (ID). They showed minor facial anomalies, not clearly ascribable to any specific syndromic patterns, microcephaly, brachydactyly and broad toes. Both brothers presented seizures. Karyotype, subtelomeric and FMR1 analysis were normal in both cases. We performed array-CGH analysis that revealed no copy-number variations potentially associated with ID. Subsequent exome sequence analysis allowed the identification of the ATRX c.109C>T (p.R37X) mutation in both the affected brothers. Sanger sequencing confirmed the presence of the mutation in the brothers and showed that the mother is a healthy carrier. Mutations in the ATRX gene cause the X-linked alpha thalassemia/mental retardation (ATR-X) syndrome (MIM #301040), a severe clinical condition usually associated with profound ID, facial dysmorphism and alpha thalassemia. However, the syndrome is clinically heterogeneous and some mutations, including the c.109C>T, are associated with a broad phenotypic spectrum, with patients displaying a less severe phenotype with only mild-moderate ID. In the case presented here, exome sequencing provided an effective strategy to achieve the molecular diagnosis of ATR-X syndrome, which otherwise would have been difficult to consider due to the mild non-specific phenotype and the absence of a family history with typical severe cases.

  15. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    Science.gov (United States)

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  16. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    OpenAIRE

    Fabio eMarroni; Sara ePinosio; Sara ePinosio; Michele eMorgante

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, only three research groups working in plant sciences have exploited this potentiality. They showed that pooled NGS can provide results in excellent agreement with those obt...

  17. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    OpenAIRE

    Marroni, Fabio; Pinosio, Sara; Morgante, Michele

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, few research groups working in plant sciences have exploited this potentiality, showing that pooled NGS provides results in excellent agreement with those obtained by indiv...

  18. Exome sequencing and genetic testing for MODY.

    Directory of Open Access Journals (Sweden)

    Stefan Johansson

    Full Text Available Genetic testing for monogenic diabetes is important for patient care. Given the extensive genetic and clinical heterogeneity of diabetes, exome sequencing might provide additional diagnostic potential when standard Sanger sequencing-based diagnostics is inconclusive.The aim of the study was to examine the performance of exome sequencing for a molecular diagnosis of MODY in patients who have undergone conventional diagnostic sequencing of candidate genes with negative results.We performed exome enrichment followed by high-throughput sequencing in nine patients with suspected MODY. They were Sanger sequencing-negative for mutations in the HNF1A, HNF4A, GCK, HNF1B and INS genes. We excluded common, non-coding and synonymous gene variants, and performed in-depth analysis on filtered sequence variants in a pre-defined set of 111 genes implicated in glucose metabolism.On average, we obtained 45 X median coverage of the entire targeted exome and found 199 rare coding variants per individual. We identified 0-4 rare non-synonymous and nonsense variants per individual in our a priori list of 111 candidate genes. Three of the variants were considered pathogenic (in ABCC8, HNF4A and PPARG, respectively, thus exome sequencing led to a genetic diagnosis in at least three of the nine patients. Approximately 91% of known heterozygous SNPs in the target exomes were detected, but we also found low coverage in some key diabetes genes using our current exome sequencing approach. Novel variants in the genes ARAP1, GLIS3, MADD, NOTCH2 and WFS1 need further investigation to reveal their possible role in diabetes.Our results demonstrate that exome sequencing can improve molecular diagnostics of MODY when used as a complement to Sanger sequencing. However, improvements will be needed, especially concerning coverage, before the full potential of exome sequencing can be realized.

  19. Increased genetic diversity and prevalence of co-infection with Trypanosoma spp. in koalas (Phascolarctos cinereus and their ticks identified using next-generation sequencing (NGS.

    Directory of Open Access Journals (Sweden)

    Amanda D Barbosa

    Full Text Available Infections with Trypanosoma spp. have been associated with poor health and decreased survival of koalas (Phascolarctos cinereus, particularly in the presence of concurrent pathogens such as Chlamydia and koala retrovirus. The present study describes the application of a next-generation sequencing (NGS-based assay to characterise the prevalence and genetic diversity of trypanosome communities in koalas and two native species of ticks (Ixodes holocyclus and I. tasmani removed from koala hosts. Among 168 koalas tested, 32.2% (95% CI: 25.2-39.8% were positive for at least one Trypanosoma sp. Previously described Trypanosoma spp. from koalas were identified, including T. irwini (32.1%, 95% CI: 25.2-39.8%, T. gilletti (25%, 95% CI: 18.7-32.3%, T. copemani (27.4%, 95% CI: 20.8-34.8% and T. vegrandis (10.1%, 95% CI: 6.0-15.7%. Trypanosoma noyesi was detected for the first time in koalas, although at a low prevalence (0.6% 95% CI: 0-3.3%, and a novel species (Trypanosoma sp. AB-2017 was identified at a prevalence of 4.8% (95% CI: 2.1-9.2%. Mixed infections with up to five species were present in 27.4% (95% CI: 21-35% of the koalas, which was significantly higher than the prevalence of single infections 4.8% (95% CI: 2-9%. Overall, a considerably higher proportion (79.7% of the Trypanosoma sequences isolated from koala blood samples were identified as T. irwini, suggesting this is the dominant species. Co-infections involving T. gilletti, T. irwini, T. copemani, T. vegrandis and Trypanosoma sp. AB-2017 were also detected in ticks, with T. gilletti and T. copemani being the dominant species within the invertebrate hosts. Direct Sanger sequencing of Trypanosoma 18S rRNA gene amplicons was also performed and results revealed that this method was only able to identify the genotypes with greater amount of reads (according to NGS within koala samples, which highlights the advantages of NGS in detecting mixed infections. The present study provides new insights

  20. Increased genetic diversity and prevalence of co-infection with Trypanosoma spp. in koalas (Phascolarctos cinereus) and their ticks identified using next-generation sequencing (NGS).

    Science.gov (United States)

    Barbosa, Amanda D; Gofton, Alexander W; Paparini, Andrea; Codello, Annachiara; Greay, Telleasha; Gillett, Amber; Warren, Kristin; Irwin, Peter; Ryan, Una

    2017-01-01

    Infections with Trypanosoma spp. have been associated with poor health and decreased survival of koalas (Phascolarctos cinereus), particularly in the presence of concurrent pathogens such as Chlamydia and koala retrovirus. The present study describes the application of a next-generation sequencing (NGS)-based assay to characterise the prevalence and genetic diversity of trypanosome communities in koalas and two native species of ticks (Ixodes holocyclus and I. tasmani) removed from koala hosts. Among 168 koalas tested, 32.2% (95% CI: 25.2-39.8%) were positive for at least one Trypanosoma sp. Previously described Trypanosoma spp. from koalas were identified, including T. irwini (32.1%, 95% CI: 25.2-39.8%), T. gilletti (25%, 95% CI: 18.7-32.3%), T. copemani (27.4%, 95% CI: 20.8-34.8%) and T. vegrandis (10.1%, 95% CI: 6.0-15.7%). Trypanosoma noyesi was detected for the first time in koalas, although at a low prevalence (0.6% 95% CI: 0-3.3%), and a novel species (Trypanosoma sp. AB-2017) was identified at a prevalence of 4.8% (95% CI: 2.1-9.2%). Mixed infections with up to five species were present in 27.4% (95% CI: 21-35%) of the koalas, which was significantly higher than the prevalence of single infections 4.8% (95% CI: 2-9%). Overall, a considerably higher proportion (79.7%) of the Trypanosoma sequences isolated from koala blood samples were identified as T. irwini, suggesting this is the dominant species. Co-infections involving T. gilletti, T. irwini, T. copemani, T. vegrandis and Trypanosoma sp. AB-2017 were also detected in ticks, with T. gilletti and T. copemani being the dominant species within the invertebrate hosts. Direct Sanger sequencing of Trypanosoma 18S rRNA gene amplicons was also performed and results revealed that this method was only able to identify the genotypes with greater amount of reads (according to NGS) within koala samples, which highlights the advantages of NGS in detecting mixed infections. The present study provides new insights on the

  1. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    Energy Technology Data Exchange (ETDEWEB)

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  2. Identifying recombinants in human and primate immunodeficiency virus sequence alignments using quartet scanning

    Directory of Open Access Journals (Sweden)

    Martin Darren P

    2009-04-01

    Full Text Available Abstract Background Recombination has a profound impact on the evolution of viruses, but characterizing recombination patterns in molecular sequences remains a challenging endeavor. Despite its importance in molecular evolutionary studies, identifying the sequences that exhibit such patterns has received comparatively less attention in the recombination detection framework. Here, we extend a quartet-mapping based recombination detection method to enable identification of recombinant sequences without prior specifications of either query and reference sequences. Through simulations we evaluate different recombinant identification statistics and significance tests. We compare the quartet approach with triplet-based methods that employ additional heuristic tests to identify parental and recombinant sequences. Results Analysis of phylogenetic simulations reveal that identifying the descendents of relatively old recombination events is a challenging task for all methods available, and that quartet scanning performs relatively well compared to the triplet based methods. The use of quartet scanning is further demonstrated by analyzing both well-established and putative HIV-1 recombinant strains. In agreement with recent findings, we provide evidence that the presumed circulating recombinant CRF02_AG is a 'pure' lineage, whereas the presumed parental lineage subtype G has a recombinant origin. We also demonstrate HIV-1 intrasubtype recombination, confirm the hybrid origin of SIV in chimpanzees and further disentangle the recombinant history of SIV lineages in a primate immunodeficiency virus data set. Conclusion Quartet scanning makes a valuable addition to triplet-based methods for identifying recombinant sequences without prior specifications of either query and reference sequences. The new method is available in the VisRD v.3.0 package http://www.cmp.uea.ac.uk/~vlm/visrd.

  3. Novel ZEB2-BCL11B Fusion Gene Identified by RNA-Sequencing in Acute Myeloid Leukemia with t(2;14(q22;q32.

    Directory of Open Access Journals (Sweden)

    Synne Torkildsen

    Full Text Available RNA-sequencing of a case of acute myeloid leukemia with the bone marrow karyotype 46,XY,t(2;14(q22;q32[5]/47,XY,idem,+?4,del(6(q13q21[cp6]/46,XY[4] showed that the t(2;14 generated a ZEB2-BCL11B chimera in which exon 2 of ZEB2 (nucleotide 595 in the sequence with accession number NM_014795.3 was fused to exon 2 of BCL11B (nucleotide 554 in the sequence with accession number NM_022898.2. RT-PCR together with Sanger sequencing verified the presence of the above-mentioned fusion transcript. All functional domains of BCL11B are retained in the chimeric protein. Abnormal expression of BCL11B coding regions subjected to control by the ZEB2 promoter seems to be the leukemogenic mechanism behind the translocation.

  4. An Evolutionarily Young Polar Bear (Ursus maritimus Endogenous Retrovirus Identified from Next Generation Sequence Data

    Directory of Open Access Journals (Sweden)

    Kyriakos Tsangaras

    2015-11-01

    Full Text Available Transcriptome analysis of polar bear (Ursus maritimus tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV. Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos and black bear (Ursus americanus but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals.

  5. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data.

    Science.gov (United States)

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E; Greenwood, Alex D

    2015-11-24

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals.

  6. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data

    Science.gov (United States)

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E.; Greenwood, Alex D.

    2015-01-01

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals. PMID:26610552

  7. Natural history bycatch: a pipeline for identifying metagenomic sequences in RADseq data

    Directory of Open Access Journals (Sweden)

    Iris Holmes

    2018-04-01

    Full Text Available Background Reduced representation genomic datasets are increasingly becoming available from a variety of organisms. These datasets do not target specific genes, and so may contain sequences from parasites and other organisms present in the target tissue sample. In this paper, we demonstrate that (1 RADseq datasets can be used for exploratory analysis of tissue-specific metagenomes, and (2 tissue collections house complete metagenomic communities, which can be investigated and quantified by a variety of techniques. Methods We present an exploratory method for mining metagenomic “bycatch” sequences from a range of host tissue types. We use a combination of the pyRAD assembly pipeline, NCBI’s blastn software, and custom R scripts to isolate metagenomic sequences from RADseq type datasets. Results When we focus on sequences that align with existing references in NCBI’s GenBank, we find that between three and five percent of identifiable double-digest restriction site associated DNA (ddRAD sequences from host tissue samples are from phyla to contain known blood parasites. In addition to tissue samples, we examine ddRAD sequences from metagenomic DNA extracted snake and lizard hind-gut samples. We find that the sequences recovered from these samples match with expected bacterial and eukaryotic gut microbiome phyla. Discussion Our results suggest that (1 museum tissue banks originally collected for host DNA archiving are also preserving valuable parasite and microbiome communities, (2 that publicly available RADseq datasets may include metagenomic sequences that could be explored, and (3 that restriction site approaches are a useful exploratory technique to identify microbiome lineages that could be missed by primer-based approaches.

  8. Exome sequencing identifies mutations in ABCD1 and DACH2 in two brothers with a distinct phenotype.

    Science.gov (United States)

    Zhang, Yanliang; Liu, Yanhui; Li, Ya; Duan, Yong; Zhang, Keyun; Wang, Junwang; Dai, Yong

    2014-09-19

    We report on two brothers with a distinct syndromic phenotype and explore the potential pathogenic cause. Cytogenetic tests and exome sequencing were performed on the two brothers and their parents. Variants detected by exome sequencing were validated by Sanger sequencing. The main phenotype of the two brothers included congenital language disorder, growth retardation, intellectual disability, difficulty in standing and walking, and urinary and fecal incontinence. To the best of our knowledge, no similar phenotype has been reported previously. No abnormalities were detected by G-banding chromosome analysis or array comparative genomic hybridization. However, exome sequencing revealed novel mutations in the ATP-binding cassette, sub-family D member 1 (ABCD1) and Dachshund homolog 2 (DACH2) genes in both brothers. The ABCD1 mutation was a missense mutation c.1126G > C in exon 3 leading to a p.E376Q substitution. The DACH2 mutation was also a missense mutation c.1069A > T in exon 6, leading to a p.S357C substitution. The mother was an asymptomatic heterozygous carrier. Plasma levels of very-long-chain fatty acids were increased in both brothers, suggesting a diagnosis of adrenoleukodystrophy (ALD); however, their phenotype was not compatible with any reported forms of ALD. DACH2 plays an important role in the regulation of brain and limb development, suggesting that this mutation may be involved in the phenotype of the two brothers. The distinct phenotype demonstrated by these two brothers might represent a new form of ALD or a new syndrome. The combination of mutations in ABCD1 and DACH2 provides a plausible mechanism for this phenotype.

  9. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    Science.gov (United States)

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  10. Exome Sequencing Fails to Identify the Genetic Cause of Aicardi Syndrome.

    Science.gov (United States)

    Lund, Caroline; Striano, Pasquale; Sorte, Hanne Sørmo; Parisi, Pasquale; Iacomino, Michele; Sheng, Ying; Vigeland, Magnus D; Øye, Anne-Marte; Møller, Rikke Steensbjerre; Selmer, Kaja K; Zara, Federico

    2016-09-01

    Aicardi syndrome (AS) is a well-characterized neurodevelopmental disorder with an unknown etiology. In this study, we performed whole-exome sequencing in 11 female patients with the diagnosis of AS, in order to identify the disease-causing gene. In particular, we focused on detecting variants in the X chromosome, including the analysis of variants with a low number of sequencing reads, in case of somatic mosaicism. For 2 of the patients, we also sequenced the exome of the parents to search for de novo mutations. We did not identify any genetic variants likely to be damaging. Only one single missense variant was identified by the de novo analyses of the 2 trios, and this was considered benign. The failure to identify a disease gene in this study may be due to technical limitations of our study design, including the possibility that the genetic aberration leading to AS is situated in a non-exonic region or that the mutation is somatic and not detectable by our approach. Alternatively, it is possible that AS is genetically heterogeneous and that 11 patients are not sufficient to reveal the causative genes. Future studies of AS should consider designs where also non-exonic regions are explored and apply a sequencing depth so that also low-grade somatic mosaicism can be detected.

  11. Exome Sequencing Fails to Identify the Genetic Cause of Aicardi Syndrome

    DEFF Research Database (Denmark)

    Lund, Caroline; Striano, Pasquale; Sorte, Hanne Sørmo

    2016-01-01

    Aicardi syndrome (AS) is a well-characterized neurodevelopmental disorder with an unknown etiology. In this study, we performed whole-exome sequencing in 11 female patients with the diagnosis of AS, in order to identify the disease-causing gene. In particular, we focused on detecting variants in ...

  12. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    OpenAIRE

    Hu, H.; Haas, S.A.; Chelly, J.; Van Esch, H.; Raynaud, M.; de Brouwer, A.P.M.; Weinert, S.; Froyen, G.; Frints, S.G.M.; Laumonnier, F.; Zemojtel, T.; Love, M.I.; Richard, H.; Emde, A.K.; Bienek, M.

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of ...

  13. A Novel Prosthetic Joint Infection Pathogen, Mycoplasma salivarium, Identified by Metagenomic Shotgun Sequencing.

    Science.gov (United States)

    Thoendel, Matthew; Jeraldo, Patricio; Greenwood-Quaintance, Kerryl E; Chia, Nicholas; Abdel, Matthew P; Steckelberg, James M; Osmon, Douglas R; Patel, Robin

    2017-07-15

    Defining the microbial etiology of culture-negative prosthetic joint infection (PJI) can be challenging. Metagenomic shotgun sequencing is a new tool to identify organisms undetected by conventional methods. We present a case where metagenomics was used to identify Mycoplasma salivarium as a novel PJI pathogen in a patient with hypogammaglobulinemia. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.

  14. SoftSearch: integration of multiple sequence features to identify breakpoints of structural variations.

    Directory of Open Access Journals (Sweden)

    Steven N Hart

    Full Text Available BACKGROUND: Structural variation (SV represents a significant, yet poorly understood contribution to an individual's genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. RESULTS: We developed and validated SoftSearch using real and synthetic datasets. SoftSearch's key features are 1 not requiring secondary (or exhaustive primary alignment, 2 portability into established sequencing workflows, and 3 is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.. SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. CONCLUSIONS: We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance.

  15. Incorporation of unique molecular identifiers in TruSeq adapters improves the accuracy of quantitative sequencing.

    Science.gov (United States)

    Hong, Jungeui; Gresham, David

    2017-11-01

    Quantitative analysis of next-generation sequencing (NGS) data requires discriminating duplicate reads generated by PCR from identical molecules that are of unique origin. Typically, PCR duplicates are identified as sequence reads that align to the same genomic coordinates using reference-based alignment. However, identical molecules can be independently generated during library preparation. Misidentification of these molecules as PCR duplicates can introduce unforeseen biases during analyses. Here, we developed a cost-effective sequencing adapter design by modifying Illumina TruSeq adapters to incorporate a unique molecular identifier (UMI) while maintaining the capacity to undertake multiplexed, single-index sequencing. Incorporation of UMIs into TruSeq adapters (TrUMIseq adapters) enables identification of bona fide PCR duplicates as identically mapped reads with identical UMIs. Using TrUMIseq adapters, we show that accurate removal of PCR duplicates results in improved accuracy of both allele frequency (AF) estimation in heterogeneous populations using DNA sequencing and gene expression quantification using RNA-Seq.

  16. Unique Trichomonas vaginalis gene sequences identified in multinational regions of Northwest China.

    Science.gov (United States)

    Liu, Jun; Feng, Meng; Wang, Xiaolan; Fu, Yongfeng; Ma, Cailing; Cheng, Xunjia

    2017-07-24

    Trichomonas vaginalis (T. vaginalis) is a flagellated protozoan parasite that infects humans worldwide. This study determined the sequence of the 18S ribosomal RNA gene of T. vaginalis infecting both females and males in Xinjiang, China. Samples from 73 females and 28 males were collected and confirmed for infection with T. vaginalis, a total of 110 sequences were identified when the T. vaginalis 18S ribosomal RNA gene was sequenced. These sequences were used to prepare a phylogenetic network. The rooted network comprised three large clades and several independent branches. Most of the Xinjiang sequences were in one group. Preliminary results suggest that Xinjiang T. vaginalis isolates might be genetically unique, as indicated by the sequence of their 18S ribosomal RNA gene. Low migration rate of local people in this province may contribute to a genetic conservativeness of T. vaginalis. The unique genetic feature of our isolates may suggest a different clinical presentation of trichomoniasis, including metronidazole susceptibility, T. vaginalis virus or Mycoplasma co-infection characteristics. The transmission and evolution of Xinjiang T. vaginalis is of interest and should be studied further. More attention should be given to T. vaginalis infection in both females and males in Xinjiang.

  17. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    Science.gov (United States)

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. © 2016 Elsevier Inc. All rights reserved.

  18. SeqAnt: A web service to rapidly identify and annotate DNA sequence variations

    Directory of Open Access Journals (Sweden)

    Patel Viren

    2010-09-01

    Full Text Available Abstract Background The enormous throughput and low cost of second-generation sequencing platforms now allow research and clinical geneticists to routinely perform single experiments that identify tens of thousands to millions of variant sites. Existing methods to annotate variant sites using information from publicly available databases via web browsers are too slow to be useful for the large sequencing datasets being routinely generated by geneticists. Because sequence annotation of variant sites is required before functional characterization can proceed, the lack of a high-throughput pipeline to efficiently annotate variant sites can act as a significant bottleneck in genetics research. Results SeqAnt (Sequence Annotator is an open source web service and software package that rapidly annotates DNA sequence variants and identifies recessive or compound heterozygous loci in human, mouse, fly, and worm genome sequencing experiments. Variants are characterized with respect to their functional type, frequency, and evolutionary conservation. Annotated variants can be viewed on a web browser, downloaded in a tab-delimited text file, or directly uploaded in a BED format to the UCSC genome browser. To demonstrate the speed of SeqAnt, we annotated a series of publicly available datasets that ranged in size from 37 to 3,439,107 variant sites. The total time to completely annotate these data completely ranged from 0.17 seconds to 28 minutes 49.8 seconds. Conclusion SeqAnt is an open source web service and software package that overcomes a critical bottleneck facing research and clinical geneticists using second-generation sequencing platforms. SeqAnt will prove especially useful for those investigators who lack dedicated bioinformatics personnel or infrastructure in their laboratories.

  19. Whole-Exome Sequencing Identifies Rare and Low-Frequency Coding Variants Associated with LDL Cholesterol

    Science.gov (United States)

    Lange, Leslie A.; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M.; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M.; Smith, Joshua D.; Turner, Emily H.; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A.; Holmen, Oddgeir L.; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A.; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C.; Correa, Adolfo; Griswold, Michael E.; Jakobsdottir, Johanna; Smith, Albert V.; Schreiner, Pamela J.; Feitosa, Mary F.; Zhang, Qunyuan; Huffman, Jennifer E.; Crosby, Jacy; Wassel, Christina L.; Do, Ron; Franceschini, Nora; Martin, Lisa W.; Robinson, Jennifer G.; Assimes, Themistocles L.; Crosslin, David R.; Rosenthal, Elisabeth A.; Tsai, Michael; Rieder, Mark J.; Farlow, Deborah N.; Folsom, Aaron R.; Lumley, Thomas; Fox, Ervin R.; Carlson, Christopher S.; Peters, Ulrike; Jackson, Rebecca D.; van Duijn, Cornelia M.; Uitterlinden, André G.; Levy, Daniel; Rotter, Jerome I.; Taylor, Herman A.; Gudnason, Vilmundur; Siscovick, David S.; Fornage, Myriam; Borecki, Ingrid B.; Hayward, Caroline; Rudan, Igor; Chen, Y. Eugene; Bottinger, Erwin P.; Loos, Ruth J.F.; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M.; Gabriel, Stacey B.; O’Donnell, Christopher J.; Post, Wendy S.; North, Kari E.; Reiner, Alexander P.; Boerwinkle, Eric; Psaty, Bruce M.; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P.; Cupples, L. Adrienne; Kooperberg, Charles; Wilson, James G.; Nickerson, Deborah A.; Abecasis, Goncalo R.; Rich, Stephen S.; Tracy, Russell P.; Willer, Cristen J.; Gabriel, Stacey B.; Altshuler, David M.; Abecasis, Gonçalo R.; Allayee, Hooman; Cresci, Sharon; Daly, Mark J.; de Bakker, Paul I.W.; DePristo, Mark A.; Do, Ron; Donnelly, Peter; Farlow, Deborah N.; Fennell, Tim; Garimella, Kiran; Hazen, Stanley L.; Hu, Youna; Jordan, Daniel M.; Jun, Goo; Kathiresan, Sekar; Kang, Hyun Min; Kiezun, Adam; Lettre, Guillaume; Li, Bingshan; Li, Mingyao; Newton-Cheh, Christopher H.; Padmanabhan, Sandosh; Peloso, Gina; Pulit, Sara; Rader, Daniel J.; Reich, David; Reilly, Muredach P.; Rivas, Manuel A.; Schwartz, Steve; Scott, Laura; Siscovick, David S.; Spertus, John A.; Stitziel, Nathaniel O.; Stoletzki, Nina; Sunyaev, Shamil R.; Voight, Benjamin F.; Willer, Cristen J.; Rich, Stephen S.; Akylbekova, Ermeg; Atwood, Larry D.; Ballantyne, Christie M.; Barbalic, Maja; Barr, R. Graham; Benjamin, Emelia J.; Bis, Joshua; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer; Budoff, Matthew; Burke, Greg; Buxbaum, Sarah; Carr, Jeff; Chen, Donna T.; Chen, Ida Y.; Chen, Wei-Min; Concannon, Pat; Crosby, Jacy; Cupples, L. Adrienne; D’Agostino, Ralph; DeStefano, Anita L.; Dreisbach, Albert; Dupuis, Josée; Durda, J. Peter; Ellis, Jaclyn; Folsom, Aaron R.; Fornage, Myriam; Fox, Caroline S.; Fox, Ervin; Funari, Vincent; Ganesh, Santhi K.; Gardin, Julius; Goff, David; Gordon, Ora; Grody, Wayne; Gross, Myron; Guo, Xiuqing; Hall, Ira M.; Heard-Costa, Nancy L.; Heckbert, Susan R.; Heintz, Nicholas; Herrington, David M.; Hickson, DeMarc; Huang, Jie; Hwang, Shih-Jen; Jacobs, David R.; Jenny, Nancy S.; Johnson, Andrew D.; Johnson, Craig W.; Kawut, Steven; Kronmal, Richard; Kurz, Raluca; Lange, Ethan M.; Lange, Leslie A.; Larson, Martin G.; Lawson, Mark; Lewis, Cora E.; Levy, Daniel; Li, Dalin; Lin, Honghuang; Liu, Chunyu; Liu, Jiankang; Liu, Kiang; Liu, Xiaoming; Liu, Yongmei; Longstreth, William T.; Loria, Cay; Lumley, Thomas; Lunetta, Kathryn; Mackey, Aaron J.; Mackey, Rachel; Manichaikul, Ani; Maxwell, Taylor; McKnight, Barbara; Meigs, James B.; Morrison, Alanna C.; Musani, Solomon K.; Mychaleckyj, Josyf C.; Nettleton, Jennifer A.; North, Kari; O’Donnell, Christopher J.; O’Leary, Daniel; Ong, Frank; Palmas, Walter; Pankow, James S.; Pankratz, Nathan D.; Paul, Shom; Perez, Marco; Person, Sharina D.; Polak, Joseph; Post, Wendy S.; Psaty, Bruce M.; Quinlan, Aaron R.; Raffel, Leslie J.; Ramachandran, Vasan S.; Reiner, Alexander P.; Rice, Kenneth; Rotter, Jerome I.; Sanders, Jill P.; Schreiner, Pamela; Seshadri, Sudha; Shea, Steve; Sidney, Stephen; Silverstein, Kevin; Smith, Nicholas L.; Sotoodehnia, Nona; Srinivasan, Asoke; Taylor, Herman A.; Taylor, Kent; Thomas, Fridtjof; Tracy, Russell P.; Tsai, Michael Y.; Volcik, Kelly A.; Wassel, Chrstina L.; Watson, Karol; Wei, Gina; White, Wendy; Wiggins, Kerri L.; Wilk, Jemma B.; Williams, O. Dale; Wilson, Gregory; Wilson, James G.; Wolf, Phillip; Zakai, Neil A.; Hardy, John; Meschia, James F.; Nalls, Michael; Singleton, Andrew; Worrall, Brad; Bamshad, Michael J.; Barnes, Kathleen C.; Abdulhamid, Ibrahim; Accurso, Frank; Anbar, Ran; Beaty, Terri; Bigham, Abigail; Black, Phillip; Bleecker, Eugene; Buckingham, Kati; Cairns, Anne Marie; Caplan, Daniel; Chatfield, Barbara; Chidekel, Aaron; Cho, Michael; Christiani, David C.; Crapo, James D.; Crouch, Julia; Daley, Denise; Dang, Anthony; Dang, Hong; De Paula, Alicia; DeCelie-Germana, Joan; Drumm, Allen DozorMitch; Dyson, Maynard; Emerson, Julia; Emond, Mary J.; Ferkol, Thomas; Fink, Robert; Foster, Cassandra; Froh, Deborah; Gao, Li; Gershan, William; Gibson, Ronald L.; Godwin, Elizabeth; Gondor, Magdalen; Gutierrez, Hector; Hansel, Nadia N.; Hassoun, Paul M.; Hiatt, Peter; Hokanson, John E.; Howenstine, Michelle; Hummer, Laura K.; Kanga, Jamshed; Kim, Yoonhee; Knowles, Michael R.; Konstan, Michael; Lahiri, Thomas; Laird, Nan; Lange, Christoph; Lin, Lin; Lin, Xihong; Louie, Tin L.; Lynch, David; Make, Barry; Martin, Thomas R.; Mathai, Steve C.; Mathias, Rasika A.; McNamara, John; McNamara, Sharon; Meyers, Deborah; Millard, Susan; Mogayzel, Peter; Moss, Richard; Murray, Tanda; Nielson, Dennis; Noyes, Blakeslee; O’Neal, Wanda; Orenstein, David; O’Sullivan, Brian; Pace, Rhonda; Pare, Peter; Parker, H. Worth; Passero, Mary Ann; Perkett, Elizabeth; Prestridge, Adrienne; Rafaels, Nicholas M.; Ramsey, Bonnie; Regan, Elizabeth; Ren, Clement; Retsch-Bogart, George; Rock, Michael; Rosen, Antony; Rosenfeld, Margaret; Ruczinski, Ingo; Sanford, Andrew; Schaeffer, David; Sell, Cindy; Sheehan, Daniel; Silverman, Edwin K.; Sin, Don; Spencer, Terry; Stonebraker, Jackie; Tabor, Holly K.; Varlotta, Laurie; Vergara, Candelaria I.; Weiss, Robert; Wigley, Fred; Wise, Robert A.; Wright, Fred A.; Wurfel, Mark M.; Zanni, Robert; Zou, Fei; Nickerson, Deborah A.; Rieder, Mark J.; Green, Phil; Shendure, Jay; Akey, Joshua M.; Bustamante, Carlos D.; Crosslin, David R.; Eichler, Evan E.; Fox, P. Keolu; Fu, Wenqing; Gordon, Adam; Gravel, Simon; Jarvik, Gail P.; Johnsen, Jill M.; Kan, Mengyuan; Kenny, Eimear E.; Kidd, Jeffrey M.; Lara-Garduno, Fremiet; Leal, Suzanne M.; Liu, Dajiang J.; McGee, Sean; O’Connor, Timothy D.; Paeper, Bryan; Robertson, Peggy D.; Smith, Joshua D.; Staples, Jeffrey C.; Tennessen, Jacob A.; Turner, Emily H.; Wang, Gao; Yi, Qian; Jackson, Rebecca; Peters, Ulrike; Carlson, Christopher S.; Anderson, Garnet; Anton-Culver, Hoda; Assimes, Themistocles L.; Auer, Paul L.; Beresford, Shirley; Bizon, Chris; Black, Henry; Brunner, Robert; Brzyski, Robert; Burwen, Dale; Caan, Bette; Carty, Cara L.; Chlebowski, Rowan; Cummings, Steven; Curb, J. David; Eaton, Charles B.; Ford, Leslie; Franceschini, Nora; Fullerton, Stephanie M.; Gass, Margery; Geller, Nancy; Heiss, Gerardo; Howard, Barbara V.; Hsu, Li; Hutter, Carolyn M.; Ioannidis, John; Jiao, Shuo; Johnson, Karen C.; Kooperberg, Charles; Kuller, Lewis; LaCroix, Andrea; Lakshminarayan, Kamakshi; Lane, Dorothy; Lasser, Norman; LeBlanc, Erin; Li, Kuo-Ping; Limacher, Marian; Lin, Dan-Yu; Logsdon, Benjamin A.; Ludlam, Shari; Manson, JoAnn E.; Margolis, Karen; Martin, Lisa; McGowan, Joan; Monda, Keri L.; Kotchen, Jane Morley; Nathan, Lauren; Ockene, Judith; O’Sullivan, Mary Jo; Phillips, Lawrence S.; Prentice, Ross L.; Robbins, John; Robinson, Jennifer G.; Rossouw, Jacques E.; Sangi-Haghpeykar, Haleh; Sarto, Gloria E.; Shumaker, Sally; Simon, Michael S.; Stefanick, Marcia L.; Stein, Evan; Tang, Hua; Taylor, Kira C.; Thomson, Cynthia A.; Thornton, Timothy A.; Van Horn, Linda; Vitolins, Mara; Wactawski-Wende, Jean; Wallace, Robert; Wassertheil-Smoller, Sylvia; Zeng, Donglin; Applebaum-Bowden, Deborah; Feolo, Michael; Gan, Weiniu; Paltoo, Dina N.; Sholinsky, Phyliss; Sturcke, Anne

    2014-01-01

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775

  20. HIV-1 envelope sequence-based diversity measures for identifying recent infections.

    Directory of Open Access Journals (Sweden)

    Alexis Kafando

    Full Text Available Identifying recent HIV-1 infections is crucial for monitoring HIV-1 incidence and optimizing public health prevention efforts. To identify recent HIV-1 infections, we evaluated and compared the performance of 4 sequence-based diversity measures including percent diversity, percent complexity, Shannon entropy and number of haplotypes targeting 13 genetic segments within the env gene of HIV-1. A total of 597 diagnostic samples obtained in 2013 and 2015 from recently and chronically HIV-1 infected individuals were selected. From the selected samples, 249 (134 from recent versus 115 from chronic infections env coding regions, including V1-C5 of gp120 and the gp41 ectodomain of HIV-1, were successfully amplified and sequenced by next generation sequencing (NGS using the Illumina MiSeq platform. The ability of the four sequence-based diversity measures to correctly identify recent HIV infections was evaluated using the frequency distribution curves, median and interquartile range and area under the curve (AUC of the receiver operating characteristic (ROC. Comparing the median and interquartile range and evaluating the frequency distribution curves associated with the 4 sequence-based diversity measures, we observed that the percent diversity, number of haplotypes and Shannon entropy demonstrated significant potential to discriminate recent from chronic infections (p<0.0001. Using the AUC of ROC analysis, only the Shannon entropy measure within three HIV-1 env segments could accurately identify recent infections at a satisfactory level. The env segments were gp120 C2_1 (AUC = 0.806, gp120 C2_3 (AUC = 0.805 and gp120 V3 (AUC = 0.812. Our results clearly indicate that the Shannon entropy measure represents a useful tool for predicting HIV-1 infection recency.

  1. BLAT2DOLite: An Online System for Identifying Significant Relationships between Genetic Sequences and Diseases.

    Directory of Open Access Journals (Sweden)

    Liang Cheng

    Full Text Available The significantly related diseases of sequences could play an important role in understanding the functions of these sequences. In this paper, we introduced BLAT2DOLite, an online system for annotating human genes and diseases and identifying the significant relationships between sequences and diseases. Currently, BLAT2DOLite integrates Entrez Gene database and Disease Ontology Lite (DOLite, which contain loci of gene and relationships between genes and diseases. It utilizes hypergeometric test to calculate P-values between genes and diseases of DOLite. The system can be accessed from: http://123.59.132.21:8080/BLAT2DOLite. The corresponding web service is described in: http://123.59.132.21:8080/BLAT2DOLite/BLAT2DOLiteIDMappingPort?wsdl.

  2. Shirky and Sanger, or the costs of crowdsourcing

    Directory of Open Access Journals (Sweden)

    Mathieu O'Neil

    2010-03-01

    Full Text Available Online knowledge production sites do not rely on isolated experts but on collaborative processes, on the wisdom of the group or “crowd”. Some authors have argued that it is possible to combine traditional or credentialled expertise with collective production; others believe that traditional expertise's focus on correctness has been superseded by the affordances of digital networking, such as re-use and verifiability. This paper examines the costs of two kinds of “crowdsourced” encyclopedic projects: Citizendium, based on the work of credentialled and identified experts, faces a recruitment deficit; in contrast Wikipedia has proved wildly popular, but anti-credentialism and anonymity result in uncertainty, irresponsibility, the development of cliques and the growing importance of pseudo-legal competencies for conflict resolution. Finally the paper reflects on the wider social implications of focusing on what experts are rather than on what they are for.

  3. Novel expressed sequences identified in a model of androgen independent prostate cancer

    Directory of Open Access Journals (Sweden)

    Jones Steven JM

    2007-01-01

    Full Text Available Abstract Background Prostate cancer is the most frequently diagnosed cancer in American men, and few effective treatment options are available to patients who develop hormone-refractory prostate cancer. The molecular changes that occur to allow prostate cells to proliferate in the absence of androgens are not fully understood. Results Subtractive hybridization experiments performed with samples from an in vivo model of hormonal progression identified 25 expressed sequences representing novel human transcripts. Intriguingly, these 25 sequences have small open-reading frames and are not highly conserved through evolution, suggesting many of these novel expressed sequences may be derived from untranslated regions of novel transcripts or from non-coding transcripts. Examination of a large metalibrary of human Serial Analysis of Gene Expression (SAGE tags demonstrated that only three of these novel sequences had been previously detected. RT-PCR experiments confirmed that the 6 sequences tested were expressed in specific human tissues, as well as in clinical samples of prostate cancer. Further RT-PCR experiments for five of these fragments indicated they originated from large untranslated regions of unannotated transcripts. Conclusion This study underlines the value of using complementary techniques in the annotation of the human genome. The tissue-specific expression of 4 of the 6 clones tested indicates the expression of these novel transcripts is tightly regulated, and future work will determine the possible role(s these novel transcripts may play in the progression of prostate cancer.

  4. Functional brain activation differences in stuttering identified with a rapid fMRI sequence

    Science.gov (United States)

    Kraft, Shelly Jo; Choo, Ai Leen; Sharma, Harish; Ambrose, Nicoline G.

    2011-01-01

    The purpose of this study was to investigate whether brain activity related to the presence of stuttering can be identified with rapid functional MRI (fMRI) sequences that involved overt and covert speech processing tasks. The long-term goal is to develop sensitive fMRI approaches with developmentally appropriate tasks to identify deviant speech motor and auditory brain activity in children who stutter closer to the age at which recovery from stuttering is documented. Rapid sequences may be preferred for individuals or populations who do not tolerate long scanning sessions. In this report, we document the application of a picture naming and phoneme monitoring task in three minute fMRI sequences with adults who stutter (AWS). If relevant brain differences are found in AWS with these approaches that conform to previous reports, then these approaches can be extended to younger populations. Pairwise contrasts of brain BOLD activity between AWS and normally fluent adults indicated the AWS showed higher BOLD activity in the right inferior frontal gyrus (IFG), right temporal lobe and sensorimotor cortices during picture naming and and higher activity in the right IFG during phoneme monitoring. The right lateralized pattern of BOLD activity together with higher activity in sensorimotor cortices is consistent with previous reports, which indicates rapid fMRI sequences can be considered for investigating stuttering in younger participants. PMID:22133409

  5. A viral metagenomic approach on a non-metagenomic experiment: Mining next generation sequencing datasets from pig DNA identified several porcine parvoviruses for a retrospective evaluation of viral infections.

    Directory of Open Access Journals (Sweden)

    Samuele Bovo

    Full Text Available Shot-gun next generation sequencing (NGS on whole DNA extracted from specimens collected from mammals often produces reads that are not mapped (i.e. unmapped reads on the host reference genome and that are usually discarded as by-products of the experiments. In this study, we mined Ion Torrent reads obtained by sequencing DNA isolated from archived blood samples collected from 100 performance tested Italian Large White pigs. Two reduced representation libraries were prepared from two DNA pools constructed each from 50 equimolar DNA samples. Bioinformatic analyses were carried out to mine unmapped reads on the reference pig genome that were obtained from the two NGS datasets. In silico analyses included read mapping and sequence assembly approaches for a viral metagenomic analysis using the NCBI Viral Genome Resource. Our approach identified sequences matching several viruses of the Parvoviridae family: porcine parvovirus 2 (PPV2, PPV4, PPV5 and PPV6 and porcine bocavirus 1-H18 isolate (PBoV1-H18. The presence of these viruses was confirmed by PCR and Sanger sequencing of individual DNA samples. PPV2, PPV4, PPV5, PPV6 and PBoV1-H18 were all identified in samples collected in 1998-2007, 1998-2000, 1997-2000, 1998-2004 and 2003, respectively. For most of these viruses (PPV4, PPV5, PPV6 and PBoV1-H18 previous studies reported their first occurrence much later (from 5 to more than 10 years than our identification period and in different geographic areas. Our study provided a retrospective evaluation of apparently asymptomatic parvovirus infected pigs providing information that could be important to define occurrence and prevalence of different parvoviruses in South Europe. This study demonstrated the potential of mining NGS datasets non-originally derived by metagenomics experiments for viral metagenomics analyses in a livestock species.

  6. Molecular defects identified by whole exome sequencing in a child with Fanconi anemia.

    Science.gov (United States)

    Zheng, Zhaojing; Geng, Juan; Yao, Ru-En; Li, Caihua; Ying, Daming; Shen, Yongnian; Ying, Lei; Yu, Yongguo; Fu, Qihua

    2013-11-10

    Fanconi anemia is a rare genetic disease characterized by bone marrow failure, multiple congenital malformations, and an increased susceptibility to malignancy. At least 15 genes have been identified that are involved in the pathogenesis of Fanconi anemia. However, it is still a challenge to assign the complementation group and to characterize the molecular defects in patients with Fanconi anemia. In the current study, whole exome sequencing was used to identify the affected gene(s) in a boy with Fanconi anemia. A recurring, non-synonymous mutation was found (c.3971C>T, p.P1324L) as well as a novel frameshift mutation (c.989_995del, p.H330LfsX2) in FANCA gene. Our results indicate that whole exome sequencing may be useful in clinical settings for rapid identification of disease-causing mutations in rare genetic disorders such as Fanconi anemia. © 2013 Elsevier B.V. All rights reserved.

  7. A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data

    Science.gov (United States)

    Lea, Amanda J.

    2015-01-01

    Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html. PMID:26599596

  8. Whole-exome sequencing identified a variant in EFTUD2 gene in establishing a genetic diagnosis.

    Science.gov (United States)

    Rengasamy Venugopalan, S; Farrow, E G; Lypka, M

    2017-06-01

    Craniofacial anomalies are complex and have an overlapping phenotype. Mandibulofacial Dysostosis and Oculo-Auriculo-Vertebral Spectrum are conditions that share common craniofacial phenotype and present a challenge in arriving at a diagnosis. In this report, we present a case of female proband who was given a differential diagnosis of Treacher Collins syndrome or Hemifacial Microsomia without certainty. Prior genetic testing reported negative for 22q deletion and FGFR screenings. The objective of this study was to demonstrate the critical role of whole-exome sequencing in establishing a genetic diagnosis of the proband. The participants were 14½-year-old affected female proband/parent trio. Proband/parent trio were enrolled in the study. Surgical tissue sample from the proband and parental blood samples were collected and prepared for whole-exome sequencing. Illumina HiSeq 2500 instrument was used for sequencing (125 nucleotide reads/84X coverage). Analyses of variants were performed using custom-developed software, RUNES and VIKING. Variant analyses following whole-exome sequencing identified a heterozygous de novo pathogenic variant, c.259C>T (p.Gln87*), in EFTUD2 (NM_004247.3) gene in the proband. Previous studies have reported that the variants in EFTUD2 gene were associated with Mandibulofacial Dysostosis with Microcephaly. Patients with facial asymmetry, micrognathia, choanal atresia and microcephaly should be analyzed for variants in EFTUD2 gene. Next-generation sequencing techniques, such as whole-exome sequencing offer great promise to improve the understanding of etiologies of sporadic genetic diseases. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  9. Identifying transposon insertions and their effects from RNA-sequencing data.

    Science.gov (United States)

    de Ruiter, Julian R; Kas, Sjors M; Schut, Eva; Adams, David J; Koudijs, Marco J; Wessels, Lodewyk F A; Jonkers, Jos

    2017-07-07

    Insertional mutagenesis using engineered transposons is a potent forward genetic screening technique used to identify cancer genes in mouse model systems. In the analysis of these screens, transposon insertion sites are typically identified by targeted DNA-sequencing and subsequently assigned to predicted target genes using heuristics. As such, these approaches provide no direct evidence that insertions actually affect their predicted targets or how transcripts of these genes are affected. To address this, we developed IM-Fusion, an approach that identifies insertion sites from gene-transposon fusions in standard single- and paired-end RNA-sequencing data. We demonstrate IM-Fusion on two separate transposon screens of 123 mammary tumors and 20 B-cell acute lymphoblastic leukemias, respectively. We show that IM-Fusion accurately identifies transposon insertions and their true target genes. Furthermore, by combining the identified insertion sites with expression quantification, we show that we can determine the effect of a transposon insertion on its target gene(s) and prioritize insertions that have a significant effect on expression. We expect that IM-Fusion will significantly enhance the accuracy of cancer gene discovery in forward genetic screens and provide initial insight into the biological effects of insertions on candidate cancer genes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Exome Sequencing Identifies Potential Risk Variants for Mendelian Disorders at High Prevalence in Qatar

    Science.gov (United States)

    Rodriguez-Flores, Juan L.; Fakhro, Khalid; Hackett, Neil R.; Salit, Jacqueline; Fuller, Jennifer; Agosto-Perez, Francisco; Gharbiah, Maey; Malek, Joel A.; Zirie, Mahmoud; Jayyousi, Amin; Badii, Ramin; Al-Marri, Ajayeb Al-Nabet; Chouchane, Lotfi; Stadler, Dora J.; Hunter-Zinck, Haley; Mezey, Jason G.; Crystal, Ronald G.

    2013-01-01

    Exome sequencing of families of related individuals has been highly successful in identifying genetic polymorphisms responsible for Mendelian disorders. Here, we demonstrate the value of the reverse approach, where we use exome sequencing of a sample of unrelated individuals to analyze allele frequencies of known causal mutations for Mendelian diseases. We sequenced the exomes of 100 individuals representing the three major genetic subgroups of the Qatari population (Q1 Bedouin, Q2 Persian-South Asian, Q3 African) and identified 37 variants in 33 genes with effects on 36 clinically significant Mendelian diseases. These include variants not present in 1000 Genomes and variants at high frequency when compared to 1000 Genomes populations. Several of these Mendelian variants were only segregating in one Qatari subpopulation, where the observed subpopulation specificity trends were confirmed in an independent population of 386 Qataris. Pre-marital genetic screening in Qatar tests for only 4 out of the 37, such that this study provides a set of Mendelian disease variants with potential impact on the epidemiological profile of the population that could be incorporated into the testing program if further experimental and clinical characterization confirms high penetrance. PMID:24123366

  11. Complete genome sequence of Clostridium estertheticum DSM 8809, a microbe identified in spoiled vacuum packed beef

    Directory of Open Access Journals (Sweden)

    Zhongyi Yu

    2016-11-01

    Full Text Available Blown pack spoilage (BPS is a major issue for the beef industry. Aetiological agents of BPS involve members of a group of Clostridium species, including Clostridium estertheticum which has the ability to produce gas, mostly carbon dioxide, under anaerobic psychotrophic growth conditions. This spore-forming bacterium grows slowly under laboratory conditions, and it can take up to 3 months to produce a workable culture. These characteristics have limited the study of this commercially challenging bacterium. Consequently information on this bacterium is limited and no effective controls are currently available to confidently detect and manage this production risk. In this study the complete genome of Clostridium estertheticum DSM 8809 was determined by SMRT® sequencing. The genome consists of a circular chromosome of 4.7 Mbp along with a single plasmid carrying a potential tellurite resistance gene tehB and a Tn3-like resolvase-encoding gene tnpR. The genome sequence was searched for central metabolic pathways that would support its biochemical profile and several enzymes contributing to this phenotype were identified. Several putative antibiotic/biocide/metal resistance-encoding genes and virulence factors were also identified in the genome, a feature that requires further research. The availability of the genome sequence will provide a basic blueprint from which to develop valuable biomarkers that could support and improve the detection and control of this bacterium along the beef production chain.

  12. Somatic mosaicism of a CDKL5 mutation identified by next-generation sequencing.

    Science.gov (United States)

    Kato, Takeshi; Morisada, Naoya; Nagase, Hiroaki; Nishiyama, Masahiro; Toyoshima, Daisaku; Nakagawa, Taku; Maruyama, Azusa; Fu, Xue Jun; Nozu, Kandai; Wada, Hiroko; Takada, Satoshi; Iijima, Kazumoto

    2015-10-01

    CDKL5-related encephalopathy is an X-linked dominantly inherited disorder that is characterized by early infantile epileptic encephalopathy or atypical Rett syndrome. We describe a 5-year-old Japanese boy with intractable epilepsy, severe developmental delay, and Rett syndrome-like features. Onset was at 2 months, when his electroencephalogram showed sporadic single poly spikes and diffuse irregular poly spikes. We conducted a genetic analysis using an Illumina® TruSight™ One sequencing panel on a next-generation sequencer. We identified two epilepsy-associated single nucleotide variants in our case: CDKL5 p.Ala40Val and KCNQ2 p.Glu515Asp. CDKL5 p.Ala40Val has been previously reported to be responsible for early infantile epileptic encephalopathy. In our case, the CDKL5 heterozygous mutation showed somatic mosaicism because the boy's karyotype was 46,XY. The KCNQ2 variant p.Glu515Asp is known to cause benign familial neonatal seizures-1, and this variant showed paternal inheritance. Although we believe that the somatic mosaic CDKL5 mutation is mainly responsible for the neurological phenotype in the patient, the KCNQ2 variant might have some neurological effect. Genetic analysis by next-generation sequencing is capable of identifying multiple variants in a patient. Copyright © 2015 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.

  13. Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    Directory of Open Access Journals (Sweden)

    Devier Benjamin

    2007-08-01

    Full Text Available Abstract Background The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.

  14. Use of a mitochondrial COI sequence to identify species of the subtribe Aphidina (Hemiptera, Aphididae

    Directory of Open Access Journals (Sweden)

    Jianfeng WANG

    2011-08-01

    Full Text Available Aphids of the subtribe Aphidina are found mainly in the North Temperate Zone. The relative lack of diagnostic morphological characteristics has obscured the identification of species in this group. However, DNA-based taxonomic methods can clarify species relationships within this group. Sequence variation in a partial segment of the mitochondrial COI gene was highly effective for resolving species relationships within Aphidina. Forty-five species were correctly identified in a neighbor-joining tree. Mean intraspecific sequence divergence was 0.17%, with a range of 0.00% to 1.54%. Mean interspecific divergence within previously recognized genera or morphologically similar species groups was 4.54%, with variation mainly in the range of 3.50% to 8.00%. Possible reasons for anomalous levels of mean nucleotide divergence within or between some taxa are discussed.

  15. Whole-genome and Transcriptome Sequencing of Prostate Cancer Identify New Genetic Alterations Driving Disease Progression

    DEFF Research Database (Denmark)

    Ren, Shancheng; Wei, Gong-Hong; Liu, Dongbing

    2018-01-01

    BACKGROUND: Global disparities in prostate cancer (PCa) incidence highlight the urgent need to identify genomic abnormalities in prostate tumors in different ethnic populations including Asian men. OBJECTIVE: To systematically explore the genomic complexity and define disease-driven genetic......-scale and comprehensive genomic data of prostate cancer from Asian population. Identification of these genetic alterations may help advance prostate cancer diagnosis, prognosis, and treatment....... alterations in PCa. DESIGN, SETTING, AND PARTICIPANTS: The study sequenced whole-genome and transcriptome of tumor-benign paired tissues from 65 treatment-naive Chinese PCa patients. Subsequent targeted deep sequencing of 293 PCa-relevant genes was performed in another cohort of 145 prostate tumors. OUTCOME...

  16. Microsatellite Primers Identified by 454 Sequencing in the Floodplain Tree Species Eucalyptus victrix (Myrtaceae

    Directory of Open Access Journals (Sweden)

    Paul G. Nevill

    2013-05-01

    Full Text Available Premise of the study: Microsatellite primers were developed for Eucalyptus victrix (Myrtaceae to evaluate the population and spatial genetic structure of this widespread northwestern Australian riparian tree species, which may be impacted by hydrological changes associated with mining activity. Methods and Results: 454 GS-FLX shotgun sequencing was used to obtain 1895 sequences containing putative microsatellite motifs. Ten polymorphic microsatellite loci were identified and screened for variation in individuals from two populations in the Pilbara region. Observed heterozygosities ranged from 0.44 to 0.91 (mean: 0.66 and the number of alleles per locus ranged from five to 25 (average: 11. Conclusions: These microsatellite loci will be useful in future studies of population and spatial genetic structure in E. victrix, and inform the development of seed sourcing strategies for the species.

  17. Sequence-Based Introgression Mapping Identifies Candidate White Mold Tolerance Genes in Common Bean

    Directory of Open Access Journals (Sweden)

    Sujan Mamidi

    2016-07-01

    Full Text Available White mold, caused by the necrotrophic fungus (Lib. de Bary, is a major disease of common bean ( L.. WM7.1 and WM8.3 are two quantitative trait loci (QTL with major effects on tolerance to the pathogen. Advanced backcross populations segregating individually for either of the two QTL, and a recombinant inbred (RI population segregating for both QTL were used to fine map and confirm the genetic location of the QTL. The QTL intervals were physically mapped using the reference common bean genome sequence, and the physical intervals for each QTL were further confirmed by sequence-based introgression mapping. Using whole-genome sequence data from susceptible and tolerant DNA pools, introgressed regions were identified as those with significantly higher numbers of single-nucleotide polymorphisms (SNPs relative to the whole genome. By combining the QTL and SNP data, WM7.1 was located to a 660-kb region that contained 41 gene models on the proximal end of chromosome Pv07, while the WM8.3 introgression was narrowed to a 1.36-Mb region containing 70 gene models. The most polymorphic candidate gene in the WM7.1 region encodes a BEACH-domain protein associated with apoptosis. Within the WM8.3 interval, a receptor-like protein with the potential to recognize pathogen effectors was the most polymorphic gene. The use of gene and sequence-based mapping identified two candidate genes whose putative functions are consistent with the current model of pathogenicity.

  18. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    Science.gov (United States)

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  19. Contig Maps and Genomic Sequencing Identify Candidate Genes in the Usher 1C Locus

    Science.gov (United States)

    Higgins, Michael J.; Day, Colleen D.; Smilinich, Nancy J.; Ni, L.; Cooper, Paul R.; Nowak, Norma J.; Davies, Chris; de Jong, Pieter J.; Hejtmancik, Fielding; Evans, Glen A.; Smith, Richard J.H.; Shows, Thomas B.

    1998-01-01

    Usher syndrome 1C (USH1C) is a congenital condition manifesting profound hearing loss, the absence of vestibular function, and eventual retinal degeneration. The USH1C locus has been mapped genetically to a 2- to 3-cM interval in 11p14–15.1 between D11S899 and D11S861. In an effort to identify the USH1C disease gene we have isolated the region between these markers in yeast artificial chromosomes (YACs) using a combination of STS content mapping and Alu–PCR hybridization. The YAC contig is ∼3.5 Mb and has located several other loci within this interval, resulting in the order CEN-LDHA-SAA1-TPH-D11S1310-(D11S1888/KCNC1)-MYOD1-D11S902D11S921-D11S1890-TEL. Subsequent haplotyping and homozygosity analysis refined the location of the disease gene to a 400-kb interval between D11S902 and D11S1890 with all affected individuals being homozygous for the internal marker D11S921. To facilitate gene identification, the critical region has been converted into P1 artificial chromosome (PAC) clones using sequence-tagged sites (STSs) mapped to the YAC contig, Alu–PCR products generated from the YACs, and PAC end probes. A contig of >50 PAC clones has been assembled between D11S1310 and D11S1890, confirming the order of markers used in haplotyping. Three PAC clones representing nearly two-thirds of the USH1C critical region have been sequenced. PowerBLAST analysis identified six clusters of expressed sequence tags (ESTs), two known genes (BIR,SUR1) mapped previously to this region, and a previously characterized but unmapped gene NEFA (DNA binding/EF hand/acidic amino-acid-rich). GRAIL analysis identified 11 CpG islands and 73 exons of excellent quality. These data allowed the construction of a transcription map for the USH1C critical region, consisting of three known genes and six or more novel transcripts. Based on their map location, these loci represent candidate disease loci for USH1C. The NEFA gene was assessed as the USH1C locus by the sequencing of an amplified NEFA

  20. Exome Sequencing and Linkage Analysis Identified Novel Candidate Genes in Recessive Intellectual Disability Associated with Ataxia.

    Science.gov (United States)

    Jazayeri, Roshanak; Hu, Hao; Fattahi, Zohreh; Musante, Luciana; Abedini, Seyedeh Sedigheh; Hosseini, Masoumeh; Wienker, Thomas F; Ropers, Hans Hilger; Najmabadi, Hossein; Kahrizi, Kimia

    2015-10-01

    Intellectual disability (ID) is a neuro-developmental disorder which causes considerable socio-economic problems. Some ID individuals are also affected by ataxia, and the condition includes different mutations affecting several genes. We used whole exome sequencing (WES) in combination with homozygosity mapping (HM) to identify the genetic defects in five consanguineous families among our cohort study, with two affected children with ID and ataxia as major clinical symptoms. We identified three novel candidate genes, RIPPLY1, MRPL10, SNX14, and a new mutation in known gene SURF1. All are autosomal genes, except RIPPLY1, which is located on the X chromosome. Two are housekeeping genes, implicated in transcription and translation regulation and intracellular trafficking, and two encode mitochondrial proteins. The pathogenesis of these variants was evaluated by mutation classification, bioinformatic methods, review of medical and biological relevance, co-segregation studies in the particular family, and a normal population study. Linkage analysis and exome sequencing of a small number of affected family members is a powerful new technique which can be used to decrease the number of candidate genes in heterogenic disorders such as ID, and may even identify the responsible gene(s).

  1. Globicatella sanguinis bacteraemia identified by partial 16S rRNA gene sequencing

    DEFF Research Database (Denmark)

    Abdul-Redha, Rawaa Jalil; Balslew, Ulla; Christensen, Jens Jørgen

    2007-01-01

    Globicatella sanguinis is a gram-positive coccus, resembling non-haemolytic streptococci. The organism has been isolated infrequently from normally sterile sites of humans. Three isolates obtained by blood culture could not be identified by Rapid 32 ID Strep, but partial sequencing of the 16S r......RNA gene revealed the identity of the isolated bacteria, and supplementary biochemical tests confirmed the species identification. The cases histories illustrate the dilemma of finding relevant, newly recognized, opportunistic pathogens and the identification achievement (s) that can be obtained by using...

  2. RePS: a sequence assembler that masks exact repeats identified from the shotgun data

    DEFF Research Database (Denmark)

    Wang, Jun; Wong, Gane Ka-Shu; Ni, Peixiang

    2002-01-01

    We describe a sequence assembler, RePS (repeat-masked Phrap with scaffolding), that explicitly identifies exact 20mer repeats from the shotgun data and removes them prior to the assembly. The established software is used to compute meaningful error probabilities for each base. Clone......-end-pairing information is used to construct scaffolds that order and orient the contigs. We show with real data for human and rice that reasonable assemblies are possible even at coverages of only 4x to 6x, despite having up to 42.2% in exact repeats. Udgivelsesdato: 2002-May...

  3. Completed Ensemble Empirical Mode Decomposition: a Robust Signal Processing Tool to Identify Sequence Strata

    Science.gov (United States)

    Purba, H.; Musu, J. T.; Diria, S. A.; Permono, W.; Sadjati, O.; Sopandi, I.; Ruzi, F.

    2018-03-01

    Well logging data provide many geological information and its trends resemble nonlinear or non-stationary signals. As long well log data recorded, there will be external factors can interfere or influence its signal resolution. A sensitive signal analysis is required to improve the accuracy of logging interpretation which it becomes an important thing to determine sequence stratigraphy. Complete Ensemble Empirical Mode Decomposition (CEEMD) is one of nonlinear and non-stationary signal analysis method which decomposes complex signal into a series of intrinsic mode function (IMF). Gamma Ray and Spontaneous Potential well log parameters decomposed into IMF-1 up to IMF-10 and each of its combination and correlation makes physical meaning identification. It identifies the stratigraphy and cycle sequence and provides an effective signal treatment method for sequence interface. This method was applied to BRK- 30 and BRK-13 well logging data. The result shows that the combination of IMF-5, IMF-6, and IMF-7 pattern represent short-term and middle-term while IMF-9 and IMF-10 represent the long-term sedimentation which describe distal front and delta front facies, and inter-distributary mouth bar facies, respectively. Thus, CEEMD clearly can determine the different sedimentary layer interface and better identification of the cycle of stratigraphic base level.

  4. Exome sequencing of a large family identifies potential candidate genes contributing risk to bipolar disorder.

    Science.gov (United States)

    Zhang, Tianxiao; Hou, Liping; Chen, David T; McMahon, Francis J; Wang, Jen-Chyong; Rice, John P

    2018-03-01

    Bipolar disorder is a mental illness with lifetime prevalence of about 1%. Previous genetic studies have identified multiple chromosomal linkage regions and candidate genes that might be associated with bipolar disorder. The present study aimed to identify potential susceptibility variants for bipolar disorder using 6 related case samples from a four-generation family. A combination of exome sequencing and linkage analysis was performed to identify potential susceptibility variants for bipolar disorder. Our study identified a list of five potential candidate genes for bipolar disorder. Among these five genes, GRID1(Glutamate Receptor Delta-1 Subunit), which was previously reported to be associated with several psychiatric disorders and brain related traits, is particularly interesting. Variants with functional significance in this gene were identified from two cousins in our bipolar disorder pedigree. Our findings suggest a potential role for these genes and the related rare variants in the onset and development of bipolar disorder in this one family. Additional research is needed to replicate these findings and evaluate their patho-biological significance. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Transcriptome Sequencing of Chemically Induced Aquilaria sinensis to Identify Genes Related to Agarwood Formation.

    Science.gov (United States)

    Ye, Wei; Wu, Hongqing; He, Xin; Wang, Lei; Zhang, Weimin; Li, Haohua; Fan, Yunfei; Tan, Guohui; Liu, Taomei; Gao, Xiaoxia

    2016-01-01

    Agarwood is a traditional Chinese medicine used as a clinical sedative, carminative, and antiemetic drug. Agarwood is formed in Aquilaria sinensis when A. sinensis trees are threatened by external physical, chemical injury or endophytic fungal irritation. However, the mechanism of agarwood formation via chemical induction remains unclear. In this study, we characterized the transcriptome of different parts of a chemically induced A. sinensis trunk sample with agarwood. The Illumina sequencing platform was used to identify the genes involved in agarwood formation. A five-year-old Aquilaria sinensis treated by formic acid was selected. The white wood part (B1 sample), the transition part between agarwood and white wood (W2 sample), the agarwood part (J3 sample), and the rotten wood part (F5 sample) were collected for transcriptome sequencing. Accordingly, 54,685,634 clean reads, which were assembled into 83,467 unigenes, were obtained with a Q20 value of 97.5%. A total of 50,565 unigenes were annotated using the Nr, Nt, SWISS-PROT, KEGG, COG, and GO databases. In particular, 171,331,352 unigenes were annotated by various pathways, including the sesquiterpenoid (ko00909) and plant-pathogen interaction (ko03040) pathways. These pathways were related to sesquiterpenoid biosynthesis and defensive responses to chemical stimulation. The transcriptome data of the different parts of the chemically induced A. sinensis trunk provide a rich source of materials for discovering and identifying the genes involved in sesquiterpenoid production and in defensive responses to chemical stimulation. This study is the first to use de novo sequencing and transcriptome assembly for different parts of chemically induced A. sinensis. Results demonstrate that the sesquiterpenoid biosynthesis pathway and WRKY transcription factor play important roles in agarwood formation via chemical induction. The comparative analysis of the transcriptome data of agarwood and A. sinensis lays the foundation

  6. Identifying Genetic Signatures of Natural Selection Using Pooled Population Sequencing in Picea abies.

    Science.gov (United States)

    Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin

    2016-07-07

    The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. Copyright © 2016 Chen et al.

  7. Identifying and sequencing a Mycobacterium sp. strain F4 as a potential bioremediation agent for quinclorac.

    Science.gov (United States)

    Li, Yingying; Chen, Wu; Wang, Yunsheng; Luo, Kun; Li, Yue; Bai, Lianyang; Luo, Feng

    2017-01-01

    Quinclorac is a widely used herbicide in rice filed. Unfortunately, quinclorac residues are phytotoxic to many crops/vegetables. The degradation of quinclorac in nature is very slow. On the other hand, degradation of quinclorac using bacteria can be an effective and efficient method to reduce its contamination. In this study, we isolated a quinclorac bioremediation bacterium strain F4 from quinclorac contaminated soils. Based on morphological characteristics and 16S rRNA gene sequence analysis, we identified strain F4 as Mycobacterium sp. We investigated the effects of temperature, pH, inoculation size and initial quinclorac concentration on growth and degrading efficiency of F4 and determined the optimal quinclorac degrading condition of F4. Under optimal degrading conditions, F4 degraded 97.38% of quinclorac from an initial concentration of 50 mg/L in seven days. Our indoor pot experiment demonstrated that the degradation products were non-phytotoxic to tobacco. After analyzing the quinclorac degradation products of F4, we proposed that F4 could employ two pathways to degrade quinclorac: one is through methylation, the other is through dechlorination. Furthermore, we reconstructed the whole genome of F4 through single molecular sequencing and de novo assembly. We identified 77 methyltransferases and eight dehalogenases in the F4 genome to support our hypothesized degradation path.

  8. Transcriptome sequencing in pediatric acute lymphoblastic leukemia identifies fusion genes associated with distinct DNA methylation profiles

    Directory of Open Access Journals (Sweden)

    Yanara Marincevic-Zuniga

    2017-08-01

    Full Text Available Abstract Background Structural chromosomal rearrangements that lead to expressed fusion genes are a hallmark of acute lymphoblastic leukemia (ALL. In this study, we performed transcriptome sequencing of 134 primary ALL patient samples to comprehensively detect fusion transcripts. Methods We combined fusion gene detection with genome-wide DNA methylation analysis, gene expression profiling, and targeted sequencing to determine molecular signatures of emerging ALL subtypes. Results We identified 64 unique fusion events distributed among 80 individual patients, of which over 50% have not previously been reported in ALL. Although the majority of the fusion genes were found only in a single patient, we identified several recurrent fusion gene families defined by promiscuous fusion gene partners, such as ETV6, RUNX1, PAX5, and ZNF384, or recurrent fusion genes, such as DUX4-IGH. Our data show that patients harboring these fusion genes displayed characteristic genome-wide DNA methylation and gene expression signatures in addition to distinct patterns in single nucleotide variants and recurrent copy number alterations. Conclusion Our study delineates the fusion gene landscape in pediatric ALL, including both known and novel fusion genes, and highlights fusion gene families with shared molecular etiologies, which may provide additional information for prognosis and therapeutic options in the future.

  9. Epidemiological analysis of Salmonella clusters identified by whole genome sequencing, England and Wales 2014.

    Science.gov (United States)

    Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J

    2018-05-01

    The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Identifying and sequencing a Mycobacterium sp. strain F4 as a potential bioremediation agent for quinclorac.

    Directory of Open Access Journals (Sweden)

    Yingying Li

    Full Text Available Quinclorac is a widely used herbicide in rice filed. Unfortunately, quinclorac residues are phytotoxic to many crops/vegetables. The degradation of quinclorac in nature is very slow. On the other hand, degradation of quinclorac using bacteria can be an effective and efficient method to reduce its contamination. In this study, we isolated a quinclorac bioremediation bacterium strain F4 from quinclorac contaminated soils. Based on morphological characteristics and 16S rRNA gene sequence analysis, we identified strain F4 as Mycobacterium sp. We investigated the effects of temperature, pH, inoculation size and initial quinclorac concentration on growth and degrading efficiency of F4 and determined the optimal quinclorac degrading condition of F4. Under optimal degrading conditions, F4 degraded 97.38% of quinclorac from an initial concentration of 50 mg/L in seven days. Our indoor pot experiment demonstrated that the degradation products were non-phytotoxic to tobacco. After analyzing the quinclorac degradation products of F4, we proposed that F4 could employ two pathways to degrade quinclorac: one is through methylation, the other is through dechlorination. Furthermore, we reconstructed the whole genome of F4 through single molecular sequencing and de novo assembly. We identified 77 methyltransferases and eight dehalogenases in the F4 genome to support our hypothesized degradation path.

  11. microRNA expression profiling in fetal single ventricle malformation identified by deep sequencing.

    Science.gov (United States)

    Yu, Zhang-Bin; Han, Shu-Ping; Bai, Yun-Fei; Zhu, Chun; Pan, Ya; Guo, Xi-Rong

    2012-01-01

    microRNAs (miRNAs) have emerged as key regulators in many biological processes, particularly cardiac growth and development, although the specific miRNA expression profile associated with this process remains to be elucidated. This study aimed to characterize the cellular microRNA profile involved in the development of congenital heart malformation, through the investigation of single ventricle (SV) defects. Comprehensive miRNA profiling in human fetal SV cardiac tissue was performed by deep sequencing. Differential expression of 48 miRNAs was revealed by sequencing by oligonucleotide ligation and detection (SOLiD) analysis. Of these, 38 were down-regulated and 10 were up-regulated in differentiated SV cardiac tissue, compared to control cardiac tissue. This was confirmed by real-time quantitative reverse transcription-polymerase chain reaction (qRT-PCR) analysis. Predicted target genes of the 48 differentially expressed miRNAs were analyzed by gene ontology and categorized according to cellular process, regulation of biological process and metabolic process. Pathway-Express analysis identified the WNT and mTOR signaling pathways as the most significant processes putatively affected by the differential expression of these miRNAs. The candidate genes involved in cardiac development were identified as potential targets for these differentially expressed microRNAs and the collaborative network of microRNAs and cardiac development related-mRNAs was constructed. These data provide the basis for future investigation of the mechanism of the occurrence and development of fetal SV malformations.

  12. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    Science.gov (United States)

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  13. A new assay to identify recurrent mutations in acute myeloid leukemia using next-generation sequencing

    Directory of Open Access Journals (Sweden)

    Coriu Daniel

    2014-03-01

    Full Text Available Introducere: Leucemia acută mieloblastică (LAM este o boală heterogenă caracterizată prin debut la vârstă avansată, fenotip agresiv şi prognostic nefavorabil în special în grupul de vârstă de peste 65 de ani. Pentru stratificarea pacienţilor în grupe de risc se utilizează citogenetica clasică împreună cu metodele moleculare pentru identificarea mutaţiilor punctiforme. În acest articol descriem o nouă metodă de identificare a mutaţiilor în 5 gene implicate în LAM: RUNX1, FLT3, DNMT3A, IDH1 şi IDH2 utilizând secvenţierea de nouă generaţie. Materiale şi metode: Au fost secvenţiate probe de la 40 de pacienţi cu LAM cu cariotip normal internaţi în Institutul Clinic Fundeni. Design-ul de primeri a fost efectuat utilizând LaserGene Genomics suit. Secvenţierea de nouă generaţie a fost efectuată pe platforma MiSeq de la Illumina. Rezultatele au fost analizate utilizând LaserGene Genomics suit. Rezultatele obţinute prin secvenţierea de nouă generaţie au fost comparate cu secvenţierea Sanger. Rezultate: Nu au fost identificate mutaţii adiţionale în probele de la nouă pacienţi pozitivi pentru mutaţiile FLT3-ITD şi / sau NPM1. În probele de la 25 din 31 de pacienţi, cu cariotip normal şi fără mutaţii FLT3-ITD şi NPM1, au fost identificate mutaţii în una din cele 5 gene studiate. Toate aceste mutaţii, identificate prin secvenţierea de nouă generaţie, au fost confirmate prin metoda de secvenţiere clasică Sanger. Concluzii: În acest studiu am validat o metodă de identificare a mutaţiilor apărute la pacienţii cu LAM utilizând secvenţierea de nouă generaţie. Această metodă prezintă o serie de avantaje: este mai ieftină ca in cazul secvenţierii Sanger, prezintă o sensibilitate crescută pentru detectarea mutaţiilor, a fost descrisă ca fiind cantitativă şi în cazul nostru a permis stratificarea în grupe de risc a majorităţii pacienţilor cu cariotip normal şi fără muta

  14. Exome sequencing identifies three novel candidate genes implicated in intellectual disability.

    Directory of Open Access Journals (Sweden)

    Zehra Agha

    Full Text Available Intellectual disability (ID is a major health problem mostly with an unknown etiology. Recently exome sequencing of individuals with ID identified novel genes implicated in the disease. Therefore the purpose of the present study was to identify the genetic cause of ID in one syndromic and two non-syndromic Pakistani families. Whole exome of three ID probands was sequenced. Missense variations in two plausible novel genes implicated in autosomal recessive ID were identified: lysine (K-specific methyltransferase 2B (KMT2B, zinc finger protein 589 (ZNF589, as well as hedgehog acyltransferase (HHAT with a de novo mutation with autosomal dominant mode of inheritance. The KMT2B recessive variant is the first report of recessive Kleefstra syndrome-like phenotype. Identification of plausible causative mutations for two recessive and a dominant type of ID, in genes not previously implicated in disease, underscores the large genetic heterogeneity of ID. These results also support the viewpoint that large number of ID genes converge on limited number of common networks i.e. ZNF589 belongs to KRAB-domain zinc-finger proteins previously implicated in ID, HHAT is predicted to affect sonic hedgehog, which is involved in several disorders with ID, KMT2B associated with syndromic ID fits the epigenetic module underlying the Kleefstra syndromic spectrum. The association of these novel genes in three different Pakistani ID families highlights the importance of screening these genes in more families with similar phenotypes from different populations to confirm the involvement of these genes in pathogenesis of ID.

  15. Integrated sequence analysis pipeline provides one-stop solution for identifying disease-causing mutations.

    Science.gov (United States)

    Hu, Hao; Wienker, Thomas F; Musante, Luciana; Kalscheuer, Vera M; Kahrizi, Kimia; Najmabadi, Hossein; Ropers, H Hilger

    2014-12-01

    Next-generation sequencing has greatly accelerated the search for disease-causing defects, but even for experts the data analysis can be a major challenge. To facilitate the data processing in a clinical setting, we have developed a novel medical resequencing analysis pipeline (MERAP). MERAP assesses the quality of sequencing, and has optimized capacity for calling variants, including single-nucleotide variants, insertions and deletions, copy-number variation, and other structural variants. MERAP identifies polymorphic and known causal variants by filtering against public domain databases, and flags nonsynonymous and splice-site changes. MERAP uses a logistic model to estimate the causal likelihood of a given missense variant. MERAP considers the relevant information such as phenotype and interaction with known disease-causing genes. MERAP compares favorably with GATK, one of the widely used tools, because of its higher sensitivity for detecting indels, its easy installation, and its economical use of computational resources. Upon testing more than 1,200 individuals with mutations in known and novel disease genes, MERAP proved highly reliable, as illustrated here for five families with disease-causing variants. We believe that the clinical implementation of MERAP will expedite the diagnostic process of many disease-causing defects. © 2014 WILEY PERIODICALS, INC.

  16. Whole-exome sequencing identifies novel candidate predisposition genes for familial polycythemia vera.

    Science.gov (United States)

    Hirvonen, Elina A M; Pitkänen, Esa; Hemminki, Kari; Aaltonen, Lauri A; Kilpivaara, Outi

    2017-04-20

    Polycythemia vera (PV), characterized by massive production of erythrocytes, is one of the myeloproliferative neoplasms. Most patients carry a somatic gain-of-function mutation in JAK2, c.1849G > T (p.Val617Phe), leading to constitutive activation of JAK-STAT signaling pathway. Familial clustering is also observed occasionally, but high-penetrance predisposition genes to PV have remained unidentified. We studied the predisposition to PV by exome sequencing (three cases) in a Finnish PV family with four patients. The 12 shared variants (maximum allowed minor allele frequency  G (p.Phe418Leu) in ZXDC, c.1931C > G (p.Pro644Arg) in ATN1, and c.701G > A (p.Arg234Gln) in LRRC3. We also observed a rare, predicted benign germline variant c.2912C > G (p.Ala971Gly) in BCORL1 in all four patients. Somatic mutations in BCORL1 have been reported in myeloid malignancies. We further screened the variants in eight PV patients in six other Finnish families, but no other carriers were found. Exome sequencing provides a powerful tool for the identification of novel variants, and understanding the familial predisposition of diseases. This is the first report on Finnish familial PV cases, and we identified three novel candidate variants that may predispose to the disease.

  17. Targeted exome sequencing identified novel USH2A mutations in Usher syndrome families.

    Directory of Open Access Journals (Sweden)

    Xiu-Feng Huang

    Full Text Available Usher syndrome (USH is a leading cause of deaf-blindness in autosomal recessive trait. Phenotypic and genetic heterogeneities in USH make molecular diagnosis much difficult. This is a pilot study aiming to develop an approach based on next-generation sequencing to determine the genetic defects in patients with USH or allied diseases precisely and effectively. Eight affected patients and twelve unaffected relatives from five unrelated Chinese USH families, including 2 pseudo-dominant ones, were recruited. A total of 144 known genes of inherited retinal diseases were selected for deep exome resequencing. Through systematic data analysis using established bioinformatics pipeline and segregation analysis, a number of genetic variants were released. Eleven mutations, eight of them were novel, in the USH2A gene were identified. Biparental mutations in USH2A were revealed in 2 families with pseudo-dominant inheritance. A proband was found to have triple mutations, two of them were supposed to locate in the same chromosome. In conclusion, this study revealed the genetic defects in the USH2A gene and demonstrated the robustness of targeted exome sequencing to precisely and rapidly determine genetic defects. The methodology provides a reliable strategy for routine gene diagnosis of USH.

  18. LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone

    KAUST Repository

    Chen, Peng

    2014-12-03

    Background Protein-ligand binding is important for some proteins to perform their functions. Protein-ligand binding sites are the residues of proteins that physically bind to ligands. Despite of the recent advances in computational prediction for protein-ligand binding sites, the state-of-the-art methods search for similar, known structures of the query and predict the binding sites based on the solved structures. However, such structural information is not commonly available. Results In this paper, we propose a sequence-based approach to identify protein-ligand binding residues. We propose a combination technique to reduce the effects of different sliding residue windows in the process of encoding input feature vectors. Moreover, due to the highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we construct several balanced data sets, for each of which a random forest (RF)-based classifier is trained. The ensemble of these RF classifiers forms a sequence-based protein-ligand binding site predictor. Conclusions Experimental results on CASP9 and CASP8 data sets demonstrate that our method compares favorably with the state-of-the-art protein-ligand binding site prediction methods.

  19. WHITE-DWARF-MAIN-SEQUENCE BINARIES IDENTIFIED FROM THE LAMOST PILOT SURVEY

    International Nuclear Information System (INIS)

    Ren Juanjuan; Luo Ali; Li Yinbi; Wei Peng; Zhao Jingkun; Zhao Yongheng; Song Yihan; Zhao Gang

    2013-01-01

    We present a set of white-dwarf-main-sequence (WDMS) binaries identified spectroscopically from the Large sky Area Multi-Object fiber Spectroscopic Telescope (LAMOST, also called the Guo Shou Jing Telescope) pilot survey. We develop a color selection criteria based on what is so far the largest and most complete Sloan Digital Sky Survey (SDSS) DR7 WDMS binary catalog and identify 28 WDMS binaries within the LAMOST pilot survey. The primaries in our binary sample are mostly DA white dwarfs except for one DB white dwarf. We derive the stellar atmospheric parameters, masses, and radii for the two components of 10 of our binaries. We also provide cooling ages for the white dwarf primaries as well as the spectral types for the companion stars of these 10 WDMS binaries. These binaries tend to contain hot white dwarfs and early-type companions. Through cross-identification, we note that nine binaries in our sample have been published in the SDSS DR7 WDMS binary catalog. Nineteen spectroscopic WDMS binaries identified by the LAMOST pilot survey are new. Using the 3σ radial velocity variation as a criterion, we find two post-common-envelope binary candidates from our WDMS binary sample

  20. Microfluidic screening and whole-genome sequencing identifies mutations associated with improved protein secretion by yeast

    DEFF Research Database (Denmark)

    Huang, Mingtao; Bai, Yunpeng; Sjostrom, Staffan L.

    2015-01-01

    There is an increasing demand for biotech-based production of recombinant proteins for use as pharmaceuticals in the food and feed industry and in industrial applications. Yeast Saccharomyces cerevisiae is among preferred cell factories for recombinant protein production, and there is increasing...... interest in improving its protein secretion capacity. Due to the complexity of the secretory machinery in eukaryotic cells, it is difficult to apply rational engineering for construction of improved strains. Here we used high-throughput microfluidics for the screening of yeast libraries, generated by UV...... mutagenesis. Several screening and sorting rounds resulted in the selection of eight yeast clones with significantly improved secretion of recombinant a-amylase. Efficient secretion was genetically stable in the selected clones. We performed whole-genome sequencing of the eight clones and identified 330...

  1. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    Science.gov (United States)

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  2. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    Science.gov (United States)

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  3. Third-Generation Sequencing and Analysis of Four Complete Pig Liver Esterase Gene Sequences in Clones Identified by Screening BAC Library.

    Science.gov (United States)

    Zhou, Qiongqiong; Sun, Wenjuan; Liu, Xiyan; Wang, Xiliang; Xiao, Yuncai; Bi, Dingren; Yin, Jingdong; Shi, Deshi

    2016-01-01

    Pig liver carboxylesterase (PLE) gene sequences in GenBank are incomplete, which has led to difficulties in studying the genetic structure and regulation mechanisms of gene expression of PLE family genes. The aim of this study was to obtain and analysis of complete gene sequences of PLE family by screening from a Rongchang pig BAC library and third-generation PacBio gene sequencing. After a number of existing incomplete PLE isoform gene sequences were analysed, primers were designed based on conserved regions in PLE exons, and the whole pig genome used as a template for Polymerase chain reaction (PCR) amplification. Specific primers were then selected based on the PCR amplification results. A three-step PCR screening method was used to identify PLE-positive clones by screening a Rongchang pig BAC library and PacBio third-generation sequencing was performed. BLAST comparisons and other bioinformatics methods were applied for sequence analysis. Five PLE-positive BAC clones, designated BAC-10, BAC-70, BAC-75, BAC-119 and BAC-206, were identified. Sequence analysis yielded the complete sequences of four PLE genes, PLE1, PLE-B9, PLE-C4, and PLE-G2. Complete PLE gene sequences were defined as those containing regulatory sequences, exons, and introns. It was found that, not only did the PLE exon sequences of the four genes show a high degree of homology, but also that the intron sequences were highly similar. Additionally, the regulatory region of the genes contained two 720bps reverse complement sequences that may have an important function in the regulation of PLE gene expression. This is the first report to confirm the complete sequences of four PLE genes. In addition, the study demonstrates that each PLE isoform is encoded by a single gene and that the various genes exhibit a high degree of sequence homology, suggesting that the PLE family evolved from a single ancestral gene. Obtaining the complete sequences of these PLE genes provides the necessary foundation for

  4. Application of small RNA sequencing to identify microRNAs in acute kidney injury and fibrosis

    Energy Technology Data Exchange (ETDEWEB)

    Pellegrini, Kathryn L. [Department of Medicine, Renal Division, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Gerlach, Cory V. [Department of Medicine, Renal Division, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, MA (United States); Laboratory of Systems Pharmacology, Harvard Program in Therapeutic Sciences, Harvard Medical School, Boston, MA (United States); Craciun, Florin L.; Ramachandran, Krithika [Department of Medicine, Renal Division, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Bijol, Vanesa [Department of Pathology, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Kissick, Haydn T. [Department of Surgery, Urology Division, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA (United States); Vaidya, Vishal S., E-mail: vvaidya@bwh.harvard.edu [Department of Medicine, Renal Division, Brigham and Women' s Hospital, Harvard Medical School, Boston, MA (United States); Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, MA (United States); Laboratory of Systems Pharmacology, Harvard Program in Therapeutic Sciences, Harvard Medical School, Boston, MA (United States)

    2016-12-01

    Establishing a microRNA (miRNA) expression profile in affected tissues provides an important foundation for the discovery of miRNAs involved in the development or progression of pathologic conditions. We conducted small RNA sequencing to generate a temporal profile of miRNA expression in the kidneys using a mouse model of folic acid-induced (250 mg/kg i.p.) kidney injury and fibrosis. From the 103 miRNAs that were differentially expressed over the time course (> 2-fold, p < 0.05), we chose to further investigate miR-18a-5p, which is expressed during the acute stage of the injury; miR-132-3p, which is upregulated during transition between acute and fibrotic injury; and miR-146b-5p, which is highly expressed at the peak of fibrosis. Using qRT-PCR, we confirmed the increased expression of these candidate miRNAs in the folic acid model as well as in other established mouse models of acute injury (ischemia/reperfusion injury) and fibrosis (unilateral ureteral obstruction). In situ hybridization confirmed high expression of miR-18a-5p, miR-132-3p and miR-146b-5p throughout the kidney cortex in mice and humans with severe kidney injury or fibrosis. When primary human proximal tubular epithelial cells were treated with model nephrotoxicants such as cadmium chloride (CdCl{sub 2}), arsenic trioxide, aristolochic acid (AA), potassium dichromate (K{sub 2}Cr{sub 2}O{sub 7}) and cisplatin, miRNA-132-3p was upregulated 4.3-fold after AA treatment and 1.5-fold after K{sub 2}Cr{sub 2}O{sub 7} and CdCl{sub 2} treatment. These results demonstrate the application of temporal small RNA sequencing to identify miR-18a, miR-132 and miR-146b as differentially expressed miRNAs during distinct phases of kidney injury and fibrosis progression. - Highlights: • We used small RNA sequencing to identify differentially expressed miRNAs in kidney. • Distinct patterns were found for acute injury and fibrotic stages in the kidney. • Upregulation of miR-18a, -132 and -146b was confirmed in mice

  5. Molecular profiling of appendiceal epithelial tumors using massively parallel sequencing to identify somatic mutations.

    Science.gov (United States)

    Liu, Xiaoying; Mody, Kabir; de Abreu, Francine B; Pipas, J Marc; Peterson, Jason D; Gallagher, Torrey L; Suriawinata, Arief A; Ripple, Gregory H; Hourdequin, Kathryn C; Smith, Kerrington D; Barth, Richard J; Colacchio, Thomas A; Tsapakos, Michael J; Zaki, Bassem I; Gardner, Timothy B; Gordon, Stuart R; Amos, Christopher I; Wells, Wendy A; Tsongalis, Gregory J

    2014-07-01

    Some epithelial neoplasms of the appendix, including low-grade appendiceal mucinous neoplasm and adenocarcinoma, can result in pseudomyxoma peritonei (PMP). Little is known about the mutational spectra of these tumor types and whether mutations may be of clinical significance with respect to therapeutic selection. In this study, we identified somatic mutations using the Ion Torrent AmpliSeq Cancer Hotspot Panel v2. Specimens consisted of 3 nonneoplastic retention cysts/mucocele, 15 low-grade mucinous neoplasms (LAMNs), 8 low-grade/well-differentiated mucinous adenocarcinomas with pseudomyxoma peritonei, and 12 adenocarcinomas with/without goblet cell/signet ring cell features. Barcoded libraries were prepared from up to 10 ng of extracted DNA and multiplexed on single 318 chips for sequencing. Data analysis was performed using Golden Helix SVS. Variants that remained after the analysis pipeline were individually interrogated using the Integrative Genomics Viewer. A single Janus kinase 3 (JAK3) mutation was detected in the mucocele group. Eight mutations were identified in the V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) and GNAS complex locus (GNAS) genes among LAMN samples. Additional gene mutations were identified in the AKT1 (v-akt murine thymoma viral oncogene homolog 1), APC (adenomatous polyposis coli), JAK3, MET (met proto-oncogene), phosphatidylinositol-4,5-bisphosphate 3-kinase (PIK3CA), RB1 (retinoblastoma 1), STK11 (serine/threonine kinase 11), and tumor protein p53 (TP53) genes. Among the PMPs, 6 mutations were detected in the KRAS gene and also in the GNAS, TP53, and RB1 genes. Appendiceal cancers showed mutations in the APC, ATM (ataxia telangiectasia mutated), KRAS, IDH1 [isocitrate dehydrogenase 1 (NADP+)], NRAS [neuroblastoma RAS viral (v-ras) oncogene homolog], PIK3CA, SMAD4 (SMAD family member 4), and TP53 genes. Our results suggest molecular heterogeneity among epithelial tumors of the appendix. Next generation sequencing efforts

  6. Extended exome sequencing identifies BACH2 as a novel major risk locus for Addison's disease.

    Science.gov (United States)

    Eriksson, D; Bianchi, M; Landegren, N; Nordin, J; Dalin, F; Mathioudaki, A; Eriksson, G N; Hultin-Rosenberg, L; Dahlqvist, J; Zetterqvist, H; Karlsson, Å; Hallgren, Å; Farias, F H G; Murén, E; Ahlgren, K M; Lobell, A; Andersson, G; Tandre, K; Dahlqvist, S R; Söderkvist, P; Rönnblom, L; Hulting, A-L; Wahlberg, J; Ekwall, O; Dahlqvist, P; Meadows, J R S; Bensing, S; Lindblad-Toh, K; Kämpe, O; Pielberg, G R

    2016-12-01

    Autoimmune disease is one of the leading causes of morbidity and mortality worldwide. In Addison's disease, the adrenal glands are targeted by destructive autoimmunity. Despite being the most common cause of primary adrenal failure, little is known about its aetiology. To understand the genetic background of Addison's disease, we utilized the extensively characterized patients of the Swedish Addison Registry. We developed an extended exome capture array comprising a selected set of 1853 genes and their potential regulatory elements, for the purpose of sequencing 479 patients with Addison's disease and 1394 controls. We identified BACH2 (rs62408233-A, OR = 2.01 (1.71-2.37), P = 1.66 × 10 -15 , MAF 0.46/0.29 in cases/controls) as a novel gene associated with Addison's disease development. We also confirmed the previously known associations with the HLA complex. Whilst BACH2 has been previously reported to associate with organ-specific autoimmune diseases co-inherited with Addison's disease, we have identified BACH2 as a major risk locus in Addison's disease, independent of concomitant autoimmune diseases. Our results may enable future research towards preventive disease treatment. © 2016 The Authors. Journal of Internal Medicine published by John Wiley & Sons Ltd on behalf of Association for Publication of The Journal of Internal Medicine.

  7. Exome sequencing in 53 sporadic cases of schizophrenia identifies 18 putative candidate genes.

    Directory of Open Access Journals (Sweden)

    Michel Guipponi

    Full Text Available Schizophrenia (SCZ is a severe, debilitating mental illness which has a significant genetic component. The identification of genetic factors related to SCZ has been challenging and these factors remain largely unknown. To evaluate the contribution of de novo variants (DNVs to SCZ, we sequenced the exomes of 53 individuals with sporadic SCZ and of their non-affected parents. We identified 49 DNVs, 18 of which were predicted to alter gene function, including 13 damaging missense mutations, 2 conserved splice site mutations, 2 nonsense mutations, and 1 frameshift deletion. The average number of exonic DNV per proband was 0.88, which corresponds to an exonic point mutation rate of 1.7×10(-8 per nucleotide per generation. The non-synonymous-to-synonymous mutation ratio of 2.06 did not differ from neutral expectations. Overall, this study provides a list of 18 putative candidate genes for sporadic SCZ, and when combined with the results of similar reports, identifies a second proband carrying a non-synonymous DNV in the RGS12 gene.

  8. Next-generation sequencing identifies transportin 3 as the causative gene for LGMD1F.

    Directory of Open Access Journals (Sweden)

    Annalaura Torella

    Full Text Available Limb-girdle muscular dystrophies (LGMD are genetically and clinically heterogeneous conditions. We investigated a large family with autosomal dominant transmission pattern, previously classified as LGMD1F and mapped to chromosome 7q32. Affected members are characterized by muscle weakness affecting earlier the pelvic girdle and the ileopsoas muscles. We sequenced the whole exome of four family members and identified a shared heterozygous frame-shift variant in the Transportin 3 (TNPO3 gene, encoding a member of the importin-β super-family. The TNPO3 gene is mapped within the LGMD1F critical interval and its 923-amino acid human gene product is also expressed in skeletal muscle. In addition, we identified an isolated case of LGMD with a new missense mutation in the same gene. We localized the mutant TNPO3 around the nucleus, but not inside. The involvement of gene related to the nuclear transport suggests a novel disease mechanism leading to muscular dystrophy.

  9. Screening of whole genome sequences identified high-impact variants for stallion fertility.

    Science.gov (United States)

    Schrimpf, Rahel; Gottschalk, Maren; Metzger, Julia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

    2016-04-14

    g.37455302G>A in NOTCH1 with the de-regressed estimated breeding values of the paternal component of the pregnancy rate per estrus (EBV-PAT). For 9 high-impact variants within the genes CFTR, OVGP1, FBXO43, TSSK6, PKD1, FOXP1, TCP11, SPATA31E1 and NOTCH1 (g.37453246G>C) absence of the homozygous mutant genotype in the validation sample of all 337 fertile stallions was obvious. Therefore, these variants were considered as potentially deleterious factors for stallion fertility. In conclusion, this study revealed 17 genetic variants with a predicted high damaging effect on protein structure and missing homozygous mutant genotype. The g.37455302G>A NOTCH1 variant was identified as a significant stallion fertility locus in Hanoverian stallions and further 9 candidate fertility loci with missing homozygous mutant genotypes were validated in a panel including 19 horse breeds. To our knowledge this is the first study in horses using next generation sequencing data to uncover strong candidate factors for stallion fertility.

  10. Whole-exome sequencing and high throughput genotyping identified KCNJ11 as the thirteenth MODY gene.

    Science.gov (United States)

    Bonnefond, Amélie; Philippe, Julien; Durand, Emmanuelle; Dechaume, Aurélie; Huyvaert, Marlène; Montagne, Louise; Marre, Michel; Balkau, Beverley; Fajardy, Isabelle; Vambergue, Anne; Vatin, Vincent; Delplanque, Jérôme; Le Guilcher, David; De Graeve, Franck; Lecoeur, Cécile; Sand, Olivier; Vaxillaire, Martine; Froguel, Philippe

    2012-01-01

    Maturity-onset of the young (MODY) is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X). Here, we aimed to use whole-exome sequencing (WES) in a four-generation MODY-X family to identify a new susceptibility gene for MODY. WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing) was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay) of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out. By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130) present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11) co-segregated with diabetes in the family (with a LOD-score of 3.68). No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects. Beyond neonatal diabetes mellitus (NDM), KCNJ11 is also a MODY gene ('MODY13'), confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS). Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.

  11. Whole-exome sequencing and high throughput genotyping identified KCNJ11 as the thirteenth MODY gene.

    Directory of Open Access Journals (Sweden)

    Amélie Bonnefond

    Full Text Available BACKGROUND: Maturity-onset of the young (MODY is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X. Here, we aimed to use whole-exome sequencing (WES in a four-generation MODY-X family to identify a new susceptibility gene for MODY. METHODOLOGY: WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out. PRINCIPAL FINDINGS: By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130 present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11 co-segregated with diabetes in the family (with a LOD-score of 3.68. No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects. CONCLUSIONS/SIGNIFICANCE: Beyond neonatal diabetes mellitus (NDM, KCNJ11 is also a MODY gene ('MODY13', confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS. Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.

  12. Diagnostic SNPs for inferring population structure in American mink (Neovison vison) identified through RAD sequencing

    DEFF Research Database (Denmark)

    2015-01-01

    Data from: "Diagnostic SNPs for inferring population structure in American mink (Neovison vison) identified through RAD sequencing" in Genomic Resources Notes accepted 1 October 2014 to 30 November 2014....

  13. The Ebola virus VP35 protein binds viral immunostimulatory and host RNAs identified through deep sequencing.

    Directory of Open Access Journals (Sweden)

    Kari A Dilley

    Full Text Available Ebola virus and Marburg virus are members of the Filovirdae family and causative agents of hemorrhagic fever with high fatality rates in humans. Filovirus virulence is partially attributed to the VP35 protein, a well-characterized inhibitor of the RIG-I-like receptor pathway that triggers the antiviral interferon (IFN response. Prior work demonstrates the ability of VP35 to block potent RIG-I activators, such as Sendai virus (SeV, and this IFN-antagonist activity is directly correlated with its ability to bind RNA. Several structural studies demonstrate that VP35 binds short synthetic dsRNAs; yet, there are no data that identify viral immunostimulatory RNAs (isRNA or host RNAs bound to VP35 in cells. Utilizing a SeV infection model, we demonstrate that both viral isRNA and host RNAs are bound to Ebola and Marburg VP35s in cells. By deep sequencing the purified VP35-bound RNA, we identified the SeV copy-back defective interfering (DI RNA, previously identified as a robust RIG-I activator, as the isRNA bound by multiple filovirus VP35 proteins, including the VP35 protein from the West African outbreak strain (Makona EBOV. Moreover, RNAs isolated from a VP35 RNA-binding mutant were not immunostimulatory and did not include the SeV DI RNA. Strikingly, an analysis of host RNAs bound by wild-type, but not mutant, VP35 revealed that select host RNAs are preferentially bound by VP35 in cell culture. Taken together, these data support a model in which VP35 sequesters isRNA in virus-infected cells to avert RIG-I like receptor (RLR activation.

  14. The Ebola virus VP35 protein binds viral immunostimulatory and host RNAs identified through deep sequencing.

    Science.gov (United States)

    Dilley, Kari A; Voorhies, Alexander A; Luthra, Priya; Puri, Vinita; Stockwell, Timothy B; Lorenzi, Hernan; Basler, Christopher F; Shabman, Reed S

    2017-01-01

    Ebola virus and Marburg virus are members of the Filovirdae family and causative agents of hemorrhagic fever with high fatality rates in humans. Filovirus virulence is partially attributed to the VP35 protein, a well-characterized inhibitor of the RIG-I-like receptor pathway that triggers the antiviral interferon (IFN) response. Prior work demonstrates the ability of VP35 to block potent RIG-I activators, such as Sendai virus (SeV), and this IFN-antagonist activity is directly correlated with its ability to bind RNA. Several structural studies demonstrate that VP35 binds short synthetic dsRNAs; yet, there are no data that identify viral immunostimulatory RNAs (isRNA) or host RNAs bound to VP35 in cells. Utilizing a SeV infection model, we demonstrate that both viral isRNA and host RNAs are bound to Ebola and Marburg VP35s in cells. By deep sequencing the purified VP35-bound RNA, we identified the SeV copy-back defective interfering (DI) RNA, previously identified as a robust RIG-I activator, as the isRNA bound by multiple filovirus VP35 proteins, including the VP35 protein from the West African outbreak strain (Makona EBOV). Moreover, RNAs isolated from a VP35 RNA-binding mutant were not immunostimulatory and did not include the SeV DI RNA. Strikingly, an analysis of host RNAs bound by wild-type, but not mutant, VP35 revealed that select host RNAs are preferentially bound by VP35 in cell culture. Taken together, these data support a model in which VP35 sequesters isRNA in virus-infected cells to avert RIG-I like receptor (RLR) activation.

  15. Identifying Genetic Differences Between Dongxiang Blue-Shelled and White Leghorn Chickens Using Sequencing Data

    Directory of Open Access Journals (Sweden)

    Qing-bo Zhao

    2018-02-01

    Full Text Available The Dongxiang Blue-shelled chicken is one of the most valuable Chinese indigenous poultry breeds. However, compared to the Italian native White Leghorn, although this Chinese breed possesses numerous favorable characteristics, it also exhibits lower growth performance and fertility. Here, we utilized genotyping sequencing data obtained via genome reduction on a sequencing platform to detect 100,114 single nucleotide polymorphisms and perform further biological analysis and functional annotation. We employed cross-population extended haplotype homozygosity, eigenvector decomposition combined with genome-wide association studies (EigenGWAS, and efficient mixed-model association expedited methods to detect areas of the genome that are potential selected regions (PSR in both chicken breeds, and performed gene ontology (GO enrichment and quantitative trait loci (QTL analyses annotating using the Kyoto Encyclopedia of Genes and Genomes. The results of this study revealed a total of 2424 outlier loci (p-value <0.01, of which 2144 occur in the White Leghorn breed and 280 occur in the Dongxiang Blue-shelled chicken. These correspond to 327 and 94 PSRs containing 297 and 54 genes, respectively. The most significantly selected genes in Blue-shelled chicken are TMEM141 and CLIC3, while the SLCO1B3 gene, related to eggshell color, was identified via EigenGWAS. We show that the White Leghorn genes JARID2, RBMS3, GPC3, TRIB2, ROBO1, SAMSN1, OSBP2, and IGFALS are involved in immunity, reproduction, and growth, and thus might represent footprints of the selection process. In contrast, we identified six significantly enriched pathways in the Dongxiang Blue-shelled chicken that are related to amino acid and lipid metabolism as well as signal transduction. Our results also reveal the presence of a GO term associated with cell metabolism that occurs mainly in the White Leghorn breed, while the most significant QTL regions mapped to the Chicken QTL Database (GG_4

  16. Genome-wide analysis of regulatory proteases sequences identified through bioinformatics data mining in Taenia solium.

    Science.gov (United States)

    Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng

    2014-06-04

    Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases

  17. Pooled-DNA sequencing identifies genomic regions of selection in Nigerian isolates of Plasmodium falciparum.

    Science.gov (United States)

    Oyebola, Kolapo M; Idowu, Emmanuel T; Olukosi, Yetunde A; Awolola, Taiwo S; Amambua-Ngwa, Alfred

    2017-06-29

    The burden of falciparum malaria is especially high in sub-Saharan Africa. Differences in pressure from host immunity and antimalarial drugs lead to adaptive changes responsible for high level of genetic variations within and between the parasite populations. Population-specific genetic studies to survey for genes under positive or balancing selection resulting from drug pressure or host immunity will allow for refinement of interventions. We performed a pooled sequencing (pool-seq) of the genomes of 100 Plasmodium falciparum isolates from Nigeria. We explored allele-frequency based neutrality test (Tajima's D) and integrated haplotype score (iHS) to identify genes under selection. Fourteen shared iHS regions that had at least 2 SNPs with a score > 2.5 were identified. These regions code for genes that were likely to have been under strong directional selection. Two of these genes were the chloroquine resistance transporter (CRT) on chromosome 7 and the multidrug resistance 1 (MDR1) on chromosome 5. There was a weak signature of selection in the dihydrofolate reductase (DHFR) gene on chromosome 4 and MDR5 genes on chromosome 13, with only 2 and 3 SNPs respectively identified within the iHS window. We observed strong selection pressure attributable to continued chloroquine and sulfadoxine-pyrimethamine use despite their official proscription for the treatment of uncomplicated malaria. There was also a major selective sweep on chromosome 6 which had 32 SNPs within the shared iHS region. Tajima's D of circumsporozoite protein (CSP), erythrocyte-binding antigen (EBA-175), merozoite surface proteins - MSP3 and MSP7, merozoite surface protein duffy binding-like (MSPDBL2) and serine repeat antigen (SERA-5) were 1.38, 1.29, 0.73, 0.84 and 0.21, respectively. We have demonstrated the use of pool-seq to understand genomic patterns of selection and variability in P. falciparum from Nigeria, which bears the highest burden of infections. This investigation identified known

  18. Whole exome sequencing identifies mutations in Usher syndrome genes in profoundly deaf Tunisian patients.

    Science.gov (United States)

    Riahi, Zied; Bonnet, Crystel; Zainine, Rim; Lahbib, Saida; Bouyacoub, Yosra; Bechraoui, Rym; Marrakchi, Jihène; Hardelin, Jean-Pierre; Louha, Malek; Largueche, Leila; Ben Yahia, Salim; Kheirallah, Moncef; Elmatri, Leila; Besbes, Ghazi; Abdelhak, Sonia; Petit, Christine

    2015-01-01

    Usher syndrome (USH) is an autosomal recessive disorder characterized by combined deafness-blindness. It accounts for about 50% of all hereditary deafness blindness cases. Three clinical subtypes (USH1, USH2, and USH3) are described, of which USH1 is the most severe form, characterized by congenital profound deafness, constant vestibular dysfunction, and a prepubertal onset of retinitis pigmentosa. We performed whole exome sequencing in four unrelated Tunisian patients affected by apparently isolated, congenital profound deafness, with reportedly normal ocular fundus examination. Four biallelic mutations were identified in two USH1 genes: a splice acceptor site mutation, c.2283-1G>T, and a novel missense mutation, c.5434G>A (p.Glu1812Lys), in MYO7A, and two previously unreported mutations in USH1G, i.e. a frameshift mutation, c.1195_1196delAG (p.Leu399Alafs*24), and a nonsense mutation, c.52A>T (p.Lys18*). Another ophthalmological examination including optical coherence tomography actually showed the presence of retinitis pigmentosa in all the patients. Our findings provide evidence that USH is under-diagnosed in Tunisian deaf patients. Yet, early diagnosis of USH is of utmost importance because these patients should undergo cochlear implant surgery in early childhood, in anticipation of the visual loss.

  19. Exome sequencing identifies a novel SMCHD1 mutation in facioscapulohumeral muscular dystrophy 2.

    Science.gov (United States)

    Mitsuhashi, Satomi; Boyden, Steven E; Estrella, Elicia A; Jones, Takako I; Rahimov, Fedik; Yu, Timothy W; Darras, Basil T; Amato, Anthony A; Folkerth, Rebecca D; Jones, Peter L; Kunkel, Louis M; Kang, Peter B

    2013-12-01

    FSHD2 is a rare form of facioscapulohumeral muscular dystrophy (FSHD) characterized by the absence of a contraction in the D4Z4 macrosatellite repeat region on chromosome 4q35 that is the hallmark of FSHD1. However, hypomethylation of this region is common to both subtypes. Recently, mutations in SMCHD1 combined with a permissive 4q35 allele were reported to cause FSHD2. We identified a novel p.Lys275del SMCHD1 mutation in a family affected with FSHD2 using whole-exome sequencing and linkage analysis. This mutation alters a highly conserved amino acid in the ATPase domain of SMCHD1. Subject III-11 is a male who developed asymmetrical muscle weakness characteristic of FSHD at 13 years. Physical examination revealed marked bilateral atrophy at biceps brachii, bilateral scapular winging, some asymmetrical weakness at tibialis anterior and peroneal muscles, and mild lower facial weakness. Biopsy of biceps brachii in subject II-5, the father of III-11, demonstrated lobulated fibers and dystrophic changes. Endomysial and perivascular inflammation was found, which has been reported in FSHD1 but not FSHD2. Given the previous report of SMCHD1 mutations in FSHD2 and the clinical presentations consistent with the FSHD phenotype, we conclude that the SMCHD1 mutation is the likely cause of the disease in this family. Copyright © 2013 Elsevier B.V. All rights reserved.

  20. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder

    Science.gov (United States)

    Yuen, Ryan KC; Merico, Daniele; Bookman, Matt; Howe, Jennifer L; Thiruvahindrapuram, Bhooma; Patel, Rohan V; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A; Walker, Susan; Marshall, Christian R; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D’Abate, Lia; Chan, Ada JS; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R; Nalpathamkalam, Thomas; Sung, Wilson WL; Tsoi, Fiona J; Wei, John; Xu, Lizhen; Tasse, Anne-Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie MacKinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A; Parr, Jeremy R; Spence, Sarah J; Vorstman, Jacob; Frey, Brendan J; Robinson, James T; Strug, Lisa J; Fernandez, Bridget A; Elsabbagh, Mayada; Carter, Melissa T; Hallmayer, Joachim; Knoppers, Bartha M; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H; Glazer, David; Pletcher, Mathew T; Scherer, Stephen W

    2017-01-01

    We are performing whole genome sequencing (WGS) of families with Autism Spectrum Disorder (ASD) to build a resource, named MSSNG, to enable the sub-categorization of phenotypes and underlying genetic factors involved. Here, we report WGS of 5,205 samples from families with ASD, accompanied by clinical information, creating a database accessible in a cloud platform, and through an internet portal with controlled access. We found an average of 73.8 de novo single nucleotide variants and 12.6 de novo insertion/deletions (indels) or copy number variations (CNVs) per ASD subject. We identified 18 new candidate ASD-risk genes such as MED13 and PHF3, and found that participants bearing mutations in susceptibility genes had significantly lower adaptive ability (p=6×10−4). In 294/2,620 (11.2%) of ASD cases, a molecular basis could be determined and 7.2% of these carried CNV/chromosomal abnormalities, emphasizing the importance of detecting all forms of genetic variation as diagnostic and therapeutic targets in ASD. PMID:28263302

  1. Antimicrobial susceptibility among clinical Nocardia species identified by multilocus sequence analysis.

    Science.gov (United States)

    McTaggart, Lisa R; Doucet, Jennifer; Witkowska, Maria; Richardson, Susan E

    2015-01-01

    Antimicrobial susceptibility patterns of 112 clinical isolates, 28 type strains, and 9 reference strains of Nocardia were determined using the Sensititre Rapmyco microdilution panel (Thermo Fisher, Inc.). Isolates were identified by highly discriminatory multilocus sequence analysis and were chosen to represent the diversity of species recovered from clinical specimens in Ontario, Canada. Susceptibility to the most commonly used drug, trimethoprim-sulfamethoxazole, was observed in 97% of isolates. Linezolid and amikacin were also highly effective; 100% and 99% of all isolates demonstrated a susceptible phenotype. For the remaining antimicrobials, resistance was species specific with isolates of Nocardia otitidiscaviarum, N. brasiliensis, N. abscessus complex, N. nova complex, N. transvalensis complex, N. farcinica, and N. cyriacigeorgica displaying the traditional characteristic drug pattern types. In addition, the antimicrobial susceptibility profiles of a variety of rarely encountered species isolated from clinical specimens are reported for the first time and were categorized into four additional drug pattern types. Finally, MICs for the control strains N. nova ATCC BAA-2227, N. asteroides ATCC 19247(T), and N. farcinica ATCC 23826 were robustly determined to demonstrate method reproducibility and suitability of the commercial Sensititre Rapmyco panel for antimicrobial susceptibility testing of Nocardia spp. isolated from clinical specimens. The reported values will facilitate quality control and standardization among laboratories. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  2. Whole exome sequencing identifies mutations in Usher syndrome genes in profoundly deaf Tunisian patients.

    Directory of Open Access Journals (Sweden)

    Zied Riahi

    Full Text Available Usher syndrome (USH is an autosomal recessive disorder characterized by combined deafness-blindness. It accounts for about 50% of all hereditary deafness blindness cases. Three clinical subtypes (USH1, USH2, and USH3 are described, of which USH1 is the most severe form, characterized by congenital profound deafness, constant vestibular dysfunction, and a prepubertal onset of retinitis pigmentosa. We performed whole exome sequencing in four unrelated Tunisian patients affected by apparently isolated, congenital profound deafness, with reportedly normal ocular fundus examination. Four biallelic mutations were identified in two USH1 genes: a splice acceptor site mutation, c.2283-1G>T, and a novel missense mutation, c.5434G>A (p.Glu1812Lys, in MYO7A, and two previously unreported mutations in USH1G, i.e. a frameshift mutation, c.1195_1196delAG (p.Leu399Alafs*24, and a nonsense mutation, c.52A>T (p.Lys18*. Another ophthalmological examination including optical coherence tomography actually showed the presence of retinitis pigmentosa in all the patients. Our findings provide evidence that USH is under-diagnosed in Tunisian deaf patients. Yet, early diagnosis of USH is of utmost importance because these patients should undergo cochlear implant surgery in early childhood, in anticipation of the visual loss.

  3. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    Science.gov (United States)

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  4. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3' UTRs and coding sequences.

    Science.gov (United States)

    Šulc, Miroslav; Marín, Ray M; Robins, Harlan S; Vaníček, Jiří

    2015-07-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3' untranslated regions (3' UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3' UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA-mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA-mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3′ UTRs and coding sequences

    Science.gov (United States)

    Šulc, Miroslav; Marín, Ray M.; Robins, Harlan S.; Vaníček, Jiří

    2015-01-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3′ untranslated regions (3′ UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3′ UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA–mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA–mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. PMID:25948580

  6. Global Transcriptome Sequencing Identifies Chlamydospore Specific Markers in Candida albicans and Candida dubliniensis

    LENUS (Irish Health Repository)

    Palige, Katja

    2013-04-15

    Candida albicans and Candida dubliniensis are pathogenic fungi that are highly related but differ in virulence and in some phenotypic traits. During in vitro growth on certain nutrient-poor media, C. albicans and C. dubliniensis are the only yeast species which are able to produce chlamydospores, large thick-walled cells of unknown function. Interestingly, only C. dubliniensis forms pseudohyphae with abundant chlamydospores when grown on Staib medium, while C. albicans grows exclusively as a budding yeast. In order to further our understanding of chlamydospore development and assembly, we compared the global transcriptional profile of both species during growth in liquid Staib medium by RNA sequencing. We also included a C. albicans mutant in our study which lacks the morphogenetic transcriptional repressor Nrg1. This strain, which is characterized by its constitutive pseudohyphal growth, specifically produces masses of chlamydospores in Staib medium, similar to C. dubliniensis. This comparative approach identified a set of putatively chlamydospore-related genes. Two of the homologous C. albicans and C. dubliniensis genes (CSP1 and CSP2) which were most strongly upregulated during chlamydospore development were analysed in more detail. By use of the green fluorescent protein as a reporter, the encoded putative cell wall related proteins were found to exclusively localize to C. albicans and C. dubliniensis chlamydospores. Our findings uncover the first chlamydospore specific markers in Candida species and provide novel insights in the complex morphogenetic development of these important fungal pathogens.

  7. SEQATOMS: a web tool for identifying missing regions in PDB in sequence context

    NARCIS (Netherlands)

    Brandt, B.W.; Heringa, J.; Leunissen, J.A.M.

    2008-01-01

    With over 46 000 proteins, the Protein Data Bank (PDB) is the most important database with structural information of biological macromolecules. PDB files contain sequence and coordinate information. Residues present in the sequence can be absent from the coordinate section, which means their

  8. Whole genome sequencing resource identifies 18 new candidate genes for autism spectrum disorder

    NARCIS (Netherlands)

    Yuen, Ryan K C; Merico, Daniele; Bookman, Matt; Howe, Jennifer L.; Thiruvahindrapuram, Bhooma; Patel, Rohan V.; Whitney, Joe; Deflaux, Nicole; Bingham, Jonathan; Wang, Zhuozhi; Pellecchia, Giovanna; Buchanan, Janet A.; Walker, Susan; Marshall, Christian R.; Uddin, Mohammed; Zarrei, Mehdi; Deneault, Eric; D'Abate, Lia; Chan, Ada J S; Koyanagi, Stephanie; Paton, Tara; Pereira, Sergio L.; Hoang, Ny; Engchuan, Worrawat; Higginbotham, Edward J.; Ho, Karen; Lamoureux, Sylvia; Li, Weili; MacDonald, Jeffrey R.; Nalpathamkalam, Thomas; Sung, Wilson W L; Tsoi, Fiona J.; Wei, John; Xu, Lizhen; Tasse, Anne Marie; Kirby, Emily; Van Etten, William; Twigger, Simon; Roberts, Wendy; Drmic, Irene; Jilderda, Sanne; Modi, Bonnie Mackinnon; Kellam, Barbara; Szego, Michael; Cytrynbaum, Cheryl; Weksberg, Rosanna; Zwaigenbaum, Lonnie; Woodbury-Smith, Marc; Brian, Jessica; Senman, Lili; Iaboni, Alana; Doyle-Thomas, Krissy; Thompson, Ann; Chrysler, Christina; Leef, Jonathan; Savion-Lemieux, Tal; Smith, Isabel M.; Liu, Xudong; Nicolson, Rob; Seifer, Vicki; Fedele, Angie; Cook, Edwin H.; Dager, Stephen; Estes, Annette; Gallagher, Louise; Malow, Beth A.; Parr, Jeremy R.; Spence, Sarah J.; Vorstman, Jacob; Frey, Brendan J.; Robinson, James T.; Strug, Lisa J.; Fernandez, Bridget A.; Elsabbagh, Mayada; Carter, Melissa T.; Hallmayer, Joachim; Knoppers, Bartha M.; Anagnostou, Evdokia; Szatmari, Peter; Ring, Robert H.; Glazer, David; Pletcher, Mathew T.; Scherer, Stephen W.

    2017-01-01

    We are performing whole-genome sequencing of families with autism spectrum disorder (ASD) to build a resource (MSSNG) for subcategorizing the phenotypes and underlying genetic factors involved. Here we report sequencing of 5,205 samples from families with ASD, accompanied by clinical information,

  9. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways

    NARCIS (Netherlands)

    Cirulli, Elizabeth T.; Lasseigne, Brittany N.; Petrovski, Slavé; Sapp, Peter C.; Dion, Patrick A.; Leblond, Claire S.; Couthouis, Julien; Lu, Yi-Fan; Wang, Quanli; Krueger, Brian J.; Ren, Zhong; Keebler, Jonathan; Han, Yujun; Levy, Shawn E.; Boone, Braden E.; Wimbish, Jack R.; Waite, Lindsay L.; Jones, Angela L.; Carulli, John P.; Day-Williams, Aaron G.; Staropoli, John F.; Xin, Winnie W.; Chesi, Alessandra; Raphael, Alya R.; McKenna-Yasek, Diane; Cady, Janet; de Jong, J. M. B. Vianney; Kenna, Kevin P.; Smith, Bradley N.; Topp, Simon; Miller, Jack; Gkazi, Athina; Al-Chalabi, Ammar; van den Berg, Leonard H.; Veldink, Jan; Silani, Vincenzo; Ticozzi, Nicola; Shaw, Christopher E.; Baloh, Robert H.; Appel, Stanley; Simpson, Ericka; Lagier-Tourenne, Clotilde; Pulst, Stefan M.; Gibson, Summer; Trojanowski, John Q.; Elman, Lauren; McCluskey, Leo; Grossman, Murray; Baas, Frank; ten Asbroek, Anneloor L. M. A.

    2015-01-01

    Amyotrophic lateral sclerosis (ALS) is a devastating neurological disease with no effective treatment. We report the results of a moderate-scale sequencing study aimed at increasing the number of genes known to contribute to predisposition for ALS. We performed whole-exome sequencing of 2869 ALS

  10. Identifying Students' Conceptions of Basic Principles in Sequence Stratigraphy

    Science.gov (United States)

    Herrera, Juan S.; Riggs, Eric M.

    2013-01-01

    Sequence stratigraphy is a major research subject in the geosciences academia and the oil industry. However, the geoscience education literature addressing students' understanding of the basic concepts of sequence stratigraphy is relatively thin, and the topic has not been well explored. We conducted an assessment of 27 students' conceptions of…

  11. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Directory of Open Access Journals (Sweden)

    Pimlapas Leekitcharoenphon

    Full Text Available Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections.

  12. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Thorup Nielsen, Mette

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely...

  13. Exome sequencing of Pakistani consanguineous families identifies 30 novel candidate genes for recessive intellectual disability.

    Science.gov (United States)

    Riazuddin, S; Hussain, M; Razzaq, A; Iqbal, Z; Shahzad, M; Polla, D L; Song, Y; van Beusekom, E; Khan, A A; Tomas-Roca, L; Rashid, M; Zahoor, M Y; Wissink-Lindhout, W M; Basra, M A R; Ansar, M; Agha, Z; van Heeswijk, K; Rasheed, F; Van de Vorst, M; Veltman, J A; Gilissen, C; Akram, J; Kleefstra, T; Assir, M Z; Grozeva, D; Carss, K; Raymond, F L; O'Connor, T D; Riazuddin, S A; Khan, S N; Ahmed, Z M; de Brouwer, A P M; van Bokhoven, H; Riazuddin, S

    2017-11-01

    Intellectual disability (ID) is a clinically and genetically heterogeneous disorder, affecting 1-3% of the general population. Although research into the genetic causes of ID has recently gained momentum, identification of pathogenic mutations that cause autosomal recessive ID (ARID) has lagged behind, predominantly due to non-availability of sizeable families. Here we present the results of exome sequencing in 121 large consanguineous Pakistani ID families. In 60 families, we identified homozygous or compound heterozygous DNA variants in a single gene, 30 affecting reported ID genes and 30 affecting novel candidate ID genes. Potential pathogenicity of these alleles was supported by co-segregation with the phenotype, low frequency in control populations and the application of stringent bioinformatics analyses. In another eight families segregation of multiple pathogenic variants was observed, affecting 19 genes that were either known or are novel candidates for ID. Transcriptome profiles of normal human brain tissues showed that the novel candidate ID genes formed a network significantly enriched for transcriptional co-expression (P<0.0001) in the frontal cortex during fetal development and in the temporal-parietal and sub-cortex during infancy through adulthood. In addition, proteins encoded by 12 novel ID genes directly interact with previously reported ID proteins in six known pathways essential for cognitive function (P<0.0001). These results suggest that disruptions of temporal parietal and sub-cortical neurogenesis during infancy are critical to the pathophysiology of ID. These findings further expand the existing repertoire of genes involved in ARID, and provide new insights into the molecular mechanisms and the transcriptome map of ID.

  14. A bioinformatics approach for identifying transgene insertion sites using whole genome sequencing data.

    Science.gov (United States)

    Park, Doori; Park, Su-Hyun; Ban, Yong Wook; Kim, Youn Shic; Park, Kyoung-Cheul; Kim, Nam-Soo; Kim, Ju-Kon; Choi, Ik-Young

    2017-08-15

    Genetically modified crops (GM crops) have been developed to improve the agricultural traits of modern crop cultivars. Safety assessments of GM crops are of paramount importance in research at developmental stages and before releasing transgenic plants into the marketplace. Sequencing technology is developing rapidly, with higher output and labor efficiencies, and will eventually replace existing methods for the molecular characterization of genetically modified organisms. To detect the transgenic insertion locations in the three GM rice gnomes, Illumina sequencing reads are mapped and classified to the rice genome and plasmid sequence. The both mapped reads are classified to characterize the junction site between plant and transgene sequence by sequence alignment. Herein, we present a next generation sequencing (NGS)-based molecular characterization method, using transgenic rice plants SNU-Bt9-5, SNU-Bt9-30, and SNU-Bt9-109. Specifically, using bioinformatics tools, we detected the precise insertion locations and copy numbers of transfer DNA, genetic rearrangements, and the absence of backbone sequences, which were equivalent to results obtained from Southern blot analyses. NGS methods have been suggested as an effective means of characterizing and detecting transgenic insertion locations in genomes. Our results demonstrate the use of a combination of NGS technology and bioinformatics approaches that offers cost- and time-effective methods for assessing the safety of transgenic plants.

  15. The complete genomic sequence of a tentative new polerovirus identified in barley in South Korea.

    Science.gov (United States)

    Zhao, Fumei; Lim, Seungmo; Yoo, Ran Hee; Igori, Davaajargal; Kim, Sang-Min; Kwak, Do Yeon; Kim, Sun Lim; Lee, Bong Choon; Moon, Jae Sun

    2016-07-01

    The complete nucleotide sequence of a new barley polerovirus, tentatively named barley virus G (BVG), which was isolated in Gimje, South Korea, has been determined using an RNA sequencing technique combined with polymerase chain reaction methods. The viral genomic RNA of BVG is 5,620 nucleotides long and contains six typical open reading frames commonly observed in other poleroviruses. Sequence comparisons revealed that BVG is most closely related to maize yellow dwarf virus-RMV, with the highest amino acid identities being less than 90 % for all of the corresponding proteins. These results suggested that BVG is a member of a new species in the genus Polerovirus.

  16. Next-generation sequencing for genetic testing of familial colorectal cancer syndromes.

    Science.gov (United States)

    Simbolo, Michele; Mafficini, Andrea; Agostini, Marco; Pedrazzani, Corrado; Bedin, Chiara; Urso, Emanuele D; Nitti, Donato; Turri, Giona; Scardoni, Maria; Fassan, Matteo; Scarpa, Aldo

    2015-01-01

    Genetic screening in families with high risk to develop colorectal cancer (CRC) prevents incurable disease and permits personalized therapeutic and follow-up strategies. The advancement of next-generation sequencing (NGS) technologies has revolutionized the throughput of DNA sequencing. A series of 16 probands for either familial adenomatous polyposis (FAP; 8 cases) or hereditary nonpolyposis colorectal cancer (HNPCC; 8 cases) were investigated for intragenic mutations in five CRC familial syndromes-associated genes (APC, MUTYH, MLH1, MSH2, MSH6) applying both a custom multigene Ion AmpliSeq NGS panel and conventional Sanger sequencing. Fourteen pathogenic variants were detected in 13/16 FAP/HNPCC probands (81.3 %); one FAP proband presented two co-existing pathogenic variants, one in APC and one in MUTYH. Thirteen of these 14 pathogenic variants were detected by both NGS and Sanger, while one MSH2 mutation (L280FfsX3) was identified only by Sanger sequencing. This is due to a limitation of the NGS approach in resolving sequences close or within homopolymeric stretches of DNA. To evaluate the performance of our NGS custom panel we assessed its capability to resolve the DNA sequences corresponding to 2225 pathogenic variants reported in the COSMIC database for APC, MUTYH, MLH1, MSH2, MSH6. Our NGS custom panel resolves the sequences where 2108 (94.7 %) of these variants occur. The remaining 117 mutations reside inside or in close proximity to homopolymer stretches; of these 27 (1.2 %) are imprecisely identified by the software but can be resolved by visual inspection of the region, while the remaining 90 variants (4.0 %) are blind spots. In summary, our custom panel would miss 4 % (90/2225) of pathogenic variants that would need a small set of Sanger sequencing reactions to be solved. The multiplex NGS approach has the advantage of analyzing multiple genes in multiple samples simultaneously, requiring only a reduced number of Sanger sequences to resolve

  17. The Comparison of Biochemical and Sequencing 16S rDNA Gene Methods to Identify Nontuberculous Mycobacteria

    Directory of Open Access Journals (Sweden)

    Shafipour1, M.

    2014-11-01

    Full Text Available The identification of Mycobacteria in the species level has great medical importance. Biochemical tests are laborious and time-consuming, so new techniques could be used to identify the species. This research aimed to the comparison of biochemical and sequencing 16S rDNA gene methods to identify nontuberculous Mycobacteria in patients suspected to tuberculosis in Golestan province which is the most prevalent region of tuberculosis in Iran. Among 3336 patients suspected to tuberculosis referred to hospitals and health care centres in Golestan province during 2010-2011, 319 (9.56% culture positive cases were collected. Identification of species by using biochemical tests was done. On the samples recognized as nontuberculous Mycobacteria, after DNA extraction by boiling, 16S rDNA PCR was done and their sequencing were identified by NCBI BLAST. Of the 319 positive samples in Golestan Province, 300 cases were M.tuberculosis and 19 cases (5.01% were identified as nontuberculous Mycobacteria by biochemical tests. 15 out of 19 nontuberculous Mycobacteria were identified by PCR and sequencing method as similar by biochemical methods (similarity rate: 78.9%. But after PCR, 1 case known as M.simiae by biochemical test was identified as M. lentiflavum and 3 other cases were identified as Nocardia. Biochemical methods corresponded to the 16S rDNA PCR and sequencing in 78.9% of cases. However, in identification of M. lentiflavum and Nocaria sp. the molecular method is better than biochemical methods.

  18. VWF mutations and new sequence variations identified in healthy controls are more frequent in the African-American population.

    Science.gov (United States)

    Bellissimo, Daniel B; Christopherson, Pamela A; Flood, Veronica H; Gill, Joan Cox; Friedman, Kenneth D; Haberichter, Sandra L; Shapiro, Amy D; Abshire, Thomas C; Leissinger, Cindy; Hoots, W Keith; Lusher, Jeanne M; Ragni, Margaret V; Montgomery, Robert R

    2012-03-01

    Diagnosis and classification of VWD is aided by molecular analysis of the VWF gene. Because VWF polymorphisms have not been fully characterized, we performed VWF laboratory testing and gene sequencing of 184 healthy controls with a negative bleeding history. The controls included 66 (35.9%) African Americans (AAs). We identified 21 new sequence variations, 13 (62%) of which occurred exclusively in AAs and 2 (G967D, T2666M) that were found in 10%-15% of the AA samples, suggesting they are polymorphisms. We identified 14 sequence variations reported previously as VWF mutations, the majority of which were type 1 mutations. These controls had VWF Ag levels within the normal range, suggesting that these sequence variations might not always reduce plasma VWF levels. Eleven mutations were found in AAs, and the frequency of M740I, H817Q, and R2185Q was 15%-18%. Ten AA controls had the 2N mutation H817Q; 1 was homozygous. The average factor VIII level in this group was 99 IU/dL, suggesting that this variation may confer little or no clinical symptoms. This study emphasizes the importance of sequencing healthy controls to understand ethnic-specific sequence variations so that asymptomatic sequence variations are not misidentified as mutations in other ethnic or racial groups.

  19. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Science.gov (United States)

    Greub, Gilbert; Kebbi-Beghdadi, Carole; Bertelli, Claire; Collyn, François; Riederer, Beat M; Yersin, Camille; Croxatto, Antony; Raoult, Didier

    2009-12-23

    With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  20. Mutations in the newly identified RAX regulatory sequence are not a frequent cause of micro/anophthalmia.

    Science.gov (United States)

    Chassaing, Nicolas; Vigouroux, Adeline; Calvas, Patrick

    2009-06-01

    Microphthalmia and anophthalmia are at the severe end of the spectrum of abnormalities in ocular development. A few genes (SOX2, OTX2, RAX, and CHX10) have been implicated in isolated micro/anophthalmia, but causative mutations of these genes explain less than a quarter of these developmental defects. A specifically conserved SOX2/OTX2-mediated RAX expression regulatory sequence has recently been identified. We postulated that mutations in this sequence could lead to micro/anophthalmia, and thus we performed molecular screening of this regulatory element in patients suffering from micro/anophthalmia. Fifty-one patients suffering from nonsyndromic microphthalmia (n = 40) or anophthalmia (n = 11) were included in this study after negative molecular screening for SOX2, OTX2, RAX, and CHX10 mutations. Mutation screening of the RAX regulatory sequence was performed by direct sequencing for these patients. No mutations were identified in the highly conserved RAX regulatory sequence in any of the 51 patients. Mutations in the newly identified RAX regulatory sequence do not represent a frequent cause of nonsyndromic micro/anophthalmia.

  1. Salmonella Persistence in Tomatoes Requires a Distinct Set of Metabolic Functions Identified by Transposon Insertion Sequencing

    Science.gov (United States)

    Desai, Prerak; Porwollik, Steffen; Canals, Rocio; Perez, Daniel R.; Chu, Weiping; McClelland, Michael; Teplitski, Max

    2016-01-01

    ABSTRACT Human enteric pathogens, such as Salmonella spp. and verotoxigenic Escherichia coli, are increasingly recognized as causes of gastroenteritis outbreaks associated with the consumption of fruits and vegetables. Persistence in plants represents an important part of the life cycle of these pathogens. The identification of the full complement of Salmonella genes involved in the colonization of the model plant (tomato) was carried out using transposon insertion sequencing analysis. With this approach, 230,000 transposon insertions were screened in tomato pericarps to identify loci with reduction in fitness, followed by validation of the screen results using competition assays of the isogenic mutants against the wild type. A comparison with studies in animals revealed a distinct plant-associated set of genes, which only partially overlaps with the genes required to elicit disease in animals. De novo biosynthesis of amino acids was critical to persistence within tomatoes, while amino acid scavenging was prevalent in animal infections. Fitness reduction of the Salmonella amino acid synthesis mutants was generally more severe in the tomato rin mutant, which hyperaccumulates certain amino acids, suggesting that these nutrients remain unavailable to Salmonella spp. within plants. Salmonella lipopolysaccharide (LPS) was required for persistence in both animals and plants, exemplifying some shared pathogenesis-related mechanisms in animal and plant hosts. Similarly to phytopathogens, Salmonella spp. required biosynthesis of amino acids, LPS, and nucleotides to colonize tomatoes. Overall, however, it appears that while Salmonella shares some strategies with phytopathogens and taps into its animal virulence-related functions, colonization of tomatoes represents a distinct strategy, highlighting this pathogen's flexible metabolism. IMPORTANCE Outbreaks of gastroenteritis caused by human pathogens have been increasingly associated with foods of plant origin, with tomatoes

  2. Exome Sequencing Identifies a Novel MAP3K14 Mutation in Recessive Atypical Combined Immunodeficiency

    Directory of Open Access Journals (Sweden)

    Nikola Schlechter

    2017-11-01

    Full Text Available Primary immunodeficiency disorders (PIDs render patients vulnerable to infection with a wide range of microorganisms and thus provide good in vivo models for the assessment of immune responses during infectious challenges. Priming of the immune system, especially in infancy, depends on different environmental exposures and medical practices. This may determine the timing and phenotype of clinical appearance of immune deficits as exemplified with early exposure to Bacillus Calmette-Guérin (BCG vaccination and dissemination in combined immunodeficiencies. Varied phenotype expression poses a challenge to identification of the putative immune deficit. Without the availability of genomic diagnosis and data analysis resources and with limited capacity for functional definition of immune pathways, it is difficult to establish a definitive diagnosis and to decide on appropriate treatment. This study describes the use of exome sequencing to identify a homozygous recessive variant in MAP3K14, NIKVal345Met, in a patient with combined immunodeficiency, disseminated BCG-osis, and paradoxically elevated lymphocytes. Laboratory testing confirmed hypogammaglobulinemia with normal CD19, but failed to confirm a definitive diagnosis for targeted treatment decisions. NIKVal345Met is predicted to be deleterious and pathogenic by two in silico prediction tools and is situated in a gene crucial for effective functioning of the non-canonical nuclear factor-kappa B signaling pathway. Functional analysis of NIKVal345Met- versus NIKWT-transfected human embryonic kidney-293T cells showed that this mutation significantly affects the kinase activity of NIK leading to decreased levels of phosphorylated IkappaB kinase-alpha (IKKα, the target of NIK. BCG-stimulated RAW264.7 cells transfected with NIKVal345Met also presented with reduced levels of phosphorylated IKKα, significantly increased p100 levels and significantly decreased p52 levels compared to cells transfected

  3. Targeted next generation sequencing identified a novel mutation in MYO7A causing Usher syndrome type 1 in an Iranian consanguineous pedigree.

    Science.gov (United States)

    Kooshavar, Daniz; Razipour, Masoumeh; Movasat, Morteza; Keramatipour, Mohammad

    2018-01-01

    Usher syndrome (USH) is characterized by congenital hearing loss and retinitis pigmentosa (RP) with a later onset. It is an autosomal recessive trait with clinical and genetic heterogeneity which makes the molecular diagnosis much difficult. In this study, we introduce a pedigree with two affected members with USH type 1 and represent a cost and time effective approach for genetic diagnosis of USH as a genetically heterogeneous disorder. Target region capture in the genes of interest, followed by next generation sequencing (NGS) was used to determine the causative mutations in one of the probands. Then segregation analysis in the pedigree was conducted using PCR-Sanger sequencing. Targeted NGS detected a novel homozygous nonsense variant c.4513G > T (p.Glu1505Ter) in MYO7A. The variant is segregating in the pedigree with an autosomal recessive pattern. In this study, a novel stop gained variant c.4513G > T (p.Glu1505Ter) in MYO7A was found in an Iranian pedigree with two affected members with USH type 1. Bioinformatic as well as pedigree segregation analyses were in line with pathogenic nature of this variant. Targeted NGS panel was showed to be an efficient method for mutation detection in hereditary disorders with locus heterogeneity. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Science.gov (United States)

    Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

    2014-01-01

    Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738

  5. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Directory of Open Access Journals (Sweden)

    Zheng Ping

    2014-01-01

    Full Text Available Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer.

  6. An Efficient Method for Identifying Gene Fusions by Targeted RNA Sequencing from Fresh Frozen and FFPE Samples.

    Directory of Open Access Journals (Sweden)

    Jonathan A Scolnick

    Full Text Available Fusion genes are known to be key drivers of tumor growth in several types of cancer. Traditionally, detecting fusion genes has been a difficult task based on fluorescent in situ hybridization to detect chromosomal abnormalities. More recently, RNA sequencing has enabled an increased pace of fusion gene identification. However, RNA-Seq is inefficient for the identification of fusion genes due to the high number of sequencing reads needed to detect the small number of fusion transcripts present in cells of interest. Here we describe a method, Single Primer Enrichment Technology (SPET, for targeted RNA sequencing that is customizable to any target genes, is simple to use, and efficiently detects gene fusions. Using SPET to target 5701 exons of 401 known cancer fusion genes for sequencing, we were able to identify known and previously unreported gene fusions from both fresh-frozen and formalin-fixed paraffin-embedded (FFPE tissue RNA in both normal tissue and cancer cells.

  7. Exome sequencing identifies NBEAL2 as the causative gene for gray platelet syndrome

    NARCIS (Netherlands)

    Albers, Cornelis A.; Cvejic, Ana; Favier, Rémi; Bouwmans, Evelien E.; Alessi, Marie-Christine; Bertone, Paul; Jordan, Gregory; Kettleborough, Ross N. W.; Kiddle, Graham; Kostadima, Myrto; Read, Randy J.; Sipos, Botond; Sivapalaratnam, Suthesh; Smethurst, Peter A.; Stephens, Jonathan; Voss, Katrin; Nurden, Alan; Rendon, Augusto; Nurden, Paquita; Ouwehand, Willem H.

    2011-01-01

    Gray platelet syndrome (GPS) is a predominantly recessive platelet disorder that is characterized by mild thrombocytopenia with large platelets and a paucity of α-granules; these abnormalities cause mostly moderate but in rare cases severe bleeding. We sequenced the exomes of four unrelated

  8. Identifying Learning Behaviors by Contextualizing Differential Sequence Mining with Action Features and Performance Evolution

    Science.gov (United States)

    Kinnebrew, John S.; Biswas, Gautam

    2012-01-01

    Our learning-by-teaching environment, Betty's Brain, captures a wealth of data on students' learning interactions as they teach a virtual agent. This paper extends an exploratory data mining methodology for assessing and comparing students' learning behaviors from these interaction traces. The core algorithm employs sequence mining techniques to…

  9. Use of microsatellite markers derived from whole genome sequence data for identifying polymorphism in Phytophthora ramorum

    Science.gov (United States)

    Kelly Ivors; Matteo Garbelotto; Ineke De Vries; Peter Bonants

    2006-01-01

    Investigating the population genetics of Phytophthora ramorum, the causal agent of sudden oak death (SOD), is critical to understanding the biology and epidemiology of this important phytopathogen. Raw sequence data (445,000 reads) of P. ramorum was provided by the Joint Genome Institute. Our objective was to develop and utilize...

  10. Sequence analysis of the its-2 region: a tool to identify strains of Scenedesmus (Chlorophyceae)

    NARCIS (Netherlands)

    Van Hannen, E.J.; Lürling, M.; Van Donk, E.

    2000-01-01

    The genetic distances between several strains of Senedesmus obliquus (Turp,) Kutz,, S, acutus Hortobagyi, and S, naegelii Chod. calculated from ITS-2 sequences were found to be smaller than the genetic distances within other strains of Scenedesmus-that is, in S, acuminatus (Lagerh,) Chod, and S,

  11. Whole-genome sequencing and comprehensive molecular profiling identify new driver mutations in gastric cancer

    NARCIS (Netherlands)

    Wang, Kai; Yuen, Siu Tsan; Xu, Jiangchun; Lee, Siu Po; Yan, Helen H N; Shi, Stephanie T; Siu, Hoi Cheong; Deng, Shibing; Chu, Kent Man; Law, Simon; Chan, Kok Hoe; Chan, Annie S Y; Tsui, Wai Yin; Ho, Siu Lun; Chan, Anthony K W; Man, Jonathan L K; Foglizzo, Valentina; Ng, Man Kin; Chan, April S; Ching, Yick Pang; Cheng, Grace H W; Xie, Tao; Fernandez, Julio; Li, Vivian S W; Clevers, Hans; Rejto, Paul A; Mao, Mao; Leung, Suet Yi

    Gastric cancer is a heterogeneous disease with diverse molecular and histological subtypes. We performed whole-genome sequencing in 100 tumor-normal pairs, along with DNA copy number, gene expression and methylation profiling, for integrative genomic analysis. We found subtype-specific genetic and

  12. Diverse Array of New Viral Sequences Identified in Worldwide Populations of the Asian Citrus Psyllid (Diaphorina citri) Using Viral Metagenomics.

    Science.gov (United States)

    Nouri, Shahideh; Salem, Nidá; Nigg, Jared C; Falk, Bryce W

    2015-12-16

    The Asian citrus psyllid, Diaphorina citri, is the natural vector of the causal agent of Huanglongbing (HLB), or citrus greening disease. Together; HLB and D. citri represent a major threat to world citrus production. As there is no cure for HLB, insect vector management is considered one strategy to help control the disease, and D. citri viruses might be useful. In this study, we used a metagenomic approach to analyze viral sequences associated with the global population of D. citri. By sequencing small RNAs and the transcriptome coupled with bioinformatics analysis, we showed that the virus-like sequences of D. citri are diverse. We identified novel viral sequences belonging to the picornavirus superfamily, the Reoviridae, Parvoviridae, and Bunyaviridae families, and an unclassified positive-sense single-stranded RNA virus. Moreover, a Wolbachia prophage-related sequence was identified. This is the first comprehensive survey to assess the viral community from worldwide populations of an agricultural insect pest. Our results provide valuable information on new putative viruses, some of which may have the potential to be used as biocontrol agents. Insects have the most species of all animals, and are hosts to, and vectors of, a great variety of known and unknown viruses. Some of these most likely have the potential to be important fundamental and/or practical resources. In this study, we used high-throughput next-generation sequencing (NGS) technology and bioinformatics analysis to identify putative viruses associated with Diaphorina citri, the Asian citrus psyllid. D. citri is the vector of the bacterium causing Huanglongbing (HLB), currently the most serious threat to citrus worldwide. Here, we report several novel viral sequences associated with D. citri. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  13. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome.

    Science.gov (United States)

    Benoit, Joshua B; Adelman, Zach N; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C; Szuter, Elise M; Hagan, Richard W; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M; Nelson, David R; Rosendale, Andrew J; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R; Ioannidis, Panagiotis; Waterhouse, Robert M; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J Spencer; Gondhalekar, Ameya D; Scharf, Michael E; Peterson, Brittany F; Raje, Kapil R; Hottel, Benjamin A; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S T; Duncan, Elizabeth J; Murali, Shwetha C; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C; Muzny, Donna M; Wheeler, David; Panfilio, Kristen A; Vargas Jentzsch, Iris M; Vargo, Edward L; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T; Anderson, Michelle A E; Jones, Jeffery W; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D; Attardo, Geoffrey M; Robertson, Hugh M; Zdobnov, Evgeny M; Ribeiro, Jose M C; Gibbs, Richard A; Werren, John H; Palli, Subba R; Schal, Coby; Richards, Stephen

    2016-02-02

    The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host-symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human-bed bug and symbiont-bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite.

  14. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome

    Science.gov (United States)

    Benoit, Joshua B.; Adelman, Zach N.; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C.; Szuter, Elise M.; Hagan, Richard W.; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M.; Nelson, David R.; Rosendale, Andrew J.; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M.; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R.; Ioannidis, Panagiotis; Waterhouse, Robert M.; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J. Spencer; Gondhalekar, Ameya D.; Scharf, Michael E.; Peterson, Brittany F.; Raje, Kapil R.; Hottel, Benjamin A.; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S. T.; Duncan, Elizabeth J.; Murali, Shwetha C.; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L.; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C.; Muzny, Donna M.; Wheeler, David; Panfilio, Kristen A.; Vargas Jentzsch, Iris M.; Vargo, Edward L.; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T.; Anderson, Michelle A. E.; Jones, Jeffery W.; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D.; Attardo, Geoffrey M.; Robertson, Hugh M.; Zdobnov, Evgeny M.; Ribeiro, Jose M. C.; Gibbs, Richard A.; Werren, John H.; Palli, Subba R.; Schal, Coby; Richards, Stephen

    2016-01-01

    The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host–symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human–bed bug and symbiont–bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite. PMID:26836814

  15. The genome sequence of the emerging common midwife toad virus identifies an evolutionary intermediate within ranaviruses.

    Science.gov (United States)

    Mavian, Carla; López-Bueno, Alberto; Balseiro, Ana; Casais, Rosa; Alcamí, Antonio; Alejo, Alí

    2012-04-01

    Worldwide amphibian population declines have been ascribed to global warming, increasing pollution levels, and other factors directly related to human activities. These factors may additionally be favoring the emergence of novel pathogens. In this report, we have determined the complete genome sequence of the emerging common midwife toad ranavirus (CMTV), which has caused fatal disease in several amphibian species across Europe. Phylogenetic and gene content analyses of the first complete genomic sequence from a ranavirus isolated in Europe show that CMTV is an amphibian-like ranavirus (ALRV). However, the CMTV genome structure is novel and represents an intermediate evolutionary stage between the two previously described ALRV groups. We find that CMTV clusters with several other ranaviruses isolated from different hosts and locations which might also be included in this novel ranavirus group. This work sheds light on the phylogenetic relationships within this complex group of emerging, disease-causing viruses.

  16. High throughput sequencing identifies chilling responsive genes in sweetpotato (Ipomoea batatas Lam.) during storage.

    Science.gov (United States)

    Xie, Zeyi; Zhou, Zhilin; Li, Hongmin; Yu, Jingjing; Jiang, Jiaojiao; Tang, Zhonghou; Ma, Daifu; Zhang, Baohong; Han, Yonghua; Li, Zongyun

    2018-05-21

    Sweetpotato (Ipomoea batatas L.) is a globally important economic food crop. It belongs to Convolvulaceae family and origins in the tropics; however, sweetpotato is sensitive to cold stress during storage. In this study, we performed transcriptome sequencing to investigate the sweetpotato response to chilling stress during storage. A total of 110,110 unigenes were generated via high-throughput sequencing. Differentially expressed genes (DEGs) analysis showed that 18,681 genes were up-regulated and 21,983 genes were down-regulated in low temperature condition. Many DEGs were related to the cell membrane system, antioxidant enzymes, carbohydrate metabolism, and hormone metabolism, which are potentially associated with sweetpotato resistance to low temperature. The existence of DEGs suggests a molecular basis for the biochemical and physiological consequences of sweetpotato in low temperature storage conditions. Our analysis will provide a new target for enhancement of sweetpotato cold stress tolerance in postharvest storage through genetic manipulation. Copyright © 2018. Published by Elsevier Inc.

  17. Identifying spatial clustering properties of the 1997-2003 Liguria (Northern Italy) forest-fire sequence

    International Nuclear Information System (INIS)

    Telesca, Luciano; Amatulli, Giuseppe; Lasaponara, Rosa; Lovallo, Michele; Santulli, Adriano

    2007-01-01

    The spatial clustering of the forest-fire sequence (1997-2003) of Liguria Region (Northern Italy) has been analysed using the correlation dimension D C , calculated by means of the correlation integral method. Studying the variations of this parameter, we recognize the presence of a strong variability of the spatial clusterization, modulated by seasonal cycles. Furthermore, we found that the larger fires (size >400 ha) mark the cyclic behaviour of the correlation dimension

  18. Epitope Sequences in Dengue Virus NS1 Protein Identified by Monoclonal Antibodies

    Directory of Open Access Journals (Sweden)

    Leticia Barboza Rocha

    2017-10-01

    Full Text Available Dengue nonstructural protein 1 (NS1 is a multi-functional glycoprotein with essential functions both in viral replication and modulation of host innate immune responses. NS1 has been established as a good surrogate marker for infection. In the present study, we generated four anti-NS1 monoclonal antibodies against recombinant NS1 protein from dengue virus serotype 2 (DENV2, which were used to map three NS1 epitopes. The sequence 193AVHADMGYWIESALNDT209 was recognized by monoclonal antibodies 2H5 and 4H1BC, which also cross-reacted with Zika virus (ZIKV protein. On the other hand, the sequence 25VHTWTEQYKFQPES38 was recognized by mAb 4F6 that did not cross react with ZIKV. Lastly, a previously unidentified DENV2 NS1-specific epitope, represented by the sequence 127ELHNQTFLIDGPETAEC143, is described in the present study after reaction with mAb 4H2, which also did not cross react with ZIKV. The selection and characterization of the epitope, specificity of anti-NS1 mAbs, may contribute to the development of diagnostic tools able to differentiate DENV and ZIKV infections.

  19. MicroRNA repertoire for functional genome research in tilapia identified by deep sequencing.

    Science.gov (United States)

    Yan, Biao; Wang, Zhen-Hua; Zhu, Chang-Dong; Guo, Jin-Tao; Zhao, Jin-Liang

    2014-08-01

    The Nile tilapia (Oreochromis niloticus; Cichlidae) is an economically important species in aquaculture and occupies a prominent position in the aquaculture industry. MicroRNAs (miRNAs) are a class of noncoding RNAs that post-transcriptionally regulate gene expression involved in diverse biological and metabolic processes. To increase the repertoire of miRNAs characterized in tilapia, we used the Illumina/Solexa sequencing technology to sequence a small RNA library using pooled RNA sample isolated from the different developmental stages of tilapia. Bioinformatic analyses suggest that 197 conserved and 27 novel miRNAs are expressed in tilapia. Sequence alignments indicate that all tested miRNAs and miRNAs* are highly conserved across many species. In addition, we characterized the tissue expression patterns of five miRNAs using real-time quantitative PCR. We found that miR-1/206, miR-7/9, and miR-122 is abundantly expressed in muscle, brain, and liver, respectively, implying a potential role in the regulation of tissue differentiation or the maintenance of tissue identity. Overall, our results expand the number of tilapia miRNAs, and the discovery of miRNAs in tilapia genome contributes to a better understanding the role of miRNAs in regulating diverse biological processes.

  20. Leaf Transcriptome Sequencing for Identifying Genic-SSR Markers and SNP Heterozygosity in Crossbred Mango Variety 'Amrapali' (Mangifera indica L.).

    Science.gov (United States)

    Mahato, Ajay Kumar; Sharma, Nimisha; Singh, Akshay; Srivastav, Manish; Jaiprakash; Singh, Sanjay Kumar; Singh, Anand Kumar; Sharma, Tilak Raj; Singh, Nagendra Kumar

    2016-01-01

    Mango (Mangifera indica L.) is called "king of fruits" due to its sweetness, richness of taste, diversity, large production volume and a variety of end usage. Despite its huge economic importance genomic resources in mango are scarce and genetics of useful horticultural traits are poorly understood. Here we generated deep coverage leaf RNA sequence data for mango parental varieties 'Neelam', 'Dashehari' and their hybrid 'Amrapali' using next generation sequencing technologies. De-novo sequence assembly generated 27,528, 20,771 and 35,182 transcripts for the three genotypes, respectively. The transcripts were further assembled into a non-redundant set of 70,057 unigenes that were used for SSR and SNP identification and annotation. Total 5,465 SSR loci were identified in 4,912 unigenes with 288 type I SSR (n ≥ 20 bp). One hundred type I SSR markers were randomly selected of which 43 yielded PCR amplicons of expected size in the first round of validation and were designated as validated genic-SSR markers. Further, 22,306 SNPs were identified by aligning high quality sequence reads of the three mango varieties to the reference unigene set, revealing significantly enhanced SNP heterozygosity in the hybrid Amrapali. The present study on leaf RNA sequencing of mango varieties and their hybrid provides useful genomic resource for genetic improvement of mango.

  1. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Directory of Open Access Journals (Sweden)

    Gilbert Greub

    Full Text Available BACKGROUND: With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. METHODS/PRINCIPAL FINDINGS: We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. CONCLUSIONS/SIGNIFICANCE: This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  2. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.

    Science.gov (United States)

    Motamayor, Juan C; Mockaitis, Keithanne; Schmutz, Jeremy; Haiminen, Niina; Livingstone, Donald; Cornejo, Omar; Findley, Seth D; Zheng, Ping; Utro, Filippo; Royaert, Stefan; Saski, Christopher; Jenkins, Jerry; Podicheti, Ram; Zhao, Meixia; Scheffler, Brian E; Stack, Joseph C; Feltus, Frank A; Mustiga, Guiliana M; Amores, Freddy; Phillips, Wilbert; Marelli, Jean Philippe; May, Gregory D; Shapiro, Howard; Ma, Jianxin; Bustamante, Carlos D; Schnell, Raymond J; Main, Dorrie; Gilbert, Don; Parida, Laxmi; Kuhn, David N

    2013-06-03

    Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits.

  3. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color

    Science.gov (United States)

    2013-01-01

    Background Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. Results We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. Conclusions We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits. PMID:23731509

  4. A machine learning model to determine the accuracy of variant calls in capture-based next generation sequencing.

    Science.gov (United States)

    van den Akker, Jeroen; Mishne, Gilad; Zimmer, Anjali D; Zhou, Alicia Y

    2018-04-17

    Next generation sequencing (NGS) has become a common technology for clinical genetic tests. The quality of NGS calls varies widely and is influenced by features like reference sequence characteristics, read depth, and mapping accuracy. With recent advances in NGS technology and software tools, the majority of variants called using NGS alone are in fact accurate and reliable. However, a small subset of difficult-to-call variants that still do require orthogonal confirmation exist. For this reason, many clinical laboratories confirm NGS results using orthogonal technologies such as Sanger sequencing. Here, we report the development of a deterministic machine-learning-based model to differentiate between these two types of variant calls: those that do not require confirmation using an orthogonal technology (high confidence), and those that require additional quality testing (low confidence). This approach allows reliable NGS-based calling in a clinical setting by identifying the few important variant calls that require orthogonal confirmation. We developed and tested the model using a set of 7179 variants identified by a targeted NGS panel and re-tested by Sanger sequencing. The model incorporated several signals of sequence characteristics and call quality to determine if a variant was identified at high or low confidence. The model was tuned to eliminate false positives, defined as variants that were called by NGS but not confirmed by Sanger sequencing. The model achieved very high accuracy: 99.4% (95% confidence interval: +/- 0.03%). It categorized 92.2% (6622/7179) of the variants as high confidence, and 100% of these were confirmed to be present by Sanger sequencing. Among the variants that were categorized as low confidence, defined as NGS calls of low quality that are likely to be artifacts, 92.1% (513/557) were found to be not present by Sanger sequencing. This work shows that NGS data contains sufficient characteristics for a machine-learning-based model to

  5. Recurrent chimeric RNAs enriched in human prostate cancer identified by deep sequencing

    Science.gov (United States)

    Kannan, Kalpana; Wang, Liguo; Wang, Jianghua; Ittmann, Michael M.; Li, Wei; Yen, Laising

    2011-01-01

    Transcription-induced chimeric RNAs, possessing sequences from different genes, are expected to increase the proteomic diversity through chimeric proteins or altered regulation. Despite their importance, few studies have focused on chimeric RNAs especially regarding their presence/roles in human cancers. By deep sequencing the transcriptome of 20 human prostate cancer and 10 matched benign prostate tissues, we obtained 1.3 billion sequence reads, which led to the identification of 2,369 chimeric RNA candidates. Chimeric RNAs occurred in significantly higher frequency in cancer than in matched benign samples. Experimental investigation of a selected 46 set led to the confirmation of 32 chimeric RNAs, of which 27 were highly recurrent and previously undescribed in prostate cancer. Importantly, a subset of these chimeras was present in prostate cancer cell lines, but not detectable in primary human prostate epithelium cells, implying their associations with cancer. These chimeras contain discernable 5′ and 3′ splice sites at the RNA junction, indicating that their formation is mediated by splicing. Their presence is also largely independent of the expression of parental genes, suggesting that other factors are involved in their production and regulation. One chimera, TMEM79-SMG5, is highly differentially expressed in human cancer samples and therefore a potential biomarker. The prevalence of chimeric RNAs may allow the limited number of human genes to encode a substantially larger number of RNAs and proteins, forming an additional layer of cellular complexity. Together, our results suggest that chimeric RNAs are widespread, and increased chimeric RNA events could represent a unique class of molecular alteration in cancer. PMID:21571633

  6. TAPDANCE: An automated tool to identify and annotate transposon insertion CISs and associations between CISs from next generation sequence data

    Directory of Open Access Journals (Sweden)

    Sarver Aaron L

    2012-06-01

    Full Text Available Abstract Background Next generation sequencing approaches applied to the analyses of transposon insertion junction fragments generated in high throughput forward genetic screens has created the need for clear informatics and statistical approaches to deal with the massive amount of data currently being generated. Previous approaches utilized to 1 map junction fragments within the genome and 2 identify Common Insertion Sites (CISs within the genome are not practical due to the volume of data generated by current sequencing technologies. Previous approaches applied to this problem also required significant manual annotation. Results We describe Transposon Annotation Poisson Distribution Association Network Connectivity Environment (TAPDANCE software, which automates the identification of CISs within transposon junction fragment insertion data. Starting with barcoded sequence data, the software identifies and trims sequences and maps putative genomic sequence to a reference genome using the bowtie short read mapper. Poisson distribution statistics are then applied to assess and rank genomic regions showing significant enrichment for transposon insertion. Novel methods of counting insertions are used to ensure that the results presented have the expected characteristics of informative CISs. A persistent mySQL database is generated and utilized to keep track of sequences, mappings and common insertion sites. Additionally, associations between phenotypes and CISs are also identified using Fisher’s exact test with multiple testing correction. In a case study using previously published data we show that the TAPDANCE software identifies CISs as previously described, prioritizes them based on p-value, allows holistic visualization of the data within genome browser software and identifies relationships present in the structure of the data. Conclusions The TAPDANCE process is fully automated, performs similarly to previous labor intensive approaches

  7. Exome sequencing identifies highly recurrent MED12 somatic mutations in breast fibroadenoma.

    Science.gov (United States)

    Lim, Weng Khong; Ong, Choon Kiat; Tan, Jing; Thike, Aye Aye; Ng, Cedric Chuan Young; Rajasegaran, Vikneswari; Myint, Swe Swe; Nagarajan, Sanjanaa; Nasir, Nur Diyana Md; McPherson, John R; Cutcutache, Ioana; Poore, Gregory; Tay, Su Ting; Ooi, Wei Siong; Tan, Veronique Kiak Mien; Hartman, Mikael; Ong, Kong Wee; Tan, Benita K T; Rozen, Steven G; Tan, Puay Hoon; Tan, Patrick; Teh, Bin Tean

    2014-08-01

    Fibroadenomas are the most common breast tumors in women under 30 (refs. 1,2). Exome sequencing of eight fibroadenomas with matching whole-blood samples revealed recurrent somatic mutations solely in MED12, which encodes a Mediator complex subunit. Targeted sequencing of an additional 90 fibroadenomas confirmed highly frequent MED12 exon 2 mutations (58/98, 59%) that are probably somatic, with 71% of mutations occurring in codon 44. Using laser capture microdissection, we show that MED12 fibroadenoma mutations are present in stromal but not epithelial mammary cells. Expression profiling of MED12-mutated and wild-type fibroadenomas revealed that MED12 mutations are associated with dysregulated estrogen signaling and extracellular matrix organization. The fibroadenoma MED12 mutation spectrum is nearly identical to that of previously reported MED12 lesions in uterine leiomyoma but not those of other tumors. Benign tumors of the breast and uterus, both of which are key target tissues of estrogen, may thus share a common genetic basis underpinned by highly frequent and specific MED12 mutations.

  8. Multiplexed resequencing analysis to identify rare variants in pooled DNA with barcode indexing using next-generation sequencer.

    Science.gov (United States)

    Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji

    2010-07-01

    We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.

  9. Next generation sequencing identifies abnormal Y chromosome and candidate causal variants in premature ovarian failure patients.

    Science.gov (United States)

    Lee, Yujung; Kim, Changshin; Park, YoungJoon; Pyun, Jung-A; Kwack, KyuBum

    2016-12-01

    Premature ovarian failure (POF) is characterized by heterogeneous genetic causes such as chromosomal abnormalities and variants in causal genes. Recently, development of techniques made next generation sequencing (NGS) possible to detect genome wide variants including chromosomal abnormalities. Among 37 Korean POF patients, XY karyotype with distal part deletions of Y chromosome, Yp11.32-31 and Yp12 end part, was observed in two patients through NGS. Six deleterious variants in POF genes were also detected which might explain the pathogenesis of POF with abnormalities in the sex chromosomes. Additionally, the two POF patients had no mutation in SRY but three non-synonymous variants were detected in genes regarding sex reversal. These findings suggest candidate causes of POF and sex reversal and show the propriety of NGS to approach the heterogeneous pathogenesis of POF. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Whole-exome sequencing identifies common and rare variant metabolic QTLs in a Middle Eastern population.

    Science.gov (United States)

    Yousri, Noha A; Fakhro, Khalid A; Robay, Amal; Rodriguez-Flores, Juan L; Mohney, Robert P; Zeriri, Hassina; Odeh, Tala; Kader, Sara Abdul; Aldous, Eman K; Thareja, Gaurav; Kumar, Manish; Al-Shakaki, Alya; Chidiac, Omar M; Mohamoud, Yasmin A; Mezey, Jason G; Malek, Joel A; Crystal, Ronald G; Suhre, Karsten

    2018-01-23

    Metabolomics-genome-wide association studies (mGWAS) have uncovered many metabolic quantitative trait loci (mQTLs) influencing human metabolic individuality, though predominantly in European cohorts. By combining whole-exome sequencing with a high-resolution metabolomics profiling for a highly consanguineous Middle Eastern population, we discover 21 common variant and 12 functional rare variant mQTLs, of which 45% are novel altogether. We fine-map 10 common variant mQTLs to new metabolite ratio associations, and 11 common variant mQTLs to putative protein-altering variants. This is the first work to report common and rare variant mQTLs linked to diseases and/or pharmacological targets in a consanguineous Arab cohort, with wide implications for precision medicine in the Middle East.

  11. Identifying Likely Transmission Pathways within a 10-Year Community Outbreak of Tuberculosis by High-Depth Whole Genome Sequencing.

    Directory of Open Access Journals (Sweden)

    Alexander C Outhred

    Full Text Available Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways.We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants.Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade.Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster.

  12. The role of the physician: Eugene Sanger and a standard of care at the Elmira prison camp.

    Science.gov (United States)

    Waggoner, Jesse

    2008-01-01

    The conduct of American military physicians in prisoner of war (POW) camps has been called into question by the abuse scandals at Abu Ghraib and Guantánamo Bay. This essay explores the experiences of the first U.S. military physicians to confront POW patients in large numbers-events that occurred during the American Civil War. While POWs received sub-standard care in camps north and south, the war also saw the issuance of the first document to outline the rights of POWs. This ambivalence toward the proper care and treatment of the POW is evident in the career of Dr. Eugene Sanger, the first Union surgeon at the prison camp in Elmira, New York. Sanger demonstrated both concern about the sanitary condition of the camp and pride in the deaths of POWs as furthering the overall war aims. His cruelty attracted some censure, but Sanger never faced disciplinary action. He was honorably discharged and went on to become the Surgeon General of his home state. This article places his actions at Elmira in the context of medical ethics, Army orders, and Northern opinion in 1864, and it will argue that the lack of Federal response to Eugene Sanger's poor record while serving at the prison set a precedent for inferior medical care of POWs by American military physicians.

  13. Greater than the sum of its parts: single-nucleus sequencing identifies convergent evolution of independent EGFR mutants in GBM.

    Science.gov (United States)

    Gini, Beatrice; Mischel, Paul S

    2014-08-01

    Single-cell sequencing approaches are needed to characterize the genomic diversity of complex tumors, shedding light on their evolutionary paths and potentially suggesting more effective therapies. In this issue of Cancer Discovery, Francis and colleagues develop a novel integrative approach to identify distinct tumor subpopulations based on joint detection of clonal and subclonal events from bulk tumor and single-nucleus whole-genome sequencing, allowing them to infer a subclonal architecture. Surprisingly, the authors identify convergent evolution of multiple, mutually exclusive, independent EGFR gain-of-function variants in a single tumor. This study demonstrates the value of integrative single-cell genomics and highlights the biologic primacy of EGFR as an actionable target in glioblastoma. ©2014 American Association for Cancer Research.

  14. Integrative analysis of functional genomic annotations and sequencing data to identify rare causal variants via hierarchical modeling

    Directory of Open Access Journals (Sweden)

    Marinela eCapanu

    2015-05-01

    Full Text Available Identifying the small number of rare causal variants contributing to disease has beena major focus of investigation in recent years, but represents a formidable statisticalchallenge due to the rare frequencies with which these variants are observed. In thiscommentary we draw attention to a formal statistical framework, namely hierarchicalmodeling, to combine functional genomic annotations with sequencing data with theobjective of enhancing our ability to identify rare causal variants. Using simulations weshow that in all configurations studied, the hierarchical modeling approach has superiordiscriminatory ability compared to a recently proposed aggregate measure of deleteriousness,the Combined Annotation-Dependent Depletion (CADD score, supportingour premise that aggregate functional genomic measures can more accurately identifycausal variants when used in conjunction with sequencing data through a hierarchicalmodeling approach

  15. Use of Whole-Genus Genome Sequence Data To Develop a Multilocus Sequence Typing Tool That Accurately Identifies Yersinia Isolates to the Species and Subspecies Levels

    Science.gov (United States)

    Hall, Miquette; Chattaway, Marie A.; Reuter, Sandra; Savin, Cyril; Strauch, Eckhard; Carniel, Elisabeth; Connor, Thomas; Van Damme, Inge; Rajakaruna, Lakshani; Rajendram, Dunstan; Jenkins, Claire; Thomson, Nicholas R.

    2014-01-01

    The genus Yersinia is a large and diverse bacterial genus consisting of human-pathogenic species, a fish-pathogenic species, and a large number of environmental species. Recently, the phylogenetic and population structure of the entire genus was elucidated through the genome sequence data of 241 strains encompassing every known species in the genus. Here we report the mining of this enormous data set to create a multilocus sequence typing-based scheme that can identify Yersinia strains to the species level to a level of resolution equal to that for whole-genome sequencing. Our assay is designed to be able to accurately subtype the important human-pathogenic species Yersinia enterocolitica to whole-genome resolution levels. We also report the validation of the scheme on 386 strains from reference laboratory collections across Europe. We propose that the scheme is an important molecular typing system to allow accurate and reproducible identification of Yersinia isolates to the species level, a process often inconsistent in nonspecialist laboratories. Additionally, our assay is the most phylogenetically informative typing scheme available for Y. enterocolitica. PMID:25339391

  16. Activity of Posaconazole and Other Antifungal Agents against Mucorales Strains Identified by Sequencing of Internal Transcribed Spacers▿

    Science.gov (United States)

    Alastruey-Izquierdo, Ana; Castelli, Maria Victoria; Cuesta, Isabel; Monzon, Araceli; Cuenca-Estrella, Manuel; Rodriguez-Tudela, Juan Luis

    2009-01-01

    The antifungal susceptibility profiles of 77 clinical strains of Mucorales species, identified by internal transcribed spacer sequencing, were analyzed. MICs obtained at 24 and 48 h were compared. Amphotericin B was the most active agent against all isolates, except for Cunninghamella and Apophysomyces isolates. Posaconazole also showed good activity for all species but Cunninghamella bertholletiae. Voriconazole had no activity against any of the fungi tested. Terbinafine showed good activity, except for Rhizopus oryzae, Mucor circinelloides, and Rhizomucor variabilis isolates. PMID:19171801

  17. Activity of posaconazole and other antifungal agents against Mucorales strains identified by sequencing of internal transcribed spacers.

    Science.gov (United States)

    Alastruey-Izquierdo, Ana; Castelli, Maria Victoria; Cuesta, Isabel; Monzon, Araceli; Cuenca-Estrella, Manuel; Rodriguez-Tudela, Juan Luis

    2009-04-01

    The antifungal susceptibility profiles of 77 clinical strains of Mucorales species, identified by internal transcribed spacer sequencing, were analyzed. MICs obtained at 24 and 48 h were compared. Amphotericin B was the most active agent against all isolates, except for Cunninghamella and Apophysomyces isolates. Posaconazole also showed good activity for all species but Cunninghamella bertholletiae. Voriconazole had no activity against any of the fungi tested. Terbinafine showed good activity, except for Rhizopus oryzae, Mucor circinelloides, and Rhizomucor variabilis isolates.

  18. Rapid and Accurate Sequencing of Enterovirus Genomes Using MinION Nanopore Sequencer.

    Science.gov (United States)

    Wang, Ji; Ke, Yue Hua; Zhang, Yong; Huang, Ke Qiang; Wang, Lei; Shen, Xin Xin; Dong, Xiao Ping; Xu, Wen Bo; Ma, Xue Jun

    2017-10-01

    Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly used to sequence various viral pathogens in many clinical situations because of its long reads, portability, real-time accessibility of sequenced data, and very low initial costs. However, information is lacking on MinION sequencing of enterovirus genomes. In this proof-of-concept study using Enterovirus 71 (EV71) and Coxsackievirus A16 (CA16) strains as examples, we established an amplicon-based whole genome sequencing method using MinION. We explored the accuracy, minimum sequencing time, discrimination and high-throughput sequencing ability of MinION, and compared its performance with Sanger sequencing. Within the first minute (min) of sequencing, the accuracy of MinION was 98.5% for the single EV71 strain and 94.12%-97.33% for 10 genetically-related CA16 strains. In as little as 14 min, 99% identity was reached for the single EV71 strain, and in 17 min (on average), 99% identity was achieved for 10 CA16 strains in a single run. MinION is suitable for whole genome sequencing of enteroviruses with sufficient accuracy and fine discrimination and has the potential as a fast, reliable and convenient method for routine use. Copyright © 2017 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.

  19. Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol

    NARCIS (Netherlands)

    L.A. Lange (Leslie); Y. Hu (Youna); H. Zhang (He); C. Xue (Chenyi); E.M. Schmidt (Ellen); Z.-Z. Tang (Zheng-Zheng); C. Bizon (Chris); E.M. Lange (Ethan); G.D. Smith; E.H. Turner (Emily); Y. Jun (Yang); H.M. Kang (Hyun Min); G.M. Peloso (Gina); P. Auer (Paul); K.-P. Li (Kuo-Ping); J. Flannick (Jason); J. Zhang (Ji); C. Fuchsberger (Christian); K. Gaulton (Kyle); C.M. Lindgren (Cecilia); A. Locke (Adam); A.K. Manning (Alisa); X. Sim (Xueling); M.A. Rivas (Manuel); O.L. Holmen (Oddgeir); R.F. Gottesman (Rebecca); Y. Lu (Yingchang); D. Ruderfer (Douglas); E.A. Stahl (Eli); Q. Duan (Qing); Y. Li (Yun); P. Durda (Peter); S. Jiao (Shuo); A.J. Isaacs (Aaron); A. Hofman (Albert); J.C. Bis (Joshua); D.D. Correa; M.D. Griswold (Michael); M. Jakobsdottir (Margret); G.D. Smith; P.J. Schreiner (Pamela); M.F. Feitosa (Mary Furlan); Q. Zhang (Qunyuan); J.E. Huffman (Jennifer); S. Crosby; C.L. Wassel (Christina); R. Do (Ron); N. Franceschini (Nora); L.W. Martin (Lisa); J.G. Robinson (Jennifer); T.L. Assimes (Themistocles); D.R. Crosslin (David); E.A. Rosenthal (Elisabeth); M.Y. Tsai (Michael); M. Rieder (Mark); D.N. Farlow (Deborah); A.R. Folsom (Aaron); T. Lumley (Thomas); E.R. Fox (Ervin); C.S. Carlson (Christopher); U. Peters (Ulrike); R.D. Jackson (Rebecca); C.M. van Duijn (Cornelia); A.G. Uitterlinden (André); D. Levy (Daniel); J.I. Rotter (Jerome); H.A. Taylor (Herman); V. Gudnason (Vilmundur); D.S. Siscovick (David); M. Fornage (Myriam); I.B. Borecki (Ingrid); C. Hayward (Caroline); I. Rudan (Igor); Y.E. Chen (Y. Eugene); E.P. Bottinger (Erwin); R.J.F. Loos (Ruth); P. Sætrom (Pål); K. Hveem (Kristian); M. Boehnke (Michael); L. Groop (Leif); M.I. McCarthy (Mark); T. Meitinger (Thomas); C. Ballantyne (Christie); S.B. Gabriel (Stacey); C.J. O'Donnell (Christopher); W.S. Post (Wendy S.); K.E. North (Kari); A. Reiner (Alexander); E.A. Boerwinkle (Eric); B.M. Psaty (Bruce); D. Altshuler (David); S. Kathiresan (Sekar); D.Y. Lin (Dan); G.P. Jarvik (Gail); L.A. Cupples (Adrienne); C. Kooperberg (Charles); J.G. Wilson (James); D.A. Nickerson (Deborah); G.R. Abecasis (Gonçalo); S.S. Rich (Stephen); R.P. Tracy (Russell); C.J. Willer (Cristen)

    2014-01-01

    textabstractElevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency

  20. Discovery and molecular characterization of a new luteovirus identified by high-throughput sequencing from apple

    Science.gov (United States)

    ‘Rapid Apple Decline’ (RAD) is a newly emerging problem of young, dwarf apple trees in the northeastern USA. The affected trees show trunk necrosis, bark cracking and canker formation before collapsing in the summer. In this study, a new luteovirus and three common viruses were identified from apple...

  1. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    NARCIS (Netherlands)

    Hu, H; Haas, S.A.; Chelly, J.; Esch, H. Van; Raynaud, M.; Brouwer, A.P. de; Weinert, S.; Froyen, G.; Frints, S.G.; Laumonnier, F.; Zemojtel, T.; Love, M.I.; Richard, H.; Emde, A.K.; Bienek, M.; Jensen, C.; Hambrock, M.; Fischer, U.; Langnick, C.; Feldkamp, M.; Wissink-Lindhout, W.; Lebrun, N.; Castelnau, L.; Rucci, J.; Montjean, R.; Dorseuil, O.; Billuart, P.; Stuhlmann, T.; Shaw, M.; Corbett, M.A.; Gardner, A.; Willis-Owen, S.; Tan, C.; Friend, K.L.; Belet, S.; Roozendaal, K.E. van; Jimenez-Pocquet, M.; Moizard, M.P.; Ronce, N.; Sun, R.; O'Keeffe, S.; Chenna, R.; Bommel, A. van; Goke, J.; Hackett, A.; Field, M.; Christie, L.; Boyle, J.; Haan, E.; Nelson, J.; Turner, G.; Baynam, G.; Gillessen-Kaesbach, G.; Muller, U.; Steinberger, D.; Budny, B.; Badura-Stronka, M.; Latos-Bielenska, A.; Ousager, L.B.; Wieacker, P.; Rodriguez Criado, G.; Bondeson, M.L.; Anneren, G.; Dufke, A.; Cohen, M.; Maldergem, L. Van; Vincent-Delorme, C.; Echenne, B.; Simon-Bouy, B.; Kleefstra, T.; Willemsen, M.H.; Fryns, J.P.; Devriendt, K.; Ullmann, R.; Vingron, M.; Wrogemann, K.; Wienker, T.F.; Tzschach, A.; Bokhoven, H. van; Gecz, J.; Jentsch, T.J.; Chen, W.; Ropers, H.H.; Kalscheuer, V.M.

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or

  2. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    DEFF Research Database (Denmark)

    Hu, H; Haas, S A; Chelly, J

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes...

  3. Exome sequencing identifies early gastric carcinoma as an early stage of advanced gastric cancer.

    Directory of Open Access Journals (Sweden)

    Guhyun Kang

    Full Text Available Gastric carcinoma is one of the major causes of cancer-related mortality worldwide. Early detection and treatment leads to an excellent prognosis in patients with early gastric cancer (EGC, whereas the prognosis of patients with advanced gastric cancer (AGC remains poor. It is unclear whether EGCs and AGCs are distinct entities or whether EGCs are the beginning stages of AGCs. We performed whole exome sequencing of four samples from patients with EGC and compared the results with those from AGCs. In both EGCs and AGCs, a total of 268 genes were commonly mutated and independent mutations were additionally found in EGCs (516 genes and AGCs (3104 genes. A higher frequency of C>G transitions was observed in intestinal-type compared to diffuse-type carcinomas (P = 0.010. The DYRK3, GPR116, MCM10, PCDH17, PCDHB1, RDH5 and UNC5C genes are recurrently mutated in EGCs and may be involved in early carcinogenesis.

  4. A survey of single nucleotide polymorphisms identified from whole-genome sequencing and their functional effect in the porcine genome.

    Science.gov (United States)

    Keel, B N; Nonneman, D J; Rohrer, G A

    2017-08-01

    Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole-genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire-Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss-of-function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss-of-function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.

  5. Deep sequencing identifies ethnicity-specific bacterial signatures in the oral microbiome.

    Directory of Open Access Journals (Sweden)

    Matthew R Mason

    Full Text Available Oral infections have a strong ethnic predilection; suggesting that ethnicity is a critical determinant of oral microbial colonization. Dental plaque and saliva samples from 192 subjects belonging to four major ethnicities in the United States were analyzed using terminal restriction fragment length polymorphism (t-RFLP and 16S pyrosequencing. Ethnicity-specific clustering of microbial communities was apparent in saliva and subgingival biofilms, and a machine-learning classifier was capable of identifying an individual's ethnicity from subgingival microbial signatures. The classifier identified African Americans with a 100% sensitivity and 74% specificity and Caucasians with a 50% sensitivity and 91% specificity. The data demonstrates a significant association between ethnic affiliation and the composition of the oral microbiome; to the extent that these microbial signatures appear to be capable of discriminating between ethnicities.

  6. Exome sequencing identifies rare deleterious mutations in DNA repair genes FANCC and BLM as potential breast cancer susceptibility alleles.

    Directory of Open Access Journals (Sweden)

    Ella R Thompson

    2012-09-01

    Full Text Available Despite intensive efforts using linkage and candidate gene approaches, the genetic etiology for the majority of families with a multi-generational breast cancer predisposition is unknown. In this study, we used whole-exome sequencing of thirty-three individuals from 15 breast cancer families to identify potential predisposing genes. Our analysis identified families with heterozygous, deleterious mutations in the DNA repair genes FANCC and BLM, which are responsible for the autosomal recessive disorders Fanconi Anemia and Bloom syndrome. In total, screening of all exons in these genes in 438 breast cancer families identified three with truncating mutations in FANCC and two with truncating mutations in BLM. Additional screening of FANCC mutation hotspot exons identified one pathogenic mutation among an additional 957 breast cancer families. Importantly, none of the deleterious mutations were identified among 464 healthy controls and are not reported in the 1,000 Genomes data. Given the rarity of Fanconi Anemia and Bloom syndrome disorders among Caucasian populations, the finding of multiple deleterious mutations in these critical DNA repair genes among high-risk breast cancer families is intriguing and suggestive of a predisposing role. Our data demonstrate the utility of intra-family exome-sequencing approaches to uncover cancer predisposition genes, but highlight the major challenge of definitively validating candidates where the incidence of sporadic disease is high, germline mutations are not fully penetrant, and individual predisposition genes may only account for a tiny proportion of breast cancer families.

  7. High-throughput sequencing enhanced phage display identifies peptides that bind mycobacteria

    CSIR Research Space (South Africa)

    Ngubane, NAC

    2013-11-01

    Full Text Available . The displayed peptides are flanked by two cysteine residues, which are oxidized during phage assembly to a disulfide bond, resulting in a loop constrained peptide. We initially used the traditional clone picking method to identify the enriched clones... of the library, 1.236109 heptapeptides, it represented sufficient depth to measure the quantitative enrich- ment of relevant peptides. To confirm successful enrichment during selection, we characterized the reduction in diversity of the pool in the consecutive...

  8. Exome sequencing identifies DYNC2H1 mutations as a common cause of asphyxiating thoracic dystrophy (Jeune syndrome) without major polydactyly, renal or retinal involvement

    Science.gov (United States)

    Schmidts, Miriam; Arts, Heleen H; Bongers, Ernie M H F; Yap, Zhimin; Oud, Machteld M; Antony, Dinu; Duijkers, Lonneke; Emes, Richard D; Stalker, Jim; Yntema, Jan-Bart L; Plagnol, Vincent; Hoischen, Alexander; Gilissen, Christian; Forsythe, Elisabeth; Lausch, Ekkehart; Veltman, Joris A; Roeleveld, Nel; Superti-Furga, Andrea; Kutkowska-Kazmierczak, Anna; Kamsteeg, Erik-Jan; Elçioğlu, Nursel; van Maarle, Merel C; Graul-Neumann, Luitgard M; Devriendt, Koenraad; Smithson, Sarah F; Wellesley, Diana; Verbeek, Nienke E; Hennekam, Raoul C M; Kayserili, Hulya; Scambler, Peter J; Beales, Philip L; Knoers, Nine VAM; Roepman, Ronald; Mitchison, Hannah M

    2013-01-01

    Background Jeune asphyxiating thoracic dystrophy (JATD) is a rare, often lethal, recessively inherited chondrodysplasia characterised by shortened ribs and long bones, sometimes accompanied by polydactyly, and renal, liver and retinal disease. Mutations in intraflagellar transport (IFT) genes cause JATD, including the IFT dynein-2 motor subunit gene DYNC2H1. Genetic heterogeneity and the large DYNC2H1 gene size have hindered JATD genetic diagnosis. Aims and methods To determine the contribution to JATD we screened DYNC2H1 in 71 JATD patients JATD patients combining SNP mapping, Sanger sequencing and exome sequencing. Results and conclusions We detected 34 DYNC2H1 mutations in 29/71 (41%) patients from 19/57 families (33%), showing it as a major cause of JATD especially in Northern European patients. This included 13 early protein termination mutations (nonsense/frameshift, deletion, splice site) but no patients carried these in combination, suggesting the human phenotype is at least partly hypomorphic. In addition, 21 missense mutations were distributed across DYNC2H1 and these showed some clustering to functional domains, especially the ATP motor domain. DYNC2H1 patients largely lacked significant extra-skeletal involvement, demonstrating an important genotype–phenotype correlation in JATD. Significant variability exists in the course and severity of the thoracic phenotype, both between affected siblings with identical DYNC2H1 alleles and among individuals with different alleles, which suggests the DYNC2H1 phenotype might be subject to modifier alleles, non-genetic or epigenetic factors. Assessment of fibroblasts from patients showed accumulation of anterograde IFT proteins in the ciliary tips, confirming defects similar to patients with other retrograde IFT machinery mutations, which may be of undervalued potential for diagnostic purposes. PMID:23456818

  9. Particular Candida albicans strains in the digestive tract of dyspeptic patients, identified by multilocus sequence typing.

    Directory of Open Access Journals (Sweden)

    Yan-Bing Gong

    Full Text Available BACKGROUND: Candida albicans is a human commensal that is also responsible for chronic gastritis and peptic ulcerous disease. Little is known about the genetic profiles of the C. albicans strains in the digestive tract of dyspeptic patients. The aim of this study was to evaluate the prevalence, diversity, and genetic profiles among C. albicans isolates recovered from natural colonization of the digestive tract in the dyspeptic patients. METHODS AND FINDINGS: Oral swab samples (n = 111 and gastric mucosa samples (n = 102 were obtained from a group of patients who presented dyspeptic symptoms or ulcer complaints. Oral swab samples (n = 162 were also obtained from healthy volunteers. C. albicans isolates were characterized and analyzed by multilocus sequence typing. The prevalence of Candida spp. in the oral samples was not significantly different between the dyspeptic group and the healthy group (36.0%, 40/111 vs. 29.6%, 48/162; P > 0.05. However, there were significant differences between the groups in the distribution of species isolated and the genotypes of the C. albicans isolates. C. albicans was isolated from 97.8% of the Candida-positive subjects in the dyspeptic group, but from only 56.3% in the healthy group (P < 0.001. DST1593 was the dominant C. albicans genotype from the digestive tract of the dyspeptic group (60%, 27/45, but not the healthy group (14.8%, 4/27 (P < 0.001. CONCLUSIONS: Our data suggest a possible link between particular C. albicans strain genotypes and the host microenvironment. Positivity for particular C. albicans genotypes could signify susceptibility to dyspepsia.

  10. Multiplexed microsatellite recovery using massively parallel sequencing

    Science.gov (United States)

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  11. Saprolegniaceae identified on amphibian eggs throughout the Pacific Northwest, USA, by internal transcribed spacer sequences and phylogenetic analysis.

    Science.gov (United States)

    Petrisko, Jill E; Pearl, Christopher A; Pilliod, David S; Sheridan, Peter P; Williams, Charles F; Peterson, Charles R; Bury, R Bruce

    2008-01-01

    We assessed the diversity and phylogeny of Saprolegniaceae on amphibian eggs from the Pacific Northwest, with particular focus on Saprolegnia ferax, a species implicated in high egg mortality. We identified isolates from eggs of six amphibians with the internal transcribed spacer (ITS) and 5.8S gene regions and BLAST of the GenBank database. We identified 68 sequences as Saprolegniaceae and 43 sequences as true fungi from at least nine genera. Our phylogenetic analysis of the Saprolegniaceae included isolates within the genera Saprolegnia, Achlya and Leptolegnia. Our phylogeny grouped S. semihypogyna with Achlya rather than with the Saprolegnia reference sequences. We found only one isolate that grouped closely with S. ferax, and this came from a hatchery-raised salmon (Idaho) that we sampled opportunistically. We had representatives of 7-12 species and three genera of Saprolegniaceae on our amphibian eggs. Further work on the ecological roles of different species of Saprolegniaceae is needed to clarify their potential importance in amphibian egg mortality and potential links to population declines.

  12. Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

    Science.gov (United States)

    Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

    2014-01-01

    High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

  13. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing

    Science.gov (United States)

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-01-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations. PMID:26206155

  14. Validation of rearrangement break points identified by paired-end sequencing in natural populations of Drosophila melanogaster.

    Science.gov (United States)

    Cridland, Julie M; Thornton, Kevin R

    2010-01-13

    Several recent studies have focused on the evolution of recently duplicated genes in Drosophila. Currently, however, little is known about the evolutionary forces acting upon duplications that are segregating in natural populations. We used a high-throughput, paired-end sequencing platform (Illumina) to identify structural variants in a population sample of African D. melanogaster. Polymerase chain reaction and sequencing confirmation of duplications detected by multiple, independent paired-ends showed that paired-end sequencing reliably uncovered the break points of structural rearrangements and allowed us to identify a number of tandem duplications segregating within a natural population. Our confirmation experiments show that rates of confirmation are very high, even at modest coverage. Our results also compare well with previous studies using microarrays (Emerson J, Cardoso-Moreira M, Borevitz JO, Long M. 2008. Natural selection shapes genome wide patterns of copy-number polymorphism in Drosophila melanogaster. Science. 320:1629-1631. and Dopman EB, Hartl DL. 2007. A portrait of copy-number polymorphism in Drosophila melanogaster. Proc Natl Acad Sci U S A. 104:19920-19925.), which both gives us confidence in the results of this study as well as confirms previous microarray results.We were also able to identify whole-gene duplications, such as a novel duplication of Or22a, an olfactory receptor, and identify copy-number differences in genes previously known to be under positive selection, like Cyp6g1, which confers resistance to dichlorodiphenyltrichloroethane. Several "hot spots" of duplications were detected in this study, which indicate that particular regions of the genome may be more prone to generating duplications. Finally, population frequency analysis of confirmed events also showed an excess of rare variants in our population, which indicates that duplications segregating in the population may be deleterious and ultimately destined to be lost from the

  15. Cytoplasmic male sterility-associated chimeric open reading frames identified by mitochondrial genome sequencing of four Cajanus genotypes.

    Science.gov (United States)

    Tuteja, Reetu; Saxena, Rachit K; Davila, Jaime; Shah, Trushar; Chen, Wenbin; Xiao, Yong-Li; Fan, Guangyi; Saxena, K B; Alverson, Andrew J; Spillane, Charles; Town, Christopher; Varshney, Rajeev K

    2013-10-01

    The hybrid pigeonpea (Cajanus cajan) breeding technology based on cytoplasmic male sterility (CMS) is currently unique among legumes and displays major potential for yield increase. CMS is defined as a condition in which a plant is unable to produce functional pollen grains. The novel chimeric open reading frames (ORFs) produced as a results of mitochondrial genome rearrangements are considered to be the main cause of CMS. To identify these CMS-related ORFs in pigeonpea, we sequenced the mitochondrial genomes of three C. cajan lines (the male-sterile line ICPA 2039, the maintainer line ICPB 2039, and the hybrid line ICPH 2433) and of the wild relative (Cajanus cajanifolius ICPW 29). A single, circular-mapping molecule of length 545.7 kb was assembled and annotated for the ICPA 2039 line. Sequence annotation predicted 51 genes, including 34 protein-coding and 17 RNA genes. Comparison of the mitochondrial genomes from different Cajanus genotypes identified 31 ORFs, which differ between lines within which CMS is present or absent. Among these chimeric ORFs, 13 were identified by comparison of the related male-sterile and maintainer lines. These ORFs display features that are known to trigger CMS in other plant species and to represent the most promising candidates for CMS-related mitochondrial rearrangements in pigeonpea.

  16. T cell receptor sequencing of early-stage breast cancer tumors identifies altered clonal structure of the T cell repertoire.

    Science.gov (United States)

    Beausang, John F; Wheeler, Amanda J; Chan, Natalie H; Hanft, Violet R; Dirbas, Frederick M; Jeffrey, Stefanie S; Quake, Stephen R

    2017-11-28

    Tumor-infiltrating T cells play an important role in many cancers, and can improve prognosis and yield therapeutic targets. We characterized T cells infiltrating both breast cancer tumors and the surrounding normal breast tissue to identify T cells specific to each, as well as their abundance in peripheral blood. Using immune profiling of the T cell beta-chain repertoire in 16 patients with early-stage breast cancer, we show that the clonal structure of the tumor is significantly different from adjacent breast tissue, with the tumor containing ∼2.5-fold greater density of T cells and higher clonality compared with normal breast. The clonal structure of T cells in blood and normal breast is more similar than between blood and tumor, and could be used to distinguish tumor from normal breast tissue in 14 of 16 patients. Many T cell sequences overlap between tissue and blood from the same patient, including ∼50% of T cells between tumor and normal breast. Both tumor and normal breast contain high-abundance "enriched" sequences that are absent or of low abundance in the other tissue. Many of these T cells are either not detected or detected with very low frequency in the blood, suggesting the existence of separate compartments of T cells in both tumor and normal breast. Enriched T cell sequences are typically unique to each patient, but a subset is shared between many different patients. We show that many of these are commonly generated sequences, and thus unlikely to play an important role in the tumor microenvironment. Copyright © 2017 the Author(s). Published by PNAS.

  17. Identifying Rare Variation in Cases of Schizophrenia in the Isolated Population of the Faroe Islands using Whole-genome Sequencing

    DEFF Research Database (Denmark)

    Als, Thomas Damm; Lescai, Francesco; Dahl, Hans

    to map risk variants involved in complex traits. We aim at utilizing samples of cases and controls of the isolated population of the Faroe Islands to conduct whole-genome-sequence analysis in order to identify rare genetic variants associated with schizophrenia. We will search for rare genetic variants...... of developing SZ. However, these studies are designed to examining only “the common variant” proportion of the genomic landscape of SZ. Due to increased genetic drift during founding and potential bottlenecks, followed by population expansion, isolated populations may be particularly useful in identifying rare...... disease variants, that may appear at higher frequencies and/or within a more clearly distinct haplotype structure compared to outbred populations. Small isolated populations also typically show reduced phenotypic, genetic and environmental heterogeneity, thus making them advantageous in studies aiming...

  18. Transcriptome sequencing of the blind subterranean mole rat, Spalax galili: Utility and potential for the discovery of novel evolutionary patterns

    KAUST Repository

    Malik, Assaf; Korol, Abraham; Hü bner, Sariel; Hernandez, Alvaro G.; Thimmapuram, Jyothi; Ali, Shahjahan; Glaser, Fabian; Paz, Arnon; Avivi, Aaron; Band, Mark

    2011-01-01

    sequencing of Spalax galili, a chromosomal type of S. ehrenbergi. cDNA pools from muscle and brain tissues isolated from animals exposed to hypoxic and normoxic conditions were sequenced using Sanger, GS FLX, and GS FLX Titanium technologies. Assembly

  19. Cross-comparison of the genome sequences from human, chimpanzee, Neanderthal and a Denisovan hominin identifies novel potentially compensated mutations

    Directory of Open Access Journals (Sweden)

    Zhang Guojie

    2011-07-01

    Full Text Available Abstract The recent publication of the draft genome sequences of the Neanderthal and a ~50,000-year-old archaic hominin from Denisova Cave in southern Siberia has ushered in a new age in molecular archaeology. We previously cross-compared the human, chimpanzee and Neanderthal genome sequences with respect to a set of disease-causing/disease-associated missense and regulatory mutations (Human Gene Mutation Database and succeeded in identifying genetic variants which, although apparently pathogenic in humans, may represent a 'compensated' wild-type state in at least one of the other two species. Here, in an attempt to identify further 'potentially compensated mutations' (PCMs of interest, we have compared our dataset of disease-causing/disease-associated mutations with their corresponding nucleotide positions in the Denisovan hominin, Neanderthal and chimpanzee genomes. Of the 15 human putatively disease-causing mutations that were found to be compensated in chimpanzee, Denisovan or Neanderthal, only a solitary F5 variant (Val1736Met was specific to the Denisovan. In humans, this missense mutation is associated with activated protein C resistance and an increased risk of thromboembolism and recurrent miscarriage. It is unclear at this juncture whether this variant was indeed a PCM in the Denisovan or whether it could instead have been associated with disease in this ancient hominin.

  20. Lariat sequencing in a unicellular yeast identifies regulated alternative splicing of exons that are evolutionarily conserved with humans.

    Science.gov (United States)

    Awan, Ali R; Manfredo, Amanda; Pleiss, Jeffrey A

    2013-07-30

    Alternative splicing is a potent regulator of gene expression that vastly increases proteomic diversity in multicellular eukaryotes and is associated with organismal complexity. Although alternative splicing is widespread in vertebrates, little is known about the evolutionary origins of this process, in part because of the absence of phylogenetically conserved events that cross major eukaryotic clades. Here we describe a lariat-sequencing approach, which offers high sensitivity for detecting splicing events, and its application to the unicellular fungus, Schizosaccharomyces pombe, an organism that shares many of the hallmarks of alternative splicing in mammalian systems but for which no previous examples of exon-skipping had been demonstrated. Over 200 previously unannotated splicing events were identified, including examples of regulated alternative splicing. Remarkably, an evolutionary analysis of four of the exons identified here as subject to skipping in S. pombe reveals high sequence conservation and perfect length conservation with their homologs in scores of plants, animals, and fungi. Moreover, alternative splicing of two of these exons have been documented in multiple vertebrate organisms, making these the first demonstrations of identical alternative-splicing patterns in species that are separated by over 1 billion y of evolution.

  1. Targeted next generation sequencing identifies functionally deleterious germline mutations in novel genes in early-onset/familial prostate cancer.

    Directory of Open Access Journals (Sweden)

    Paula Paulo

    2018-04-01

    Full Text Available Considering that mutations in known prostate cancer (PrCa predisposition genes, including those responsible for hereditary breast/ovarian cancer and Lynch syndromes, explain less than 5% of early-onset/familial PrCa, we have sequenced 94 genes associated with cancer predisposition using next generation sequencing (NGS in a series of 121 PrCa patients. We found monoallelic truncating/functionally deleterious mutations in seven genes, including ATM and CHEK2, which have previously been associated with PrCa predisposition, and five new candidate PrCa associated genes involved in cancer predisposing recessive disorders, namely RAD51C, FANCD2, FANCI, CEP57 and RECQL4. Furthermore, using in silico pathogenicity prediction of missense variants among 18 genes associated with breast/ovarian cancer and/or Lynch syndrome, followed by KASP genotyping in 710 healthy controls, we identified "likely pathogenic" missense variants in ATM, BRIP1, CHEK2 and TP53. In conclusion, this study has identified putative PrCa predisposing germline mutations in 14.9% of early-onset/familial PrCa patients. Further data will be necessary to confirm the genetic heterogeneity of inherited PrCa predisposition hinted in this study.

  2. Somatic mutation profiles of MSI and MSS colorectal cancer identified by whole exome next generation sequencing and bioinformatics analysis.

    Directory of Open Access Journals (Sweden)

    Bernd Timmermann

    Full Text Available BACKGROUND: Colorectal cancer (CRC is with approximately 1 million cases the third most common cancer worldwide. Extensive research is ongoing to decipher the underlying genetic patterns with the hope to improve early cancer diagnosis and treatment. In this direction, the recent progress in next generation sequencing technologies has revolutionized the field of cancer genomics. However, one caveat of these studies remains the large amount of genetic variations identified and their interpretation. METHODOLOGY/PRINCIPAL FINDINGS: Here we present the first work on whole exome NGS of primary colon cancers. We performed 454 whole exome pyrosequencing of tumor as well as adjacent not affected normal colonic tissue from microsatellite stable (MSS and microsatellite instable (MSI colon cancer patients and identified more than 50,000 small nucleotide variations for each tissue. According to predictions based on MSS and MSI pathomechanisms we identified eight times more somatic non-synonymous variations in MSI cancers than in MSS and we were able to reproduce the result in four additional CRCs. Our bioinformatics filtering approach narrowed down the rate of most significant mutations to 359 for MSI and 45 for MSS CRCs with predicted altered protein functions. In both CRCs, MSI and MSS, we found somatic mutations in the intracellular kinase domain of bone morphogenetic protein receptor 1A, BMPR1A, a gene where so far germline mutations are associated with juvenile polyposis syndrome, and show that the mutations functionally impair the protein function. CONCLUSIONS/SIGNIFICANCE: We conclude that with deep sequencing of tumor exomes one may be able to predict the microsatellite status of CRC and in addition identify potentially clinically relevant mutations.

  3. Refining the Results of a Classical SELEX Experiment by Expanding the Sequence Data Set of an Aptamer Pool Selected for Protein A

    Directory of Open Access Journals (Sweden)

    Regina Stoltenburg

    2018-02-01

    Full Text Available New, as yet undiscovered aptamers for Protein A were identified by applying next generation sequencing (NGS to a previously selected aptamer pool. This pool was obtained in a classical SELEX (Systematic Evolution of Ligands by EXponential enrichment experiment using the FluMag-SELEX procedure followed by cloning and Sanger sequencing. PA#2/8 was identified as the only Protein A-binding aptamer from the Sanger sequence pool, and was shown to be able to bind intact cells of Staphylococcus aureus. In this study, we show the extension of the SELEX results by re-sequencing of the same aptamer pool using a medium throughput NGS approach and data analysis. Both data pools were compared. They confirm the selection of a highly complex and heterogeneous oligonucleotide pool and show consistently a high content of orphans as well as a similar relative frequency of certain sequence groups. But in contrast to the Sanger data pool, the NGS pool was clearly dominated by one sequence group containing the known Protein A-binding aptamer PA#2/8 as the most frequent sequence in this group. In addition, we found two new sequence groups in the NGS pool represented by PA-C10 and PA-C8, respectively, which also have high specificity for Protein A. Comparative affinity studies reveal differences between the aptamers and confirm that PA#2/8 remains the most potent sequence within the selected aptamer pool reaching affinities in the low nanomolar range of KD = 20 ± 1 nM.

  4. Refining the Results of a Classical SELEX Experiment by Expanding the Sequence Data Set of an Aptamer Pool Selected for Protein A.

    Science.gov (United States)

    Stoltenburg, Regina; Strehlitz, Beate

    2018-02-24

    New, as yet undiscovered aptamers for Protein A were identified by applying next generation sequencing (NGS) to a previously selected aptamer pool. This pool was obtained in a classical SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiment using the FluMag-SELEX procedure followed by cloning and Sanger sequencing. PA#2/8 was identified as the only Protein A-binding aptamer from the Sanger sequence pool, and was shown to be able to bind intact cells of Staphylococcus aureus . In this study, we show the extension of the SELEX results by re-sequencing of the same aptamer pool using a medium throughput NGS approach and data analysis. Both data pools were compared. They confirm the selection of a highly complex and heterogeneous oligonucleotide pool and show consistently a high content of orphans as well as a similar relative frequency of certain sequence groups. But in contrast to the Sanger data pool, the NGS pool was clearly dominated by one sequence group containing the known Protein A-binding aptamer PA#2/8 as the most frequent sequence in this group. In addition, we found two new sequence groups in the NGS pool represented by PA-C10 and PA-C8, respectively, which also have high specificity for Protein A. Comparative affinity studies reveal differences between the aptamers and confirm that PA#2/8 remains the most potent sequence within the selected aptamer pool reaching affinities in the low nanomolar range of K D = 20 ± 1 nM.

  5. Targeted exome sequencing reveals novel USH2A mutations in Chinese patients with simplex Usher syndrome.

    Science.gov (United States)

    Shu, Hai-Rong; Bi, Huai; Pan, Yang-Chun; Xu, Hang-Yu; Song, Jian-Xin; Hu, Jie

    2015-09-16

    Usher syndrome (USH) is an autosomal recessive disorder characterized by hearing impairment and vision dysfunction due to retinitis pigmentosa. Phenotypic and genetic heterogeneities of this disease make it impractical to obtain a genetic diagnosis by conventional Sanger sequencing. In this study, we applied a next-generation sequencing approach to detect genetic abnormalities in patients with USH. Two unrelated Chinese families were recruited, consisting of two USH afflicted patients and four unaffected relatives. We selected 199 genes related to inherited retinal diseases as targets for deep exome sequencing. Through systematic data analysis using an established bioinformatics pipeline, all variants that passed filter criteria were validated by Sanger sequencing and co-segregation analysis. A homozygous frameshift mutation (c.4382delA, p.T1462Lfs*2) was revealed in exon20 of gene USH2A in the F1 family. Two compound heterozygous mutations, IVS47 + 1G > A and c.13156A > T (p.I4386F), located in intron 48 and exon 63 respectively, of USH2A, were identified as causative mutations for the F2 family. Of note, the missense mutation c.13156A > T has not been reported so far. In conclusion, targeted exome sequencing precisely and rapidly identified the genetic defects in two Chinese USH families and this technique can be applied as a routine examination for these disorders with significant clinical and genetic heterogeneity.

  6. De novo sequencing of circulating miRNAs identifies novel markers predicting clinical outcome of locally advanced breast cancer

    Directory of Open Access Journals (Sweden)

    Wu Xiwei

    2012-03-01

    Full Text Available Abstract Background MicroRNAs (miRNAs have been recently detected in the circulation of cancer patients, where they are associated with clinical parameters. Discovery profiling of circulating small RNAs has not been reported in breast cancer (BC, and was carried out in this study to identify blood-based small RNA markers of BC clinical outcome. Methods The pre-treatment sera of 42 stage II-III locally advanced and inflammatory BC patients who received neoadjuvant chemotherapy (NCT followed by surgical tumor resection were analyzed for marker identification by deep sequencing all circulating small RNAs. An independent validation cohort of 26 stage II-III BC patients was used to assess the power of identified miRNA markers. Results More than 800 miRNA species were detected in the circulation, and observed patterns showed association with histopathological profiles of BC. Groups of circulating miRNAs differentially associated with ER/PR/HER2 status and inflammatory BC were identified. The relative levels of selected miRNAs measured by PCR showed consistency with their abundance determined by deep sequencing. Two circulating miRNAs, miR-375 and miR-122, exhibited strong correlations with clinical outcomes, including NCT response and relapse with metastatic disease. In the validation cohort, higher levels of circulating miR-122 specifically predicted metastatic recurrence in stage II-III BC patients. Conclusions Our study indicates that certain miRNAs can serve as potential blood-based biomarkers for NCT response, and that miR-122 prevalence in the circulation predicts BC metastasis in early-stage patients. These results may allow optimized chemotherapy treatments and preventive anti-metastasis interventions in future clinical applications.

  7. Full genome sequencing and genetic characterization of Eubenangee viruses identify Pata virus as a distinct species within the genus Orbivirus.

    Directory of Open Access Journals (Sweden)

    Manjunatha N Belaganahalli

    Full Text Available Eubenangee virus has previously been identified as the cause of Tammar sudden death syndrome (TSDS. Eubenangee virus (EUBV, Tilligery virus (TILV, Pata virus (PATAV and Ngoupe virus (NGOV are currently all classified within the Eubenangee virus species of the genus Orbivirus, family Reoviridae. Full genome sequencing confirmed that EUBV and TILV (both of which are from Australia show high levels of aa sequence identity (>92% in the conserved polymerase VP1(Pol, sub-core VP3(T2 and outer core VP7(T13 proteins, and are therefore appropriately classified within the same virus species. However, they show much lower amino acid (aa identity levels in their larger outer-capsid protein VP2 (<53%, consistent with membership of two different serotypes - EUBV-1 and EUBV-2 (respectively. In contrast PATAV showed significantly lower levels of aa sequence identity with either EUBV or TILV (with <71% in VP1(Pol and VP3(T2, and <57% aa identity in VP7(T13 consistent with membership of a distinct virus species. A proposal has therefore been sent to the Reoviridae Study Group of ICTV to recognise 'Pata virus' as a new Orbivirus species, with the PATAV isolate as serotype 1 (PATAV-1. Amongst the other orbiviruses, PATAV shows closest relationships to Epizootic Haemorrhagic Disease virus (EHDV, with 80.7%, 72.4% and 66.9% aa identity in VP3(T2, VP1(Pol, and VP7(T13 respectively. Although Ngoupe virus was not available for these studies, like PATAV it was isolated in Central Africa, and therefore seems likely to also belong to the new species, possibly as a distinct 'type'. The data presented will facilitate diagnostic assay design and the identification of additional isolates of these viruses.

  8. The First Endogenous Herpesvirus, Identified in the Tarsier Genome, and Novel Sequences from Primate Rhadinoviruses and Lymphocryptoviruses

    Science.gov (United States)

    Aswad, Amr; Katzourakis, Aris

    2014-01-01

    Herpesviridae is a diverse family of large and complex pathogens whose genomes are extremely difficult to sequence. This is particularly true for clinical samples, and if the virus, host, or both genomes are being sequenced for the first time. Although herpesviruses are known to occasionally integrate in host genomes, and can also be inherited in a Mendelian fashion, they are notably absent from the genomic fossil record comprised of endogenous viral elements (EVEs). Here, we combine paleovirological and metagenomic approaches to both explore the constituent viral diversity of mammalian genomes and search for endogenous herpesviruses. We describe the first endogenous herpesvirus from the genome of the Philippine tarsier, belonging to the Roseolovirus genus, and characterize its highly defective genome that is integrated and flanked by unambiguous host DNA. From a draft assembly of the aye-aye genome, we use bioinformatic tools to reveal over 100,000 bp of a novel rhadinovirus that is the first lemur gammaherpesvirus, closely related to Kaposi's sarcoma-associated virus. We also identify 58 genes of Pan paniscus lymphocryptovirus 1, the bonobo equivalent of human Epstein-Barr virus. For each of the viruses, we postulate gene function via comparative analysis to known viral relatives. Most notably, the evidence from gene content and phylogenetics suggests that the aye-aye sequences represent the most basal known rhadinovirus, and indicates that tumorigenic herpesviruses have been infecting primates since their emergence in the late Cretaceous. Overall, these data show that a genomic fossil record of herpesviruses exists despite their extremely large genomes, and expands the known diversity of Herpesviridae, which will aid the characterization of pathogenesis. Our analytical approach illustrates the benefit of intersecting evolutionary approaches with metagenomics, genetics and paleovirology. PMID:24945689

  9. Genotyping-by-sequencing in an orphan plant species Physocarpus opulifolius helps identify the evolutionary origins of the genus Prunus.

    Science.gov (United States)

    Buti, Matteo; Sargent, Daniel J; Mhelembe, Khethani G; Delfino, Pietro; Tobutt, Kenneth R; Velasco, Riccardo

    2016-05-11

    The Rosaceae family encompasses numerous genera exhibiting morphological diversification in fruit types and plant habit as well as a wide variety of chromosome numbers. Comparative genomics between various Rosaceous genera has led to the hypothesis that the ancestral genome of the family contained nine chromosomes, however, the synteny studies performed in the Rosaceae to date encompass species with base chromosome numbers x = 7 (Fragaria), x = 8 (Prunus), and x = 17 (Malus), and no study has included species from one of the many Rosaceous genera containing a base chromosome number of x = 9. A genetic linkage map of the species Physocarpus opulifolius (x = 9) was populated with sequence characterised SNP markers using genotyping by sequencing. This allowed for the first time, the extent of the genome diversification of a Rosaceous genus with a base chromosome number of x = 9 to be performed. Orthologous loci distributed throughout the nine chromosomes of Physocarpus and the eight chromosomes of Prunus were identified which permitted a meaningful comparison of the genomes of these two genera to be made. The study revealed a high level of macro-synteny between the two genomes, and relatively few chromosomal rearrangements, as has been observed in studies of other Rosaceous genomes, lending further support for a relatively simple model of genomic evolution in Rosaceae.

  10. Flavonoid Biosynthesis Genes Putatively Identified in the Aromatic Plant Polygonum minus via Expressed Sequences Tag (EST Analysis

    Directory of Open Access Journals (Sweden)

    Zamri Zainal

    2012-02-01

    Full Text Available P. minus is an aromatic plant, the leaf of which is widely used as a food additive and in the perfume industry. The leaf also accumulates secondary metabolites that act as active ingredients such as flavonoid. Due to limited genomic and transcriptomic data, the biosynthetic pathway of flavonoids is currently unclear. Identification of candidate genes involved in the flavonoid biosynthetic pathway will significantly contribute to understanding the biosynthesis of active compounds. We have constructed a standard cDNA library from P. minus leaves, and two normalized full-length enriched cDNA libraries were constructed from stem and root organs in order to create a gene resource for the biosynthesis of secondary metabolites, especially flavonoid biosynthesis. Thus, large‑scale sequencing of P. minus cDNA libraries identified 4196 expressed sequences tags (ESTs which were deposited in dbEST in the National Center of Biotechnology Information (NCBI. From the three constructed cDNA libraries, 11 ESTs encoding seven genes were mapped to the flavonoid biosynthetic pathway. Finally, three flavonoid biosynthetic pathway-related ESTs chalcone synthase, CHS (JG745304, flavonol synthase, FLS (JG705819 and leucoanthocyanidin dioxygenase, LDOX (JG745247 were selected for further examination by quantitative RT-PCR (qRT-PCR in different P. minus organs. Expression was detected in leaf, stem and root. Gene expression studies have been initiated in order to better understand the underlying physiological processes.

  11. De Novo Transcriptome Sequencing of Olea europaea L. to Identify Genes Involved in the Development of the Pollen Tube.

    Science.gov (United States)

    Iaria, Domenico; Chiappetta, Adriana; Muzzalupo, Innocenzo

    2016-01-01

    In olive (Olea europaea L.), the processes controlling self-incompatibility are still unclear and the molecular basis underlying this process are still not fully characterized. In order to determine compatibility relationships, using next-generation sequencing techniques and a de novo transcriptome assembly strategy, we show that pollen tubes from different olive plants, grown in vitro in a medium containing its own pistil and in combination pollen/pistil from self-sterile and self-fertile cultivars, have a distinct gene expression profile and many of the differentially expressed sequences between the samples fall within gene families involved in the development of the pollen tube, such as lipase, carboxylesterase, pectinesterase, pectin methylesterase, and callose synthase. Moreover, different genes involved in signal transduction, transcription, and growth are overrepresented. The analysis also allowed us to identify members in actin and actin depolymerization factor and fibrin gene family and member of the Ca(2+) binding gene family related to the development and polarization of pollen apical tip. The whole transcriptomic analysis, through the identification of the differentially expressed transcripts set and an extended functional annotation analysis, will lead to a better understanding of the mechanisms of pollen germination and pollen tube growth in the olive.

  12. Use of deep whole-genome sequencing data to identify structure risk variants in breast cancer susceptibility genes.

    Science.gov (United States)

    Guo, Xingyi; Shi, Jiajun; Cai, Qiuyin; Shu, Xiao-Ou; He, Jing; Wen, Wanqing; Allen, Jamie; Pharoah, Paul; Dunning, Alison; Hunter, David J; Kraft, Peter; Easton, Douglas F; Zheng, Wei; Long, Jirong

    2018-03-01

    Functional disruptions of susceptibility genes by large genomic structure variant (SV) deletions in germlines are known to be associated with cancer risk. However, few studies have been conducted to systematically search for SV deletions in breast cancer susceptibility genes. We analysed deep (> 30x) whole-genome sequencing (WGS) data generated in blood samples from 128 breast cancer patients of Asian and European descent with either a strong family history of breast cancer or early cancer onset disease. To identify SV deletions in known or suspected breast cancer susceptibility genes, we used multiple SV calling tools including Genome STRiP, Delly, Manta, BreakDancer and Pindel. SV deletions were detected by at least three of these bioinformatics tools in five genes. Specifically, we identified heterozygous deletions covering a fraction of the coding regions of BRCA1 (with approximately 80kb in two patients), and TP53 genes (with ∼1.6 kb in two patients), and of intronic regions (∼1 kb) of the PALB2 (one patient), PTEN (three patients) and RAD51C genes (one patient). We confirmed the presence of these deletions using real-time quantitative PCR (qPCR). Our study identified novel SV deletions in breast cancer susceptibility genes and the identification of such SV deletions may improve clinical testing.

  13. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak.

    Science.gov (United States)

    Saelens, Joseph W; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M; Xet-Mull, Ana M; Stout, Jason E; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M

    2015-12-01

    Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  14. Massively parallel signature sequencing and bioinformatics analysis identifies up-regulation of TGFBI and SOX4 in human glioblastoma.

    Directory of Open Access Journals (Sweden)

    Biaoyang Lin

    Full Text Available BACKGROUND: A comprehensive network-based understanding of molecular pathways abnormally altered in glioblastoma multiforme (GBM is essential for developing effective therapeutic approaches for this deadly disease. METHODOLOGY/PRINCIPAL FINDINGS: Applying a next generation sequencing technology, massively parallel signature sequencing (MPSS, we identified a total of 4535 genes that are differentially expressed between normal brain and GBM tissue. The expression changes of three up-regulated genes, CHI3L1, CHI3L2, and FOXM1, and two down-regulated genes, neurogranin and L1CAM, were confirmed by quantitative PCR. Pathway analysis revealed that TGF- beta pathway related genes were significantly up-regulated in GBM tumor samples. An integrative pathway analysis of the TGF beta signaling network identified two alternative TGF-beta signaling pathways mediated by SOX4 (sex determining region Y-box 4 and TGFBI (Transforming growth factor beta induced. Quantitative RT-PCR and immunohistochemistry staining demonstrated that SOX4 and TGFBI expression is elevated in GBM tissues compared with normal brain tissues at both the RNA and protein levels. In vitro functional studies confirmed that TGFBI and SOX4 expression is increased by TGF-beta stimulation and decreased by a specific inhibitor of TGF-beta receptor 1 kinase. CONCLUSIONS/SIGNIFICANCE: Our MPSS database for GBM and normal brain tissues provides a useful resource for the scientific community. The identification of non-SMAD mediated TGF-beta signaling pathways acting through SOX4 and TGFBI (GENE ID:7045 in GBM indicates that these alternative pathways should be considered, in addition to the canonical SMAD mediated pathway, in the development of new therapeutic strategies targeting TGF-beta signaling in GBM. Finally, the construction of an extended TGF-beta signaling network with overlaid gene expression changes between GBM and normal brain extends our understanding of the biology of GBM.

  15. Sequencing illustrates the transcriptional response of Legionella pneumophila during infection and identifies seventy novel small non-coding RNAs.

    LENUS (Irish Health Repository)

    Weissenmayer, Barbara A

    2011-01-01

    Second generation sequencing has prompted a number of groups to re-interrogate the transcriptomes of several bacterial and archaeal species. One of the central findings has been the identification of complex networks of small non-coding RNAs that play central roles in transcriptional regulation in all growth conditions and for the pathogen\\'s interaction with and survival within host cells. Legionella pneumophila is a gram-negative facultative intracellular human pathogen with a distinct biphasic lifestyle. One of its primary environmental hosts in the free-living amoeba Acanthamoeba castellanii and its infection by L. pneumophila mimics that seen in human macrophages. Here we present analysis of strand specific sequencing of the transcriptional response of L. pneumophila during exponential and post-exponential broth growth and during the replicative and transmissive phase of infection inside A. castellanii. We extend previous microarray based studies as well as uncovering evidence of a complex regulatory architecture underpinned by numerous non-coding RNAs. Over seventy new non-coding RNAs could be identified; many of them appear to be strain specific and in configurations not previously reported. We discover a family of non-coding RNAs preferentially expressed during infection conditions and identify a second copy of 6S RNA in L. pneumophila. We show that the newly discovered putative 6S RNA as well as a number of other non-coding RNAs show evidence for antisense transcription. The nature and extent of the non-coding RNAs and their expression patterns suggests that these may well play central roles in the regulation of Legionella spp. specific traits and offer clues as to how L. pneumophila adapts to its intracellular niche. The expression profiles outlined in the study have been deposited into Genbank\\'s Gene Expression Omnibus (GEO) database under the series accession GSE27232.

  16. Identification, characterization, and utilization of genome-wide simple sequence repeats to identify a QTL for acidity in apple

    Science.gov (United States)

    2012-01-01

    Background Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome. Results A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation. Conclusions This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding

  17. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius).

    LENUS (Irish Health Repository)

    Edwards, Ceiridwen J

    2010-01-01

    BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius) has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs) from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+\\/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer). In total, 289.9 megabases (22.48%) of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously identified

  18. RNA sequencing of Populus x canadensis roots identifies key molecular mechanisms underlying physiological adaption to excess zinc.

    Directory of Open Access Journals (Sweden)

    Andrea Ariani

    Full Text Available Populus x canadensis clone I-214 exhibits a general indicator phenotype in response to excess Zn, and a higher metal uptake in roots than in shoots with a reduced translocation to aerial parts under hydroponic conditions. This physiological adaptation seems mainly regulated by roots, although the molecular mechanisms that underlie these processes are still poorly understood. Here, differential expression analysis using RNA-sequencing technology was used to identify the molecular mechanisms involved in the response to excess Zn in root. In order to maximize specificity of detection of differentially expressed (DE genes, we consider the intersection of genes identified by three distinct statistical approaches (61 up- and 19 down-regulated and validate them by RT-qPCR, yielding an agreement of 93% between the two experimental techniques. Gene Ontology (GO terms related to oxidation-reduction processes, transport and cellular iron ion homeostasis were enriched among DE genes, highlighting the importance of metal homeostasis in adaptation to excess Zn by P. x canadensis clone I-214. We identified the up-regulation of two Populus metal transporters (ZIP2 and NRAMP1 probably involved in metal uptake, and the down-regulation of a NAS4 gene involved in metal translocation. We identified also four Fe-homeostasis transcription factors (two bHLH38 genes, FIT and BTS that were differentially expressed, probably for reducing Zn-induced Fe-deficiency. In particular, we suggest that the down-regulation of FIT transcription factor could be a mechanism to cope with Zn-induced Fe-deficiency in Populus. These results provide insight into the molecular mechanisms involved in adaption to excess Zn in Populus spp., but could also constitute a starting point for the identification and characterization of molecular markers or biotechnological targets for possible improvement of phytoremediation performances of poplar trees.

  19. The Quest for Rare Variants: Pooled Multiplexed Next Generation Sequencing in Plants

    Directory of Open Access Journals (Sweden)

    Fabio eMarroni

    2012-06-01

    Full Text Available Next generation sequencing (NGS instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, only three research groups working in plant sciences have exploited this potentiality. They showed that pooled NGS can provide results in excellent agreement with those obtained by individual Sanger sequencing. Aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method we will explain in detail the variations in study design and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled next generation sequencing can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity and Tajima’s D. Finally we will discuss applications and future perspectives of the multiplexed NGS approach.

  20. Clinical Use of Next-Generation Sequencing in the Diagnosis of Wilson’s Disease

    Directory of Open Access Journals (Sweden)

    Dániel Németh

    2016-01-01

    Full Text Available Objective. Wilson’s disease is a disorder of copper metabolism which is fatal without treatment. The great number of disease-causing ATP7B gene mutations and the variable clinical presentation of WD may cause a real diagnostic challenge. The emergence of next-generation sequencing provides a time-saving, cost-effective method for full sequencing of the whole ATP7B gene compared to the traditional Sanger sequencing. This is the first report on the clinical use of NGS to examine ATP7B gene. Materials and Methods. We used Ion Torrent Personal Genome Machine in four heterozygous patients for the identification of the other mutations and also in two patients with no known mutation. One patient with acute on chronic liver failure was a candidate for acute liver transplantation. The results were validated by Sanger sequencing. Results. In each case, the diagnosis of Wilson’s disease was confirmed by identifying the mutations in both alleles within 48 hours. One novel mutation (p.Ala1270Ile was found beyond the eight other known ones. The rapid detection of the mutations made possible the prompt diagnosis of WD in a patient with acute liver failure. Conclusions. According to our results we found next-generation sequencing a very useful, reliable, time-saving, and cost-effective method for diagnosing Wilson’s disease in selected cases.

  1. Analyses of Tissue Culture Adaptation of Human Herpesvirus-6A by Whole Genome Deep Sequencing Redefines the Reference Sequence and Identifies Virus Entry Complex Changes.

    Science.gov (United States)

    Tweedy, Joshua G; Escriva, Eric; Topf, Maya; Gompels, Ursula A

    2017-12-31

    Tissue-culture adaptation of viruses can modulate infection. Laboratory passage and bacterial artificial chromosome (BAC)mid cloning of human cytomegalovirus, HCMV, resulted in genomic deletions and rearrangements altering genes encoding the virus entry complex, which affected cellular tropism, virulence, and vaccine development. Here, we analyse these effects on the reference genome for related betaherpesviruses, Roseolovirus, human herpesvirus 6A (HHV-6A) strain U1102. This virus is also naturally "cloned" by germline subtelomeric chromosomal-integration in approximately 1% of human populations, and accurate references are key to understanding pathological relationships between exogenous and endogenous virus. Using whole genome next-generation deep-sequencing Illumina-based methods, we compared the original isolate to tissue-culture passaged and the BACmid-cloned virus. This re-defined the reference genome showing 32 corrections and 5 polymorphisms. Furthermore, minor variant analyses of passaged and BACmid virus identified emerging populations of a further 32 single nucleotide polymorphisms (SNPs) in 10 loci, half non-synonymous indicating cell-culture selection. Analyses of the BAC-virus genome showed deletion of the BAC cassette via loxP recombination removing green fluorescent protein (GFP)-based selection. As shown for HCMV culture effects, select HHV-6A SNPs mapped to genes encoding mediators of virus cellular entry, including virus envelope glycoprotein genes gB and the gH/gL complex. Comparative models suggest stabilisation of the post-fusion conformation. These SNPs are essential to consider in vaccine-design, antimicrobial-resistance, and pathogenesis.

  2. Deep Sequencing Reveals the Complete Genome and Evidence for Transcriptional Activity of the First Virus-Like Sequences Identified in Aristotelia chilensis (Maqui Berry

    Directory of Open Access Journals (Sweden)

    Javier Villacreses

    2015-04-01

    Full Text Available Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1. High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs: ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV, Petuvirus genus. ORF1 encodes a movement protein (MP; ORF2 a Reverse Transcriptase (RT and a Ribonuclease H (RNase H domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs, AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq. Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  3. Targeted high-throughput sequencing identifies mutations in atlastin-1 as a cause of hereditary sensory neuropathy type I.

    Science.gov (United States)

    Guelly, Christian; Zhu, Peng-Peng; Leonardis, Lea; Papić, Lea; Zidar, Janez; Schabhüttl, Maria; Strohmaier, Heimo; Weis, Joachim; Strom, Tim M; Baets, Jonathan; Willems, Jan; De Jonghe, Peter; Reilly, Mary M; Fröhlich, Eleonore; Hatz, Martina; Trajanoski, Slave; Pieber, Thomas R; Janecke, Andreas R; Blackstone, Craig; Auer-Grumbach, Michaela

    2011-01-07

    Hereditary sensory neuropathy type I (HSN I) is an axonal form of autosomal-dominant hereditary motor and sensory neuropathy distinguished by prominent sensory loss that leads to painless injuries. Unrecognized, these can result in delayed wound healing and osteomyelitis, necessitating distal amputations. To elucidate the genetic basis of an HSN I subtype in a family in which mutations in the few known HSN I genes had been excluded, we employed massive parallel exon sequencing of the 14.3 Mb disease interval on chromosome 14q. We detected a missense mutation (c.1065C>A, p.Asn355Lys) in atlastin-1 (ATL1), a gene that is known to be mutated in early-onset hereditary spastic paraplegia SPG3A and that encodes the large dynamin-related GTPase atlastin-1. The mutant protein exhibited reduced GTPase activity and prominently disrupted ER network morphology when expressed in COS7 cells, strongly supporting pathogenicity. An expanded screen in 115 additional HSN I patients identified two further dominant ATL1 mutations (c.196G>C [p.Glu66Gln] and c.976 delG [p.Val326TrpfsX8]). This study highlights an unexpected major role for atlastin-1 in the function of sensory neurons and identifies HSN I and SPG3A as allelic disorders.

  4. Use of DNA sequences to identify forensically important fly species and their distribution in the coastal region of Central California.

    Science.gov (United States)

    Nakano, Angie; Honda, Jeff

    2015-08-01

    Forensic entomology has gained prominence in recent years, as improvements in DNA technology and molecular methods have allowed insect and other arthropod evidence to become increasingly useful in criminal and civil investigations. However, comprehensive faunal inventories are still needed, including cataloging local DNA sequences for forensically significant Diptera. This multi-year fly-trapping study was built upon and expanded a previous survey of these flies in Santa Clara County, including the addition of genetic barcoding data from collected species of flies. Flies from the families Calliphoridae, Sarcophagidae, and Muscidae were trapped in meat-baited traps set in a variety of locations throughout the county. Flies were identified using morphological features and confirmed by molecular analysis. A total of 16 calliphorid species, 11 sarcophagid species, and four muscid species were collected and differentiated. This study found more species of flies than previous area surveys and established new county records for two calliphorid species: Cynomya cadaverina and Chrysomya rufifacies. Differences were found in fly fauna in different areas of the county, indicating the importance of microclimates in the distribution of these flies. Molecular analysis supported the use of DNA barcoding as an effective method of identifying cryptic fly species. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  5. Epidemiological study on the penicillin resistance of clinical Streptococcus pneumoniae isolates identified as the common sequence types.

    Science.gov (United States)

    Gao, Wei; Shi, Wei; Chen, Chang-hui; Wen, De-nian; Tian, Jin; Yao, Kai-hu

    2016-10-20

    There were some limitation in the current interpretation about the penicillin resistance mechanism of clinical Streptococcus pneumoniae isolates at the strain level. To explore the possibilities of studying the mechanism based on the sequence types (ST) of this bacteria, 488 isolates collected in Beijing from 1997-2014 and 88 isolates collected in Youyang County, Chongqing and Zhongjiang County, Sichuan in 2015 were analyzed by penicillin minimum inhibitory concentration (MIC) distribution and annual distribution. The results showed that the penicillin MICs of the all isolates covering by the given ST in Beijing have a defined range, either penicillin MIC penicillin MICs in the first few years after it was identified. The penicillin MIC of isolates identified as common STs and collected in Youyang County, Chongqing and Sichuan Zhongjiang County, including the ST271, ST320 and ST81, was around 0.25~2 mg/L (≥0.25 mg/L). Our study revealed the epidemiological distribution of penicillin MICs of the given STs determined in clinical S. pneumoniae isolates, suggesting that it is reasonable to research the penicillin resistance mechanism based on the STs of this bacteria.

  6. TSSer: an automated method to identify transcription start sites in prokaryotic genomes from differential RNA sequencing data.

    Science.gov (United States)

    Jorjani, Hadi; Zavolan, Mihaela

    2014-04-01

    Accurate identification of transcription start sites (TSSs) is an essential step in the analysis of transcription regulatory networks. In higher eukaryotes, the capped analysis of gene expression technology enabled comprehensive annotation of TSSs in genomes such as those of mice and humans. In bacteria, an equivalent approach, termed differential RNA sequencing (dRNA-seq), has recently been proposed, but the application of this approach to a large number of genomes is hindered by the paucity of computational analysis methods. With few exceptions, when the method has been used, annotation of TSSs has been largely done manually. In this work, we present a computational method called 'TSSer' that enables the automatic inference of TSSs from dRNA-seq data. The method rests on a probabilistic framework for identifying both genomic positions that are preferentially enriched in the dRNA-seq data as well as preferentially captured relative to neighboring genomic regions. Evaluating our approach for TSS calling on several publicly available datasets, we find that TSSer achieves high consistency with the curated lists of annotated TSSs, but identifies many additional TSSs. Therefore, TSSer can accelerate genome-wide identification of TSSs in bacterial genomes and can aid in further characterization of bacterial transcription regulatory networks. TSSer is freely available under GPL license at http://www.clipz.unibas.ch/TSSer/index.php

  7. Whole-Exome Sequencing Identified a Novel Compound Heterozygous Mutation of LRRC6 in a Chinese Primary Ciliary Dyskinesia Patient

    Directory of Open Access Journals (Sweden)

    Lv Liu

    2018-01-01

    Full Text Available Primary ciliary dyskinesia (PCD is a clinical rare peculiar disorder, mainly featured by respiratory infection, tympanitis, nasosinusitis, and male infertility. Previous study demonstrated it is an autosomal recessive disease and by 2017 almost 40 pathologic genes have been identified. Among them are the leucine-rich repeat- (LRR- containing 6 (LRRC6 codes for a 463-amino-acid cytoplasmic protein, expressed distinctively in motile cilia cells, including the testis cells and the respiratory epithelial cells. In this study, we applied whole-exome sequencing combined with PCD-known genes filtering to explore the genetic lesion of a PCD patient. A novel compound heterozygous mutation in LRRC6 (c.183T>G/p.N61K; c.179-1G>A was identified and coseparated in this family. The missense mutation (c.183T>G/p.N61K may lead to a substitution of asparagine by lysine at position 61 in exon 3 of LRRC6. The splice site mutation (c.179-1G>A may cause a premature stop codon in exon 4 and decrease the mRNA levels of LRRC6. Both mutations were not present in our 200 local controls, dbSNP, and 1000 genomes. Three bioinformatics programs also predicted that both mutations are deleterious. Our study not only further supported the importance of LRRC6 in PCD, but also expanded the spectrum of LRRC6 mutations and will contribute to the genetic diagnosis and counseling of PCD patients.

  8. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    DEFF Research Database (Denmark)

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang

    2015-01-01

    . Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication...

  9. De Novo Transcriptome Sequencing in Passiflora edulis Sims to Identify Genes and Signaling Pathways Involved in Cold Tolerance

    Directory of Open Access Journals (Sweden)

    Sian Liu

    2017-11-01

    Full Text Available The passion fruit (Passiflora edulis Sims, also known as the purple granadilla, is widely cultivated as the new darling of the fruit market throughout southern China. This exotic and perennial climber is adapted to warm and humid climates, and thus is generally intolerant of cold. There is limited information about gene regulation and signaling pathways related to the cold stress response in this species. In this study, two transcriptome libraries (KEDU_AP vs. GX_AP were constructed from the aerial parts of cold-tolerant and cold-susceptible varieties of P. edulis, respectively. Overall, 126,284,018 clean reads were obtained, and 86,880 unigenes with a mean size of 1449 bp were assembled. Of these, there were 64,067 (73.74% unigenes with significant similarity to publicly available plant protein sequences. Expression profiles were generated, and 3045 genes were found to be significantly differentially expressed between the KEDU_AP and GX_AP libraries, including 1075 (35.3% up-regulated and 1970 (64.7% down-regulated. These included 36 genes in enriched pathways of plant hormone signal transduction, and 56 genes encoding putative transcription factors. Six genes involved in the ICE1–CBF–COR pathway were induced in the cold-tolerant variety, and their expression levels were further verified using quantitative real-time PCR. This report is the first to identify genes and signaling pathways involved in cold tolerance using high-throughput transcriptome sequencing in P. edulis. These findings may provide useful insights into the molecular mechanisms regulating cold tolerance and genetic breeding in Passiflora spp.

  10. Identifying Faults Associated with the 2001 Avoca Induced(?) Seismicity Sequence of Western New York State Using Potential Field Wavelets.

    Science.gov (United States)

    Horowitz, F. G.; Ebinger, C.; Jordan, T. E.

    2017-12-01

    Results from recent DOE and USGS sponsored projects in the (intraplate) northeastern portions of the US and southeastern portions of Canada have identified locations of steeply dipping structures - many previously unknown - from a Poisson wavelet multiscale edge ('worm') analysis of gravity and magnetic fields. The Avoca sequence of induced(?) seismicity in western New York state occurred during January and February of 2001. The Avoca earthquake sequence is associated with industrial hydraulic fracturing activity "related to a proposed natural gas storage facility near Avoca to be constructed by solution mining" (Kim, 2001). The main Avoca event was a felt Mb = 3.2 earthquake on Feb. 3, 2001 recorded by the Lamont Cooperative Seismic Network. Earlier, smaller events were located by the Canadian Geological Survey's seismic network north of the Canadian border - implying that the event locations might be biased because they occurred off the southern edge of the array. Some of these events were also felt locally, according to local newspaper reports. By plotting the location of the seismic events and that of the injection well - reported via it's API number - we find a strong correlation with structures detected via our potential field worms. The injection occurred near a NE-SW striking structure that was not activated. All but one of the earthquakes occurred about 5 km north of the injection well on or nearby to an E-W striking structure that appears to intersect the NE-SW structure. The final, small (MN=2.2) earthquake was located on a different complex structure about 10 km north of the other events. We suggest that potential field methods such as ours might be appropriate to locating structures of concern for induced seismic activity in association with industrial activity. Reference: Kim, W.-Y. (2001). The Lamont cooperative seismic network and the national seismic system: Earthquake hazard studies in the northeastern United States. Tech. Rep. 98-01, Lamont

  11. High-throughput sequencing of the T cell receptor β gene identifies aggressive early-stage mycosis fungoides.

    Science.gov (United States)

    de Masson, Adele; O'Malley, John T; Elco, Christopher P; Garcia, Sarah S; Divito, Sherrie J; Lowry, Elizabeth L; Tawa, Marianne; Fisher, David C; Devlin, Phillip M; Teague, Jessica E; Leboeuf, Nicole R; Kirsch, Ilan R; Robins, Harlan; Clark, Rachael A; Kupper, Thomas S

    2018-05-09

    Mycosis fungoides (MF), the most common cutaneous T cell lymphoma (CTCL) is a malignancy of skin-tropic memory T cells. Most MF cases present as early stage (stage I A/B, limited to the skin), and these patients typically have a chronic, indolent clinical course. However, a small subset of early-stage cases develop progressive and fatal disease. Because outcomes can be so different, early identification of this high-risk population is an urgent unmet clinical need. We evaluated the use of next-generation high-throughput DNA sequencing of the T cell receptor β gene ( TCRB ) in lesional skin biopsies to predict progression and survival in a discovery cohort of 208 patients with CTCL (177 with MF) from a 15-year longitudinal observational clinical study. We compared these data to the results in an independent validation cohort of 101 CTCL patients (87 with MF). The tumor clone frequency (TCF) in lesional skin, measured by high-throughput sequencing of the TCRB gene, was an independent prognostic factor of both progression-free and overall survival in patients with CTCL and MF in particular. In early-stage patients, a TCF of >25% in the skin was a stronger predictor of progression than any other established prognostic factor (stage IB versus IA, presence of plaques, high blood lactate dehydrogenase concentration, large-cell transformation, or age). The TCF therefore may accurately predict disease progression in early-stage MF. Early identification of patients at high risk for progression could help identify candidates who may benefit from allogeneic hematopoietic stem cell transplantation before their disease becomes treatment-refractory. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

  12. A RAD-Based Genetic Map for Anchoring Scaffold Sequences and Identifying QTLs in Bitter Gourd (Momordica charantia)

    Science.gov (United States)

    Cui, Junjie; Luo, Shaobo; Niu, Yu; Huang, Rukui; Wen, Qingfang; Su, Jianwen; Miao, Nansheng; He, Weiming; Dong, Zhensheng; Cheng, Jiaowen; Hu, Kailin

    2018-01-01

    Genetic mapping is a basic tool necessary for anchoring assembled scaffold sequences and for identifying QTLs controlling important traits. Though bitter gourd (Momordica charantia) is both consumed and used as a medicinal, research on its genomics and genetic mapping is severely limited. Here, we report the construction of a restriction site associated DNA (RAD)-based genetic map for bitter gourd using an F2 mapping population comprising 423 individuals derived from two cultivated inbred lines, the gynoecious line ‘K44’ and the monoecious line ‘Dali-11.’ This map comprised 1,009 SNP markers and spanned a total genetic distance of 2,203.95 cM across the 11 linkage groups. It anchored a total of 113 assembled scaffolds that covered about 251.32 Mb (85.48%) of the 294.01 Mb assembled genome. In addition, three horticulturally important traits including sex expression, fruit epidermal structure, and immature fruit color were evaluated using a combination of qualitative and quantitative data. As a result, we identified three QTL/gene loci responsible for these traits in three environments. The QTL/gene gy/fffn/ffn, controlling sex expression involved in gynoecy, first female flower node, and female flower number was detected in the reported region. Particularly, two QTLs/genes, Fwa/Wr and w, were found to be responsible for fruit epidermal structure and white immature fruit color, respectively. This RAD-based genetic map promotes the assembly of the bitter gourd genome and the identified genetic loci will accelerate the cloning of relevant genes in the future. PMID:29706980

  13. Panel-based whole exome sequencing identifies novel mutations in microphthalmia and anophthalmia patients showing complex Mendelian inheritance patterns.

    Science.gov (United States)

    Riera, Marina; Wert, Ana; Nieto, Isabel; Pomares, Esther

    2017-11-01

    Microphthalmia and anophthalmia (MA) are congenital eye abnormalities that show an extremely high clinical and genetic complexity. In this study, we evaluated the implementation of whole exome sequencing (WES) for the genetic analysis of MA patients. This approach was used to investigate three unrelated families in which previous single-gene analyses failed to identify the molecular cause. A total of 47 genes previously associated with nonsyndromic MA were included in our panel. WES was performed in one affected patient from each family using the AmpliSeq TM Exome technology and the Ion Proton TM platform. A novel heterozygous OTX2 missense mutation was identified in a patient showing bilateral anophthalmia who inherited the variant from a parent who was a carrier, but showed no sign of the condition. We also describe a new PAX6 missense variant in an autosomal-dominant pedigree affected by mild bilateral microphthalmia showing high intrafamiliar variability, with germline mosaicism determined to be the most plausible molecular cause of the disease. Finally, a heterozygous missense mutation in RBP4 was found to be responsible in an isolated case of bilateral complex microphthalmia. This study highlights that panel-based WES is a reliable and effective strategy for the genetic diagnosis of MA. Furthermore, using this technique, the mutational spectrum of these diseases was broadened, with novel variants identified in each of the OTX2, PAX6, and RBP4 genes. Moreover, we report new cases of reduced penetrance, mosaicism, and variable phenotypic expressivity associated with MA, further demonstrating the heterogeneity of such disorders. © 2017 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.

  14. A RAD-Based Genetic Map for Anchoring Scaffold Sequences and Identifying QTLs in Bitter Gourd (Momordica charantia

    Directory of Open Access Journals (Sweden)

    Junjie Cui

    2018-04-01

    Full Text Available Genetic mapping is a basic tool necessary for anchoring assembled scaffold sequences and for identifying QTLs controlling important traits. Though bitter gourd (Momordica charantia is both consumed and used as a medicinal, research on its genomics and genetic mapping is severely limited. Here, we report the construction of a restriction site associated DNA (RAD-based genetic map for bitter gourd using an F2 mapping population comprising 423 individuals derived from two cultivated inbred lines, the gynoecious line ‘K44’ and the monoecious line ‘Dali-11.’ This map comprised 1,009 SNP markers and spanned a total genetic distance of 2,203.95 cM across the 11 linkage groups. It anchored a total of 113 assembled scaffolds that covered about 251.32 Mb (85.48% of the 294.01 Mb assembled genome. In addition, three horticulturally important traits including sex expression, fruit epidermal structure, and immature fruit color were evaluated using a combination of qualitative and quantitative data. As a result, we identified three QTL/gene loci responsible for these traits in three environments. The QTL/gene gy/fffn/ffn, controlling sex expression involved in gynoecy, first female flower node, and female flower number was detected in the reported region. Particularly, two QTLs/genes, Fwa/Wr and w, were found to be responsible for fruit epidermal structure and white immature fruit color, respectively. This RAD-based genetic map promotes the assembly of the bitter gourd genome and the identified genetic loci will accelerate the cloning of relevant genes in the future.

  15. Identification of the first homozygous 1-bp deletion in GDF9 gene leading to primary ovarian insufficiency by using targeted massively parallel sequencing.

    Science.gov (United States)

    França, M M; Funari, M F A; Nishi, M Y; Narcizo, A M; Domenice, S; Costa, E M F; Lerario, A M; Mendonca, B B

    2018-02-01

    Targeted massively parallel sequencing (TMPS) has been used in genetic diagnosis for Mendelian disorders. In the past few years, the TMPS has identified new and already described genes associated with primary ovarian insufficiency (POI) phenotype. Here, we performed a targeted gene sequencing to find a genetic diagnosis in idiopathic cases of Brazilian POI cohort. A custom SureSelect XT DNA target enrichment panel was designed and the sequencing was performed on Illumina NextSeq sequencer. We identified 1 homozygous 1-bp deletion variant (c.783delC) in the GDF9 gene in 1 patient with POI. The variant was confirmed and segregated using Sanger sequencing. The c.783delC GDF9 variant changed an amino acid creating a premature termination codon (p.Ser262Hisfs*2). This variant was not present in all public databases (ExAC/gnomAD, NHLBI/EVS and 1000Genomes). Moreover, it was absent in 400 alleles from fertile Brazilian women screened by Sanger sequencing. The patient's mother and her unaffected sister carried the c.783delC variant in a heterozygous state, as expected for an autosomal recessive inheritance. Here, the TMPS identified the first homozygous 1-bp deletion variant in GDF9. This finding reveals a novel inheritance pattern of pathogenic variant in GDF9 associated with POI, thus improving the genetic diagnosis of this disorder. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  16. Identification of genomic insertion and flanking sequence of G2-EPSPS and GAT transgenes in soybean using whole genome sequencing method

    Directory of Open Access Journals (Sweden)

    Bingfu Guo

    2016-07-01

    Full Text Available Molecular characterization of sequences flanking exogenous fragment insertions is essential for safety assessment and labeling of genetically modified organisms (GMO. In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS method. About 21 Gb sequence data (~21× coverage for each line was generated on Illumina HiSeq 2500 platform. The junction reads mapped to boundary of T-DNA and flanking sequences in these two events were identified by comparing all sequencing reads with soybean reference genome and sequence of transgenic vector. The putative insertion loci and flanking sequences were further confirmed by PCR amplification, Sanger sequencing, and co-segregation analysis. All these analyses supported that exogenous T-DNA fragments were integrated in positions of Chr19: 50543767-50543792 and Chr17: 7980527-7980541 in these two transgenic lines. Identification of the genomic insertion site of the G2-EPSPS and GAT transgenes will facilitate the use of their glyphosate-tolerant traits in soybean breeding program. These results also demonstrated that WGS is a cost-effective and rapid method of identifying sites of T-DNA insertions and flanking sequences in soybean.

  17. Whole exome sequencing identifies novel genes for fetal hemoglobin response to hydroxyurea in children with sickle cell anemia.

    Science.gov (United States)

    Sheehan, Vivien A; Crosby, Jacy R; Sabo, Aniko; Mortier, Nicole A; Howard, Thad A; Muzny, Donna M; Dugan-Perez, Shannon; Aygun, Banu; Nottage, Kerri A; Boerwinkle, Eric; Gibbs, Richard A; Ware, Russell E; Flanagan, Jonathan M

    2014-01-01

    Hydroxyurea has proven efficacy in children and adults with sickle cell anemia (SCA), but with considerable inter-individual variability in the amount of fetal hemoglobin (HbF) produced. Sibling and twin studies indicate that some of that drug response variation is heritable. To test the hypothesis that genetic modifiers influence pharmacological induction of HbF, we investigated phenotype-genotype associations using whole exome sequencing of children with SCA treated prospectively with hydroxyurea to maximum tolerated dose (MTD). We analyzed 171 unrelated patients enrolled in two prospective clinical trials, all treated with dose escalation to MTD. We examined two MTD drug response phenotypes: HbF (final %HbF minus baseline %HbF), and final %HbF. Analyzing individual genetic variants, we identified multiple low frequency and common variants associated with HbF induction by hydroxyurea. A validation cohort of 130 pediatric sickle cell patients treated to MTD with hydroxyurea was genotyped for 13 non-synonymous variants with the strongest association with HbF response to hydroxyurea in the discovery cohort. A coding variant in Spalt-like transcription factor, or SALL2, was associated with higher final HbF in this second independent replication sample and SALL2 represents an outstanding novel candidate gene for further investigation. These findings may help focus future functional studies and provide new insights into the pharmacological HbF upregulation by hydroxyurea in patients with SCA.

  18. Automatically Identifying Fusion Events between GLUT4 Storage Vesicles and the Plasma Membrane in TIRF Microscopy Image Sequences

    Directory of Open Access Journals (Sweden)

    Jian Wu

    2015-01-01

    Full Text Available Quantitative analysis of the dynamic behavior about membrane-bound secretory vesicles has proven to be important in biological research. This paper proposes a novel approach to automatically identify the elusive fusion events between VAMP2-pHluorin labeled GLUT4 storage vesicles (GSVs and the plasma membrane. The differentiation is implemented to detect the initiation of fusion events by modified forward subtraction of consecutive frames in the TIRFM image sequence. Spatially connected pixels in difference images brighter than a specified adaptive threshold are grouped into a distinct fusion spot. The vesicles are located at the intensity-weighted centroid of their fusion spots. To reveal the true in vivo nature of a fusion event, 2D Gaussian fitting for the fusion spot is used to derive the intensity-weighted centroid and the spot size during the fusion process. The fusion event and its termination can be determined according to the change of spot size. The method is evaluated on real experiment data with ground truth annotated by expert cell biologists. The evaluation results show that it can achieve relatively high accuracy comparing favorably to the manual analysis, yet at a small fraction of time.

  19. A targeted sequencing panel identifies rare damaging variants in multiple genes in the cranial neural tube defect, anencephaly.

    Science.gov (United States)

    Ishida, M; Cullup, T; Boustred, C; James, C; Docker, J; English, C; Lench, N; Copp, A J; Moore, G E; Greene, N D E; Stanier, P

    2018-04-01

    Neural tube defects (NTDs) affecting the brain (anencephaly) are lethal before or at birth, whereas lower spinal defects (spina bifida) may lead to lifelong neurological handicap. Collectively, NTDs rank among the most common birth defects worldwide. This study focuses on anencephaly, which despite having a similar frequency to spina bifida and being the most common type of NTD observed in mouse models, has had more limited inclusion in genetic studies. A genetic influence is strongly implicated in determining risk of NTDs and a molecular diagnosis is of fundamental importance to families both in terms of understanding the origin of the condition and for managing future pregnancies. Here we used a custom panel of 191 NTD candidate genes to screen 90 patients with cranial NTDs (n = 85 anencephaly and n = 5 craniorachischisis) with a targeted exome sequencing platform. After filtering and comparing to our in-house control exome database (N = 509), we identified 397 rare variants (minor allele frequency, MAF < 1%), 21 of which were previously unreported and predicted damaging. This included 1 frameshift (PDGFRA), 2 stop-gained (MAT1A; NOS2) and 18 missense variations. Together with evidence for oligogenic inheritance, this study provides new information on the possible genetic causation of anencephaly. © 2017 The Authors. Clinical Genetics published by John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  20. Whole Genome Sequencing Identifies a Missense Mutation in HES7 Associated with Short Tails in Asian Domestic Cats.

    Science.gov (United States)

    Xu, Xiao; Sun, Xin; Hu, Xue-Song; Zhuang, Yan; Liu, Yue-Chen; Meng, Hao; Miao, Lin; Yu, He; Luo, Shu-Jin

    2016-08-25

    Domestic cats exhibit abundant variations in tail morphology and serve as an excellent model to study the development and evolution of vertebrate tails. Cats with shortened and kinked tails were first recorded in the Malayan archipelago by Charles Darwin in 1868 and remain quite common today in Southeast and East Asia. To elucidate the genetic basis of short tails in Asian cats, we built a pedigree of 13 cats segregating at the trait with a founder from southern China and performed linkage mapping based on whole genome sequencing data from the pedigree. The short-tailed trait was mapped to a 5.6 Mb region of Chr E1, within which the substitution c. 5T > C in the somite segmentation-related gene HES7 was identified as the causal mutation resulting in a missense change (p.V2A). Validation in 245 unrelated cats confirmed the correlation between HES7-c. 5T > C and Chinese short-tailed feral cats as well as the Japanese Bobtail breed, indicating a common genetic basis of the two. In addition, some of our sampled kinked-tailed cats could not be explained by either HES7 or the Manx-related T-box, suggesting at least three independent events in the evolution of domestic cats giving rise to short-tailed traits.

  1. Mining environmental high-throughput sequence data sets to identify divergent amplicon clusters for phylogenetic reconstruction and morphotype visualization.

    Science.gov (United States)

    Gimmler, Anna; Stoeck, Thorsten

    2015-08-01

    Environmental high-throughput sequencing (envHTS) is a very powerful tool, which in protistan ecology is predominantly used for the exploration of diversity and its geographic and local patterns. We here used a pyrosequenced V4-SSU rDNA data set from a solar saltern pond as test case to exploit such massive protistan amplicon data sets beyond this descriptive purpose. Therefore, we combined a Swarm-based blastn network including 11 579 ciliate V4 amplicons to identify divergent amplicon clusters with targeted polymerase chain reaction (PCR) primer design for full-length small subunit of the ribosomal DNA retrieval and probe design for fluorescence in situ hybridization (FISH). This powerful strategy allows to benefit from envHTS data sets to (i) reveal the phylogenetic position of the taxon behind divergent amplicons; (ii) improve phylogenetic resolution and evolutionary history of specific taxon groups; (iii) solidly assess an amplicons (species') degree of similarity to its closest described relative; (iv) visualize the morphotype behind a divergent amplicons cluster; (v) rapidly FISH screen many environmental samples for geographic/habitat distribution and abundances of the respective organism and (vi) to monitor the success of enrichment strategies in live samples for cultivation and isolation of the respective organisms. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.

  2. Next Generation Sequencing Identifies Five Major Classes of Potentially Therapeutic Enzymes Secreted by Lucilia sericata Medical Maggots.

    Science.gov (United States)

    Franta, Zdeněk; Vogel, Heiko; Lehmann, Rüdiger; Rupp, Oliver; Goesmann, Alexander; Vilcinskas, Andreas

    2016-01-01

    Lucilia sericata larvae are used as an alternative treatment for recalcitrant and chronic wounds. Their excretions/secretions contain molecules that facilitate tissue debridement, disinfect, or accelerate wound healing and have therefore been recognized as a potential source of novel therapeutic compounds. Among the substances present in excretions/secretions various peptidase activities promoting the wound healing processes have been detected but the peptidases responsible for these activities remain mostly unidentified. To explore these enzymes we applied next generation sequencing to analyze the transcriptomes of different maggot tissues (salivary glands, gut, and crop) associated with the production of excretions/secretions and/or with digestion as well as the rest of the larval body. As a result we obtained more than 123.8 million paired-end reads, which were assembled de novo using Trinity and Oases assemblers, yielding 41,421 contigs with an N50 contig length of 2.22 kb and a total length of 67.79 Mb. BLASTp analysis against the MEROPS database identified 1729 contigs in 577 clusters encoding five peptidase classes (serine, cysteine, aspartic, threonine, and metallopeptidases), which were assigned to 26 clans, 48 families, and 185 peptidase species. The individual enzymes were differentially expressed among maggot tissues and included peptidase activities related to the therapeutic effects of maggot excretions/secretions.

  3. Patterns of oligonucleotide sequences in viral and host cell RNA identify mediators of the host innate immune system.

    Directory of Open Access Journals (Sweden)

    Benjamin D Greenbaum

    Full Text Available The innate immune response provides a first line of defense against pathogens by targeting generic differential features that are present in foreign organisms but not in the host. These innate responses generate selection forces acting both in pathogens and hosts that further determine their co-evolution. Here we analyze the nucleic acid sequence fingerprints of these selection forces acting in parallel on both host innate immune genes and ssRNA viral genomes. We do this by identifying dinucleotide biases in the coding regions of innate immune response genes in plasmacytoid dendritic cells, and then use this signal to identify other significant host innate immune genes. The persistence of these biases in the orthologous groups of genes in humans and chickens is also examined. We then compare the significant motifs in highly expressed genes of the innate immune system to those in ssRNA viruses and study the evolution of these motifs in the H1N1 influenza genome. We argue that the significant under-represented motif pattern of CpG in an AU context--which is found in both the ssRNA viruses and innate genes, and has decreased throughout the history of H1N1 influenza replication in humans--is immunostimulatory and has been selected against during the co-evolution of viruses and host innate immune genes. This shows how differences in host immune biology can drive the evolution of viruses that jump into species with different immune priorities than the original host.

  4. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  5. The utility of Next Generation Sequencing for molecular diagnostics in Rett syndrome.

    Science.gov (United States)

    Vidal, Silvia; Brandi, Núria; Pacheco, Paola; Gerotina, Edgar; Blasco, Laura; Trotta, Jean-Rémi; Derdak, Sophia; Del Mar O'Callaghan, Maria; Garcia-Cazorla, Àngels; Pineda, Mercè; Armstrong, Judith

    2017-09-25

    Rett syndrome (RTT) is an early-onset neurodevelopmental disorder that almost exclusively affects girls and is totally disabling. Three genes have been identified that cause RTT: MECP2, CDKL5 and FOXG1. However, the etiology of some of RTT patients still remains unknown. Recently, next generation sequencing (NGS) has promoted genetic diagnoses because of the quickness and affordability of the method. To evaluate the usefulness of NGS in genetic diagnosis, we present the genetic study of RTT-like patients using different techniques based on this technology. We studied 1577 patients with RTT-like clinical diagnoses and reviewed patients who were previously studied and thought to have RTT genes by Sanger sequencing. Genetically, 477 of 1577 patients with a RTT-like suspicion have been diagnosed. Positive results were found in 30% by Sanger sequencing, 23% with a custom panel, 24% with a commercial panel and 32% with whole exome sequencing. A genetic study using NGS allows the study of a larger number of genes associated with RTT-like symptoms simultaneously, providing genetic study of a wider group of patients as well as significantly reducing the response time and cost of the study.

  6. The quest for rare variants: pooled multiplexed next generation sequencing in plants.

    Science.gov (United States)

    Marroni, Fabio; Pinosio, Sara; Morgante, Michele

    2012-01-01

    Next generation sequencing (NGS) instruments produce an unprecedented amount of sequence data at contained costs. This gives researchers the possibility of designing studies with adequate power to identify rare variants at a fraction of the economic and labor resources required by individual Sanger sequencing. As of today, few research groups working in plant sciences have exploited this potentiality, showing that pooled NGS provides results in excellent agreement with those obtained by individual Sanger sequencing. The aim of this review is to convey to the reader the general ideas underlying the use of pooled NGS for the identification of rare variants. To facilitate a thorough understanding of the possibilities of the method, we will explain in detail the possible experimental and analytical approaches and discuss their advantages and disadvantages. We will show that information on allele frequency obtained by pooled NGS can be used to accurately compute basic population genetics indexes such as allele frequency, nucleotide diversity, and Tajima's D. Finally, we will discuss applications and future perspectives of the multiplexed NGS approach.

  7. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Science.gov (United States)

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  8. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Directory of Open Access Journals (Sweden)

    Nathan D. Olson

    2015-03-01

    Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  9. Targeted sequencing identifies genetic alterations that confer primary resistance to EGFR tyrosine kinase inhibitor (Korean Lung Cancer Consortium).

    Science.gov (United States)

    Lim, Sun Min; Kim, Hye Ryun; Cho, Eun Kyung; Min, Young Joo; Ahn, Jin Seok; Ahn, Myung-Ju; Park, Keunchil; Cho, Byoung Chul; Lee, Ji-Hyun; Jeong, Hye Cheol; Kim, Eun Kyung; Kim, Joo-Hang

    2016-06-14

    Non-small-cell lung cancer (NSCLC) patients with activating epidermal growth factor receptor (EGFR) mutations may exhibit primary resistance to EGFR tyrosine kinase inhibitor (TKI). We aimed to examine genomic alterations associated with de novo resistance to gefitinib in a prospective study of NSCLC patients. One-hundred and fifty two patients with activating EGFR mutations were included in this study and 136 patients' tumor sample were available for targeted sequencing of genomic alterations in 22 genes using the Colon and Lung Cancer panel (Ampliseq, Life Technologies). All 132 patients with EGFR mutation were treated with gefitinib for their treatment of advanced NSCLC. Twenty patients showed primary resistance to EGFR TKI, and were classified as non-responders. A total of 543 somatic single-nucleotide variants (498 missense, 13 nonsense) and 32 frameshift insertions/deletions, with a median of 3 mutations per sample. TP53 was most commonly mutated (47%) and mutations in SMAD4 was also common (19%), as well as DDR2 (16%), PIK3CA (15%), STK11 (14%), and BRAF (7%). Genomic mutations in the PI3K/Akt/mTOR pathway were commonly found in non-responders (45%) compared to responders (27%), and they had significantly shorter progression-free survival and overall survival compared to patients without mutations (2.1 vs. 12.8 months, P=0.04, 15.7 vs. not reached, PAkt/mTOR pathway were commonly identified in non-responders and may confer resistance to EGFR TKI. Screening lung adenocarcinoma patients with clinical cancer gene test may aid in selecting out those who show primary resistance to EGFR TKI (NCT01697163).

  10. PineElm_SSRdb: a microsatellite marker database identified from genomic, chloroplast, mitochondrial and EST sequences of pineapple (Ananas comosus (L.) Merrill).

    Science.gov (United States)

    Chaudhary, Sakshi; Mishra, Bharat Kumar; Vivek, Thiruvettai; Magadum, Santoshkumar; Yasin, Jeshima Khan

    2016-01-01

    Simple Sequence Repeats or microsatellites are resourceful molecular genetic markers. There are only few reports of SSR identification and development in pineapple. Complete genome sequence of pineapple available in the public domain can be used to develop numerous novel SSRs. Therefore, an attempt was made to identify SSRs from genomic, chloroplast, mitochondrial and EST sequences of pineapple which will help in deciphering genetic makeup of its germplasm resources. A total of 359511 SSRs were identified in pineapple (356385 from genome sequence, 45 from chloroplast sequence, 249 in mitochondrial sequence and 2832 from EST sequences). The list of EST-SSR markers and their details are available in the database. PineElm_SSRdb is an open source database available for non-commercial academic purpose at http://app.bioelm.com/ with a mapping tool which can develop circular maps of selected marker set. This database will be of immense use to breeders, researchers and graduates working on Ananas spp. and to others working on cross-species transferability of markers, investigating diversity, mapping and DNA fingerprinting.

  11. Identity-by-descent filtering of exome sequence data identifies PIGV mutations in hyperphosphatasia mental retardation syndrome.

    NARCIS (Netherlands)

    Krawitz, P.M.; Schweiger, M.R.; Rodelsperger, C.; Marcelis, C.L.M.; Kolsch, U.; Meisel, C.; Stephani, F.; Kinoshita, T.; Murakami, Y.; Bauer, S.; Isau, M.; Fischer, A.; Dahl, A.; Kerick, M.; Hecht, J.; Kohler, S.; Jager, M. de; Grunhagen, J.; Condor, B.J. de; Doelken, S.; Brunner, H.G.; Meinecke, P.; Passarge, E.; Thompson, M.D.; Cole, D.E.; Horn, D.; Roscioli, T.; Mundlos, S.; Robinson, P.N.

    2010-01-01

    Hyperphosphatasia mental retardation (HPMR) syndrome is an autosomal recessive form of mental retardation with distinct facial features and elevated serum alkaline phosphatase. We performed whole-exome sequencing in three siblings of a nonconsanguineous union with HPMR and performed computational

  12. Molecular genetics of the Usher syndrome in Lebanon: identification of 11 novel protein truncating mutations by whole exome sequencing.

    Science.gov (United States)

    Reddy, Ramesh; Fahiminiya, Somayyeh; El Zir, Elie; Mansour, Ahmad; Megarbane, Andre; Majewski, Jacek; Slim, Rima

    2014-01-01

    Usher syndrome (USH) is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II. Whole exome sequencing followed by expanded familial validation by Sanger sequencing. We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98. Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes.

  13. Molecular genetics of the Usher syndrome in Lebanon: identification of 11 novel protein truncating mutations by whole exome sequencing.

    Directory of Open Access Journals (Sweden)

    Ramesh Reddy

    Full Text Available Usher syndrome (USH is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II.Whole exome sequencing followed by expanded familial validation by Sanger sequencing.We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98.Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes.

  14. Molecular Genetics of the Usher Syndrome in Lebanon: Identification of 11 Novel Protein Truncating Mutations by Whole Exome Sequencing

    Science.gov (United States)

    Reddy, Ramesh; Fahiminiya, Somayyeh; El Zir, Elie; Mansour, Ahmad; Megarbane, Andre; Majewski, Jacek; Slim, Rima

    2014-01-01

    Background Usher syndrome (USH) is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II. Methods Whole exome sequencing followed by expanded familial validation by Sanger sequencing. Results We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98. Conclusion Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes. PMID:25211151

  15. Multiple viral infections in Agaricus bisporus - Characterisation of 18 unique RNA viruses and 8 ORFans identified by deep sequencing

    OpenAIRE

    Deakin, Gregory; Dobbs, Edward; Bennett, Julie M.; Jones, Ian M.; Grogan, Helen M.; Burton, Kerry S.

    2017-01-01

    Thirty unique non-host RNAs were sequenced in the cultivated fungus, Agaricus bisporus, comprising 18 viruses each encoding an RdRp domain with an additional 8 ORFans (non-host RNAs with no similarity to known sequences). Two viruses were multipartite with component RNAs showing correlative abundances and common 3′ motifs. The viruses, all positive sense single-stranded, were classified into diverse orders/families. Multiple infections of Agaricus may represent a diverse, dynamic and interact...

  16. Memory Efficient Sequence Analysis Using Compressed Data Structures (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    Energy Technology Data Exchange (ETDEWEB)

    Simpson, Jared

    2011-10-13

    Wellcome Trust Sanger Institute's Jared Simpson on Memory efficient sequence analysis using compressed data structures at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  17. Deep Ion Torrent sequencing identifies soil fungal community shifts after frequent prescribed fires in a southeastern US forest ecosystem.

    Science.gov (United States)

    Brown, Shawn P; Callaham, Mac A; Oliver, Alena K; Jumpponen, Ari

    2013-12-01

    Prescribed burning is a common management tool to control fuel loads, ground vegetation, and facilitate desirable game species. We evaluated soil fungal community responses to long-term prescribed fire treatments in a loblolly pine forest on the Piedmont of Georgia and utilized deep Internal Transcribed Spacer Region 1 (ITS1) amplicon sequencing afforded by the recent Ion Torrent Personal Genome Machine (PGM). These deep sequence data (19,000 + reads per sample after subsampling) indicate that frequent fires (3-year fire interval) shift soil fungus communities, whereas infrequent fires (6-year fire interval) permit system resetting to a state similar to that without prescribed fire. Furthermore, in nonmetric multidimensional scaling analyses, primarily ectomycorrhizal taxa were correlated with axes associated with long fire intervals, whereas soil saprobes tended to be correlated with the frequent fire recurrence. We conclude that (1) multiplexed Ion Torrent PGM analyses allow deep cost effective sequencing of fungal communities but may suffer from short read lengths and inconsistent sequence quality adjacent to the sequencing adaptor; (2) frequent prescribed fires elicit a shift in soil fungal communities; and (3) such shifts do not occur when fire intervals are longer. Our results emphasize the general responsiveness of these forests to management, and the importance of fire return intervals in meeting management objectives. © 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  18. Complete genome sequence analysis identifies a new genotype of brassica yellows virus that infects cabbage and radish in China.

    Science.gov (United States)

    Zhang, Xiao-Yan; Xiang, Hai-Ying; Zhou, Cui-Ji; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui

    2014-08-01

    For brassica yellows virus (BrYV), proposed to be a member of a new polerovirus species, two clearly distinct genotypes (BrYV-A and BrYV-B) have been described. In this study, the complete nucleotide sequences of two BrYV isolates from radish and Chinese cabbage were determined. Sequence analysis suggested that these isolates represent a new genotype, referred to here as BrYV-C. The full-length sequences of the two BrYV-C isolates shared 93.4-94.8 % identity with BrYV-A and BrYV-B. Further phylogenetic analysis showed that the BrYV-C isolates formed a subgroup that was distinct from the BrYV-A and BrYV-B isolates based on all of the proteins except P5.

  19. Detecting authorized and unauthorized genetically modified organisms containing vip3A by real-time PCR and next-generation sequencing.

    Science.gov (United States)

    Liang, Chanjuan; van Dijk, Jeroen P; Scholtens, Ingrid M J; Staats, Martijn; Prins, Theo W; Voorhuijzen, Marleen M; da Silva, Andrea M; Arisi, Ana Carolina Maisonnave; den Dunnen, Johan T; Kok, Esther J

    2014-04-01

    The growing number of biotech crops with novel genetic elements increasingly complicates the detection of genetically modified organisms (GMOs) in food and feed samples using conventional screening methods. Unauthorized GMOs (UGMOs) in food and feed are currently identified through combining GMO element screening with sequencing the DNA flanking these elements. In this study, a specific and sensitive qPCR assay was developed for vip3A element detection based on the vip3Aa20 coding sequences of the recently marketed MIR162 maize and COT102 cotton. Furthermore, SiteFinding-PCR in combination with Sanger, Illumina or Pacific BioSciences (PacBio) sequencing was performed targeting the flanking DNA of the vip3Aa20 element in MIR162. De novo assembly and Basic Local Alignment Search Tool searches were used to mimic UGMO identification. PacBio data resulted in relatively long contigs in the upstream (1,326 nucleotides (nt); 95 % identity) and downstream (1,135 nt; 92 % identity) regions, whereas Illumina data resulted in two smaller contigs of 858 and 1,038 nt with higher sequence identity (>99 % identity). Both approaches outperformed Sanger sequencing, underlining the potential for next-generation sequencing in UGMO identification.

  20. MicroRNA of the fifth-instar posterior silk gland of silkworm identified by Solexa sequencing

    Directory of Open Access Journals (Sweden)

    Jisheng Li

    2014-12-01

    Full Text Available No special studies have been focused on the microRNA (miRNA in the fifth-instar posterior silk gland of Bombyx mori. Here, using next-generation sequencing, we acquired 93.2 million processed reads from 10 small RNA libraries. In this paper, we tried to thoroughly describe how our dataset generated from deep sequencing which was recently published in BMC genomics. Results showed that our findings are largely enriched silkworm miRNA depository and may benefit us to reveal the miRNA functions in the process of silk production.

  1. High-fidelity target sequencing of individual molecules identified using barcode sequences: de novo detection and absolute quantitation of mutations in plasma cell-free DNA from cancer patients.

    Science.gov (United States)

    Kukita, Yoji; Matoba, Ryo; Uchida, Junji; Hamakawa, Takuya; Doki, Yuichiro; Imamura, Fumio; Kato, Kikuya

    2015-08-01

    Circulating tumour DNA (ctDNA) is an emerging field of cancer research. However, current ctDNA analysis is usually restricted to one or a few mutation sites due to technical limitations. In the case of massively parallel DNA sequencers, the number of false positives caused by a high read error rate is a major problem. In addition, the final sequence reads do not represent the original DNA population due to the global amplification step during the template preparation. We established a high-fidelity target sequencing system of individual molecules identified in plasma cell-free DNA using barcode sequences; this system consists of the following two steps. (i) A novel target sequencing method that adds barcode sequences by adaptor ligation. This method uses linear amplification to eliminate the errors introduced during the early cycles of polymerase chain reaction. (ii) The monitoring and removal of erroneous barcode tags. This process involves the identification of individual molecules that have been sequenced and for which the number of mutations have been absolute quantitated. Using plasma cell-free DNA from patients with gastric or lung cancer, we demonstrated that the system achieved near complete elimination of false positives and enabled de novo detection and absolute quantitation of mutations in plasma cell-free DNA. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  2. An effective approach for annotation of protein families with low sequence similarity and conserved motifs: identifying GDSL hydrolases across the plant kingdom.

    Science.gov (United States)

    Vujaklija, Ivan; Bielen, Ana; Paradžik, Tina; Biđin, Siniša; Goldstein, Pavle; Vujaklija, Dušica

    2016-02-18

    The massive accumulation of protein sequences arising from the rapid development of high-throughput sequencing, coupled with automatic annotation, results in high levels of incorrect annotations. In this study, we describe an approach to decrease annotation errors of protein families characterized by low overall sequence similarity. The GDSL lipolytic family comprises proteins with multifunctional properties and high potential for pharmaceutical and industrial applications. The number of proteins assigned to this family has increased rapidly over the last few years. In particular, the natural abundance of GDSL enzymes reported recently in plants indicates that they could be a good source of novel GDSL enzymes. We noticed that a significant proportion of annotated sequences lack specific GDSL motif(s) or catalytic residue(s). Here, we applied motif-based sequence analyses to identify enzymes possessing conserved GDSL motifs in selected proteomes across the plant kingdom. Motif-based HMM scanning (Viterbi decoding-VD and posterior decoding-PD) and the here described PD/VD protocol were successfully applied on 12 selected plant proteomes to identify sequences with GDSL motifs. A significant number of identified GDSL sequences were novel. Moreover, our scanning approach successfully detected protein sequences lacking at least one of the essential motifs (171/820) annotated by Pfam profile search (PfamA) as GDSL. Based on these analyses we provide a curated list of GDSL enzymes from the selected plants. CLANS clustering and phylogenetic analysis helped us to gain a better insight into the evolutionary relationship of all identified GDSL sequences. Three novel GDSL subfamilies as well as unreported variations in GDSL motifs were discovered in this study. In addition, analyses of selected proteomes showed a remarkable expansion of GDSL enzymes in the lycophyte, Selaginella moellendorffii. Finally, we provide a general motif-HMM scanner which is easily accessible through

  3. Isolation of Canine parvovirus with a view to identify the prevalent serotype on the basis of partial sequence analysis

    Directory of Open Access Journals (Sweden)

    Gurpreet Kaur

    2015-01-01

    Full Text Available Aim: The aim of this study was to isolate Canine parvovirus (CPV from suspected dogs on madin darby canine kidney (MDCK cell line and its confirmation by polymerase chain reaction (PCR and nested PCR (NPCR. Further, VP2 gene of the CPV isolates was amplified and sequenced to determine prevailing antigenic type. Materials and Methods: A total of 60 rectal swabs were collected from dogs showing signs of gastroenteritis, processed and subjected to isolation in MDCK cell line. The samples showing cytopathic effects (CPE were confirmed by PCR and NPCR. These samples were subjected to PCR for amplification of VP2 gene of CPV, sequenced and analyzed to study the prevailing antigenic types of CPV. Results: Out of the 60 samples subjected to isolation in MDCK cell line five samples showed CPE in the form of rounding of cells, clumping of cells and finally detachment of the cells. When these samples and the two commercially available vaccines were subjected to PCR for amplification of VP2 gene, a 1710 bp product was amplified. The sequence analysis revealed that the vaccines belonged to the CPV-2 type and the samples were of CPV-2b type. Conclusion: It can be concluded from the present study that out of a total of 60 samples 5 samples exhibited CPE as observed in MDCK cell line. Sequence analysis of the VP2 gene among the samples and vaccine strains revealed that samples belonged to CPV-2b type and vaccines belonging to CPV-2.

  4. Representational difference analysis of Neisseria meningitidis identifies sequences that are specific for the hyper-virulent lineage III clone

    NARCIS (Netherlands)

    Bart, A.; Dankert, J.; van der Ende, A.

    2000-01-01

    Neisseria meningitidis may cause meningitis and septicemia. Since the early 1980s, an increased incidence of meningococcal disease has been caused by the lineage III clone in many countries in Europe and in New Zealand. We hypothesized that lineage III meningococci have specific DNA sequences,

  5. Isolation of Canine parvovirus with a view to identify the prevalent serotype on the basis of partial sequence analysis.

    Science.gov (United States)

    Kaur, Gurpreet; Chandra, Mudit; Dwivedi, P N; Sharma, N S

    2015-01-01

    The aim of this study was to isolate Canine parvovirus (CPV) from suspected dogs on madin darby canine kidney (MDCK) cell line and its confirmation by polymerase chain reaction (PCR) and nested PCR (NPCR). Further, VP2 gene of the CPV isolates was amplified and sequenced to determine prevailing antigenic type. A total of 60 rectal swabs were collected from dogs showing signs of gastroenteritis, processed and subjected to isolation in MDCK cell line. The samples showing cytopathic effects (CPE) were confirmed by PCR and NPCR. These samples were subjected to PCR for amplification of VP2 gene of CPV, sequenced and analyzed to study the prevailing antigenic types of CPV. Out of the 60 samples subjected to isolation in MDCK cell line five samples showed CPE in the form of rounding of cells, clumping of cells and finally detachment of the cells. When these samples and the two commercially available vaccines were subjected to PCR for amplification of VP2 gene, a 1710 bp product was amplified. The sequence analysis revealed that the vaccines belonged to the CPV-2 type and the samples were of CPV-2b type. It can be concluded from the present study that out of a total of 60 samples 5 samples exhibited CPE as observed in MDCK cell line. Sequence analysis of the VP2 gene among the samples and vaccine strains revealed that samples belonged to CPV-2b type and vaccines belonging to CPV-2.

  6. Criteria for confirming sequence periodicity identified by Fourier transform analysis: application to GCR2, a candidate plant GPCR?

    Science.gov (United States)

    Illingworth, Christopher J R; Parkes, Kevin E; Snell, Christopher R; Mullineaux, Philip M; Reynolds, Christopher A

    2008-03-01

    Methods to determine periodicity in protein sequences are useful for inferring function. Fourier transformation is one approach but care is required to ensure the periodicity is genuine. Here we have shown that empirically-derived statistical tables can be used as a measure of significance. Genuine protein sequences data rather than randomly generated sequences were used as the statistical backdrop. The method has been applied to G-protein coupled receptor (GPCR) sequences, by Fourier transformation of hydrophobicity values, codon frequencies and the extent of over-representation of codon pairs; the latter being related to translational step times. Genuine periodicity was observed in the hydrophobicity whereas the apparent periodicity (as inferred from previously reported measures) in the translation step times was not validated statistically. GCR2 has recently been proposed as the plant GPCR receptor for the hormone abscisic acid. It has homology to the Lanthionine synthetase C-like family of proteins, an observation confirmed by fold recognition. Application of the Fourier transform algorithm to the GCR2 family revealed strongly predicted seven fold periodicity in hydrophobicity, suggesting why GCR2 has been reported to be a GPCR, despite negative indications in most transmembrane prediction algorithms. The underlying multiple sequence alignment, also required for the Fourier transform analysis of periodicity, indicated that the hydrophobic regions around the 7 GXXG motifs commence near the C-terminal end of each of the 7 inner helices of the alpha-toroid and continue to the N-terminal region of the helix. The results clearly explain why GCR2 has been understandably but erroneously predicted to be a GPCR.

  7. THE USE OF INTER SIMPLE SEQUENCE REPEATS (ISSR) IN DISTINGUISHING NEIGHBORING DOUGLAS-FIR TREES AS A MEANS TO IDENTIFYING TREE ROOTS WITH ABOVE-GROUND BIOMASS

    Science.gov (United States)

    We are attempting to identify specific root fragments from soil cores with individual trees. We successfully used Inter Simple Sequence Repeats (ISSR) to distinguish neighboring old-growth Douglas-fir trees from one another, while maintaining identity among each tree's parts. W...

  8. Next-generation sequencing identifies a novel compound heterozygous mutation in MYO7A in a Chinese patient with Usher Syndrome 1B.

    Science.gov (United States)

    Wei, Xiaoming; Sun, Yan; Xie, Jiansheng; Shi, Quan; Qu, Ning; Yang, Guanghui; Cai, Jun; Yang, Yi; Liang, Yu; Wang, Wei; Yi, Xin

    2012-11-20

    Targeted enrichment and next-generation sequencing (NGS) have been employed for detection of genetic diseases. The purpose of this study was to validate the accuracy and sensitivity of our method for comprehensive mutation detection of hereditary hearing loss, and identify inherited mutations involved in human deafness accurately and economically. To make genetic diagnosis of hereditary hearing loss simple and timesaving, we designed a 0.60 MB array-based chip containing 69 nuclear genes and mitochondrial genome responsible for human deafness and conducted NGS toward ten patients with five known mutations and a Chinese family with hearing loss (never genetically investigated). Ten patients with five known mutations were sequenced using next-generation sequencing to validate the sensitivity of the method. We identified four known mutations in two nuclear deafness causing genes (GJB2 and SLC26A4), one in mitochondrial DNA. We then performed this method to analyze the variants in a Chinese family with hearing loss and identified compound heterozygosity for two novel mutations in gene MYO7A. The compound heterozygosity identified in gene MYO7A causes Usher Syndrome 1B with severe phenotypes. The results support that the combination of enrichment of targeted genes and next-generation sequencing is a valuable molecular diagnostic tool for hereditary deafness and suitable for clinical application. Copyright © 2012 Elsevier B.V. All rights reserved.

  9. Older persons' worries expressed during home care visits: exploring the content of cues and concerns identified by the Verona coding definitions of emotional sequences.

    NARCIS (Netherlands)

    Hafskjold, L.; Eide, T.; Holmström, I.K.; Sundling, V.; Dulmen, S. van; Eide, H.

    2016-01-01

    Objective: Little is known about how older persons in home care express their concerns. Emotional cues and concerns can be identified by the Verona coding definitions of emotional sequences (VR-CoDES), but the method gives no insight into what causes the distress and the emotions involved. The aims

  10. Spatially conserved regulatory elements identified within human and mouse Cd247 gene using high-throughput sequencing data from the ENCODE project

    DEFF Research Database (Denmark)

    Pundhir, Sachin; Hannibal, Tine Dahlbæk; Bang-Berthelsen, Claus Heiner

    2014-01-01

    . In this study, we have utilized the wealth of high-throughput sequencing data produced during the Encyclopedia of DNA Elements (ENCODE) project to identify spatially conserved regulatory elements within the Cd247 gene from human and mouse. We show the presence of two transcription factor binding sites...

  11. Whole-Genome Sequencing of Invasion-Resistant Cells Identifies Laminin α2 as a Host Factor for Bacterial Invasion

    DEFF Research Database (Denmark)

    van Wijk, Xander M.; Döhrmann, Simon; Hallstrom, Bjorn

    2017-01-01

    cells. Whole-genome sequencing and transcriptome sequencing (RNA-Seq) uncovered a deletion in the gene encoding the laminin subunit α2 (Lama2) that eliminated much of domain L4a. Silencing of the long Lama2 isoform in wild-type cells strongly reduced bacterial invasion, whereas transfection with human...... LAMA2 cDNA significantly enhanced invasion in pgsA745 cells. The addition of exogenous laminin-α2β1γ1/laminin-α2β2γ1 strongly increased bacterial invasion in CHO cells, as well as in human alveolar basal epithelial and human brain microvascular endothelial cells. Thus, the L4a domain in laminin α2...

  12. ATRX mutation in two adult brothers with non-specific moderate intellectual disability identified by exome sequencing

    OpenAIRE

    Moncini, S.; Bedeschi, M.F.; Castronovo, P.; Crippa, M.; Calvello, M.; Garghentino, R.R.; Scuvera, G.; Finelli, P.; Venturin, M.

    2013-01-01

    In this report, we describe two adult brothers affected by moderate non-specific intellectual disability (ID). They showed minor facial anomalies, not clearly ascribable to any specific syndromic patterns, microcephaly, brachydactyly and broad toes. Both brothers presented seizures. Karyotype, subtelomeric and FMR1 analysis were normal in both cases. We performed array-CGH analysis that revealed no copy-number variations potentially associated with ID. Subsequent exome sequence analysis allow...

  13. Exome sequencing identifies pathogenic variants of VPS13B in a patient with familial 16p11.2 duplication

    OpenAIRE

    Dastan, Jila; Chijiwa, Chieko; Tang, Flamingo; Martell, Sally; Qiao, Ying; Rajcan-Separovic, Evica; Lewis, M. E. Suzanne

    2016-01-01

    Background The recurrent microduplication of 16p11.2 (dup16p11.2) is associated with a broad spectrum of neurodevelopmental disorders (NDD) confounded by incomplete penetrance and variable expressivity. This inter- and intra-familial clinical variability highlights the importance of personalized genetic counselling in individuals at-risk. Case presentation In this study, we performed whole exome sequencing (WES) to look for other genomic alterations that could explain the clinical variability...

  14. Leaf Transcriptome Sequencing for Identifying Genic-SSR Markers and SNP Heterozygosity in Crossbred Mango Variety ‘Amrapali’ (Mangifera indica L.)

    Science.gov (United States)

    Mahato, Ajay Kumar; Sharma, Nimisha; Singh, Akshay; Srivastav, Manish; Jaiprakash; Singh, Sanjay Kumar; Singh, Anand Kumar; Sharma, Tilak Raj; Singh, Nagendra Kumar

    2016-01-01

    Mango (Mangifera indica L.) is called “king of fruits” due to its sweetness, richness of taste, diversity, large production volume and a variety of end usage. Despite its huge economic importance genomic resources in mango are scarce and genetics of useful horticultural traits are poorly understood. Here we generated deep coverage leaf RNA sequence data for mango parental varieties ‘Neelam’, ‘Dashehari’ and their hybrid ‘Amrapali’ using next generation sequencing technologies. De-novo sequence assembly generated 27,528, 20,771 and 35,182 transcripts for the three genotypes, respectively. The transcripts were further assembled into a non-redundant set of 70,057 unigenes that were used for SSR and SNP identification and annotation. Total 5,465 SSR loci were identified in 4,912 unigenes with 288 type I SSR (n ≥ 20 bp). One hundred type I SSR markers were randomly selected of which 43 yielded PCR amplicons of expected size in the first round of validation and were designated as validated genic-SSR markers. Further, 22,306 SNPs were identified by aligning high quality sequence reads of the three mango varieties to the reference unigene set, revealing significantly enhanced SNP heterozygosity in the hybrid Amrapali. The present study on leaf RNA sequencing of mango varieties and their hybrid provides useful genomic resource for genetic improvement of mango. PMID:27736892

  15. Evaluation of sequence ambiguities of the HIV-1 pol gene as a method to identify recent HIV-1 infection in transmitted drug resistance surveys.

    Science.gov (United States)

    Andersson, Emmi; Shao, Wei; Bontell, Irene; Cham, Fatim; Cuong, Do Duy; Wondwossen, Amogne; Morris, Lynn; Hunt, Gillian; Sönnerborg, Anders; Bertagnolio, Silvia; Maldarelli, Frank; Jordan, Michael R

    2013-08-01

    Identification of recent HIV infection within populations is a public health priority for accurate estimation of HIV incidence rates and transmitted drug resistance at population level. Determining HIV incidence rates by prospective follow-up of HIV-uninfected individuals is challenging and serological assays have important limitations. HIV diversity within an infected host increases with duration of infection. We explore a simple bioinformatics approach to assess viral diversity by determining the percentage of ambiguous base calls in sequences derived from standard genotyping of HIV-1 protease and reverse transcriptase. Sequences from 691 recently infected (≤1 year) and chronically infected (>1 year) individuals from Sweden, Vietnam and Ethiopia were analyzed for ambiguity. A significant difference (p<0.0001) in the proportion of ambiguous bases was observed between sequences from individuals with recent and chronic infection in both HIV-1 subtype B and non-B infection, consistent with previous studies. In our analysis, a cutoff of <0.47% ambiguous base calls identified recent infection with a sensitivity and specificity of 88.8% and 74.6% respectively. 1,728 protease and reverse transcriptase sequences from 36 surveys of transmitted HIV drug resistance performed following World Health Organization guidance were analyzed for ambiguity. The 0.47% ambiguity cutoff was applied and survey sequences were classified as likely derived from recently or chronically infected individuals. 71% of patients were classified as likely to have been infected within one year of genotyping but results varied considerably amongst surveys. This bioinformatics approach may provide supporting population-level information to identify recent infection but its application is limited by infection with more than one viral variant, decreasing viral diversity in advanced disease and technical aspects of population based sequencing. Standardization of sequencing techniques and base calling

  16. Next-Generation Sequencing Platforms

    Science.gov (United States)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  17. An analysis of the sequence of the BAD gene among patients with maturity-onset diabetes of the young (MODY).

    Science.gov (United States)

    Antosik, Karolina; Gnyś, Piotr; Jarosz-Chobot, Przemysława; Myśliwiec, Małgorzata; Szadkowska, Agnieszka; Małecki, Maciej; Młynarski, Wojciech; Borowiec, Maciej

    2017-01-01

    Monogenic diabetes is a rare disease caused by single gene mutations. Maturity onset diabetes of the young (MODY) is one of the major forms of monogenic diabetes recognised in the paediatric population. To date, 13 genes have been related to MODY development. The aim of the study was to analyse the sequence of the BCL2-associated agonist of cell death (BAD) gene in patients with clinical suspicion of GCK-MODY, but who were negative for glucokinase (GCK) gene mutations. A group of 122 diabetic patients were recruited from the "Polish Registry for Paediatric and Adolescent Diabetes - nationwide genetic screening for monogenic diabetes" project. The molecular testing was performed by Sanger sequencing. A total of 10 sequence variants of the BAD gene were identified in 122 analysed diabetic patients. Among the analysed patients suspected of MODY, one possible pathogenic variant was identified in one patient; however, further confirmation is required for a certain identification.

  18. Case Report Identification of a novel SLC45A2 mutation in albinism by targeted next-generation sequencing.

    Science.gov (United States)

    Xue, J J; Xue, J F; Xue, H Q; Guo, Y Y; Liu, Y; Ouyang, N

    2016-09-19

    Albinism is a diverse group of hypopigmentary disorders caused by multiple-genetic defects. The genetic diagnosis of patients affected with albinism by Sanger sequencing is often complex, expensive, and time-consuming. In this study, we performed targeted next-generation sequencing to screen for 16 genes in a patient with albinism, and identified 21 genetic variants, including 19 known single nucleotide polymorphisms, one novel missense mutation (c.1456 G>A), and one disease-causing mutation (c.478 G>C). The novel mutation was not observed in 100 controls, and was predicted to be a damaging mutation by SIFT and Polyphen. Thus, we identified a novel mutation in SLC45A2 in a Chinese family, expanding the mutational spectrum of albinism. Our results also demonstrate that targeted next-generation sequencing is an effective genetic test for albinism.

  19. Analysis of Perioperative Chemotherapy in Resected Pancreatic Cancer: Identifying the Number and Sequence of Chemotherapy Cycles Needed to Optimize Survival.

    Science.gov (United States)

    Epelboym, Irene; Zenati, Mazen S; Hamad, Ahmad; Steve, Jennifer; Lee, Kenneth K; Bahary, Nathan; Hogg, Melissa E; Zeh, Herbert J; Zureikat, Amer H

    2017-09-01

    Receipt of 6 cycles of adjuvant chemotherapy (AC) is standard of care in pancreatic cancer (PC). Neoadjuvant chemotherapy (NAC) is increasingly utilized; however, optimal number of cycles needed alone or in combination with AC remains unknown. We sought to determine the optimal number and sequence of perioperative chemotherapy cycles in PC. Single institutional review of all resected PCs from 2008 to 2015. The impact of cumulative number of chemotherapy cycles received (0, 1-5, and ≥6 cycles) and their sequence (NAC, AC, or NAC + AC) on overall survival was evaluated Cox-proportional hazard modeling, using 6 cycles of AC as reference. A total of 522 patients were analyzed. Based on sample size distribution, four combinations were evaluated: 0 cycles = 12.1%, 1-5 cycles of combined NAC + AC = 29%, 6 cycles of AC = 25%, and ≥6 cycles of combined NAC + AC = 34%, with corresponding survival. 13.1, 18.5, 37, and 36.8 months. On MVA (P cycles AC, receipt of 0 cycles [HR 3.57, confidence interval (CI) 2.47-5.18] or 1-5 cycles in any combination (HR 2.37, CI 1.73-3.23) was associated with increased hazard of death, whereas receipt of ≥6 cycles in any sequence was associated with optimal and comparable survival (HR 1.07, CI 0.78-1.47). Receipt of 6 or more perioperative cycles of chemotherapy either as combined neoadjuvant and adjuvant or adjuvant alone may be associated with optimal and comparable survival in resected PC.

  20. Identifying return-to-work trajectories using sequence analysis in a cohort of workers with work-related musculoskeletal disorders

    NARCIS (Netherlands)

    McLeod, Christopher B.; Reiff, Eline; Maas, Esther; Bultmann, Ute

    2018-01-01

    Objectives This study aimed to identify return-to-work (RTW) trajectories among workers with work-related musculoskeletal disorders (MSD) and examine the associations between different MSD and these RTW trajectories. Methods We used administrative workers' compensation data to identify accepted MSD

  1. Targeted amplicon sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics.

    Science.gov (United States)

    Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A

    2011-01-01

    Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.

  2. Novel mutation of FKBP10 in a pediatric patient with osteogenesis imperfecta type XI identified by clinical exome sequencing

    Science.gov (United States)

    Velasco, Harvy Mauricio; Morales, Jessica L

    2017-01-01

    Osteogenesis imperfecta (OI) is a hereditary disease characterized by bone fragility caused by mutations in the proteins that support the formation of the extracellular matrix in the bone. The diagnosis of OI begins with clinical suspicion, from phenotypic findings at birth, low-impact fractures during childhood or family history that may lead to it. However, the variability in the semiology of the disease does not allow establishing an early diagnosis in all cases, and unfortunately, specific clinical data provided by the literature only report 28 patients with OI type XI. This information is limited and heterogeneous, and therefore, detailed information on the natural history of this disease is not yet available. This paper reports the case of a male patient who, despite undergoing multidisciplinary management, did not have a diagnosis for a long period of time, and could only be given one with the use of whole-exome sequencing. The use of the next-generation sequencing in patients with ultrarare genetic diseases, including skeletal dysplasias, should be justified when clear clinical criteria and an improvement in the quality of life of the patients and their families are intended while reducing economic and time costs. Thus, this case report corresponds to the 29th patient affected with OI type XI, and the 18th mutation in FKBP10, causative of this pathology. PMID:29158687

  3. Complete nucleotide sequence of a novel Hibiscus-infecting Cilevirus from Florida and its relationship with closely associated Cileviruses

    Science.gov (United States)

    The complete nucleotide sequence of a recently discovered Florida (FL) isolate of Hibiscus infecting Cilevirus (HiCV) was determined by Sanger sequencing. The movement- and coat- protein gene sequences of the HiCV-FL isolate are more divergent than other genes of the previously sequenced HiCV-HA (Ha...

  4. Sequencing of a patient with balanced chromosome abnormalities and neurodevelopmental disease identifies disruption of multiple high risk loci by structural variation.

    Directory of Open Access Journals (Sweden)

    Jonathon Blake

    Full Text Available Balanced chromosome abnormalities (BCAs occur at a high frequency in healthy and diseased individuals, but cost-efficient strategies to identify BCAs and evaluate whether they contribute to a phenotype have not yet become widespread. Here we apply genome-wide mate-pair library sequencing to characterize structural variation in a patient with unclear neurodevelopmental disease (NDD and complex de novo BCAs at the karyotype level. Nucleotide-level characterization of the clinically described BCA breakpoints revealed disruption of at least three NDD candidate genes (LINC00299, NUP205, PSMD14 that gave rise to abnormal mRNAs and could be assumed as disease-causing. However, unbiased genome-wide analysis of the sequencing data for cryptic structural variation was key to reveal an additional submicroscopic inversion that truncates the schizophrenia- and bipolar disorder-associated brain transcription factor ZNF804A as an equally likely NDD-driving gene. Deep sequencing of fluorescent-sorted wild-type and derivative chromosomes confirmed the clinically undetected BCA. Moreover, deep sequencing further validated a high accuracy of mate-pair library sequencing to detect structural variants larger than 10 kB, proposing that this approach is powerful for clinical-grade genome-wide structural variant detection. Our study supports previous evidence for a role of ZNF804A in NDD and highlights the need for a more comprehensive assessment of structural variation in karyotypically abnormal individuals and patients with neurocognitive disease to avoid diagnostic deception.

  5. Sequencing of a Patient with Balanced Chromosome Abnormalities and Neurodevelopmental Disease Identifies Disruption of Multiple High Risk Loci by Structural Variation

    Science.gov (United States)

    Blake, Jonathon; Riddell, Andrew; Theiss, Susanne; Gonzalez, Alexis Perez; Haase, Bettina; Jauch, Anna; Janssen, Johannes W. G.; Ibberson, David; Pavlinic, Dinko; Moog, Ute; Benes, Vladimir; Runz, Heiko

    2014-01-01

    Balanced chromosome abnormalities (BCAs) occur at a high frequency in healthy and diseased individuals, but cost-efficient strategies to identify BCAs and evaluate whether they contribute to a phenotype have not yet become widespread. Here we apply genome-wide mate-pair library sequencing to characterize structural variation in a patient with unclear neurodevelopmental disease (NDD) and complex de novo BCAs at the karyotype level. Nucleotide-level characterization of the clinically described BCA breakpoints revealed disruption of at least three NDD candidate genes (LINC00299, NUP205, PSMD14) that gave rise to abnormal mRNAs and could be assumed as disease-causing. However, unbiased genome-wide analysis of the sequencing data for cryptic structural variation was key to reveal an additional submicroscopic inversion that truncates the schizophrenia- and bipolar disorder-associated brain transcription factor ZNF804A as an equally likely NDD-driving gene. Deep sequencing of fluorescent-sorted wild-type and derivative chromosomes confirmed the clinically undetected BCA. Moreover, deep sequencing further validated a high accuracy of mate-pair library sequencing to detect structural variants larger than 10 kB, proposing that this approach is powerful for clinical-grade genome-wide structural variant detection. Our study supports previous evidence for a role of ZNF804A in NDD and highlights the need for a more comprehensive assessment of structural variation in karyotypically abnormal individuals and patients with neurocognitive disease to avoid diagnostic deception. PMID:24625750

  6. Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers

    DEFF Research Database (Denmark)

    Varshney, Rajeev K.; Chen, Wenbin; Li, Yupeng

    2012-01-01

    Pigeonpea is an important legume food crop grown primarily by smallholder farmers in many semi-arid tropical regions of the world. We used the Illumina next-generation sequencing platform to generate 237.2 Gb of sequence, which along with Sanger-based bacterial artificial chromosome end sequences...

  7. Enriched whole genome sequencing identified compensatory mutations in the RNA polymerase gene of rifampicin-resistant Mycobacterium leprae strains

    Directory of Open Access Journals (Sweden)

    Lavania M

    2018-01-01

    Full Text Available Mallika Lavania,1 Itu Singh,1 Ravindra P Turankar,1 Anuj Kumar Gupta,2 Madhvi Ahuja,1 Vinay Pathak,1 Utpal Sengupta1 1Stanley Browne Laboratory, The Leprosy Mission Trust India, TLM Community Hospital Nand Nagari, 2Agilent Technologies India Pvt Ltd, Jasola District Centre, New Delhi, India Abstract: Despite more than three decades of multidrug therapy (MDT, leprosy remains a major public health issue in several endemic countries, including India. The emergence of drug resistance in Mycobacterium leprae (M. leprae is a cause of concern and poses a threat to the leprosy-control program, which might ultimately dampen the achievement of the elimination program of the country. Rifampicin resistance in clinical strains of M. leprae are supposed to arise from harboring bacterial strains with mutations in the 81-bp rifampicin resistance determining region (RRDR of the rpoB gene. However, complete dynamics of rifampicin resistance are not explained only by this mutation in leprosy strains. To understand the role of other compensatory mutations and transmission dynamics of drug-resistant leprosy, a genome-wide sequencing of 11 M. leprae strains – comprising five rifampicin-resistant strains, five sensitive strains, and one reference strain – was done in this study. We observed the presence of compensatory mutations in two rifampicin-resistant strains in rpoC and mmpL7 genes, along with rpoB, that may additionally be responsible for conferring resistance in those strains. Our findings support the role for compensatory mutation(s in RNA polymerase gene(s, resulting in rifampicin resistance in relapsed leprosy patients. Keywords: leprosy, rifampicin resistance, compensatory mutations, next generation sequencing, relapsed, MDT, India

  8. 3D MR cisternography to identify distal dural rings. Comparison of 3D-CISS and 3D-SPACE sequences

    International Nuclear Information System (INIS)

    Watanabe, Yoshiyuki; Makidono, Akari; Nakamura, Miho; Saida, Yukihisa

    2011-01-01

    The distal dural ring (DDR) is an anatomical landmark used to distinguish intra- and extradural aneurysms. We investigated identification of the DDR using 2 three-dimensional (3D) magnetic resonance (MR) cisternography sequences-3D constructive interference in steady state (CISS) and 3D sampling perfection with application optimized contrasts using different flip angle evolutions (SPACE)-at 3.0 tesla. Ten healthy adult volunteers underwent imaging with 3D-CISS, 3D-SPACE, and time-of-flight (TOF) MR angiography (TOF-MRA) sequences at 3.0T. We analyzed DDR identification and internal carotid artery (ICA) signal intensity and classified the shape of the carotid cave. We identified the DDR using both 3D-SPACE and 3D-CISS, with no significant difference between the sequences. Visualization of the outline of the ICA in the cavernous sinus (CS) was significantly clearer with 3D-SPACE than 3D-CISS. In the CS and petrous portions, signal intensity was lower with 3D-SPACE, and the flow void was poor with 3D-CISS in some subjects. We identified the DDR with both 3D-SPACE and 3D-CISS, but the superior contrast of the ICA in the CS using 3D-SPACE suggests the superiority of this sequence for evaluating the DDR. (author)

  9. Whole-Exome Sequencing Identifies ALMS1, IQCB1, CNGA3, and MYO7A Mutations in Patients with Leber Congenital Amaurosis

    OpenAIRE

    Wang, Xia; Wang, Hui; Cao, Ming; Li, Zhe; Chen, Xianfeng; Patenia, Claire; Gore, Athurva; Abboud, Emad B.; Al-Rajhi, Ali A.; Lewis, Richard A.; Lupski, James R.; Mardon, Graeme; Zhang, Kun; Muzny, Donna; Gibbs, Richard A.

    2011-01-01

    It has been well documented that mutations in the same retinal disease gene can result in different clinical phenotypes due to difference in the mutant allele and/or genetic background. To evaluate this, a set of consanguineous patient families with Leber congenital amaurosis (LCA) that do not carry mutations in known LCA disease genes was characterized through homozygosity mapping followed by targeted exon/whole-exome sequencing to identify genetic variations. Among these families, a total o...

  10. Impetigo-like tinea faciei around the nostrils caused by Arthroderma vanbreuseghemii identified using polymerase chain reaction-based sequencing of crusts.

    Science.gov (United States)

    Kang, Daoxian; Ran, Yuping; Li, Conghui; Dai, Yaling; Lama, Jebina

    2013-01-01

    We report a case of Arthroderma vanbreuseghemii (a teleomorph of Trichophyton interdigitale) infection around the nostrils in a 3-year-old girl. The culture was negative, so the pathogenic agent was identified using polymerase chain reaction-based sequencing of the crusts taken from the lesion on the nostril. Treatment with oral itraconazole and topical 1% naftifine/0.25% ketoconazole cream after a topical wash with ketoconazole shampoo was effective. © 2012 Wiley Periodicals, Inc.

  11. Introduction of the hybcell-based compact sequencing technology and comparison to state-of-the-art methodologies for KRAS mutation detection.

    Science.gov (United States)

    Zopf, Agnes; Raim, Roman; Danzer, Martin; Niklas, Norbert; Spilka, Rita; Pröll, Johannes; Gabriel, Christian; Nechansky, Andreas; Roucka, Markus

    2015-03-01

    The detection of KRAS mutations in codons 12 and 13 is critical for anti-EGFR therapy strategies; however, only those methodologies with high sensitivity, specificity, and accuracy as well as the best cost and turnaround balance are suitable for routine daily testing. Here we compared the performance of compact sequencing using the novel hybcell technology with 454 next-generation sequencing (454-NGS), Sanger sequencing, and pyrosequencing, using an evaluation panel of 35 specimens. A total of 32 mutations and 10 wild-type cases were reported using 454-NGS as the reference method. Specificity ranged from 100% for Sanger sequencing to 80% for pyrosequencing. Sanger sequencing and hybcell-based compact sequencing achieved a sensitivity of 96%, whereas pyrosequencing had a sensitivity of 88%. Accuracy was 97% for Sanger sequencing, 85% for pyrosequencing, and 94% for hybcell-based compact sequencing. Quantitative results were obtained for 454-NGS and hybcell-based compact sequencing data, resulting in a significant correlation (r = 0.914). Whereas pyrosequencing and Sanger sequencing were not able to detect multiple mutated cell clones within one tumor specimen, 454-NGS and the hybcell-based compact sequencing detected multiple mutations in two specimens. Our comparison shows that the hybcell-based compact sequencing is a valuable alternative to state-of-the-art methodologies used for detection of clinically relevant point mutations.

  12. Saprolegniaceae identified on amphibian eggs throughout the Pacific Northwest, USA, by internal transcribed spacer sequences and phylogenetic analysis

    Science.gov (United States)

    Jill E. Petrisko; Christopher A. Pearl; David S. Pilliod; Peter P. Sheridan; Charles F. Williams; Charles R. Peterson; R. Bruce Bury

    2008-01-01

    We assessed the diversity and phylogeny of Saprolegniaceae on amphibian eggs from the Pacific Northwest, with particular focus on Saprolegnia ferax, a species implicated in high egg mortality. We identified isolates from eggs of six amphibians with the internal transcribed spacer (ITS) and 5.8S gene regions and BLAST of the GenBank database. We...

  13. Application of massively parallel sequencing to genetic diagnosis in multiplex families with idiopathic sensorineural hearing impairment.

    Directory of Open Access Journals (Sweden)

    Chen-Chi Wu

    Full Text Available Despite the clinical utility of genetic diagnosis to address idiopathic sensorineural hearing impairment (SNHI, the current strategy for screening mutations via Sanger sequencing suffers from the limitation that only a limited number of DNA fragments associated with common deafness mutations can be genotyped. Consequently, a definitive genetic diagnosis cannot be achieved in many families with discernible family history. To investigate the diagnostic utility of massively parallel sequencing (MPS, we applied the MPS technique to 12 multiplex families with idiopathic SNHI in which common deafness mutations had previously been ruled out. NimbleGen sequence capture array was designed to target all protein coding sequences (CDSs and 100 bp of the flanking sequence of 80 common deafness genes. We performed MPS on the Illumina HiSeq2000, and applied BWA, SAMtools, Picard, GATK, Variant Tools, ANNOVAR, and IGV for bioinformatics analyses. Initial data filtering with allele frequencies (0.95 prioritized 5 indels (insertions/deletions and 36 missense variants in the 12 multiplex families. After further validation by Sanger sequencing, segregation pattern, and evolutionary conservation of amino acid residues, we identified 4 variants in 4 different genes, which might lead to SNHI in 4 families compatible with autosomal dominant inheritance. These included GJB2 p.R75Q, MYO7A p.T381M, KCNQ4 p.S680F, and MYH9 p.E1256K. Among them, KCNQ4 p.S680F and MYH9 p.E1256K were novel. In conclusion, MPS allows genetic diagnosis in multiplex families with idiopathic SNHI by detecting mutations in relatively uncommon deafness genes.

  14. Somatosensory neuron types identified by high-coverage single-cell RNA-sequencing and functional heterogeneity

    Science.gov (United States)

    Li, Chang-Lin; Li, Kai-Cheng; Wu, Dan; Chen, Yan; Luo, Hao; Zhao, Jing-Rong; Wang, Sa-Shuang; Sun, Ming-Ming; Lu, Ying-Jin; Zhong, Yan-Qing; Hu, Xu-Ye; Hou, Rui; Zhou, Bei-Bei; Bao, Lan; Xiao, Hua-Sheng; Zhang, Xu

    2016-01-01

    Sensory neurons are distinguished by distinct signaling networks and receptive characteristics. Thus, sensory neuron types can be defined by linking transcriptome-based neuron typing with the sensory phenotypes. Here we classify somatosensory neurons of the mouse dorsal root ganglion (DRG) by high-coverage single-cell RNA-sequencing (10 950 ± 1 218 genes per neuron) and neuron size-based hierarchical clustering. Moreover, single DRG neurons responding to cutaneous stimuli are recorded using an in vivo whole-cell patch clamp technique and classified by neuron-type genetic markers. Small diameter DRG neurons are classified into one type of low-threshold mechanoreceptor and five types of mechanoheat nociceptors (MHNs). Each of the MHN types is further categorized into two subtypes. Large DRG neurons are categorized into four types, including neurexophilin 1-expressing MHNs and mechanical nociceptors (MNs) expressing BAI1-associated protein 2-like 1 (Baiap2l1). Mechanoreceptors expressing trafficking protein particle complex 3-like and Baiap2l1-marked MNs are subdivided into two subtypes each. These results provide a new system for cataloging somatosensory neurons and their transcriptome databases. PMID:26691752

  15. Enriched whole genome sequencing identified compensatory mutations in the RNA polymerase gene of rifampicin-resistant Mycobacterium leprae strains.

    Science.gov (United States)

    Lavania, Mallika; Singh, Itu; Turankar, Ravindra P; Gupta, Anuj Kumar; Ahuja, Madhvi; Pathak, Vinay; Sengupta, Utpal

    2018-01-01

    Despite more than three decades of multidrug therapy (MDT), leprosy remains a major public health issue in several endemic countries, including India. The emergence of drug resistance in Mycobacterium leprae (M. leprae) is a cause of concern and poses a threat to the leprosy-control program, which might ultimately dampen the achievement of the elimination program of the country. Rifampicin resistance in clinical strains of M. leprae are supposed to arise from harboring bacterial strains with mutations in the 81-bp rifampicin resistance determining region (RRDR) of the rpoB gene. However, complete dynamics of rifampicin resistance are not explained only by this mutation in leprosy strains. To understand the role of other compensatory mutations and transmission dynamics of drug-resistant leprosy, a genome-wide sequencing of 11 M. leprae strains - comprising five rifampicin-resistant strains, five sensitive strains, and one reference strain - was done in this study. We observed the presence of compensatory mutations in two rifampicin-resistant strains in rpoC and mmpL7 genes, along with rpoB , that may additionally be responsible for conferring resistance in those strains. Our findings support the role for compensatory mutation(s) in RNA polymerase gene(s), resulting in rifampicin resistance in relapsed leprosy patients.

  16. Novel compound heterozygous mutations in the GPR98 (USH2C) gene identified by whole exome sequencing in a Moroccan deaf family.

    Science.gov (United States)

    Bousfiha, Amale; Bakhchane, Amina; Charoute, Hicham; Detsouli, Mustapha; Rouba, Hassan; Charif, Majida; Lenaers, Guy; Barakat, Abdelhamid

    2017-10-01

    In the present work, we identified two novel compound heterozygote mutations in the GPR98 (G protein-coupled receptor 98) gene causing Usher syndrome. Whole-exome sequencing was performed to study the genetic causes of Usher syndrome in a Moroccan family with three affected siblings. We identify two novel compound heterozygote mutations (c.1054C > A, c.16544delT) in the GPR98 gene in the three affected siblings carrying post-linguale bilateral moderate hearing loss with normal vestibular functions and before installing visual disturbances. This is the first time that mutations in the GPR98 gene are described in the Moroccan deaf patients.

  17. Laser capture microdissection followed by next-generation sequencing identifies disease-related microRNAs in psoriatic skin that reflect systemic microRNA changes in psoriasis

    DEFF Research Database (Denmark)

    Løvendorf, Marianne B; Mitsui, Hiroshi; Zibert, John R

    2015-01-01

    Psoriasis is a systemic disease with cutaneous manifestations. MicroRNAs (miRNAs) are small non-coding RNA molecules that are differentially expressed in psoriatic skin; however, only few cell- and region-specific miRNAs have been identified in psoriatic lesions. We used laser capture...... microdissection (LCM) and next-generation sequencing (NGS) to study the specific miRNA expression profiles in the epidermis (Epi) and dermal inflammatory infiltrates (RD) of psoriatic skin (N = 6). We identified 24 deregulated miRNAs in the Epi and 37 deregulated miRNAs in the RD of psoriatic plaque compared...... with normal psoriatic skin (FCH > 2, FDR

  18. Targeted 'Next-Generation' sequencing in anophthalmia and microphthalmia patients confirms SOX2, OTX2 and FOXE3 mutations

    Directory of Open Access Journals (Sweden)

    Lopez Jimenez Nelson

    2011-12-01

    Full Text Available Abstract Background Anophthalmia/microphthalmia (A/M is caused by mutations in several different transcription factors, but mutations in each causative gene are relatively rare, emphasizing the need for a testing approach that screens multiple genes simultaneously. We used next-generation sequencing to screen 15 A/M patients for mutations in 9 pathogenic genes to evaluate this technology for screening in A/M. Methods We used a pooled sequencing design, together with custom single nucleotide polymorphism (SNP calling software. We verified predicted sequence alterations using Sanger sequencing. Results We verified three mutations - c.542delC in SOX2, resulting in p.Pro181Argfs*22, p.Glu105X in OTX2 and p.Cys240X in FOXE3. We found several novel sequence alterations and SNPs that were likely to be non-pathogenic - p.Glu42Lys in CRYBA4, p.Val201Met in FOXE3 and p.Asp291Asn in VSX2. Our analysis methodology gave one false positive result comprising a mutation in PAX6 (c.1268A > T, predicting p.X423LeuextX*15 that was not verified by Sanger sequencing. We also failed to detect one 20 base pair (bp deletion and one 3 bp duplication in SOX2. Conclusions Our results demonstrated the power of next-generation sequencing with pooled sample groups for the rapid screening of candidate genes for A/M as we were correctly able to identify disease-causing mutations. However, next-generation sequencing was less useful for small, intragenic deletions and duplications. We did not find mutations in 10/15 patients and conclude that there is a need for further gene discovery in A/M.

  19. Targeted 'next-generation' sequencing in anophthalmia and microphthalmia patients confirms SOX2, OTX2 and FOXE3 mutations.

    Science.gov (United States)

    Jimenez, Nelson Lopez; Flannick, Jason; Yahyavi, Mani; Li, Jiang; Bardakjian, Tanya; Tonkin, Leath; Schneider, Adele; Sherr, Elliott H; Slavotinek, Anne M

    2011-12-28

    Anophthalmia/microphthalmia (A/M) is caused by mutations in several different transcription factors, but mutations in each causative gene are relatively rare, emphasizing the need for a testing approach that screens multiple genes simultaneously. We used next-generation sequencing to screen 15 A/M patients for mutations in 9 pathogenic genes to evaluate this technology for screening in A/M. We used a pooled sequencing design, together with custom single nucleotide polymorphism (SNP) calling software. We verified predicted sequence alterations using Sanger sequencing. We verified three mutations - c.542delC in SOX2, resulting in p.Pro181Argfs*22, p.Glu105X in OTX2 and p.Cys240X in FOXE3. We found several novel sequence alterations and SNPs that were likely to be non-pathogenic - p.Glu42Lys in CRYBA4, p.Val201Met in FOXE3 and p.Asp291Asn in VSX2. Our analysis methodology gave one false positive result comprising a mutation in PAX6 (c.1268A > T, predicting p.X423LeuextX*15) that was not verified by Sanger sequencing. We also failed to detect one 20 base pair (bp) deletion and one 3 bp duplication in SOX2. Our results demonstrated the power of next-generation sequencing with pooled sample groups for the rapid screening of candidate genes for A/M as we were correctly able to identify disease-causing mutations. However, next-generation sequencing was less useful for small, intragenic deletions and duplications. We did not find mutations in 10/15 patients and conclude that there is a need for further gene discovery in A/M.

  20. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  1. snpTree - a web-server to identify and construct SNP trees from whole genome sequence data

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Kaas, Rolf Sommer; Thomsen, Martin Christen Frølund

    2012-01-01

    identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed...... to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic...... skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Results Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can...

  2. Deep sequencing of Brachypodium small RNAs at the global genome level identifies microRNAs involved in cold stress response

    Directory of Open Access Journals (Sweden)

    Chong Kang

    2009-09-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are endogenous small RNAs having large-scale regulatory effects on plant development and stress responses. Extensive studies of miRNAs have only been performed in a few model plants. Although miRNAs are proved to be involved in plant cold stress responses, little is known for winter-habit monocots. Brachypodium distachyon, with close evolutionary relationship to cool-season cereals, has recently emerged as a novel model plant. There are few reports of Brachypodium miRNAs. Results High-throughput sequencing and whole-genome-wide data mining led to the identification of 27 conserved miRNAs, as well as 129 predicted miRNAs in Brachypodium. For multiple-member conserved miRNA families, their sizes in Brachypodium were much smaller than those in rice and Populus. The genome organization of miR395 family in Brachypodium was quite different from that in rice. The expression of 3 conserved miRNAs and 25 predicted miRNAs showed significant changes in response to cold stress. Among these miRNAs, some were cold-induced and some were cold-suppressed, but all the conserved miRNAs were up-regulated under cold stress condition. Conclusion Our results suggest that Brachypodium miRNAs are composed of a set of conserved miRNAs and a large proportion of non-conserved miRNAs with low expression levels. Both kinds of miRNAs were involved in cold stress response, but all the conserved miRNAs were up-regulated, implying an important role for cold-induced miRNAs. The different size and genome organization of miRNA families in Brachypodium and rice suggest that the frequency of duplication events or the selection pressure on duplicated miRNAs are different between these two closely related plant species.

  3. Microbial Contaminants of Cord Blood Units Identified by 16S rRNA Sequencing and by API Test System, and Antibiotic Sensitivity Profiling.

    Directory of Open Access Journals (Sweden)

    Luís França

    Full Text Available Over a period of ten months a total of 5618 cord blood units (CBU were screened for microbial contamination under routine conditions. The antibiotic resistance profile for all isolates was also examined using ATB strips. The detection rate for culture positive units was 7.5%, corresponding to 422 samples.16S rRNA sequence analysis and identification with API test system were used to identify the culturable aerobic, microaerophilic and anaerobic bacteria from CBUs. From these samples we recovered 485 isolates (84 operational taxonomic units, OTUs assigned to the classes Bacteroidia, Actinobacteria, Clostridia, Bacilli, Betaproteobacteria and primarily to the Gammaproteobacteria. Sixty-nine OTUs, corresponding to 447 isolates, showed 16S rRNA sequence similarities above 99.0% with known cultured bacteria. However, 14 OTUs had 16S rRNA sequence similarities between 95 and 99% in support of genus level identification and one OTU with 16S rRNA sequence similarity of 90.3% supporting a family level identification only. The phenotypic identification formed 29 OTUs that could be identified to the species level and 9 OTUs that could be identified to the genus level by API test system. We failed to obtain identification for 14 OTUs, while 32 OTUs comprised organisms producing mixed identifications. Forty-two OTUs covered species not included in the API system databases. The API test system Rapid ID 32 Strep and Rapid ID 32 E showed the highest proportion of identifications to the species level, the lowest ratio of unidentified results and the highest agreement to the results of 16S rRNA assignments. Isolates affiliated to the Bacilli and Bacteroidia showed the highest antibiotic multi-resistance indices and microorganisms of the Clostridia displayed the most antibiotic sensitive phenotypes.

  4. Microbial Contaminants of Cord Blood Units Identified by 16S rRNA Sequencing and by API Test System, and Antibiotic Sensitivity Profiling.

    Science.gov (United States)

    França, Luís; Simões, Catarina; Taborda, Marco; Diogo, Catarina; da Costa, Milton S

    2015-01-01

    Over a period of ten months a total of 5618 cord blood units (CBU) were screened for microbial contamination under routine conditions. The antibiotic resistance profile for all isolates was also examined using ATB strips. The detection rate for culture positive units was 7.5%, corresponding to 422 samples.16S rRNA sequence analysis and identification with API test system were used to identify the culturable aerobic, microaerophilic and anaerobic bacteria from CBUs. From these samples we recovered 485 isolates (84 operational taxonomic units, OTUs) assigned to the classes Bacteroidia, Actinobacteria, Clostridia, Bacilli, Betaproteobacteria and primarily to the Gammaproteobacteria. Sixty-nine OTUs, corresponding to 447 isolates, showed 16S rRNA sequence similarities above 99.0% with known cultured bacteria. However, 14 OTUs had 16S rRNA sequence similarities between 95 and 99% in support of genus level identification and one OTU with 16S rRNA sequence similarity of 90.3% supporting a family level identification only. The phenotypic identification formed 29 OTUs that could be identified to the species level and 9 OTUs that could be identified to the genus level by API test system. We failed to obtain identification for 14 OTUs, while 32 OTUs comprised organisms producing mixed identifications. Forty-two OTUs covered species not included in the API system databases. The API test system Rapid ID 32 Strep and Rapid ID 32 E showed the highest proportion of identifications to the species level, the lowest ratio of unidentified results and the highest agreement to the results of 16S rRNA assignments. Isolates affiliated to the Bacilli and Bacteroidia showed the highest antibiotic multi-resistance indices and microorganisms of the Clostridia displayed the most antibiotic sensitive phenotypes.

  5. Sequencing of sporadic Attention-Deficit Hyperactivity Disorder (ADHD) identifies novel and potentially pathogenic de novo variants and excludes overlap with genes associated with autism spectrum disorder.

    Science.gov (United States)

    Kim, Daniel Seung; Burt, Amber A; Ranchalis, Jane E; Wilmot, Beth; Smith, Joshua D; Patterson, Karynne E; Coe, Bradley P; Li, Yatong K; Bamshad, Michael J; Nikolas, Molly; Eichler, Evan E; Swanson, James M; Nigg, Joel T; Nickerson, Deborah A; Jarvik, Gail P

    2017-06-01

    Attention-Deficit Hyperactivity Disorder (ADHD) has high heritability; however, studies of common variation account for ADHD variance. Using data from affected participants without a family history of ADHD, we sought to identify de novo variants that could account for sporadic ADHD. Considering a total of 128 families, two analyses were conducted in parallel: first, in 11 unaffected parent/affected proband trios (or quads with the addition of an unaffected sibling) we completed exome sequencing. Six de novo missense variants at highly conserved bases were identified and validated from four of the 11 families: the brain-expressed genes TBC1D9, DAGLA, QARS, CSMD2, TRPM2, and WDR83. Separately, in 117 unrelated probands with sporadic ADHD, we sequenced a panel of 26 genes implicated in intellectual disability (ID) and autism spectrum disorder (ASD) to evaluate whether variation in ASD/ID-associated genes were also present in participants with ADHD. Only one putative deleterious variant (Gln600STOP) in CHD1L was identified; this was found in a single proband. Notably, no other nonsense, splice, frameshift, or highly conserved missense variants in the 26 gene panel were identified and validated. These data suggest that de novo variant analysis in families with independently adjudicated sporadic ADHD diagnosis can identify novel genes implicated in ADHD pathogenesis. Moreover, that only one of the 128 cases (0.8%, 11 exome, and 117 MIP sequenced participants) had putative deleterious variants within our data in 26 genes related to ID and ASD suggests significant independence in the genetic pathogenesis of ADHD as compared to ASD and ID phenotypes. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  6. Genome-Wide Association Study Identifies Loci for Salt Tolerance during Germination in Autotetraploid Alfalfa (Medicago sativa L.) Using Genotyping-by-Sequencing

    Science.gov (United States)

    Yu, Long-Xi; Liu, Xinchun; Boge, William; Liu, Xiang-Ping

    2016-01-01

    Salinity is one of major abiotic stresses limiting alfalfa (Medicago sativa L.) production in the arid and semi-arid regions in US and other counties. In this study, we used a diverse panel of alfalfa accessions previously described by Zhang et al. (2015) to identify molecular markers associated with salt tolerance during germination using genome-wide association study (GWAS) and genotyping-by-sequencing (GBS). Phenotyping was done by germinating alfalfa seeds under different levels of salt stress. Phenotypic data of adjusted germination rates and SNP markers generated by GBS were used for marker-trait association. Thirty six markers were significantly associated with salt tolerance in at least one level of salt treatments. Alignment of sequence tags to the Medicago truncatula genome revealed genetic locations of the markers on all chromosomes except chromosome 3. Most significant markers were found on chromosomes 1, 2, and 4. BLAST search using the flanking sequences of significant markers identified 14 putative candidate genes linked to 23 significant markers. Most of them were repeatedly identified in two or three salt treatments. Several loci identified in the present study had similar genetic locations to the reported QTL associated with salt tolerance in M. truncatula. A locus identified on chromosome 6 by this study overlapped with that by drought in our previous study. To our knowledge, this is the first report on mapping loci associated with salt tolerance during germination in autotetraploid alfalfa. Further investigation on these loci and their linked genes would provide insight into understanding molecular mechanisms by which salt and drought stresses affect alfalfa growth. Functional markers closely linked to the resistance loci would be useful for MAS to improve alfalfa cultivars with enhanced resistance to drought and salt stresses. PMID:27446182

  7. Identification of Novel Variants in LTBP2 and PXDN Using Whole-Exome Sequencing in Developmental and Congenital Glaucoma.

    Directory of Open Access Journals (Sweden)

    Shazia Micheal

    Full Text Available Primary congenital glaucoma (PCG is the most common form of glaucoma in children. PCG occurs due to the developmental defects in the trabecular meshwork and anterior chamber of the eye. The purpose of this study is to identify the causative genetic variants in three families with developmental and primary congenital glaucoma (PCG with a recessive inheritance pattern.DNA samples were obtained from consanguineous families of Pakistani ancestry. The CYP1B1 gene was sequenced in the affected probands by conventional Sanger DNA sequencing. Whole exome sequencing (WES was performed in DNA samples of four individuals belonging to three different CYP1B1-negative families. Variants identified by WES were validated by Sanger sequencing.WES identified potentially causative novel mutations in the latent transforming growth factor beta binding protein 2 (LTBP2 gene in two PCG families. In the first family a novel missense mutation (c.4934G>A; p.Arg1645Glu co-segregates with the disease phenotype, and in the second family a novel frameshift mutation (c.4031_4032insA; p.Asp1345Glyfs*6 was identified. In a third family with developmental glaucoma a novel mutation (c.3496G>A; p.Gly1166Arg was identified in the PXDN gene, which segregates with the disease.We identified three novel mutations in glaucoma families using WES; two in the LTBP2 gene and one in the PXDN gene. The results will not only enhance our current understanding of the genetic basis of glaucoma, but may also contribute to a better understanding of the diverse phenotypic consequences caused by mutations in these genes.

  8. Implementing targeted region capture sequencing for the clinical detection of Alagille syndrome: An efficient and cost‑effective method.

    Science.gov (United States)

    Huang, Tianhong; Yang, Guilin; Dang, Xiao; Ao, Feijian; Li, Jiankang; He, Yizhou; Tang, Qiyuan; He, Qing

    2017-11-01

    Alagille syndrome (AGS) is a highly variable, autosomal dominant disease that affects multiple structures including the liver, heart, eyes, bones and face. Targeted region capture sequencing focuses on a panel of known pathogenic genes and provides a rapid, cost‑effective and accurate method for molecular diagnosis. In a Chinese family, this method was used on the proband and Sanger sequencing was applied to validate the candidate mutation. A de novo heterozygous mutation (c.3254_3255insT p.Leu1085PhefsX24) of the jagged 1 gene was identified as the potential disease‑causing gene mutation. In conclusion, the present study suggested that target region capture sequencing is an efficient, reliable and accurate approach for the clinical diagnosis of AGS. Furthermore, these results expand on the understanding of the pathogenesis of AGS.

  9. Detection of Emerging Vaccine-Related Polioviruses by Deep Sequencing.

    Science.gov (United States)

    Sahoo, Malaya K; Holubar, Marisa; Huang, ChunHong; Mohamed-Hadley, Alisha; Liu, Yuanyuan; Waggoner, Jesse J; Troy, Stephanie B; Garcia-Garcia, Lourdes; Ferreyra-Reyes, Leticia; Maldonado, Yvonne; Pinsky, Benjamin A

    2017-07-01

    Oral poliovirus vaccine can mutate to regain neurovirulence. To date, evaluation of these mutations has been performed primarily on culture-enriched isolates by using conventional Sanger sequencing. We therefore developed a culture-independent, deep-sequencing method targeting the 5' untranslated region (UTR) and P1 genomic region to characterize vaccine-related poliovirus variants. Error analysis of the deep-sequencing method demonstrated reliable detection of poliovirus mutations at levels of vaccinated, asymptomatic children and their close contacts collected during a prospective cohort study in Veracruz, Mexico, revealed no vaccine-derived polioviruses. This was expected given that the longest duration between sequenced sample collection and the end of the most recent national immunization week was 66 days. However, we identified many low-level variants (Sabin serotypes, as well as vaccine-related viruses with multiple canonical mutations associated with phenotypic reversion present at high levels (>90%). These results suggest that monitoring emerging vaccine-related poliovirus variants by deep sequencing may aid in the poliovirus endgame and efforts to ensure global polio eradication. Copyright © 2017 Sahoo et al.

  10. Reduced Representation Libraries from DNA Pools Analysed with Next Generation Semiconductor Based-Sequencing to Identify SNPs in Extreme and Divergent Pigs for Back Fat Thickness

    Directory of Open Access Journals (Sweden)

    Samuele Bovo

    2015-01-01

    Full Text Available The aim of this study was to identify single nucleotide polymorphisms (SNPs that could be associated with back fat thickness (BFT in pigs. To achieve this goal, we evaluated the potential and limits of an experimental design that combined several methodologies. DNA samples from two groups of Italian Large White pigs with divergent estimating breeding value (EBV for BFT were separately pooled and sequenced, after preparation of reduced representation libraries (RRLs, on the Ion Torrent technology. Taking advantage from SNAPE for SNPs calling in sequenced DNA pools, 39,165 SNPs were identified; 1/4 of them were novel variants not reported in dbSNP. Combining sequencing data with Illumina PorcineSNP60 BeadChip genotyping results on the same animals, 661 genomic positions overlapped with a good approximation of minor allele frequency estimation. A total of 54 SNPs showing enriched alleles in one or in the other RRLs might be potential markers associated with BFT. Some of these SNPs were close to genes involved in obesity related phenotypes.

  11. Genome-Wide Association Study with Sequence Variants Identifies Candidate Genes for Mastitis Resistance in Dairy Cattle

    DEFF Research Database (Denmark)

    Sahana, Goutam; Guldbrandtsen, Bernt; Bendixen, Christian

    Six genomic regions affecting clinical mastitis were identified through a GWAS study with imputed BovineHD chip genotype data in the Nordic Holstein cattle population. The association analyses were carried out using a SNP-by-SNP analysis by fitting the regression of allele dosage and a polygenic...... Effect Predictor (VEP) vers. 2.6 using ENSEMBL vers. 67 databases. Candidate polymorphisms affecting clinical mastitis were selected based on their association with the traits and functional annotations. A strong positional candidate gene for mastitis resistance on chromosome-6 is the NPFFR2 which...... Factor Receptor Alpha (LIFR) emerged as a strong candidate gene for mastitis resistance. The LIFR gene is involved in acute phase response and is expressed in saliva and mammary gland....

  12. Features of Two New Proteins with OmpA-Like Domains Identified in the Genome Sequences of Leptospira interrogans

    Science.gov (United States)

    Teixeira, Aline F.; de Morais, Zenaide M.; Kirchgatter, Karin; Romero, Eliete C.; Vasconcellos, Silvio A.; Nascimento, Ana Lucia T. O.

    2015-01-01

    Leptospirosis is an acute febrile disease caused by pathogenic spirochetes of the genus Leptospira. It is considered an important re-emerging infectious disease that affects humans worldwide. The knowledge about the mechanisms by which pathogenic leptospires invade and colonize the host remains limited since very few virulence factors contributing to the pathogenesis of the disease have been identified. Here, we report the identification and characterization of two new leptospiral proteins with OmpA-like domains. The recombinant proteins, which exhibit extracellular matrix-binding properties, are called Lsa46 - LIC13479 and Lsa77 - LIC10050 (Leptospiral surface adhesins of 46 and 77 kDa, respectively). Attachment of Lsa46 and Lsa77 to laminin was specific, dose dependent and saturable, with KD values of 24.3 ± 17.0 and 53.0 ± 17.5 nM, respectively. Lsa46 and Lsa77 also bind plasma fibronectin, and both adhesins are plasminogen (PLG)-interacting proteins, capable of generating plasmin (PLA) and as such, increase the proteolytic ability of leptospires. The proteins corresponding to Lsa46 and Lsa77 are present in virulent L. interrogans L1-130 and in saprophyte L. biflexa Patoc 1 strains, as detected by immunofluorescence. The adhesins are recognized by human leptospirosis serum samples at the onset and convalescent phases of the disease, suggesting that they are expressed during infection. Taken together, our data could offer valuable information to the understanding of leptospiral pathogenesis. PMID:25849456

  13. Features of two new proteins with OmpA-like domains identified in the genome sequences of Leptospira interrogans.

    Directory of Open Access Journals (Sweden)

    Aline F Teixeira

    Full Text Available Leptospirosis is an acute febrile disease caused by pathogenic spirochetes of the genus Leptospira. It is considered an important re-emerging infectious disease that affects humans worldwide. The knowledge about the mechanisms by which pathogenic leptospires invade and colonize the host remains limited since very few virulence factors contributing to the pathogenesis of the disease have been identified. Here, we report the identification and characterization of two new leptospiral proteins with OmpA-like domains. The recombinant proteins, which exhibit extracellular matrix-binding properties, are called Lsa46 - LIC13479 and Lsa77 - LIC10050 (Leptospiral surface adhesins of 46 and 77 kDa, respectively. Attachment of Lsa46 and Lsa77 to laminin was specific, dose dependent and saturable, with KD values of 24.3 ± 17.0 and 53.0 ± 17.5 nM, respectively. Lsa46 and Lsa77 also bind plasma fibronectin, and both adhesins are plasminogen (PLG-interacting proteins, capable of generating plasmin (PLA and as such, increase the proteolytic ability of leptospires. The proteins corresponding to Lsa46 and Lsa77 are present in virulent L. interrogans L1-130 and in saprophyte L. biflexa Patoc 1 strains, as detected by immunofluorescence. The adhesins are recognized by human leptospirosis serum samples at the onset and convalescent phases of the disease, suggesting that they are expressed during infection. Taken together, our data could offer valuable information to the understanding of leptospiral pathogenesis.

  14. Recurrent targeted genes of hepatitis B virus in the liver cancer genomes identified by a next-generation sequencing-based approach.

    Directory of Open Access Journals (Sweden)

    Dong Ding

    Full Text Available Integration of the viral DNA into host chromosomes was found in most of the hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs. Here we devised a massive anchored parallel sequencing (MAPS method using next-generation sequencing to isolate and sequence HBV integrants. Applying MAPS to 40 pairs of HBV-related HCC tissues (cancer and adjacent tissues, we identified 296 HBV integration events corresponding to 286 unique integration sites (UISs with precise HBV-Human DNA junctions. HBV integration favored chromosome 17 and preferentially integrated into human transcript units. HBV targeted genes were enriched in GO terms: cAMP metabolic processes, T cell differentiation and activation, TGF beta receptor pathway, ncRNA catabolic process, and dsRNA fragmentation and cellular response to dsRNA. The HBV targeted genes include 7 genes (PTPRJ, CNTN6, IL12B, MYOM1, FNDC3B, LRFN2, FN1 containing IPR003961 (Fibronectin, type III domain, 7 genes (NRG3, MASP2, NELL1, LRP1B, ADAM21, NRXN1, FN1 containing IPR013032 (EGF-like region, conserved site, and three genes (PDE7A, PDE4B, PDE11A containing IPR002073 (3', 5'-cyclic-nucleotide phosphodiesterase. Enriched pathways include hsa04512 (ECM-receptor interaction, hsa04510 (Focal adhesion, and hsa04012 (ErbB signaling pathway. Fewer integration events were found in cancers compared to cancer-adjacent tissues, suggesting a clonal expansion model in HCC development. Finally, we identified 8 genes that were recurrent target genes by HBV integration including fibronectin 1 (FN1 and telomerase reverse transcriptase (TERT1, two known recurrent target genes, and additional novel target genes such as SMAD family member 5 (SMAD5, phosphatase and actin regulator 4 (PHACTR4, and RNA binding protein fox-1 homolog (C. elegans 1 (RBFOX1. Integrating analysis with recently published whole-genome sequencing analysis, we identified 14 additional recurrent HBV target genes, greatly expanding the HBV recurrent target list

  15. 16S rRNA amplicon sequencing identifies microbiota associated with oral cancer, human papilloma virus infection and surgical treatment.

    Science.gov (United States)

    Guerrero-Preston, Rafael; Godoy-Vitorino, Filipa; Jedlicka, Anne; Rodríguez-Hilario, Arnold; González, Herminio; Bondy, Jessica; Lawson, Fahcina; Folawiyo, Oluwasina; Michailidi, Christina; Dziedzic, Amanda; Thangavel, Rajagowthamee; Hadar, Tal; Noordhuis, Maartje G; Westra, William; Koch, Wayne; Sidransky, David

    2016-08-09

    Systemic inflammatory events and localized disease, mediated by the microbiome, may be measured in saliva as head and neck squamous cell carcinoma (HNSCC) diagnostic and prognostic biomonitors. We used a 16S rRNA V3-V5 marker gene approach to compare the saliva microbiome in DNA isolated from Oropharyngeal (OPSCC), Oral Cavity Squamous Cell Carcinoma (OCSCC) patients and normal epithelium controls, to characterize the HNSCC saliva microbiota and examine their abundance before and after surgical resection.The analyses identified a predominance of Firmicutes, Proteobacteria and Bacteroidetes, with less frequent presence of Actinobacteria and Fusobacteria before surgery. At lower taxonomic levels, the most abundant genera were Streptococcus, Prevotella, Haemophilus, Lactobacillus and Veillonella, with lower numbers of Citrobacter and Neisseraceae genus Kingella. HNSCC patients had a significant loss in richness and diversity of microbiota species (p<0.05) compared to the controls. Overall, the Operational Taxonomic Units network shows that the relative abundance of OTU's within genus Streptococcus, Dialister, and Veillonella can be used to discriminate tumor from control samples (p<0.05). Tumor samples lost Neisseria, Aggregatibacter (Proteobacteria), Haemophillus (Firmicutes) and Leptotrichia (Fusobacteria). Paired taxa within family Enterobacteriaceae, together with genus Oribacterium, distinguish OCSCC samples from OPSCC and normal samples (p<0.05). Similarly, only HPV positive samples have an abundance of genus Gemellaceae and Leuconostoc (p<0.05). Longitudinal analyses of samples taken before and after surgery, revealed a reduction in the alpha diversity measure after surgery, together with an increase of this measure in patients that recurred (p<0.05). These results suggest that microbiota may be used as HNSCC diagnostic and prognostic biomonitors.

  16. Viral metagenomics: Analysis of begomoviruses by illumina high-throughput sequencing

    KAUST Repository

    Idris, Ali

    2014-03-12

    Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes) (genus, Begomovirus; family, Geminiviridae) were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA). Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS). CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions. 2014 by the authors; licensee MDPI, Basel, Switzerland.

  17. Viral Metagenomics: Analysis of Begomoviruses by Illumina High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Ali Idris

    2014-03-01

    Full Text Available Traditional DNA sequencing methods are inefficient, lack the ability to discern the least abundant viral sequences, and ineffective for determining the extent of variability in viral populations. Here, populations of single-stranded DNA plant begomoviral genomes and their associated beta- and alpha-satellite molecules (virus-satellite complexes (genus, Begomovirus; family, Geminiviridae were enriched from total nucleic acids isolated from symptomatic, field-infected plants, using rolling circle amplification (RCA. Enriched virus-satellite complexes were subjected to Illumina-Next Generation Sequencing (NGS. CASAVA and SeqMan NGen programs were implemented, respectively, for quality control and for de novo and reference-guided contig assembly of viral-satellite sequences. The authenticity of the begomoviral sequences, and the reproducibility of the Illumina-NGS approach for begomoviral deep sequencing projects, were validated by comparing NGS results with those obtained using traditional molecular cloning and Sanger sequencing of viral components and satellite DNAs, also enriched by RCA or amplified by polymerase chain reaction. As the use of NGS approaches, together with advances in software development, make possible deep sequence coverage at a lower cost; the approach described herein will streamline the exploration of begomovirus diversity and population structure from naturally infected plants, irrespective of viral abundance. This is the first report of the implementation of Illumina-NGS to explore the diversity and identify begomoviral-satellite SNPs directly from plants naturally-infected with begomoviruses under field conditions.

  18. Molecular diagnosis of glycogen storage disease and disorders with overlapping clinical symptoms by massive parallel sequencing.

    Science.gov (United States)

    Vega, Ana I; Medrano, Celia; Navarrete, Rosa; Desviat, Lourdes R; Merinero, Begoña; Rodríguez-Pombo, Pilar; Vitoria, Isidro; Ugarte, Magdalena; Pérez-Cerdá, Celia; Pérez, Belen

    2016-10-01

    Glycogen storage disease (GSD) is an umbrella term for a group of genetic disorders that involve the abnormal metabolism of glycogen; to date, 23 types of GSD have been identified. The nonspecific clinical presentation of GSD and the lack of specific biomarkers mean that Sanger sequencing is now widely relied on for making a diagnosis. However, this gene-by-gene sequencing technique is both laborious and costly, which is a consequence of the number of genes to be sequenced and the large size of some genes. This work reports the use of massive parallel sequencing to diagnose patients at our laboratory in Spain using either a customized gene panel (targeted exome sequencing) or the Illumina Clinical-Exome TruSight One Gene Panel (clinical exome sequencing (CES)). Sequence variants were matched against biochemical and clinical hallmarks. Pathogenic mutations were detected in 23 patients. Twenty-two mutations were recognized (mostly loss-of-function mutations), including 11 that were novel in GSD-associated genes. In addition, CES detected five patients with mutations in ALDOB, LIPA, NKX2-5, CPT2, or ANO5. Although these genes are not involved in GSD, they are associated with overlapping phenotypic characteristics such as hepatic, muscular, and cardiac dysfunction. These results show that next-generation sequencing, in combination with the detection of biochemical and clinical hallmarks, provides an accurate, high-throughput means of making genetic diagnoses of GSD and related diseases.Genet Med 18 10, 1037-1043.

  19. Identification of two novel pathogenic compound heterozygous MYO7A mutations in Usher syndrome by whole exome sequencing.

    Science.gov (United States)

    Jia, Ying; Li, Xiaoge; Yang, Dong; Xu, Yi; Guo, Ying; Li, Xin

    2018-01-01

    The current study aims to identify the pathogenic sites in a core pedigree of Usher syndrome (USH). A core pedigree of USH was analyzed by whole exome sequencing (WES). Mutations were verified by polymerase chain reaction (PCR) amplification and Sanger sequencing. Two pathogenic variations (c.849+2T>C and c.5994G>A) in MYO7A were successfully identified and individually separated from parents. One variant (c.849+2T>C) was nonsense mutation, causing the protein terminated in advance, and the other one (c.5994G>A) located near the boundary of exon could cause aberrant splicing. This study provides a meaningful exploration for identification of clinical core genetic pedigrees. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Gamma-Irradiated Mannheimia (Pasteurella) Haemolytica Identified by rRNA Gene Sequencing as a Potential Vaccine in Mice

    International Nuclear Information System (INIS)

    Araby, E.

    2014-01-01

    Pneumonic pasteurellosis is a significant disease in beef production medicine. The most information suggests that this disease is a $700 million dollar per year economic burden in bovine food animal production. The current study was designed to assess the immune efficacy of whole cell killed of M. haemolytica strain from satisfactory cases (infected lung from sheep). The efficacy of gamma- irradiated M. haemolytica vaccine (GIV) was evaluated in mice in comparison to the classical aqueous formalized (AFV) one. The bacteria under study were cultivation on blood agar, purification and genetically identified. Then the bacterial cells were exposed to different doses of gamma radiation (2- 20 kGy) with 2 kGy intervals and the dose response curve of the survivors was plotted and 20 kGy was selected as the dose for the preparation of the vaccine. A total of 30 male mice (two weeks – old) were used for the further experimental investigations. Animals were divided into three equal groups each of 10 animals. The first group (group A) was given GIV . The second group (group B) received AFV. The third group (group C) was injected with sterile saline solution and represents the control. Animals were vaccinated via intraperitoneal (i.p) injection with 1x10 8 CFU per treated mouse. After vaccination, the immuno response was determined by cellular surface antigens-reactive antibodies using a modified protein- electrophoresis procedure. Antibody-antigen hybrids was visualized at molecular weight more than 225 KDa in samples represented M. haemolytica antibodies group (A, B) against both bacterial samples (M. haemolytica and Pasteurella multocida ) , while non-treated bacterial cells in which cells incubated with serum of mice group (C) revealed no hybridization reaction, this results verify that, there is shared cellular surface antigens among the two Pasteurella species. Also, the bacterial distribution with (LD 50 ) 2x10 7 CFU of a live M. heamolytica into vaccinated and non

  1. Exome Sequencing Identifies a Missense Variant in EFEMP1 Co-Segregating in a Family with Autosomal Dominant Primary Open-Angle Glaucoma.

    Directory of Open Access Journals (Sweden)

    Donna S Mackay

    Full Text Available Primary open-angle glaucoma (POAG is a clinically important and genetically heterogeneous cause of progressive vision loss as a result of retinal ganglion cell death. Here we have utilized trio-based, whole-exome sequencing to identify the genetic defect underlying an autosomal dominant form of adult-onset POAG segregating in an African-American family. Exome sequencing identified a novel missense variant (c.418C>T, p.Arg140Trp in exon-5 of the gene coding for epidermal growth factor (EGF containing fibulin-like extracellular matrix protein 1 (EFEMP1 that co-segregated with disease in the family. Linkage and haplotype analyses with microsatellite markers indicated that the disease interval overlapped a known POAG locus (GLC1H on chromosome 2p. The p.Arg140Trp substitution was predicted in silico to have damaging effects on protein function and transient expression studies in cultured cells revealed that the Trp140-mutant protein exhibited increased intracellular accumulation compared with wild-type EFEMP1. In situ hybridization of the mouse eye with oligonucleotide probes detected the highest levels of EFEMP1 transcripts in the ciliary body, cornea, inner nuclear layer of the retina, and the optic nerve head. The recent finding that a common variant near EFEMP1 was associated with optic nerve-head morphology supports the possibility that the EFEMP1 variant identified in this POAG family may be pathogenic.

  2. Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds.

    Directory of Open Access Journals (Sweden)

    Nedenia Bonvino Stafuzza

    Full Text Available Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose, Gyr, Girolando and Holstein (dairy production. A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs and 3,828,041 insertions/deletions (InDels were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs.

  3. Application of CHD1 Gene and EE0.6 Sequences to Identify Sexes of Several Protected Bird Species in Taiwan

    Directory of Open Access Journals (Sweden)

    E.-C. Lin

    2011-06-01

    Full Text Available Many bird species, for example: Crested Serpent Eagle (Spilornis cheela hoya, Collared Scops (Owl Otus bakkamoena, Tawny Fish Owl (Ketupa flavipes, Crested Goshawk (Accipiter trivirgatus, and Grass Owl (Tyto longimembris... etc, are monomorphic, which is difficult to identify their sex simply by their outward appearance. Especially for those monomorphic endangered species, finding an effective tool to identify their sex beside outward appearance is needed for further captive breeding programs or other conservation plans. In this study, we collected samples of Black Swan (Cygmus atratus and Nicobar Pigeon (Caloenas nicobarica, two aviaries introduced monomorphic species served as control group, and Crested Serpent Eagle, Collared Scops Owl, Tawny Fish Owl, Crested Goshawk, and Grass Owl, five protected monomorphic species in Taiwan. We used sex-specific primers of avian CHD1 (chromo-helicase-DNA-binding gene and EE0.6 (EcoRI 0.6-kb fragment sequences to identify the sex of these birds. The results showed that CHD1 gene primers could be used to correctly identify the sex of Black Swans, Nicobar Pigeons and Crested Serpent Eagles, but it could not be used to correctly identify sex in Collared Scops Owls, Tawny Fish Owls, and Crested Goshawks. In the sex identification using EE0.6 sequence fragment, A, C, D and E primer sets could be used for sexing Black Swans; A, B, C, and D primer sets could be used for sexing Crested Serpent Eagles; and E primer set could be used for sexing Nicobar Pigeons and the two owl species. Correct determination of sex is the first step if a captive breeding measure is required. We have demonstrated that several of the existing primer sets can be used for sex determination of several captive breeding and indigenous bird species.

  4. Whole-exome re-sequencing in a family quartet identifies POP1 mutations as the cause of a novel skeletal dysplasia.

    Directory of Open Access Journals (Sweden)

    Evgeny A Glazov

    2011-03-01

    Full Text Available Recent advances in DNA sequencing have enabled mapping of genes for monogenic traits in families with small pedigrees and even in unrelated cases. We report the identification of disease-causing mutations in a rare, severe, skeletal dysplasia, studying a family of two healthy unrelated parents and two affected children using whole-exome sequencing. The two affected daughters have clinical and radiographic features suggestive of anauxetic dysplasia (OMIM 607095, a rare form of dwarfism caused by mutations of RMRP. However, mutations of RMRP were excluded in this family by direct sequencing. Our studies identified two novel compound heterozygous loss-of-function mutations in POP1, which encodes a core component of the RNase mitochondrial RNA processing (RNase MRP complex that directly interacts with the RMRP RNA domains that are affected in anauxetic dysplasia. We demonstrate that these mutations impair the integrity and activity of this complex and that they impair cell proliferation, providing likely molecular and cellular mechanisms by which POP1 mutations cause this severe skeletal dysplasia.

  5. Assessment of metagenomic assembly using simulated next generation sequencing data

    DEFF Research Database (Denmark)

    Mende, Daniel R; Waller, Alison S; Sunagawa, Shinichi

    2012-01-01

    with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved...... the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition...... the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities...

  6. Design of a High Density SNP Genotyping Assay in the Pig Using SNPs Identified and Characterized by Next Generation Sequencing Technology

    Science.gov (United States)

    Ramos, Antonio M.; Crooijmans, Richard P. M. A.; Affara, Nabeel A.; Amaral, Andreia J.; Archibald, Alan L.; Beever, Jonathan E.; Bendixen, Christian; Churcher, Carol; Clark, Richard; Dehais, Patrick; Hansen, Mark S.; Hedegaard, Jakob; Hu, Zhi-Liang; Kerstens, Hindrik H.; Law, Andy S.; Megens, Hendrik-Jan; Milan, Denis; Nonneman, Danny J.; Rohrer, Gary A.; Rothschild, Max F.; Smith, Tim P. L.; Schnabel, Robert D.; Van Tassell, Curt P.; Taylor, Jeremy F.; Wiedmann, Ralph T.; Schook, Lawrence B.; Groenen, Martien A. M.

    2009-01-01

    Background The dissection of complex traits of economic importance to the pig industry requires the availability of a significant number of genetic markers, such as single nucleotide polymorphisms (SNPs). This study was conducted to discover several hundreds of thousands of porcine SNPs using next generation sequencing technologies and use these SNPs, as well as others from different public sources, to design a high-density SNP genotyping assay. Methodology/Principal Findings A total of 19 reduced representation libraries derived from four swine breeds (Duroc, Landrace, Large White, Pietrain) and a Wild Boar population and three restriction enzymes (AluI, HaeIII and MspI) were sequenced using Illumina's Genome Analyzer (GA). The SNP discovery effort resulted in the de novo identification of over 372K SNPs. More than 549K SNPs were used to design the Illumina Porcine 60K+SNP iSelect Beadchip, now commercially available as the PorcineSNP60. A total of 64,232 SNPs were included on the Beadchip. Results from genotyping the 158 individuals used for sequencing showed a high overall SNP call rate (97.5%). Of the 62,621 loci that could be reliably scored, 58,994 were polymorphic yielding a SNP conversion success rate of 94%. The average minor allele frequency (MAF) for all scorable SNPs was 0.274. Conclusions/Significance Overall, the results of this study indicate the utility of using next generation sequencing technologies to identify large numbers of reliable SNPs. In addition, the validation of the PorcineSNP60 Beadchip demonstrated that the assay is an excellent tool that will likely be used in a variety of future studies in pigs. PMID:19654876

  7. iNR-PhysChem: a sequence-based predictor for identifying nuclear receptors and their subfamilies via physical-chemical property matrix.

    Directory of Open Access Journals (Sweden)

    Xuan Xiao

    Full Text Available Nuclear receptors (NRs form a family of ligand-activated transcription factors that regulate a wide variety of biological processes, such as homeostasis, reproduction, development, and metabolism. Human genome contains 48 genes encoding NRs. These receptors have become one of the most important targets for therapeutic drug development. According to their different action mechanisms or functions, NRs have been classified into seven subfamilies. With the avalanche of protein sequences generated in the postgenomic age, we are facing the following challenging problems. Given an uncharacterized protein sequence, how can we identify whether it is a nuclear receptor? If it is, what subfamily it belongs to? To address these problems, we developed a predictor called iNR-PhysChem in which the protein samples were expressed by a novel mode of pseudo amino acid composition (PseAAC whose components were derived from a physical-chemical matrix via a series of auto-covariance and cross-covariance transformations. It was observed that the overall success rate achieved by iNR-PhysChem was over 98% in identifying NRs or non-NRs, and over 92% in identifying NRs among the following seven subfamilies: NR1--thyroid hormone like, NR2--HNF4-like, NR3--estrogen like, NR4--nerve growth factor IB-like, NR5--fushi tarazu-F1 like, NR6--germ cell nuclear factor like, and NR0--knirps like. These rates were derived by the jackknife tests on a stringent benchmark dataset in which none of protein sequences included has ≥60% pairwise sequence identity to any other in a same subset. As a user-friendly web-server, iNR-PhysChem is freely accessible to the public at either http://www.jci-bioinfo.cn/iNR-PhysChem or http://icpr.jci.edu.cn/bioinfo/iNR-PhysChem. Also a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics involved in developing the predictor. It is anticipated that iNR-PhysChem may

  8. Whole-exome sequencing identified a homozygous FNBP4 mutation in a family with a condition similar to microphthalmia with limb anomalies.

    Science.gov (United States)

    Kondo, Yukiko; Koshimizu, Eriko; Megarbane, Andre; Hamanoue, Haruka; Okada, Ippei; Nishiyama, Kiyomi; Kodera, Hirofumi; Miyatake, Satoko; Tsurusaki, Yoshinori; Nakashima, Mitsuko; Doi, Hiroshi; Miyake, Noriko; Saitsu, Hirotomo; Matsumoto, Naomichi

    2013-07-01

    Microphthalmia with limb anomalies (MLA), also known as Waardenburg anophthalmia syndrome or ophthalmoacromelic syndrome, is a rare autosomal recessive disorder. Recently, we and others successfully identified SMOC1 as the causative gene for MLA. However, there are several MLA families without SMOC1 abnormality, suggesting locus heterogeneity in MLA. We aimed to identify a pathogenic mutation in one Lebanese family having an MLA-like condition without SMOC1 mutation by whole-exome sequencing (WES) combined with homozygosity mapping. A c.683C>T (p.Thr228Met) in FNBP4 was found as a primary candidate, drawing the attention that FNBP4 and SMOC1 may potentially modulate BMP signaling. Copyright © 2013 Wiley Periodicals, Inc.

  9. Whole exome sequencing identifies a POLRID mutation segregating in a father and two daughters with findings of Klippel-Feil and Treacher Collins syndromes.

    Science.gov (United States)

    Giampietro, Philip F; Armstrong, Linlea; Stoddard, Alex; Blank, Robert D; Livingston, Janet; Raggio, Cathy L; Rasmussen, Kristen; Pickart, Michael; Lorier, Rachel; Turner, Amy; Sund, Sarah; Sobrera, Nara; Neptune, Enid; Sweetser, David; Santiago-Cornier, Alberto; Broeckel, Ulrich

    2015-01-01

    We report on a father and his two daughters diagnosed with Klippel-Feil syndrome (KFS) but with craniofacial differences (zygomatic and mandibular hypoplasia and cleft palate) and external ear abnormalities suggestive of Treacher Collins syndrome (TCS). The diagnosis of KFS was favored, given that the neck anomalies were the predominant manifestations, and that the diagnosis predated later recognition of the association between spinal segmentation abnormalities and TCS. Genetic heterogeneity and the rarity of large families with KFS have limited the ability to identify mutations by traditional methods. Whole exome sequencing identified a nonsynonymous mutation in POLR1D (subunit of RNA polymerase I and II): exon2:c.T332C:p.L111P. Mutations in POLR1D are present in about 5% of individuals diagnosed with TCS. We propose that this mutation is causal in this family, suggesting a pathogenetic link between KFS and TCS. © 2014 Wiley Periodicals, Inc.

  10. Apoptosis-inducing signal sequence mutation in carbonic anhydrase IV identified in patients with the RP17 form of retinitis pigmentosa

    Science.gov (United States)

    Rebello, George; Ramesar, Rajkumar; Vorster, Alvera; Roberts, Lisa; Ehrenreich, Liezle; Oppon, Ekow; Gama, Dumisani; Bardien, Soraya; Greenberg, Jacquie; Bonapace, Giuseppe; Waheed, Abdul; Shah, Gul N.; Sly, William S.

    2004-01-01

    Genetic and physical mapping of the RP17 locus on 17q identified a 3.6-megabase candidate region that includes the gene encoding carbonic anhydrase IV (CA4), a glycosylphosphatidylinositol-anchored protein that is highly expressed in the choriocapillaris of the human eye. By sequencing candidate genes in this region, we identified a mutation that causes replacement of an arginine with a tryptophan (R14W) in the signal sequence of the CA4 gene at position -5 relative to the signal sequence cleavage site. This mutation was found to cosegregate with the disease phenotype in two large families and was not found in 36 unaffected family members or 100 controls. Expression of the mutant cDNA in COS-7 cells produced several findings, suggesting a mechanism by which the mutation can explain the autosomal dominant disease. In transfected COS-7 cells, the R14W mutation (i) reduced the steady-state level of carbonic anhydrase IV activity expressed by 28% due to a combination of decreased synthesis and accelerated turnover; (ii) led to up-regulation of immunoglobulin-binding protein, double-stranded RNA-regulated protein kinase-like ER kinase, and CCAAT/enhancer-binding protein homologous protein, markers of the unfolded protein response and endoplasmic reticulum stress; and (iii) induced apoptosis, as evidenced by annexin V binding and terminal deoxynucleotidyltransferase-mediated dUTP nick end labeling staining, in most cells expressing the mutant, but not the WT, protein. We suggest that a high level of expression of the mutant allele in the endothelial cells of the choriocapillaris leads to apoptosis, leading in turn to ischemia in the overlying retina and producing autosomal dominant retinitis pigmentosa. PMID:15090652

  11. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius.

    Directory of Open Access Journals (Sweden)

    Ceiridwen J Edwards

    Full Text Available BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer. In total, 289.9 megabases (22.48% of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously

  12. Deep sequencing of Salmonella RNA associated with heterologous Hfq proteins in vivo reveals small RNAs as a major target class and identifies RNA processing phenotypes.

    Science.gov (United States)

    Sittka, Alexandra; Sharma, Cynthia M; Rolle, Katarzyna; Vogel, Jörg

    2009-01-01

    The bacterial Sm-like protein, Hfq, is a key factor for the stability and function of small non-coding RNAs (sRNAs) in Escherichia coli. Homologues of this protein have been predicted in many distantly related organisms yet their functional conservation as sRNA-binding proteins has not entirely been clear. To address this, we expressed in Salmonella the Hfq proteins of two eubacteria (Neisseria meningitides, Aquifex aeolicus) and an archaeon (Methanocaldococcus jannaschii), and analyzed the associated RNA by deep sequencing. This in vivo approach identified endogenous Salmonella sRNAs as a major target of the foreign Hfq proteins. New Salmonella sRNA species were also identified, and some of these accumulated specifically in the presence of a foreign Hfq protein. In addition, we observed specific RNA processing defects, e.g., suppression of precursor processing of SraH sRNA by Methanocaldococcus Hfq, or aberrant accumulation of extracytoplasmic target mRNAs of the Salmonella GcvB, MicA or RybB sRNAs. Taken together, our study provides evidence of a conserved inherent sRNA-binding property of Hfq, which may facilitate the lateral transmission of regulatory sRNAs among distantly related species. It also suggests that the expression of heterologous RNA-binding proteins combined with deep sequencing analysis of RNA ligands can be used as a molecular tool to dissect individual steps of RNA metabolism in vivo.

  13. Whole-exome sequencing, without prior linkage, identifies a mutation in LAMB3 as a cause of dominant hypoplastic amelogenesis imperfecta.

    Science.gov (United States)

    Poulter, James A; El-Sayed, Walid; Shore, Roger C; Kirkham, Jennifer; Inglehearn, Chris F; Mighell, Alan J

    2014-01-01

    The conventional approach to identifying the defective gene in a family with an inherited disease is to find the disease locus through family studies. However, the rapid development and decreasing cost of next generation sequencing facilitates a more direct approach. Here, we report the identification of a frameshift mutation in LAMB3 as a cause of dominant hypoplastic amelogenesis imperfecta (AI). Whole-exome sequencing of three affected family members and subsequent filtering of shared variants, without prior genetic linkage, sufficed to identify the pathogenic variant. Simultaneous analysis of multiple family members confirms segregation, enhancing the power to filter the genetic variation found and leading to rapid identification of the pathogenic variant. LAMB3 encodes a subunit of Laminin-5, one of a family of basement membrane proteins with essential functions in cell growth, movement and adhesion. Homozygous LAMB3 mutations cause junctional epidermolysis bullosa (JEB) and enamel defects are seen in JEB cases. However, to our knowledge, this is the first report of dominant AI due to a LAMB3 mutation in the absence of JEB.

  14. Private selective sweeps identified from next-generation pool-sequencing reveal convergent pathways under selection in two inbred Schistosoma mansoni strains.

    Directory of Open Access Journals (Sweden)

    Julie A J Clément

    Full Text Available BACKGROUND: The trematode flatworms of the genus Schistosoma, the causative agents of schistosomiasis, are among the most prevalent parasites in humans, affecting more than 200 million people worldwide. In this study, we focused on two well-characterized strains of S. mansoni, to explore signatures of selection. Both strains are highly inbred and exhibit differences in life history traits, in particular in their compatibility with the intermediate host Biomphalaria glabrata. METHODOLOGY/PRINCIPAL FINDINGS: We performed high throughput sequencing of DNA from pools of individuals of each strain using Illumina technology and identified single nucleotide polymorphisms (SNP and copy number variations (CNV. In total, 708,898 SNPs were identified and roughly 2,000 CNVs. The SNPs revealed low nucleotide diversity (π = 2 × 10(-4 within each strain and a high differentiation level (Fst = 0.73 between them. Based on a recently developed in-silico approach, we further detected 12 and 19 private (i.e. specific non-overlapping selective sweeps among the 121 and 151 sweeps found in total for each strain. CONCLUSIONS/SIGNIFICANCE: Functional annotation of transcripts lying in the private selective sweeps revealed specific selection for functions related to parasitic interaction (e.g. cell-cell adhesion or redox reactions. Despite high differentiation between strains, we identified evolutionary convergence of genes related to proteolysis, known as a key virulence factor and a potential target of drug and vaccine development. Our data show that pool-sequencing can be used for the detection of selective sweeps in parasite populations and enables one to identify biological functions under selection.

  15. MetaGaAP: A Novel Pipeline to Estimate Community Composition and Abundance from Non-Model Sequence Data

    Directory of Open Access Journals (Sweden)

    Christopher Noune

    2017-02-01

    Full Text Available Next generation sequencing and bioinformatic approaches are increasingly used to quantify microorganisms within populations by analysis of ‘meta-barcode’ data. This approach relies on comparison of amplicon sequences of ‘barcode’ regions from a population with public-domain databases of reference sequences. However, for many organisms relevant ‘barcode’ regions may not have been identified and large databases of reference sequences may not be available. A workflow and software pipeline, ‘MetaGaAP,’ was developed to identify and quantify genotypes through four steps: shotgun sequencing and identification of polymorphisms in a metapopulation to identify custom ‘barcode’ regions of less than 30 polymorphisms within the span of a single ‘read’, amplification and sequencing of the ‘barcode’, generation of a custom database of polymorphisms, and quantitation of the relative abundance of genotypes. The pipeline and workflow were validated in a ‘wild type’ Alphabaculovirus isolate, Helicoverpa armigera single nucleopolyhedrovirus (HaSNPV-AC53 and a tissue-culture derived strain (HaSNPV-AC53-T2. The approach was validated by comparison of polymorphisms in amplicons and shotgun data, and by comparison of predicted dominant and co-dominant genotypes with Sanger sequences. The computational power required to generate and search the database effectively limits the number of polymorphisms that can be included in a barcode to 30 or less. The approach can be used in quantitative analysis of the ecology and pathology of non-model organisms.

  16. A dated molecular phylogeny of manta and devil rays (Mobulidae) based on mitogenome and nuclear sequences

    NARCIS (Netherlands)

    Poortvliet, Marloes; Olsen, Jeanine; Croll, Donald A.; Bernardi, Giacomo; Newton, Kelly; Kollias, Spyros; O'Sullivan, John; Fernando, Daniel; Stevens, Guy; Galván Magaña, Felipe; Seret, Bernard; Wintner, Sabine; Hoarau, Galice

    Manta and devil rays are an iconic group of globally distributed pelagic filter feeders, yet their evolutionary history remains enigmatic. We employed next generation sequencing of mitogenomes for nine of the 11 recognized species and two outgroups; as well as additional Sanger sequencing of two

  17. A novel mutation in PRPF31, causative of autosomal dominant retinitis pigmentosa, using the BGISEQ-500 sequencer

    Directory of Open Access Journals (Sweden)

    Yu Zheng

    2018-01-01

    Full Text Available AIM: To study the genes responsible for retinitis pigmentosa. METHODS: A total of 15 Chinese families with retinitis pigmentosa, containing 94 sporadically afflicted cases, were recruited. The targeted sequences were captured using the Target_Eye_365_V3 chip and sequenced using the BGISEQ-500 sequencer, according to the manufacturer’s instructions. Data were aligned to UCSC Genome Browser build hg19, using the Burroughs Wheeler Aligner MEM algorithm. Local realignment was performed with the Genome Analysis Toolkit (GATK v.3.3.0 IndelRealigner, and variants were called with the Genome Analysis Toolkit Haplotypecaller, without any use of imputation. Variants were filtered against a panel derived from 1000 Genomes Project, 1000G_ASN, ESP6500, ExAC and dbSNP138. In all members of Family ONE and Family TWO with available DNA samples, the genetic variant was validated using Sanger sequencing. RESULTS: A novel, pathogenic variant of retinitis pigmentosa, c.357_358delAA (p.Ser119SerfsX5 was identified in PRPF31 in 2 of 15 autosomal-dominant retinitis pigmentosa (ADRP families, as well as in one, sporadic case. Sanger sequencing was performed upon probands, as well as upon other family members. This novel, pathogenic genotype co-segregated with retinitis pigmentosa phenotype in these two families. CONCLUSION: ADRP is a subtype of retinitis pigmentosa, defined by its genotype, which accounts for 20%-40% of the retinitis pigmentosa patients. Our study thus expands the spectrum of PRPF31 mutations known to occur in ADRP, and provides further demonstration of the applicability of the BGISEQ500 sequencer for genomics research.

  18. A novel mutation in PRPF31, causative of autosomal dominant retinitis pigmentosa, using the BGISEQ-500 sequencer

    Science.gov (United States)

    Zheng, Yu; Wang, Hai-Lin; Li, Jian-Kang; Xu, Li; Tellier, Laurent; Li, Xiao-Lin; Huang, Xiao-Yan; Li, Wei; Niu, Tong-Tong; Yang, Huan-Ming; Zhang, Jian-Guo; Liu, Dong-Ning

    2018-01-01

    AIM To study the genes responsible for retinitis pigmentosa. METHODS A total of 15 Chinese families with retinitis pigmentosa, containing 94 sporadically afflicted cases, were recruited. The targeted sequences were captured using the Target_Eye_365_V3 chip and sequenced using the BGISEQ-500 sequencer, according to the manufacturer's instructions. Data were aligned to UCSC Genome Browser build hg19, using the Burroughs Wheeler Aligner MEM algorithm. Local realignment was performed with the Genome Analysis Toolkit (GATK v.3.3.0) IndelRealigner, and variants were called with the Genome Analysis Toolkit Haplotypecaller, without any use of imputation. Variants were filtered against a panel derived from 1000 Genomes Project, 1000G_ASN, ESP6500, ExAC and dbSNP138. In all members of Family ONE and Family TWO with available DNA samples, the genetic variant was validated using Sanger sequencing. RESULTS A novel, pathogenic variant of retinitis pigmentosa, c.357_358delAA (p.Ser119SerfsX5) was identified in PRPF31 in 2 of 15 autosomal-dominant retinitis pigmentosa (ADRP) families, as well as in one, sporadic case. Sanger sequencing was performed upon probands, as well as upon other family members. This novel, pathogenic genotype co-segregated with retinitis pigmentosa phenotype in these two families. CONCLUSION ADRP is a subtype of retinitis pigmentosa, defined by its genotype, which accounts for 20%-40% of the retinitis pigmentosa patients. Our study thus expands the spectrum of PRPF31 mutations known to occur in ADRP, and provides further demonstration of the applicability of the BGISEQ500 sequencer for genomics research. PMID:29375987

  19. A rapid screening with direct sequencing from blood samples for the diagnosis of Leigh syndrome

    Directory of Open Access Journals (Sweden)

    Hiroko Shimbo

    2014-01-01

    Full Text Available Large numbers of genes are responsible for Leigh syndrome (LS, making genetic confirmation of LS difficult. We screened our patients with LS using a limited set of 21 primers encompassing the frequently reported gene for the respiratory chain complexes I (ND1–ND6, and ND4L, IV(SURF1, and V(ATP6 and the pyruvate dehydrogenase E1α-subunit. Of 18 LS patients, we identified mutations in 11 patients, including 7 in mDNA (two with ATP6, 4 in nuclear (three with SURF1. Overall, we identified mutations in 61% of LS patients (11/18 individuals in this cohort. Sanger sequencing with our limited set of primers allowed us a rapid genetic confirmation of more than half of the LS patients and it appears to be efficient as a primary genetic screening in this cohort.

  20. High Throughput Sequencing of Small RNAs in the Two Cucurbita Germplasm with Different Sodium Accumulation Patterns Identifies Novel MicroRNAs Involved in Salt Stress Response.

    Science.gov (United States)

    Xie, Junjun; Lei, Bo; Niu, Mengliang; Huang, Yuan; Kong, Qiusheng; Bie, Zhilong

    2015-01-01

    MicroRNAs (miRNAs), a class of small non-coding RNAs, recognize their mRNA targets based on perfect sequence complementarity. MiRNAs lead to broader changes in gene expression after plants are exposed to stress. High-throughput sequencing is an effective method to identify and profile small RNA populations in non-model plants under salt stresses, significantly improving our knowledge regarding miRNA functions in salt tolerance. Cucurbits are sensitive to soil salinity, and the Cucurbita genus is used as the rootstock of other cucurbits to enhance salt tolerance. Several cucurbit crops have been used for miRNA sequencing but salt stress-related miRNAs in cucurbit species have not been reported. In this study, we subjected two Cucurbita germplasm, namely, N12 (Cucurbita. maxima Duch.) and N15 (Cucurbita. moschata Duch.), with different sodium accumulation patterns, to Illumina sequencing to determine small RNA populations in root tissues after 4 h of salt treatment and control. A total of 21,548,326 and 19,394,108 reads were generated from the control and salt-treated N12 root tissues, respectively. By contrast, 19,108,240 and 20,546,052 reads were obtained from the control and salt-treated N15 root tissues, respectively. Fifty-eight conserved miRNA families and 33 novel miRNAs were identified in the two Cucurbita germplasm. Seven miRNAs (six conserved miRNAs and one novel miRNAs) were up-regulated in salt-treated N12 and N15 samples. Most target genes of differentially expressed novel miRNAs were transcription factors and salt stress-responsive proteins, including dehydration-induced protein, cation/H+ antiporter 18, and CBL-interacting serine/threonine-protein kinase. The differential expression of miRNAs between the two Cucurbita germplasm under salt stress conditions and their target genes demonstrated that novel miRNAs play an important role in the response of the two Cucurbita germplasm to salt stress. The present study initially explored small RNAs in the

  1. High Throughput Sequencing of Small RNAs in the Two Cucurbita Germplasm with Different Sodium Accumulation Patterns Identifies Novel MicroRNAs Involved in Salt Stress Response.

    Directory of Open Access Journals (Sweden)

    Junjun Xie

    Full Text Available MicroRNAs (miRNAs, a class of small non-coding RNAs, recognize their mRNA targets based on perfect sequence complementarity. MiRNAs lead to broader changes in gene expression after plants are exposed to stress. High-throughput sequencing is an effective method to identify and profile small RNA populations in non-model plants under salt stresses, significantly improving our knowledge regarding miRNA functions in salt tolerance. Cucurbits are sensitive to soil salinity, and the Cucurbita genus is used as the rootstock of other cucurbits to enhance salt tolerance. Several cucurbit crops have been used for miRNA sequencing but salt stress-related miRNAs in cucurbit species have not been reported. In this study, we subjected two Cucurbita germplasm, namely, N12 (Cucurbita. maxima Duch. and N15 (Cucurbita. moschata Duch., with different sodium accumulation patterns, to Illumina sequencing to determine small RNA populations in root tissues after 4 h of salt treatment and control. A total of 21,548,326 and 19,394,108 reads were generated from the control and salt-treated N12 root tissues, respectively. By contrast, 19,108,240 and 20,546,052 reads were obtained from the control and salt-treated N15 root tissues, respectively. Fifty-eight conserved miRNA families and 33 novel miRNAs were identified in the two Cucurbita germplasm. Seven miRNAs (six conserved miRNAs and one novel miRNAs were up-regulated in salt-treated N12 and N15 samples. Most target genes of differentially expressed novel miRNAs were transcription factors and salt stress-responsive proteins, including dehydration-induced protein, cation/H+ antiporter 18, and CBL-interacting serine/threonine-protein kinase. The differential expression of miRNAs between the two Cucurbita germplasm under salt stress conditions and their target genes demonstrated that novel miRNAs play an important role in the response of the two Cucurbita germplasm to salt stress. The present study initially explored small

  2. Targeted next-generation sequencing identifies a homozygous nonsense mutation in ABHD12, the gene underlying PHARC, in a family clinically diagnosed with Usher syndrome type 3

    Science.gov (United States)

    2012-01-01

    Background Usher syndrome (USH) is an autosomal recessive genetically heterogeneous disorder with congenital sensorineural hearing impairment and retinitis pigmentosa (RP). We have identified a consanguineous Lebanese family with two affected members displaying progressive hearing loss, RP and cataracts, therefore clinically diagnosed as USH type 3 (USH3). Our study was aimed at the identification of the causative mutation in this USH3-like family. Methods Candidate loci were identified using genomewide SNP-array-based homozygosity mapping followed by targeted enrichment and next-generation sequencing. Results Using a capture array targeting the three identified homozygosity-by-descent regions on chromosomes 1q43-q44, 20p13-p12.2 and 20p11.23-q12, we identified a homozygous nonsense mutation, p.Arg65X, in ABHD12 segregating with the phenotype. Conclusion Mutations of ABHD12, an enzyme hydrolyzing an endocannabinoid lipid transmitter, cause PHARC (polyneuropathy, hearing loss, ataxia, retinitis pigmentosa, and early-onset cataract). After the identification of the ABHD12 mutation in this family, one patient underwent neurological examination which revealed ataxia, but no polyneuropathy. ABHD12 is not known to be related to the USH protein interactome. The phenotype of our patient represents a variant of PHARC, an entity that should be taken into account as differential diagnosis for USH3. Our study demonstrates the potential of comprehensive genetic analysis for improving the clinical diagnosis. PMID:22938382

  3. Targeted next-generation sequencing identifies a homozygous nonsense mutation in ABHD12, the gene underlying PHARC, in a family clinically diagnosed with Usher syndrome type 3.

    Science.gov (United States)

    Eisenberger, Tobias; Slim, Rima; Mansour, Ahmad; Nauck, Markus; Nürnberg, Gudrun; Nürnberg, Peter; Decker, Christian; Dafinger, Claudia; Ebermann, Inga; Bergmann, Carsten; Bolz, Hanno Jörn

    2012-09-02

    Usher syndrome (USH) is an autosomal recessive genetically heterogeneous disorder with congenital sensorineural hearing impairment and retinitis pigmentosa (RP). We have identified a consanguineous Lebanese family with two affected members displaying progressive hearing loss, RP and cataracts, therefore clinically diagnosed as USH type 3 (USH3). Our study was aimed at the identification of the causative mutation in this USH3-like family. Candidate loci were identified using genomewide SNP-array-based homozygosity mapping followed by targeted enrichment and next-generation sequencing. Using a capture array targeting the three identified homozygosity-by-descent regions on chromosomes 1q43-q44, 20p13-p12.2 and 20p11.23-q12, we identified a homozygous nonsense mutation, p.Arg65X, in ABHD12 segregating with the phenotype. Mutations of ABHD12, an enzyme hydrolyzing an endocannabinoid lipid transmitter, cause PHARC (polyneuropathy, hearing loss, ataxia, retinitis pigmentosa, and early-onset cataract). After the identification of the ABHD12 mutation in this family, one patient underwent neurological examination which revealed ataxia, but no polyneuropathy. ABHD12 is not known to be related to the USH protein interactome. The phenotype of our patient represents a variant of PHARC, an entity that should be taken into account as differential diagnosis for USH3. Our study demonstrates the potential of comprehensive genetic analysis for improving the clinical diagnosis.

  4. Targeted next-generation sequencing identifies a homozygous nonsense mutation in ABHD12, the gene underlying PHARC, in a family clinically diagnosed with Usher syndrome type 3

    Directory of Open Access Journals (Sweden)

    Eisenberger Tobias

    2012-09-01

    Full Text Available Abstract Background Usher syndrome (USH is an autosomal recessive genetically heterogeneous disorder with congenital sensorineural hearing impairment and retinitis pigmentosa (RP. We have identified a consanguineous Lebanese family with two affected members displaying progressive hearing loss, RP and cataracts, therefore clinically diagnosed as USH type 3 (USH3. Our study was aimed at the identification of the causative mutation in this USH3-like family. Methods Candidate loci were identified using genomewide SNP-array-based homozygosity mapping followed by targeted enrichment and next-generation sequencing. Results Using a capture array targeting the three identified homozygosity-by-descent regions on chromosomes 1q43-q44, 20p13-p12.2 and 20p11.23-q12, we identified a homozygous nonsense mutation, p.Arg65X, in ABHD12 segregating with the phenotype. Conclusion Mutations of ABHD12, an enzyme hydrolyzing an endocannabinoid lipid transmitter, cause PHARC (polyneuropathy, hearing loss, ataxia, retinitis pigmentosa, and early-onset cataract. After the identification of the ABHD12 mutation in this family, one patient underwent neurological examination which revealed ataxia, but no polyneuropathy. ABHD12 is not known to be related to the USH protein interactome. The phenotype of our patient represents a variant of PHARC, an entity that should be taken into account as differential diagnosis for USH3. Our study demonstrates the potential of comprehensive genetic analysis for improving the clinical diagnosis.

  5. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    International Nuclear Information System (INIS)

    Mefford, Megan E.; Kunstman, Kevin; Wolinsky, Steven M.; Gabuzda, Dana

    2015-01-01

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions

  6. Integration of sequence data from a Consanguineous family with genetic data from an outbred population identifies PLB1 as a candidate rheumatoid arthritis risk gene.

    Directory of Open Access Journals (Sweden)

    Yukinori Okada

    Full Text Available Integrating genetic data from families with highly penetrant forms of disease together with genetic data from outbred populations represents a promising strategy to uncover the complete frequency spectrum of risk alleles for complex traits such as rheumatoid arthritis (RA. Here, we demonstrate that rare, low-frequency and common alleles at one gene locus, phospholipase B1 (PLB1, might contribute to risk of RA in a 4-generation consanguineous pedigree (Middle Eastern ancestry and also in unrelated individuals from the general population (European ancestry. Through identity-by-descent (IBD mapping and whole-exome sequencing, we identified a non-synonymous c.2263G>C (p.G755R mutation at the PLB1 gene on 2q23, which significantly co-segregated with RA in family members with a dominant mode of inheritance (P = 0.009. We further evaluated PLB1 variants and risk of RA using a GWAS meta-analysis of 8,875 RA cases and 29,367 controls of European ancestry. We identified significant contributions of two independent non-coding variants near PLB1 with risk of RA (rs116018341 [MAF = 0.042] and rs116541814 [MAF = 0.021], combined P = 3.2 × 10(-6. Finally, we performed deep exon sequencing of PLB1 in 1,088 RA cases and 1,088 controls (European ancestry, and identified suggestive dispersion of rare protein-coding variant frequencies between cases and controls (P = 0.049 for C-alpha test and P = 0.055 for SKAT. Together, these data suggest that PLB1 is a candidate risk gene for RA. Future studies to characterize the full spectrum of genetic risk in the PLB1 genetic locus are warranted.

  7. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    Energy Technology Data Exchange (ETDEWEB)

    Mefford, Megan E., E-mail: megan_mefford@hms.harvard.edu [Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Boston, MA (United States); Kunstman, Kevin, E-mail: kunstman@northwestern.edu [Northwestern University Medical School, Chicago, IL (United States); Wolinsky, Steven M., E-mail: s-wolinsky@northwestern.edu [Northwestern University Medical School, Chicago, IL (United States); Gabuzda, Dana, E-mail: dana_gabuzda@dfci.harvard.edu [Department of Cancer Immunology and AIDS, Dana-Farber Cancer Institute, Boston, MA (United States); Department of Neurology (Microbiology and Immunobiology), Harvard Medical School, Boston, MA (United States)

    2015-07-15

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions.

  8. Next-generation sequencing to identify candidate genes and develop diagnostic markers for a novel Phytophthora resistance gene, RpsHC18, in soybean.

    Science.gov (United States)

    Zhong, Chao; Sun, Suli; Li, Yinping; Duan, Canxing; Zhu, Zhendong

    2018-03-01

    A novel Phytophthora sojae resistance gene RpsHC18 was identified and finely mapped on soybean chromosome 3. Two NBS-LRR candidate genes were identified and two diagnostic markers of RpsHC18 were developed. Phytophthora root rot caused by Phytophthora sojae is a destructive disease of soybean. The most effective disease-control strategy is to deploy resistant cultivars carrying Phytophthora-resistant Rps genes. The soybean cultivar Huachun 18 has a broad and distinct resistance spectrum to 12 P. sojae isolates. Quantitative trait loci sequencing (QTL-seq), based on the whole-genome resequencing (WGRS) of two extreme resistant and susceptible phenotype bulks from an F 2:3 population, was performed, and one 767-kb genomic region with ΔSNP-index ≥ 0.9 on chromosome 3 was identified as the RpsHC18 candidate region in Huachun 18. The candidate region was reduced to a 146-kb region by fine mapping. Nonsynonymous SNP and haplotype analyses were carried out in the 146-kb region among ten soybean genotypes using WGRS. Four specific nonsynonymous SNPs were identified in two nucleotide-binding sites-leucine-rich repeat (NBS-LRR) genes, RpsHC18-NBL1 and RpsHC18-NBL2, which were considered to be the candidate genes. Finally, one specific SNP marker in each candidate gene was successfully developed using a tetra-primer ARMS-PCR assay, and the two markers were verified to be specific for RpsHC18 and to effectively distinguish other known Rps genes. In this study, we applied an integrated genomic-based strategy combining WGRS with traditional genetic mapping to identify RpsHC18 candidate genes and develop diagnostic markers. These results suggest that next-generation sequencing is a precise, rapid and cost-effective way to identify candidate genes and develop diagnostic markers, and it can accelerate Rps gene cloning and marker-assisted selection for breeding of P. sojae-resistant soybean cultivars.

  9. RNA Sequencing Identifies Upregulated Kyphoscoliosis Peptidase and Phosphatidic Acid Signaling Pathways in Muscle Hypertrophy Generated by Transgenic Expression of Myostatin Propeptide

    Directory of Open Access Journals (Sweden)

    Yuanxin Miao

    2015-04-01

    Full Text Available Myostatin (MSTN, a member of the transforming growth factor-β superfamily, plays a crucial negative role in muscle growth. MSTN mutations or inhibitions can dramatically increase muscle mass in most mammal species. Previously, we generated a transgenic mouse model of muscle hypertrophy via the transgenic expression of the MSTN N-terminal propeptide cDNA under the control of the skeletal muscle-specific MLC1 promoter. Here, we compare the mRNA profiles between transgenic mice and wild-type littermate controls with a high-throughput RNA sequencing method. The results show that 132 genes were significantly differentially expressed between transgenic mice and wild-type control mice; 97 of these genes were up-regulated, and 35 genes were down-regulated in the skeletal muscle. Several genes that had not been reported to be involved in muscle hypertrophy were identified, including up-regulated myosin binding protein H (mybph, and zinc metallopeptidase STE24 (Zmpste24. In addition, kyphoscoliosis peptidase (Ky, which plays a vital role in muscle growth, was also up-regulated in the transgenic mice. Interestingly, a pathway analysis based on grouping the differentially expressed genes uncovered that cardiomyopathy-related pathways and phosphatidic acid (PA pathways (Dgki, Dgkz, Plcd4 were up-regulated. Increased PA signaling may increase mTOR signaling, resulting in skeletal muscle growth. The findings of the RNA sequencing analysis help to understand the molecular mechanisms of muscle hypertrophy caused by MSTN inhibition.

  10. RNA sequencing on Amomum villosum Lour. induced by MeJA identifies the genes of WRKY and terpene synthases involved in terpene biosynthesis.

    Science.gov (United States)

    He, Xueying; Wang, Huan; Yang, Jinfen; Deng, Ke; Wang, Teng

    2018-02-01

    Amomum villosum Lour. is an important Chinese medicinal plant that has diverse medicinal functions, and mainly contains volatile terpenes. This study aims to explore the WRKY transcription factors (TFs) and terpene synthase (TPS) unigenes that might be involved in terpene biosynthesis in A. villosum, and thus providing some new information on the regulation of terpenes in plants. RNA sequencing of A. villosum induced by methyl jasmonate (MeJA) revealed that the WRKY family was the second largest TF family in the transcriptome. Thirty-six complete WRKY domain sequences were expressed in response to MeJA. Further, six WRKY unigenes were highly correlated with eight deduced TPS unigenes. Ultimately, we combined the terpene abundance with the expression of candidate WRKY TFs and TPS unigenes to presume a possible model wherein AvWRKY61, AvWRKY28, and AvWRKY40 might coordinately trans-activate the AvNeoD promoter. We propose an approach to further investigate TF unigenes that might be involved in terpenoid biosynthesis, and identified four unigenes for further analyses.

  11. RNA sequencing identifies upregulated kyphoscoliosis peptidase and phosphatidic acid signaling pathways in muscle hypertrophy generated by transgenic expression of myostatin propeptide.

    Science.gov (United States)

    Miao, Yuanxin; Yang, Jinzeng; Xu, Zhong; Jing, Lu; Zhao, Shuhong; Li, Xinyun

    2015-04-09

    Myostatin (MSTN), a member of the transforming growth factor-β superfamily, plays a crucial negative role in muscle growth. MSTN mutations or inhibitions can dramatically increase muscle mass in most mammal species. Previously, we generated a transgenic mouse model of muscle hypertrophy via the transgenic expression of the MSTN N-terminal propeptide cDNA under the control of the skeletal muscle-specific MLC1 promoter. Here, we compare the mRNA profiles between transgenic mice and wild-type littermate controls with a high-throughput RNA sequencing method. The results show that 132 genes were significantly differentially expressed between transgenic mice and wild-type control mice; 97 of these genes were up-regulated, and 35 genes were down-regulated in the skeletal muscle. Several genes that had not been reported to be involved in muscle hypertrophy were identified, including up-regulated myosin binding protein H (mybph), and zinc metallopeptidase STE24 (Zmpste24). In addition, kyphoscoliosis peptidase (Ky), which plays a vital role in muscle growth, was also up-regulated in the transgenic mice. Interestingly, a pathway analysis based on grouping the differentially expressed genes uncovered that cardiomyopathy-related pathways and phosphatidic acid (PA) pathways (Dgki, Dgkz, Plcd4) were up-regulated. Increased PA signaling may increase mTOR signaling, resulting in skeletal muscle growth. The findings of the RNA sequencing analysis help to understand the molecular mechanisms of muscle hypertrophy caused by MSTN inhibition.

  12. Applying Unique Molecular Identifiers in Next Generation Sequencing Reveals a Constrained Viral Quasispecies Evolution under Cross-Reactive Antibody Pressure Targeting Long Alpha Helix of Hemagglutinin

    Science.gov (United States)

    Hauck, Nastasja C.; Kirpach, Josiane; Kiefer, Christina; Farinelle, Sophie; Morris, Stephen A.; Muller, Claude P.; Lu, I-Na

    2018-01-01

    To overcome yearly efforts and costs for the production of seasonal influenza vaccines, new approaches for the induction of broadly protective and long-lasting immune responses have been developed in the past decade. To warrant safety and efficacy of the emerging crossreactive vaccine candidates, it is critical to understand the evolution of influenza viruses in response to these new immune pressures. Here we applied unique molecular identifiers in next generation sequencing to analyze the evolution of influenza quasispecies under in vivo antibody pressure targeting the hemagglutinin (HA) long alpha helix (LAH). Our vaccine targeting LAH of hemagglutinin elicited significant seroconversion and protection against homologous and heterologous influenza virus strains in mice. The vaccine not only significantly reduced lung viral titers, but also induced a well-known bottleneck effect by decreasing virus diversity. In contrast to the classical bottleneck effect, here we showed a significant increase in the frequency of viruses with amino acid sequences identical to that of vaccine targeting LAH domain. No escape mutant emerged after vaccination. These results not only support the potential of a universal influenza vaccine targeting the conserved LAH domains, but also clearly demonstrate that the well-established bottleneck effect on viral quasispecies evolution does not necessarily generate escape mutants. PMID:29587397

  13. De novo RNA sequencing transcriptome of Rhododendron obtusum identified the early heat response genes involved in the transcriptional regulation of photosynthesis.

    Directory of Open Access Journals (Sweden)

    Linchuan Fang

    Full Text Available Rhododendron spp. is an important ornamental species that is widely cultivated for landscape worldwide. Heat stress is a major obstacle for its cultivation in south China. Previous studies on rhododendron principally focused on its physiological and biochemical processes, which are involved in a series of stress tolerance. However, molecular or genetic properties of rhododendron's response to heat stress are still poorly understood. The phenotype and chlorophyll fluorescence kinetics parameters of four rhododendron cultivars were compared under normal or heat stress conditions, and a cultivar with highest heat tolerance, "Yanzhimi" (R. obtusum was selected for transcriptome sequencing. A total of 325,429,240 high quality reads were obtained and assembled into 395,561 transcripts and 92,463 unigenes. Functional annotation showed that 38,724 unigenes had sequence similarity to known genes in at least one of the proteins or nucleotide databases used in this study. These 38,724 unigenes were categorized into 51 functional groups based on Gene Ontology classification and were blasted to 24 known cluster of orthologous groups. A total of 973 identified unigenes belonged to 57 transcription factor families, including the stress-related HSF, DREB, ZNF, and NAC genes. Photosynthesis was significantly enriched in the Kyoto Encyclopedia of Genes and Genomes pathway, and the changed expression pattern was illustrated. The key pathways and signaling components that contribute to heat tolerance in rhododendron were revealed. These results provide a potentially valuable resource that can be used for heat-tolerance breeding.

  14. Exome sequencing identifies a novel mutation of the GDI1 gene in a Chinese non-syndromic X-linked intellectual disability family

    Directory of Open Access Journals (Sweden)

    Yongheng Duan

    2017-08-01

    Full Text Available Abstract X-linked intellectual disability (XLID has been associated with various genes. Diagnosis of XLID, especially for non-syndromic ones (NS-XLID, is often hampered by the heterogeneity of this disease. Here we report the case of a Chinese family in which three males suffer from intellectual disability (ID. The three patients shared the same phenotype: no typical clinical manifestation other than IQ score ≤ 70. For a genetic diagnosis for this family we carried out whole exome sequencing on the proband, and validated 16 variants of interest in the genomic DNA of all the family members. A missense mutation (c.710G > T, which mapped to exon 6 of the Rab GDP-Dissociation Inhibitor 1 (GDI1 gene, was found segregating with the ID phenotype, and this mutation changes the 237th position in the guanosine diphosphate dissociation inhibitor (GDI protein from glycine to valine (p. Gly237Val. Through molecular dynamics simulations we found that this substitution results in a conformational change of GDI, possibly affecting the Rab-binding capacity of this protein. In conclusion, our study identified a novel GDI1 mutation that is possibly NS-XLID causative, and showed that whole exome sequencing provides advantages for detecting novel ID-associated variants and can greatly facilitate the genetic diagnosis of the disease.

  15. Applying Unique Molecular Identifiers in Next Generation Sequencing Reveals a Constrained Viral Quasispecies Evolution under Cross-Reactive Antibody Pressure Targeting Long Alpha Helix of Hemagglutinin

    Directory of Open Access Journals (Sweden)

    Nastasja C. Hauck

    2018-03-01

    Full Text Available To overcome yearly efforts and costs for the production of seasonal influenza vaccines, new approaches for the induction of broadly protective and long-lasting immune responses have been developed in the past decade. To warrant safety and efficacy of the emerging crossreactive vaccine candidates, it is critical to understand the evolution of influenza viruses in response to these new immune pressures. Here we applied unique molecular identifiers in next generation sequencing to analyze the evolution of influenza quasispecies under in vivo antibody pressure targeting the hemagglutinin (HA long alpha helix (LAH. Our vaccine targeting LAH of hemagglutinin elicited significant seroconversion and protection against homologous and heterologous influenza virus strains in mice. The vaccine not only significantly reduced lung viral titers, but also induced a well-known bottleneck effect by decreasing virus diversity. In contrast to the classical bottleneck effect, here we showed a significant increase in the frequency of viruses with amino acid sequences identical to that of vaccine targeting LAH domain. No escape mutant emerged after vaccination. These results not only support the potential of a universal influenza vaccine targeting the conserved LAH domains, but also clearly demonstrate that the well-established bottleneck effect on viral quasispecies evolution does not necessarily generate escape mutants.

  16. De novo RNA sequencing transcriptome of Rhododendron obtusum identified the early heat response genes involved in the transcriptional regulation of photosynthesis

    Science.gov (United States)

    Tong, Jun; Dong, Yanfang; Xu, Dongyun; Mao, Jing; Zhou, Yuan

    2017-01-01

    Rhododendron spp. is an important ornamental species that is widely cultivated for landscape worldwide. Heat stress is a major obstacle for its cultivation in south China. Previous studies on rhododendron principally focused on its physiological and biochemical processes, which are involved in a series of stress tolerance. However, molecular or genetic properties of rhododendron’s response to heat stress are still poorly understood. The phenotype and chlorophyll fluorescence kinetics parameters of four rhododendron cultivars were compared under normal or heat stress conditions, and a cultivar with highest heat tolerance, “Yanzhimi” (R. obtusum) was selected for transcriptome sequencing. A total of 325,429,240 high quality reads were obtained and assembled into 395,561 transcripts and 92,463 unigenes. Functional annotation showed that 38,724 unigenes had sequence similarity to known genes in at least one of the proteins or nucleotide databases used in this study. These 38,724 unigenes were categorized into 51 functional groups based on Gene Ontology classification and were blasted to 24 known cluster of orthologous groups. A total of 973 identified unigenes belonged to 57 transcription factor families, including the stress-related HSF, DREB, ZNF, and NAC genes. Photosynthesis was significantly enriched in the Kyoto Encyclopedia of Genes and Genomes pathway, and the changed expression pattern was illustrated. The key pathways and signaling components that contribute to heat tolerance in rhododendron were revealed. These results provide a potentially valuable resource that can be used for heat-tolerance breeding. PMID:29059200

  17. ReRep: Computational detection of repetitive sequences in genome survey sequences (GSS

    Directory of Open Access Journals (Sweden)

    Alves-Ferreira Marcelo

    2008-09-01

    Full Text Available Abstract Background Genome survey sequences (GSS offer a preliminary global view of a genome since, unlike ESTs, they cover coding as well as non-coding DNA and include repetitive regions of the genome. A more precise estimation of the nature, quantity and variability of repetitive sequences very early in a genome sequencing project is of considerable importance, as such data strongly influence the estimation of genome coverage, library quality and progress in scaffold construction. Also, the elimination of repetitive sequences from the initial assembly process is important to avoid errors and unnecessary complexity. Repetitive sequences are also of interest in a variety of other studies, for instance as molecular markers. Results We designed and implemented a straightforward pipeline called ReRep, which combines bioinformatics tools for identifying repetitive structures in a GSS dataset. In a case study, we first applied the pipeline to a set of 970 GSSs, sequenced in our laboratory from the human pathogen Leishmania braziliensis, the causative agent of leishmaniosis, an important public health problem in Brazil. We also verified the applicability of ReRep to new sequencing technologies using a set of 454-reads of an Escheria coli. The behaviour of several parameters in the algorithm is evaluated and suggestions are made for tuning of the analysis. Conclusion The ReRep approach for identification of repetitive elements in GSS datasets proved to be straightforward and efficient. Several potential repetitive sequences were found in a L. braziliensis GSS dataset generated in our laboratory, and further validated by the analysis of a more complete genomic dataset from the EMBL and Sanger Centre databases. ReRep also identified most of the E. coli K12 repeats prior to assembly in an example dataset obtained by automated sequencing using 454 technology. The parameters controlling the algorithm behaved consistently and may be tuned to the properties

  18. "First generation" automated DNA sequencing technology.

    Science.gov (United States)

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  19. Genome Sequencing Identifies Two Nearly Unchanged Strains of Persistent Listeria monocytogenes Isolated at Two Different Fish Processing Plants Sampled 6 Years Apart

    DEFF Research Database (Denmark)

    Holch, Anne; Webb, Kristen; Lukjancenko, Oksana

    2013-01-01

    Listeria monocytogenes is a food-borne human-pathogenic bacterium that can cause infections with a high mortality rate. It has a remarkable ability to persist in food processing facilities. Here we report the genome sequences for two L. monocytogenes strains (N53-1 and La111) that were isolated 6...... that has been isolated as a persistent subtype in several European countries. The purpose of this study was to use genome analyses to identify genes or proteins that could contribute to persistence. In a genome comparison, the two persistent strains were extremely similar and collectively differed from...... are required to determine if the absence of these genes promotes persistence. While the genome comparison did not point to a clear physiological explanation of the persistent phenotype, the remarkable similarity between the two strains indicates that subtypes with specific traits are selected for in the food...

  20. Complete re-sequencing of a 2Mb topological domain encompassing the FTO/IRXB genes identifies a novel obesity-associated region upstream of IRX5

    DEFF Research Database (Denmark)

    Hunt, Lilian E; Noyvert, Boris; Bhaw-Rosun, Leena

    2015-01-01

    BACKGROUND: Association studies have identified a number of loci that contribute to an increased body mass index (BMI), the strongest of which is in the first intron of the FTO gene on human chromosome 16q12.2. However, this region is both non-coding and under strong linkage disequilibrium, making...... it recalcitrant to functional interpretation. Furthermore, the FTO gene is located within a complex cis-regulatory landscape defined by a topologically associated domain that includes the IRXB gene cluster, a trio of developmental regulators. Consequently, at least three genes in this interval have been...... implicated in the aetiology of obesity. METHODS: Here, we sequence a 2 Mb region encompassing the FTO, RPGRIP1L and IRXB cluster genes in 284 individuals from a well-characterised study group of Danish men containing extremely overweight young adults and controls. We further replicate our findings both...

  1. Charcot-Marie-Tooth disease: The development of a diagnostic platform using next generation sequencing

    DEFF Research Database (Denmark)

    Christensen, Rikke; Væth, Signe; Thorsen, Kasper

    , Sanger sequencing of 4 genes have led to a diagnosis in approximately 30% of the patients. Aims: 1) Development of a targeted NGS platform containing 63 genes that currently are found to be associated with CMT. 2) Analysis of the increased diagnostic yield using this platform to analyze 200 CMT samples...... previously analyzed using Sanger sequencing without identification of a disease causing mutation. Materials and Methods: Libraries for 200 patient samples obtained for CMT diagnostics were prepared using Illumina Truseq and target enrichment using SeqCap EZ Choise Library (Nimblegen). The libraries were...

  2. TBX1 mutation identified by exome sequencing in a Japanese family with 22q11.2 deletion syndrome-like craniofacial features and hypocalcemia.

    Directory of Open Access Journals (Sweden)

    Tsutomu Ogata

    Full Text Available BACKGROUND: Although TBX1 mutations have been identified in patients with 22q11.2 deletion syndrome (22q11.2DS-like phenotypes including characteristic craniofacial features, cardiovascular anomalies, hypoparathyroidism, and thymic hypoplasia, the frequency of TBX1 mutations remains rare in deletion-negative patients. Thus, it would be reasonable to perform a comprehensive genetic analysis in deletion-negative patients with 22q11.2DS-like phenotypes. METHODOLOGY/PRINCIPAL FINDINGS: We studied three subjects with craniofacial features and hypocalcemia (group 1, two subjects with craniofacial features alone (group 2, and three subjects with normal phenotype within a single Japanese family. Fluorescence in situ hybridization analysis excluded chromosome 22q11.2 deletion, and genomewide array comparative genomic hybridization analysis revealed no copy number change specific to group 1 or groups 1+2. However, exome sequencing identified a heterozygous TBX1 frameshift mutation (c.1253delA, p.Y418fsX459 specific to groups 1+2, as well as six missense variants and two in-frame microdeletions specific to groups 1+2 and two missense variants specific to group 1. The TBX1 mutation resided at exon 9C and was predicted to produce a non-functional truncated protein missing the nuclear localization signal and most of the transactivation domain. CONCLUSIONS/SIGNIFICANCE: Clinical features in groups 1+2 are well explained by the TBX1 mutation, while the clinical effects of the remaining variants are largely unknown. Thus, the results exemplify the usefulness of exome sequencing in the identification of disease-causing mutations in familial disorders. Furthermore, the results, in conjunction with the previous data, imply that TBX1 isoform C is the biologically essential variant and that TBX1 mutations are associated with a wide phenotypic spectrum, including most of 22q11.2DS phenotypes.

  3. Exome Sequencing Identified a Splice Site Mutation in FHL1 that Causes Uruguay Syndrome, an X-Linked Disorder With Skeletal Muscle Hypertrophy and Premature Cardiac Death.

    Science.gov (United States)

    Xue, Yuan; Schoser, Benedikt; Rao, Aliz R; Quadrelli, Roberto; Vaglio, Alicia; Rupp, Verena; Beichler, Christine; Nelson, Stanley F; Schapacher-Tilp, Gudrun; Windpassinger, Christian; Wilcox, William R

    2016-04-01

    Previously, we reported a rare X-linked disorder, Uruguay syndrome in a single family. The main features are pugilistic facies, skeletal deformities, and muscular hypertrophy despite a lack of exercise and cardiac ventricular hypertrophy leading to premature death. An ≈19 Mb critical region on X chromosome was identified through identity-by-descent analysis of 3 affected males. Exome sequencing was conducted on one affected male to identify the disease-causing gene and variant. A splice site variant (c.502-2A>G) in the FHL1 gene was highly suspicious among other candidate genes and variants. FHL1A is the predominant isoform of FHL1 in cardiac and skeletal muscle. Sequencing cDNA showed the splice site variant led to skipping of exons 6 of the FHL1A isoform, equivalent to the FHL1C isoform. Targeted analysis showed that this splice site variant cosegregated with disease in the family. Western blot and immunohistochemical analysis of muscle from the proband showed a significant decrease in protein expression of FHL1A. Real-time polymerase chain reaction analysis of different isoforms of FHL1 demonstrated that the FHL1C is markedly increased. Mutations in the FHL1 gene have been reported in disorders with skeletal and cardiac myopathy but none has the skeletal or facial phenotype seen in patients with Uruguay syndrome. Our data suggest that a novel FHL1 splice site variant results in the absence of FHL1A and the abundance of FHL1C, which may contribute to the complex and severe phenotype. Mutation screening of the FHL1 gene should be considered for patients with uncharacterized myopathies and cardiomyopathies. © 2016 American Heart Association, Inc.

  4. Whole exome sequencing of a consanguineous family identifies the possible modifying effect of a globally rare AK5 allelic variant in celiac disease development among Saudi patients.

    Directory of Open Access Journals (Sweden)

    Jumana Yousuf Al-Aama

    Full Text Available Celiac disease (CD, a multi-factorial auto-inflammatory disease of the small intestine, is known to occur in both sporadic and familial forms. Together HLA and Non-HLA genes can explain up to 50% of CD's heritability. In order to discover the missing heritability due to rare variants, we have exome sequenced a consanguineous Saudi family presenting CD in an autosomal recessive (AR pattern. We have identified a rare homozygous insertion c.1683_1684insATT, in the conserved coding region of AK5 gene that showed classical AR model segregation in this family. Sequence validation of 200 chromosomes each of sporadic CD cases and controls, revealed that this extremely rare (EXac MAF 0.000008 mutation is highly penetrant among general Saudi populations (MAF is 0.62. Genotype and allelic distribution analysis have indicated that this AK5 (c.1683_1684insATT mutation is negatively selected among patient groups and positively selected in the control group, in whom it may modify the risk against CD development [p<0.002]. Our observation gains additional support from computational analysis which predicted that Iso561 insertion shifts the existing H-bonds between 400th and 556th amino acid residues lying near the functional domain of adenylate kinase. This shuffling of amino acids and their H-bond interactions is likely to disturb the secondary structure orientation of the polypeptide and induces the gain-of-function in nucleoside phosphate kinase activity of AK5, which may eventually down-regulates the reactivity potential of CD4+ T-cells against gluten antigens. Our study underlines the need to have population-specific genome databases to avoid false leads and to identify true candidate causal genes for the familial form of celiac disease.

  5. Whole-exome sequencing of muscle-invasive bladder cancer identifies recurrent mutations of UNC5C and prognostic importance of DNA repair gene mutations on survival.

    Science.gov (United States)

    Yap, Kai Lee; Kiyotani, Kazuma; Tamura, Kenji; Antic, Tatjana; Jang, Miran; Montoya, Magdeline; Campanile, Alexa; Yew, Poh Yin; Ganshert, Cory; Fujioka, Tomoaki; Steinberg, Gary D; O'Donnell, Peter H; Nakamura, Yusuke

    2014-12-15

    Because of suboptimal outcomes in muscle-invasive bladder cancer even with multimodality therapy, determination of potential genetic drivers offers the possibility of improving therapeutic approaches and discovering novel prognostic indicators. Using pTN staging, we case-matched 81 patients with resected ≥pT2 bladder cancers for whom perioperative chemotherapy use and disease recurrence status were known. Whole-exome sequencing was conducted in 43 cases to identify recurrent somatic mutations and targeted sequencing of 10 genes selected from the initial screening in an additional 38 cases was completed. Mutational profiles along with clinicopathologic information were correlated with recurrence-free survival (RFS) in the patients. We identified recurrent novel somatic mutations in the gene UNC5C (9.9%), in addition to TP53 (40.7%), KDM6A (21.0%), and TSC1 (12.3%). Patients who were carriers of somatic mutations in DNA repair genes (one or more of ATM, ERCC2, FANCD2, PALB2, BRCA1, or BRCA2) had a higher overall number of somatic mutations (P = 0.011). Importantly, after a median follow-up of 40.4 months, carriers of somatic mutations (n = 25) in any of these six DNA repair genes had significantly enhanced RFS compared with noncarriers [median, 32.4 vs. 14.8 months; hazard ratio of 0.46, 95% confidence interval (CI), 0.22-0.98; P = 0.0435], after adjustment for pathologic pTN staging and independent of adjuvant chemotherapy usage. Better prognostic outcomes of individuals carrying somatic mutations in DNA repair genes suggest these mutations as favorable prognostic events in muscle-invasive bladder cancer. Additional mechanistic investigation into the previously undiscovered role of UNC5C in bladder cancer is warranted. ©2014 American Association for Cancer Research.

  6. A high-density genetic map for anchoring genome sequences and identifying QTLs associated with dwarf vine in pumpkin (Cucurbita maxima Duch.).

    Science.gov (United States)

    Zhang, Guoyu; Ren, Yi; Sun, Honghe; Guo, Shaogui; Zhang, Fan; Zhang, Jie; Zhang, Haiying; Jia, Zhangcai; Fei, Zhangjun; Xu, Yong; Li, Haizhen

    2015-12-24

    Pumpkin (Cucurbita maxima Duch.) is an economically important crop belonging to the Cucurbitaceae family. However, very few genomic and genetic resources are available for this species. As part of our ongoing efforts to sequence the pumpkin genome, high-density genetic map is essential for anchoring and orienting the assembled scaffolds. In addition, a saturated genetic map can facilitate quantitative trait locus (QTL) mapping. A set of 186 F2 plants derived from the cross of pumpkin inbred lines Rimu and SQ026 were genotyped using the genotyping-by-sequencing approach. Using the SNPs we identified, a high-density genetic map containing 458 bin-markers was constructed, spanning a total genetic distance of 2,566.8 cM across the 20 linkage groups of C. maxima with a mean marker density of 5.60 cM. Using this map we were able to anchor 58 assembled scaffolds that covered about 194.5 Mb (71.7%) of the 271.4 Mb assembled pumpkin genome, of which 44 (183.0 Mb; 67.4%) were oriented. Furthermore, the high-density genetic map was used to identify genomic regions highly associated with an important agronomic trait, dwarf vine. Three QTLs on linkage groups (LGs) 1, 3 and 4, respectively, were recovered. One QTL, qCmB2, which was located in an interval of 0.42 Mb on LG 3, explained 21.4% phenotypic variations. Within qCmB2, one gene, Cma_004516, encoding the gibberellin (GA) 20-oxidase in the GA biosynthesis pathway, had a 1249-bp deletion in its promoter in bush type lines, and its expression level was significantly increased during the vine growth and higher in vine type lines than bush type lines, supporting Cma_004516 as a possible candidate gene controlling vine growth in pumpkin. A high-density pumpkin genetic map was constructed, which was used to successfully anchor and orient the assembled genome scaffolds, and to identify QTLs highly associated with pumpkin vine length. The map provided a valuable resource for gene cloning and marker assisted breeding in pumpkin and

  7. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus.

    Science.gov (United States)

    Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan

    2017-01-01

    The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS) of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp) gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV) was the most frequently detected Ilarvirus , occurring in 48 of the 61 Ilarvirus -positive trees and Prune dwarf virus (PDV) and Apple mosaic virus (ApMV) were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV) was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus -like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus -like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus -like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples, and the

  8. Generic Amplicon Deep Sequencing to Determine Ilarvirus Species Diversity in Australian Prunus

    Directory of Open Access Journals (Sweden)

    Wycliff M. Kinoti

    2017-06-01

    Full Text Available The distribution of Ilarvirus species populations amongst 61 Australian Prunus trees was determined by next generation sequencing (NGS of amplicons generated using a genus-based generic RT-PCR targeting a conserved region of the Ilarvirus RNA2 component that encodes the RNA dependent RNA polymerase (RdRp gene. Presence of Ilarvirus sequences in each positive sample was further validated by Sanger sequencing of cloned amplicons of regions of each of RNA1, RNA2 and/or RNA3 that were generated by species specific PCRs and by metagenomic NGS. Prunus necrotic ringspot virus (PNRSV was the most frequently detected Ilarvirus, occurring in 48 of the 61 Ilarvirus-positive trees and Prune dwarf virus (PDV and Apple mosaic virus (ApMV were detected in three trees and one tree, respectively. American plum line pattern virus (APLPV was detected in three trees and represents the first report of APLPV detection in Australia. Two novel and distinct groups of Ilarvirus-like RNA2 amplicon sequences were also identified in several trees by the generic amplicon NGS approach. The high read depth from the amplicon NGS of the generic PCR products allowed the detection of distinct RNA2 RdRp sequence variant populations of PNRSV, PDV, ApMV, APLPV and the two novel Ilarvirus-like sequences. Mixed infections of ilarviruses were also detected in seven Prunus trees. Sanger sequencing of specific RNA1, RNA2, and/or RNA3 genome segments of each virus and total nucleic acid metagenomics NGS confirmed the presence of PNRSV, PDV, ApMV and APLPV detected by RNA2 generic amplicon NGS. However, the two novel groups of Ilarvirus-like RNA2 amplicon sequences detected by the generic amplicon NGS could not be associated to the presence of sequence from RNA1 or RNA3 genome segments or full Ilarvirus genomes, and their origin is unclear. This work highlights the sensitivity of genus-specific amplicon NGS in detection of virus sequences and their distinct populations in multiple samples

  9. Identification of a novel LMF1 nonsense mutation responsible for severe hypertriglyceridemia by targeted next-generation sequencing.

    Science.gov (United States)

    Cefalù, Angelo B; Spina, Rossella; Noto, Davide; Ingrassia, Valeria; Valenti, Vincenza; Giammanco, Antonina; Fayer, Francesca; Misiano, Gabriella; Cocorullo, Gianfranco; Scrimali, Chiara; Palesano, Ornella; Altieri, Grazia I; Ganci, Antonina; Barbagallo, Carlo M; Averna, Maurizio R

    Severe hypertriglyceridemia (HTG) may result from mutations in genes affecting the intravascular lipolysis of triglyceride (TG)-rich lipoproteins. The aim of this study was to develop a targeted next-generation sequencing panel for the molecular diagnosis of disorders characterized by severe HTG. We developed a targeted customized panel for next-generation sequencing Ion Torrent Personal Genome Machine to capture the coding exons and intron/exon boundaries of 18 genes affecting the main pathways of TG synthesis and metabolism. We sequenced 11 samples of patients with severe HTG (TG>885 mg/dL-10 mmol/L): 4 positive controls in whom pathogenic mutations had previously been identified by Sanger sequencing and 7 patients in whom the molecular defect was still unknown. The customized panel was accurate, and it allowed to confirm genetic variants previously identified in all positive controls with primary severe HTG. Only 1 patient of 7 with HTG was found to be carrier of a homozygous pathogenic mutation of the third novel mutation of LMF1 gene (c.1380C>G-p.Y460X). The clinical and molecular familial cascade screening allowed the identification of 2 additional affected siblings and 7 heterozygous carriers of the mutation. We showed that our targeted resequencing approach for genetic diagnosis of severe HTG appears to be accurate, less time consuming, and more economical compared with traditional Sanger resequencing. The identification of pathogenic mutations in candidate genes remains challenging and clinical resequencing should mainly intended for patients with strong clinical criteria for monogenic severe HTG. Copyright © 2017 National Lipid Association. Published by Elsevier Inc. All rights reserved.

  10. Comparison of the accuracy of two conventional phenotypic methods and two MALDI-TOF MS systems with that of DNA sequencing analysis for correctly identifying clinically encountered yeasts.

    Science.gov (United States)

    Chao, Qiao-Ting; Lee, Tai-Fen; Teng, Shih-Hua; Peng, Li-Yun; Chen, Ping-Hung; Teng, Lee-Jene; Hsueh, Po-Ren

    2014-01-01

    We assessed the accuracy of species-level identification of two commercially available matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) systems (Bruker Biotyper and Vitek MS) and two conventional phenotypic methods (Phoenix 100 YBC and Vitek 2 Yeast ID) with that of rDNA gene sequencing analysis among 200 clinical isolates of commonly encountered yeasts. The correct identification rates of the 200 yeast isolates to species or complex (Candida parapsilosis complex, C. guilliermondii complex and C. rugosa complex) levels by the Bruker Biotyper, Vitek MS (using in vitro devices [IVD] database), Phoenix 100 YBC and Vitek 2 Yeast ID (Sabouraud's dextrose agar) systems were 92.5%, 79.5%, 89%, and 74%, respectively. An additional 72 isolates of C. parapsilosis complex and 18 from the above 200 isolates (30 in each of C. parapsilosis, C. metapsilosis, and C. orthopsilosis) were also evaluated separately. Bruker Biotyper system could accurately identify all C. parapsilosis complex to species level. Using Vitek 2 MS (IVD) system, all C. parapsilosis but none of C. metapsilosis, or C. orthopsilosis could be accurately identified. Among the 89 yeasts misidentified by the Vitek 2 MS (IVD) system, 39 (43.8%), including 27 C. orthopsilosis isolates, could be correctly identified Using the Vitek MS Plus SARAMIS database for research use only. This resulted in an increase in the rate of correct identification of all yeast isolates (87.5%) by Vitek 2 MS. The two species in C. guilliermondii complex (C. guilliermondii and C. fermentati) isolates were correctly identified by cluster analysis of spectra generated by the Bruker Biotyper system. Based on the results obtained in the current study, MALDI-TOF MS systems present a promising alternative for the routine identification of yeast species, including clinically commonly and rarely encountered yeast species and several species belonging to C. parapsilosis complex, C. guilliermondii complex

  11. Comparison of the accuracy of two conventional phenotypic methods and two MALDI-TOF MS systems with that of DNA sequencing analysis for correctly identifying clinically encountered yeasts.

    Directory of Open Access Journals (Sweden)

    Qiao-Ting Chao

    Full Text Available We assessed the accuracy of species-level identification of two commercially available matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS systems (Bruker Biotyper and Vitek MS and two conventional phenotypic methods (Phoenix 100 YBC and Vitek 2 Yeast ID with that of rDNA gene sequencing analysis among 200 clinical isolates of commonly encountered yeasts. The correct identification rates of the 200 yeast isolates to species or complex (Candida parapsilosis complex, C. guilliermondii complex and C. rugosa complex levels by the Bruker Biotyper, Vitek MS (using in vitro devices [IVD] database, Phoenix 100 YBC and Vitek 2 Yeast ID (Sabouraud's dextrose agar systems were 92.5%, 79.5%, 89%, and 74%, respectively. An additional 72 isolates of C. parapsilosis complex and 18 from the above 200 isolates (30 in each of C. parapsilosis, C. metapsilosis, and C. orthopsilosis were also evaluated separately. Bruker Biotyper system could accurately identify all C. parapsilosis complex to species level. Using Vitek 2 MS (IVD system, all C. parapsilosis but none of C. metapsilosis, or C. orthopsilosis could be accurately identified. Among the 89 yeasts misidentified by the Vitek 2 MS (IVD system, 39 (43.8%, including 27 C. orthopsilosis isolates, could be correctly identified Using the Vitek MS Plus SARAMIS database for research use only. This resulted in an increase in the rate of correct identification of all yeast isolates (87.5% by Vitek 2 MS. The two species in C. guilliermondii complex (C. guilliermondii and C. fermentati isolates were correctly identified by cluster analysis of spectra generated by the Bruker Biotyper system. Based on the results obtained in the current study, MALDI-TOF MS systems present a promising alternative for the routine identification of yeast species, including clinically commonly and rarely encountered yeast species and several species belonging to C. parapsilosis complex, C. guilliermondii

  12. Identification of rare paired box 3 variant in strabismus by whole exome sequencing

    Directory of Open Access Journals (Sweden)

    Hui-Min Gong

    2017-08-01

    Full Text Available AIM: To identify the potentially pathogenic gene variants that contributes to the etiology of strabismus. METHODS: A Chinese pedigree with strabismus was collected and the exomes of two affected individuals were sequenced using the next-generation sequencing technology. The resulting variants from exome sequencing were filtered by subsequent bioinformatics methods and the candidate mutation was verified as heterozygous in the affected proposita and her mother by sanger sequencing. RESULTS: Whole exome sequencing and filtering identified a nonsynonymous mutation c.434G-T transition in paired box 3 (PAX3 in the two affected individuals, which were predicted to be deleterious by more than 4 bioinformatics programs. This altered amino acid residue was located in the conserved PAX domain of PAX3. This gene encodes a member of the PAX family of transcription factors, which play critical roles during fetal development. Mutations in PAX3 were associated with Waardenburg syndrome with strabismus. CONCLUSION: Our results report that the c.434G-T mutation (p.R145L in PAX3 may contribute to strabismus, expanding our understanding of the causally relevant genes for this disorder.

  13. Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

    Directory of Open Access Journals (Sweden)

    Sergey I Nikolaev

    Full Text Available Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb and 7 (1.1 Mb from an individual from the International HapMap Project (NA12872. We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.

  14. Machine Learned Replacement of N-Labels for Basecalled Sequences in DNA Barcoding.

    Science.gov (United States)

    Ma, Eddie Y T; Ratnasingham, Sujeevan; Kremer, Stefan C

    2018-01-01

    This study presents a machine learning method that increases the number of identified bases in Sanger Sequencing. The system post-processes a KB basecalled chromatogram. It selects a recoverable subset of N-labels in the KB-called chromatogram to replace with basecalls (A,C,G,T). An N-label correction is defined given an additional read of the same sequence, and a human finished sequence. Corrections are added to the dataset when an alignment determines the additional read and human agree on the identity of the N-label. KB must also rate the replacement with quality value of in the additional read. Corrections are only available during system training. Developing the system, nearly 850,000 N-labels are obtained from Barcode of Life Datasystems, the premier database of genetic markers called DNA Barcodes. Increasing the number of correct bases improves reference sequence reliability, increases sequence identification accuracy, and assures analysis correctness. Keeping with barcoding standards, our system maintains an error rate of percent. Our system only applies corrections when it estimates low rate of error. Tested on this data, our automation selects and recovers: 79 percent of N-labels from COI (animal barcode); 80 percent from matK and rbcL (plant barcodes); and 58 percent from non-protein-coding sequences (across eukaryotes).

  15. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae.

    Directory of Open Access Journals (Sweden)

    Isabel A S Bonatelli

    Full Text Available Microsatellite markers (also known as SSRs, Simple Sequence Repeats are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  16. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

    Science.gov (United States)

    Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  17. Older persons' worries expressed during home care visits: Exploring the content of cues and concerns identified by the Verona coding definitions of emotional sequences.

    Science.gov (United States)

    Hafskjold, Linda; Eide, Tom; Holmström, Inger K; Sundling, Vibeke; van Dulmen, Sandra; Eide, Hilde

    2016-12-01

    Little is known about how older persons in home care express their concerns. Emotional cues and concerns can be identified by the Verona coding definitions of emotional sequences (VR-CoDES), but the method gives no insight into what causes the distress and the emotions involved. The aims of this study are to explore (1) older persons' worries and (2) the content of these expressions. An observational exploratory two-step approach was used to investigate audiotaped recordings from 38 Norwegian home care visits with older persons and nurse assistants. First, 206 cues and concerns were identified using VR-CoDES. Second, the content and context of these expressions were analysed inductively. Four main categories emerged: worries about relationships with others, worries about health care-related issues, worries about aging and bodily impairment, and life narratives and value issues, with several subcategories showing the causes of worry and emotions involved. The two-step approach provides an in-depth knowledge of older persons' worries, causes of worries, and their related emotions. The subcategories described in a language close to the experience can be useful in practice development and communication training for students and health care providers. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  18. High-Throughput Sequencing Identifies MicroRNAs from Posterior Intestine of Loach (Misgurnus anguillicaudatus) and Their Response to Intestinal Air-Breathing Inhibition.

    Science.gov (United States)

    Huang, Songqian; Cao, Xiaojuan; Tian, Xianchang; Wang, Weimin

    2016-01-01

    MicroRNAs (miRNAs) exert important roles in animal growth, immunity, and development, and regulate gene expression at the post-transcriptional level. Knowledges about the diversities of miRNAs and their roles in accessory air-breathing organs (ABOs) of fish remain unknown. In this work, we used high-throughput sequencing to identify known and novel miRNAs from the posterior intestine, an important ABO, in loach (Misgurnus anguillicaudatus) under normal and intestinal air-breathing inhibited conditions. A total of 204 known and 84 novel miRNAs were identified, while 47 miRNAs were differentially expressed between the two small RNA libraries (i.e. between the normal and intestinal air-breathing inhibited group). Potential miRNA target genes were predicted by combining our transcriptome data of the posterior intestine of the loach under the same conditions, and then annotated using COG, GO, KEGG, Swissprot and Nr databases. The regulatory networks of miRNAs and their target genes were analyzed. The abundances of nine known miRNAs were validated by qRT-PCR. The relative expression profiles of six known miRNAs and their eight corresponding target genes, and two novel potential miRNAs were also detected. Histological characteristics of the posterior intestines in both normal and air-breathing inhibited group were further analyzed. This study contributes to our understanding on the functions and molecular regulatory mechanisms of miRNAs in accessory air-breathing organs of fish.

  19. Germ-line variants identified by next generation sequencing in a panel of estrogen and cancer associated genes correlate with poor clinical outcome in Lynch syndrome patients.

    Science.gov (United States)

    Jóri, Balazs; Kamps, Rick; Xanthoulea, Sofia; Delvoux, Bert; Blok, Marinus J; Van de Vijver, Koen K; de Koning, Bart; Oei, Felicia Trups; Tops, Carli M; Speel, Ernst Jm; Kruitwagen, Roy F; Gomez-Garcia, Encarna B; Romano, Andrea

    2015-12-01

    The risk to develop colorectal and endometrial cancers among subjects testing positive for a pathogenic Lynch syndrome mutation varies, making the risk prediction difficult. Genetic risk modifiers alter the risk conferred by inherited Lynch syndrome mutations, and their identification can improve genetic counseling. We aimed at identifying rare genetic modifiers of the risk of Lynch syndrome endometrial cancer. A family based approach was used to assess the presence of genetic risk modifiers among 35 Lynch syndrome mutation carriers having either a poor clinical phenotype (early age of endometrial cancer diagnosis or multiple cancers) or a neutral clinical phenotype. Putative genetic risk modifiers were identified by Next Generation Sequencing among a panel of 154 genes involved in endometrial physiology and carcinogenesis. A simple pipeline, based on an allele frequency lower than 0.001 and on predicted non-conservative amino-acid substitutions returned 54 variants that were considered putative risk modifiers. The presence of two or more risk modifying variants in women carrying a pathogenic Lynch syndrome mutation was associated with a poor clinical phenotype. A gene-panel is proposed that comprehends genes that can carry variants with putative modifying effects on the risk of Lynch syndrome endometrial cancer. Validation in further studies is warranted before considering the possible use of this tool in genetic counseling.

  20. Exome Capture and Massively Parallel Sequencing Identifies a Novel HPSE2 Mutation in a Saudi Arabian Child with Ochoa (Urofacial) Syndrome

    Science.gov (United States)

    Al Badr, Wisam; Al Bader, Suha; Otto, Edgar; Hildebrandt, Friedhelm; Ackley, Todd; Peng, Weiping; Xu, Jishu; Li, Jun; Owens, Kailey M.; Bloom, David; Innis, Jeffrey W.

    2011-01-01

    We describe a child of Middle Eastern descent by first-cousin mating with idiopathic neurogenic bladder and high grade vesicoureteral reflux at 1 year of age, whose characteristic facial grimace led to the diagnosis of Ochoa (Urofacial) syndrome at age 5 years. We used homozygosity mapping, exome capture and paired end sequencing to identify the disease causing mutation in the proband. We reviewed the literature with respect to the urologic manifestations of Ochoa syndrome. A large region of marker homozygosity was observed at 10q24, consistent with known autosomal recessive inheritance, family consanguinity and previous genetic mapping in other families with Ochoa syndrome. A homozygous mutation was identified in the proband in HPSE2: c.1374_1378delTGTGC, a deletion of 5 nucleotides in exon 10 that is predicted to lead to a frameshift followed by replacement of 132 C-terminal amino acids with 153 novel amino acids (p.Ala458Alafsdel132ins153). This mutation is novel relative to very recently published mutations in HPSE2 in other families. Early intervention and recognition of Ochoa syndrome with control of risk factors and close surveillance will decrease complications and renal failure. PMID:21450525

  1. Identifying Active Faults by Improving Earthquake Locations with InSAR Data and Bayesian Estimation: The 2004 Tabuk (Saudi Arabia) Earthquake Sequence

    KAUST Repository

    Xu, Wenbin

    2015-02-03

    A sequence of shallow earthquakes of magnitudes ≤5.1 took place in 2004 on the eastern flank of the Red Sea rift, near the city of Tabuk in northwestern Saudi Arabia. The earthquakes could not be well located due to the sparse distribution of seismic stations in the region, making it difficult to associate the activity with one of the many mapped faults in the area and thus to improve the assessment of seismic hazard in the region. We used Interferometric Synthetic Aperture Radar (InSAR) data from the European Space Agency’s Envisat and ERS‐2 satellites to improve the location and source parameters of the largest event of the sequence (Mw 5.1), which occurred on 22 June 2004. The mainshock caused a small but distinct ∼2.7  cm displacement signal in the InSAR data, which reveals where the earthquake took place and shows that seismic reports mislocated it by 3–16 km. With Bayesian estimation, we modeled the InSAR data using a finite‐fault model in a homogeneous elastic half‐space and found the mainshock activated a normal fault, roughly 70 km southeast of the city of Tabuk. The southwest‐dipping fault has a strike that is roughly parallel to the Red Sea rift, and we estimate the centroid depth of the earthquake to be ∼3.2  km. Projection of the fault model uncertainties to the surface indicates that one of the west‐dipping normal faults located in the area and oriented parallel to the Red Sea is a likely source for the mainshock. The results demonstrate how InSAR can be used to improve locations of moderate‐size earthquakes and thus to identify currently active faults.

  2. Identifying Active Faults by Improving Earthquake Locations with InSAR Data and Bayesian Estimation: The 2004 Tabuk (Saudi Arabia) Earthquake Sequence

    KAUST Repository

    Xu, Wenbin; Dutta, Rishabh; Jonsson, Sigurjon

    2015-01-01

    A sequence of shallow earthquakes of magnitudes ≤5.1 took place in 2004 on the eastern flank of the Red Sea rift, near the city of Tabuk in northwestern Saudi Arabia. The earthquakes could not be well located due to the sparse distribution of seismic stations in the region, making it difficult to associate the activity with one of the many mapped faults in the area and thus to improve the assessment of seismic hazard in the region. We used Interferometric Synthetic Aperture Radar (InSAR) data from the European Space Agency’s Envisat and ERS‐2 satellites to improve the location and source parameters of the largest event of the sequence (Mw 5.1), which occurred on 22 June 2004. The mainshock caused a small but distinct ∼2.7  cm displacement signal in the InSAR data, which reveals where the earthquake took place and shows that seismic reports mislocated it by 3–16 km. With Bayesian estimation, we modeled the InSAR data using a finite‐fault model in a homogeneous elastic half‐space and found the mainshock activated a normal fault, roughly 70 km southeast of the city of Tabuk. The southwest‐dipping fault has a strike that is roughly parallel to the Red Sea rift, and we estimate the centroid depth of the earthquake to be ∼3.2  km. Projection of the fault model uncertainties to the surface indicates that one of the west‐dipping normal faults located in the area and oriented parallel to the Red Sea is a likely source for the mainshock. The results demonstrate how InSAR can be used to improve locations of moderate‐size earthquakes and thus to identify currently active faults.

  3. Integrative analysis of deep sequencing data identifies estrogen receptor early response genes and links ATAD3B to poor survival in breast cancer.

    Directory of Open Access Journals (Sweden)

    Kristian Ovaska

    Full Text Available Identification of responsive genes to an extra-cellular cue enables characterization of pathophysiologically crucial biological processes. Deep sequencing technologies provide a powerful means to identify responsive genes, which creates a need for computational methods able to analyze dynamic and multi-level deep sequencing data. To answer this need we introduce here a data-driven algorithm, SPINLONG, which is designed to search for genes that match the user-defined hypotheses or models. SPINLONG is applicable to various experimental setups measuring several molecular markers in parallel. To demonstrate the SPINLONG approach, we analyzed ChIP-seq data reporting PolII, estrogen receptor α (ERα, H3K4me3 and H2A.Z occupancy at five time points in the MCF-7 breast cancer cell line after estradiol stimulus. We obtained 777 ERa early responsive genes and compared the biological functions of the genes having ERα binding within 20 kb of the transcription start site (TSS to genes without such binding site. Our results show that the non-genomic action of ERα via the MAPK pathway, instead of direct ERa binding, may be responsible for early cell responses to ERα activation. Our results also indicate that the ERα responsive genes triggered by the genomic pathway are transcribed faster than those without ERα binding sites. The survival analysis of the 777 ERα responsive genes with 150 primary breast cancer tumors and in two independent validation cohorts indicated the ATAD3B gene, which does not have ERα binding site within 20 kb of its TSS, to be significantly associated with poor patient survival.

  4. A safe an easy method for building consensus HIV sequences from 454 massively parallel sequencing data.

    Science.gov (United States)

    Fernández-Caballero Rico, Jose Ángel; Chueca Porcuna, Natalia; Álvarez Estévez, Marta; Mosquera Gutiérrez, María Del Mar; Marcos Maeso, María Ángeles; García, Federico

    2018-02-01

    To show how to generate a consensus sequence from the information of massive parallel sequences data obtained from routine HIV anti-retroviral resistance studies, and that may be suitable for molecular epidemiology studies. Paired Sanger (Trugene-Siemens) and next-generation sequencing (NGS) (454 GSJunior-Roche) HIV RT and protease sequences from 62 patients were studied. NGS consensus sequences were generated using Mesquite, using 10%, 15%, and 20% thresholds. Molecular evolutionary genetics analysis (MEGA) was used for phylogenetic studies. At a 10% threshold, NGS-Sanger sequences from 17/62 patients were phylogenetically related, with a median bootstrap-value of 88% (IQR83.5-95.5). Association increased to 36/62 sequences, median bootstrap 94% (IQR85.5-98)], using a 15% threshold. Maximum association was at the 20% threshold, with 61/62 sequences associated, and a median bootstrap value of 99% (IQR98-100). A safe method is presented to generate consensus sequences from HIV-NGS data at 20% threshold, which will prove useful for molecular epidemiological studies. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.

  5. Maturity onset diabetes of youth (MODY) in Turkish children: sequence analysis of 11 causative genes by next generation sequencing.

    Science.gov (United States)

    Ağladıoğlu, Sebahat Yılmaz; Aycan, Zehra; Çetinkaya, Semra; Baş, Veysel Nijat; Önder, Aşan; Peltek Kendirci, Havva Nur; Doğan, Haldun; Ceylaner, Serdar

    2016-04-01

    Maturity-onset diabetes of the youth (MODY), is a genetically and clinically heterogeneous group of diseasesand is often misdiagnosed as type 1 or type 2 diabetes. The aim of this study is to investigate both novel and proven mutations of 11 MODY genes in Turkish children by using targeted next generation sequencing. A panel of 11 MODY genes were screened in 43 children with MODY diagnosed by clinical criterias. Studies of index cases was done with MISEQ-ILLUMINA, and family screenings and confirmation studies of mutations was done by Sanger sequencing. We identified 28 (65%) point mutations among 43 patients. Eighteen patients have GCK mutations, four have HNF1A, one has HNF4A, one has HNF1B, two have NEUROD1, one has PDX1 gene variations and one patient has both HNF1A and HNF4A heterozygote mutations. This is the first study including molecular studies of 11 MODY genes in Turkish children. GCK is the most frequent type of MODY in our study population. Very high frequency of novel mutations (42%) in our study population, supports that in heterogenous disorders like MODY sequence analysis provides rapid, cost effective and accurate genetic diagnosis.

  6. Implementation of Cloud based next generation sequencing data analysis in a clinical laboratory.

    Science.gov (United States)

    Onsongo, Getiria; Erdmann, Jesse; Spears, Michael D; Chilton, John; Beckman, Kenneth B; Hauge, Adam; Yohe, Sophia; Schomaker, Matthew; Bower, Matthew; Silverstein, Kevin A T; Thyagarajan, Bharat

    2014-05-23

    The introduction of next generation sequencing (NGS) has revolutionized molecular diagnostics, though several challenges remain limiting the widespread adoption of NGS testing into clinical practice. One such difficulty includes the development of a robust bioinformatics pipeline that can handle the volume of data generated by high-throughput sequencing in a cost-effective manner. Analysis of sequencing data typically requires a substantial level of computing power that is often cost-prohibitive to most clinical diagnostics laboratories. To address this challenge, our institution has developed a Galaxy-based data analysis pipeline which relies on a web-based, cloud-computing infrastructure to process NGS data and identify genetic variants. It provides additional flexibility, needed to control storage costs, resulting in a pipeline that is cost-effective on a per-sample basis. It does not require the usage of EBS disk to run a sample. We demonstrate the validation and feasibility of implementing this bioinformatics pipeline in a molecular diagnostics laboratory. Four samples were analyzed in duplicate pairs and showed 100% concordance in mutations identified. This pipeline is currently being used in the clinic and all identified pathogenic variants confirmed using Sanger sequencing further validating the software.

  7. Next-generation sequencing identifies deregulation of microRNAs involved in both innate and adaptive immune response in ALK+ ALCL.

    Directory of Open Access Journals (Sweden)

    Julia Steinhilber

    Full Text Available Anaplastic large cell lymphoma (ALCL is divided into two systemic diseases according to the expression of the anaplastic lymphoma kinase (ALK. We investigated the differential expression of miRNAs between ALK+ ALCL, ALK- ALCL cells and normal T-cells using next generation sequencing (NGS. In addition, a C/EBPβ-dependent miRNA profile was generated. The data were validated in primary ALCL cases. NGS identified 106 miRNAs significantly differentially expressed between ALK+ and ALK- ALCL and 228 between ALK+ ALCL and normal T-cells. We identified a signature of 56 miRNAs distinguishing ALK+ ALCL, ALK- ALCL and T-cells. The top candidates significant differentially expressed between ALK+ and ALK- ALCL included 5 upregulated miRNAs: miR-340, miR-203, miR-135b, miR-182, miR-183; and 7 downregulated: miR-196b, miR-155, miR-146a, miR-424, miR-503, miR-424*, miR-542-3p. The miR-17-92 cluster was also upregulated in ALK+ cells. Additionally, we identified a signature of 3 miRNAs significantly regulated by the transcription factor C/EBPβ, which is specifically overexpressed in ALK+ ALCL, including the miR-181 family. Of interest, miR-181a, which regulates T-cell differentiation and modulates TCR signalling strength, was significantly downregulated in ALK+ ALCL cases. In summary, our data reveal a miRNA signature linking ALK+ ALCL to a deregulated immune response and may reflect the abnormal TCR antigen expression known in ALK+ ALCL.

  8. UGT1A1 (TA)n genotyping in sickle-cell disease: high resolution melting (HRM) curve analysis or direct sequencing, what is the best way?

    Science.gov (United States)

    Thomas, Vincent; Mazard, Blandine; Garcia, Caroline; Lacan, Philippe; Gagnieu, Marie-Claude; Joly, Philippe

    2013-09-23

    Minucci et al. have proposed in 2010 a rapid, simple and cost-effective HRM method on the LightCycler 480® apparatus (Roche) for the determination of the 6/6, 6/7 and 7/7 genotypes of the (TA)n UGT1A1 promoter polymorphism. However, they have not studied the n=5 and n=8 alleles which can be quite frequent in sickle-cell disease patients. The aim of our study was to test this HRM protocol to all the 10 possible (TA)n UGT1A1 genotypes (i.e. 5/5, 5/6, 5/7, 5/8, 6/6, 6/7, 6/8, 7/7, 7/8 and 8/8) by using our SCD cohort of patients. All genotypes could be unambiguously identified except 6/7 and 6/8 which give a similar HRM profile. For those two genotypes, the differentiation necessitates either a direct Sanger sequencing or a second PCR protocol followed by a 3% agarose gel migration. For the (TA)n UGT1A1 promoter genotyping of African patients, each lab has to wonder what is the best way between (i) direct Sanger sequencing of all patients and (ii) HRM protocol for all patients followed by a complementary analysis to differentiate the 6/7 and 6/8 genotypes. © 2013. Published by Elsevier B.V. All rights reserved.

  9. Limitations of variable number of tandem repeat typing identified through whole genome sequencing of Mycobacterium avium subsp. paratuberculosis on a national and herd level.

    Science.gov (United States)

    Ahlstrom, Christina; Barkema, Herman W; Stevenson, Karen; Zadoks, Ruth N; Biek, Roman; Kao, Rowland; Trewby, Hannah; Haupstein, Deb; Kelton, David F; Fecteau, Gilles; Labrecque, Olivia; Keefe, Greg P; McKenna, Shawn L B; De Buck, Jeroen

    2015-03-08

    Mycobacterium avium subsp. paratuberculosis (MAP), the causative bacterium of Johne's disease in dairy cattle, is widespread in the Canadian dairy industry and has significant economic and animal welfare implications. An understanding of the population dynamics of MAP can be used to identify introduction events, improve control efforts and target transmission pathways, although this requires an adequate understanding of MAP diversity and distribution between herds and across the country. Whole genome sequencing (WGS) offers a detailed assessment of the SNP-level diversity and genetic relationship of isolates, whereas several molecular typing techniques used to investigate the molecular epidemiology of MAP, such as variable number of tandem repeat (VNTR) typing, target relatively unstable repetitive elements in the genome that may be too unpredictable to draw accurate conclusions. The objective of this study was to evaluate the diversity of bovine MAP isolates in Canadian dairy herds using WGS and then determine if VNTR typing can distinguish truly related and unrelated isolates. Phylogenetic analysis based on 3,039 SNPs identified through WGS of 124 MAP isolates identified eight genetically distinct subtypes in dairy herds from seven Canadian provinces, with the dominant type including over 80% of MAP isolates. VNTR typing of 527 MAP isolates identified 12 types, including "bison type" isolates, from seven different herds. At a national level, MAP isolates differed from each other by 1-2 to 239-240 SNPs, regardless of whether they belonged to the same or different VNTR types. A herd-level analysis of MAP isolates demonstrated that VNTR typing may both over-estimate and under-estimate the relatedness of MAP isolates found within a single herd. The presence of multiple MAP subtypes in Canada suggests multiple introductions into the country including what has now become one dominant type, an important finding for Johne's disease control. VNTR typing often failed to

  10. Implementation of Targeted Next Generation Sequencing in Clinical Diagnostics

    DEFF Research Database (Denmark)

    Larsen, Martin Jakob; Burton, Mark; Thomassen, Mads

    Accurate mutation detection is essential in clinical genetic diagnostics of monogenic hereditary diseases. Targeted next generation sequencing (NGS) provides a promising and cost-effective alternative to Sanger sequencing and MLPA analysis currently used in most diagnostic laboratories. One...... of mutation positive controls previously characterized by Sanger/MLPA analysis. Agilent SureSelect Target-Enrichment kits were used for capturing a set of genes associated with hereditary breast and ovarian cancer syndrome and a compilation of genes involved in multiple rare single gene disorders......, respectively. For diagnostics, the sequencing coverage is essential, wherefore a minimum coverage of 30x per nucleotide in the coding regions was used as our primary quality criterion. For the majority of the included genes, we obtained adequate gene coverage, in which we were able to detect 100% of the known...

  11. ChimericSeq: An open-source, user-friendly interface for analyzing NGS data to identify and characterize viral-host chimeric sequences

    Science.gov (United States)

    Shieh, Fwu-Shan; Jongeneel, Patrick; Steffen, Jamin D.; Lin, Selena; Jain, Surbhi; Song, Wei

    2017-01-01

    Identification of viral integration sites has been important in understanding the pathogenesis and progression of diseases associated with particular viral infections. The advent of next-generation sequencing (NGS) has enabled researchers to understand the impact that viral integration has on the host, such as tumorigenesis. Current computational methods to analyze NGS data of virus-host junction sites have been limited in terms of their accessibility to a broad user base. In this study, we developed a software application (named ChimericSeq), that is the first program of its kind to offer a graphical user interface, compatibility with both Windows and Mac operating systems, and optimized for effectively identifying and annotating virus-host chimeric reads within NGS data. In addition, ChimericSeq’s pipeline implements custom filtering to remove artifacts and detect reads with quantitative analytical reporting to provide functional significance to discovered integration sites. The improved accessibility of ChimericSeq through a GUI interface in both Windows and Mac has potential to expand NGS analytical support to a broader spectrum of the scientific community. PMID:28829778

  12. A Sequence in the loop domain of hepatitis C virus E2 protein identified in silico as crucial for the selective binding to human CD81.

    Directory of Open Access Journals (Sweden)

    Chun-Chun Chang

    Full Text Available Hepatitis C virus (HCV is a species-specific pathogenic virus that infects only humans and chimpanzees. Previous studies have indicated that interactions between the HCV E2 protein and CD81 on host cells are required for HCV infection. To determine the crucial factors for species-specific interactions at the molecular level, this study employed in silico molecular docking involving molecular dynamic simulations of the binding of HCV E2 onto human and rat CD81s. In vitro experiments including surface plasmon resonance measurements and cellular binding assays were applied for simple validations of the in silico results. The in silico studies identified two binding regions on the HCV E2 loop domain, namely E2-site1 and E2-site2, as being crucial for the interactions with CD81s, with the E2-site2 as the determinant factor for human-specific binding. Free energy calculations indicated that the E2/CD81 binding process might follow a two-step model involving (i the electrostatic interaction-driven initial binding of human-specific E2-site2, followed by (ii changes in the E2 orientation to facilitate the hydrophobic and van der Waals interaction-driven binding of E2-site1. The sequence of the human-specific, stronger-binding E2-site2 could serve as a candidate template for the future development of HCV-inhibiting peptide drugs.

  13. ChimericSeq: An open-source, user-friendly interface for analyzing NGS data to identify and characterize viral-host chimeric sequences.

    Directory of Open Access Journals (Sweden)

    Fwu-Shan Shieh

    Full Text Available Identification of viral integration sites has been important in understanding the pathogenesis and progression of diseases associated with particular viral infections. The advent of next-generation sequencing (NGS has enabled researchers to understand the impact that viral integration has on the host, such as tumorigenesis. Current computational methods to analyze NGS data of virus-host junction sites have been limited in terms of their accessibility to a broad user base. In this study, we developed a software application (named ChimericSeq, that is the first program of its kind to offer a graphical user interface, compatibility with both Windows and Mac operating systems, and optimized for effectively identifying and annotating virus-host chimeric reads within NGS data. In addition, ChimericSeq's pipeline implements custom filtering to remove artifacts and detect reads with quantitative analytical reporting to provide functional significance to discovered integration sites. The improved accessibility of ChimericSeq through a GUI interface in both Windows and Mac has potential to expand NGS analytical support to a broader spectrum of the scientific community.

  14. Spatially resolved RNA-sequencing of the embryonic heart identifies a role for Wnt/β-catenin signaling in autonomic control of heart rate

    Science.gov (United States)

    Burkhard, Silja Barbara

    2018-01-01

    Development of specialized cells and structures in the heart is regulated by spatially -restricted molecular pathways. Disruptions in these pathways can cause severe congenital cardiac malformations or functional defects. To better understand these pathways and how they regulate cardiac development we used tomo-seq, combining high-throughput RNA-sequencing with tissue-sectioning, to establish a genome-wide expression dataset with high spatial resolution for the developing zebrafish heart. Analysis of the dataset revealed over 1100 genes differentially expressed in sub-compartments. Pacemaker cells in the sinoatrial region induce heart contractions, but little is known about the mechanisms underlying their development. Using our transcriptome map, we identified spatially restricted Wnt/β-catenin signaling activity in pacemaker cells, which was controlled by Islet-1 activity. Moreover, Wnt/β-catenin signaling controls heart rate by regulating pacemaker cellular response to parasympathetic stimuli. Thus, this high-resolution transcriptome map incorporating all cell types in the embryonic heart can expose spatially restricted molecular pathways critical for specific cardiac functions. PMID:29400650

  15. POU4F3 mutation screening in Japanese hearing loss patients: Massively parallel DNA sequencing-based analysis identified novel variants associated with autosomal dominant hearing loss.

    Directory of Open Access Journals (Sweden)

    Tomohiro Kitano

    Full Text Available A variant in a transcription factor gene, POU4F3, is responsible for autosomal dominant nonsyndromic hereditary hearing loss, DFNA15. To date, 14 variants, including a whole deletion of POU4F3, have been reported to cause HL in various ethnic groups. In the present study, genetic screening for POU4F3 variants was carried out for a large series of Japanese hearing loss (HL patients to clarify the prevalence and clinical characteristics of DFNA15 in the Japanese population. Massively parallel DNA sequencing of 68 target candidate genes was utilized in 2,549 unrelated Japanese HL patients (probands to identify genomic variations responsible for HL. The detailed clinical features in patients with POU4F3 variants were collected from medical charts and analyzed. Novel 12 POU4F3 likely pathogenic variants (six missense variants, three frameshift variants, and three nonsense variants were successfully identified in 15 probands (2.5% among 602 families exhibiting autosomal dominant HL, whereas no variants were detected in the other 1,947 probands with autosomal recessive or inheritance pattern unknown HL. To obtain the audiovestibular configuration of the patients harboring POU4F3 variants, we collected audiograms and vestibular symptoms of the probands and their affected family members. Audiovestibular phenotypes in a total of 24 individuals from the 15 families possessing variants were characterized by progressive HL, with a large variation in the onset age and severity with or without vestibular symptoms observed. Pure-tone audiograms indicated the most prevalent configuration as mid-frequency HL type followed by high-frequency HL type, with asymmetry observed in approximately 20% of affected individuals. Analysis of the relationship between age and pure-tone average suggested that individuals with truncating variants showed earlier onset and slower progression of HL than did those with non-truncating variants. The present study showed that variants

  16. Targeted sequencing identifies associations between IL7R-JAK mutations and epigenetic modulators in T-cell acute lymphoblastic leukemia

    Science.gov (United States)

    Vicente, Carmen; Schwab, Claire; Broux, Michaël; Geerdens, Ellen; Degryse, Sandrine; Demeyer, Sofie; Lahortiga, Idoya; Elliott, Alannah; Chilton, Lucy; La Starza, Roberta; Mecucci, Cristina; Vandenberghe, Peter; Goulden, Nicholas; Vora, Ajay; Moorman, Anthony V.; Soulier, Jean; Harrison, Christine J.; Clappier, Emmanuelle; Cools, Jan

    2015-01-01

    T-cell acute lymphoblastic leukemia is caused by the accumulation of multiple oncogenic lesions, including chromosomal rearrangements and mutations. To determine the frequency and co-occurrence of mutations in T-cell acute lymphoblastic leukemia, we performed targeted re-sequencing of 115 genes across 155 diagnostic samples (44 adult and 111 childhood cases). NOTCH1 and CDKN2A/B were mutated/deleted in more than half of the cases, while an additional 37 genes were mutated/deleted in 4% to 20% of cases. We found that IL7R-JAK pathway genes were mutated in 27.7% of cases, with JAK3 mutations being the most frequent event in this group. Copy number variations were also detected, including deletions of CREBBP or CTCF and duplication of MYB. FLT3 mutations were rare, but a novel extracellular mutation in FLT3 was detected and confirmed to be transforming. Furthermore, we identified complex patterns of pairwise associations, including a significant association between mutations in IL7R-JAK genes and epigenetic regulators (WT1, PRC2, PHF6). Our analyses showed that IL7R-JAK genetic lesions did not confer adverse prognosis in T-cell acute lymphoblastic leukemia cases enrolled in the UK ALL2003 trial. Overall, these results identify interconnections between the T-cell acute lymphoblastic leukemia genome and disease biology, and suggest a potential clinical application for JAK inhibitors in a significant proportion of patients with T-cell acute lymphoblastic leukemia. PMID:26206799

  17. High-Throughput Sequencing Identifies MicroRNAs from Posterior Intestine of Loach (Misgurnus anguillicaudatus and Their Response to Intestinal Air-Breathing Inhibition.

    Directory of Open Access Journals (Sweden)

    Songqian Huang

    Full Text Available MicroRNAs (miRNAs exert important roles in animal growth, immunity, and development, and regulate gene expression at the post-transcriptional level. Knowledges about the diversities of miRNAs and their roles in accessory air-breathing organs (ABOs of fish remain unknown. In this work, we used high-throughput sequencing to identify known and novel miRNAs from the posterior intestine, an important ABO, in loach (Misgurnus anguillicaudatus under normal and intestinal air-breathing inhibited conditions. A total of 204 known and 84 novel miRNAs were identified, while 47 miRNAs were differentially expressed between the two small RNA libraries (i.e. between the normal and intestinal air-breathing inhibited group. Potential miRNA target genes were predicted by combining our transcriptome data of the posterior intestine of the loach under the same conditions, and then annotated using COG, GO, KEGG, Swissprot and Nr databases. The regulatory networks of miRNAs and their target genes were analyzed. The abundances of nine known miRNAs were validated by qRT-PCR. The relative expression profiles of six known miRNAs and their eight corresponding target genes, and two novel potential miRNAs were also detected. Histological characteristics of the posterior intestines in both normal and air-breathing inhibited group were further analyzed. This study contributes to our understanding on the functions and molecular regulatory mechanisms of miRNAs in accessory air-breathing organs of fish.

  18. Exome sequencing in schizophrenic patients with high levels of homozygosity identifies novel and extremely rare mutations in the GABA/glutamatergic pathways.

    Directory of Open Access Journals (Sweden)

    Edoardo Giacopuzzi

    Full Text Available Inbreeding is a known risk factor for recessive Mendelian diseases and previous studies have suggested that it could also play a role in complex disorders, such as psychiatric diseases. Recent inbreeding results in the presence of long runs of homozygosity (ROHs along the genome, which are also defined as autozygosity regions. Genetic variants in these regions have two alleles that are identical by descent, thus increasing the odds of bearing rare recessive deleterious mutations due to a homozygous state. A recent study showed a suggestive enrichment of long ROHs in schizophrenic patients, suggesting that recent inbreeding could play a role in the disease. To better understand the impact of autozygosity on schizophrenia risk, we selected, from a cohort of 180 Italian patients, seven subjects with extremely high numbers of large ROHs that were likely due to recent inbreeding and characterized the mutational landscape within their ROHs using Whole Exome Sequencing and, gene set enrichment analysis. We identified a significant overlap (17%; empirical p-value = 0.0171 between genes inside ROHs affected by low frequency functional homozygous variants (107 genes and the group of most promising candidate genes mutated in schizophrenia. Moreover, in four patients, we identified novel and extremely rare damaging mutations in the genes involved in neurodevelopment (MEGF8 and in GABA/glutamatergic synaptic transmission (GAD1, FMN1, ANO2. These results provide insights into the contribution of rare recessive mutations and inbreeding as risk factors for schizophrenia. ROHs that are likely due to recent inbreeding harbor a combination of predisposing low-frequency variants and extremely rare variants that have a high impact on pivotal biological pathways implicated in the disease. In addition, this study confirms that focusing on patients with high levels of homozygosity could be a useful prioritization strategy for discovering new high-impact mutations in

  19. Exome sequencing in schizophrenic patients with high levels of homozygosity identifies novel and extremely rare mutations in the GABA/glutamatergic pathways.

    Science.gov (United States)

    Giacopuzzi, Edoardo; Gennarelli, Massimo; Minelli, Alessandra; Gardella, Rita; Valsecchi, Paolo; Traversa, Michele; Bonvicini, Cristian; Vita, Antonio; Sacchetti, Emilio; Magri, Chiara

    2017-01-01

    Inbreeding is a known risk factor for recessive Mendelian diseases and previous studies have suggested that it could also play a role in complex disorders, such as psychiatric diseases. Recent inbreeding results in the presence of long runs of homozygosity (ROHs) along the genome, which are also defined as autozygosity regions. Genetic variants in these regions have two alleles that are identical by descent, thus increasing the odds of bearing rare recessive deleterious mutations due to a homozygous state. A recent study showed a suggestive enrichment of long ROHs in schizophrenic patients, suggesting that recent inbreeding could play a role in the disease. To better understand the impact of autozygosity on schizophrenia risk, we selected, from a cohort of 180 Italian patients, seven subjects with extremely high numbers of large ROHs that were likely due to recent inbreeding and characterized the mutational landscape within their ROHs using Whole Exome Sequencing and, gene set enrichment analysis. We identified a significant overlap (17%; empirical p-value = 0.0171) between genes inside ROHs affected by low frequency functional homozygous variants (107 genes) and the group of most promising candidate genes mutated in schizophrenia. Moreover, in four patients, we identified novel and extremely rare damaging mutations in the genes involved in neurodevelopment (MEGF8) and in GABA/glutamatergic synaptic transmission (GAD1, FMN1, ANO2). These results provide insights into the contribution of rare recessive mutations and inbreeding as risk factors for schizophrenia. ROHs that are likely due to recent inbreeding harbor a combination of predisposing low-frequency variants and extremely rare variants that have a high impact on pivotal biological pathways implicated in the disease. In addition, this study confirms that focusing on patients with high levels of homozygosity could be a useful prioritization strategy for discovering new high-impact mutations in genetically

  20. Analyses of expressed sequence tags from the maize foliar pathogen Cercospora zeae-maydis identify novel genes expressed during vegetative, infectious, and reproductive growth

    Directory of Open Access Journals (Sweden)

    Kema Gert HJ

    2008-11-01

    Full Text Available Abstract Background The ascomycete fungus Cercospora zeae-maydis is an aggressive foliar pathogen of maize that causes substantial l