WorldWideScience

Sample records for array-based whole-genome survey

  1. Whole genome sequencing to complement tuberculosis drug resistance surveys in Uganda

    Science.gov (United States)

    Ssengooba, Willy; Meehan, Conor J.; Lukoye, Deus; Kasule, George William; Musisi, Kenneth; Joloba, Moses L.; Cobelens, Frank G.; de Jong, Bouke C.

    2016-01-01

    Understanding the circulating Mycobacterium tuberculosis resistance mutations is vital for better TB control strategies, especially to inform a new MDR-TB treatment programme. We complemented the phenotypic drug susceptibility testing (DST) based drug resistance surveys (DRSs) conducted in Uganda between 2008 and 2011 with Whole Genome Sequencing (WGS) of 90 Mycobacterium tuberculosis isolates phenotypically resistant to rifampicin and/or isoniazid to better understand the extent of drug resistance. A total of 31 (34.4 %) patients had MDR-TB, 5 (5.6 %) mono-rifampicin resistance and 54 (60.0 %) mono-isoniazid resistance by phenotypic DST. Pyrazinamide resistance mutations were identified in 32.3% of the MDR-TB patients. Resistance to injectable agents was detected in 4/90 (4.4%), and none to fluoroquinolones or novel drugs. Compensatory mutations in rpoC were identified in two patients. The sensitivity and specificity of drug resistance mutations compared to phenotypic DST were for rpoB 88.6% and 98.1%, katG 60.0% and 100%, fabG1 16.5% and 100%, katG and/or fabG1 71.8% and 100%, embCAB 63.0% and 82.5%, rrs 11.4% and 100%, rpsL 20.5% and 95.7% and rrs and/or rpsL 31.8% and 95.7%. Phylogenetic analysis showed dispersed MDR-TB isolate, with only one cluster of three Beijing family from South West Uganda. Among tuberculosis patients in Uganda, resistance beyond first-line drugs as well as compensatory mutations remain low, and MDR-TB isolates did not arise from a dominant clone. Our findings show the potential use of sequencing for complementing DRSs or surveillance in this setting, with good specificity compared to phenotypic DST. The reported high confidence mutations can be included in molecular assays, and population-based studies can track transmission of MDR-TB including the Beijing family strains in the South West of the country. PMID:26917365

  2. Whole genome sequencing to complement tuberculosis drug resistance surveys in Uganda.

    Science.gov (United States)

    Ssengooba, Willy; Meehan, Conor J; Lukoye, Deus; Kasule, George William; Musisi, Kenneth; Joloba, Moses L; Cobelens, Frank G; de Jong, Bouke C

    2016-06-01

    Understanding the circulating Mycobacterium tuberculosis resistance mutations is vital for better TB control strategies, especially to inform a new MDR-TB treatment programme. We complemented the phenotypic drug susceptibility testing (DST) based drug resistance surveys (DRSs) conducted in Uganda between 2008 and 2011 with Whole Genome Sequencing (WGS) of 90 Mycobacterium tuberculosis isolates phenotypically resistant to rifampicin and/or isoniazid to better understand the extent of drug resistance. A total of 31 (34.4 %) patients had MDR-TB, 5 (5.6 %) mono-rifampicin resistance and 54 (60.0 %) mono-isoniazid resistance by phenotypic DST. Pyrazinamide resistance mutations were identified in 32.3% of the MDR-TB patients. Resistance to injectable agents was detected in 4/90 (4.4%), and none to fluoroquinolones or novel drugs. Compensatory mutations in rpoC were identified in two patients. The sensitivity and specificity of drug resistance mutations compared to phenotypic DST were for rpoB 88.6% and 98.1%, katG 60.0% and 100%, fabG1 16.5% and 100%, katG and/or fabG1 71.8% and 100%, embCAB 63.0% and 82.5%, rrs 11.4% and 100%, rpsL 20.5% and 95.7% and rrs and/or rpsL 31.8% and 95.7%. Phylogenetic analysis showed dispersed MDR-TB isolate, with only one cluster of three Beijing family from South West Uganda. Among tuberculosis patients in Uganda, resistance beyond first-line drugs as well as compensatory mutations remain low, and MDR-TB isolates did not arise from a dominant clone. Our findings show the potential use of sequencing for complementing DRSs or surveillance in this setting, with good specificity compared to phenotypic DST. The reported high confidence mutations can be included in molecular assays, and population-based studies can track transmission of MDR-TB including the Beijing family strains in the South West of the country. PMID:26917365

  3. Whole Genome Sequencing

    Science.gov (United States)

    ... you want to learn. Search form Search Whole Genome Sequencing You are here Home Testing & Services Testing ... the full story, click here . What is whole genome sequencing? Whole genome sequencing is the mapping out ...

  4. Whole Genome Selection

    Science.gov (United States)

    Whole genome selection (WGS) is an approach to using DNA markers that are distributed throughout the entire genome. Genes affecting most economically-important traits are distributed throughout the genome and there are relatively few that have large effects with many more genes with progressively sm...

  5. Proficiency testing for bacterial whole genome sequencing: an end-user survey of current capabilities, requirements and priorities

    DEFF Research Database (Denmark)

    Moran-Gilad, Jacob; Sintchenko, Vitali; Karlsmose Pedersen, Susanne;

    2015-01-01

    costs. The priority pathogens reported by respondents reflected the key drivers for NGS use (high burden disease and 'high profile' pathogens). The performance of and participation in PT was perceived as important by most respondents. The wide range of sequencing and bioinformatics practices reported by...... end-users highlights the importance of standardisation and harmonisation of NGS in public health and underpins the use of PT as a means to assuring quality. The findings of this survey will guide the design of the GMI PT program in relation to the spectrum of pathogens included, testing frequency and...... volume as well as technical requirements. The PT program for external quality assurance will evolve and inform the introduction of NGS into clinical and public health microbiology practice in the post-genomic era....

  6. Proficiency Testing for Bacterial Whole Genome Sequencing: An End-User Survey of Current Capabilities, Requirements and Priorities

    DEFF Research Database (Denmark)

    Moran-Gilad, Jacob; Sintchenko, Vitali; Karlsmose Pedersen, Susanne;

    2015-01-01

    range of costs. The priority pathogens reported by respondents reflected the key drivers for NGS use (high burden disease and ‘high profile’ pathogens). The performance of and participation in PT was perceived as important by most respondents. The wide range of sequencing and bioinformatics practices...... reported by end-users highlights the importance of standardisation and harmonisation of NGS in public health and underpins the use of PT as a means to assuring quality. The findings of this survey will guide the design of the GMI PT program in relation to the spectrum of pathogens included, testing...... frequency and volume as well as technical requirements. The PT program for external quality assurance will evolve and inform the introduction of NGS into clinical and public health microbiology practice in the post-genomic era....

  7. Bioinformatics for whole-genome shotgun sequencing of microbial communities.

    Directory of Open Access Journals (Sweden)

    Kevin Chen

    2005-07-01

    Full Text Available The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.

  8. Interpreting Whole-Genome Marker Data

    OpenAIRE

    Weir, Bruce S.

    2013-01-01

    The challenges of whole-genome data, when genotypes are available from hundreds of thousands of genetic markers, are explored for four topics in statistical genetics: Hardy-Weinberg testing, estimating linkage disequilibrium from unphased genotypic data, association mapping and characterizing population structure.

  9. Whole genome linkage disequilibrium maps in cattle

    Science.gov (United States)

    Bovine whole genome linkage disequilibrium maps were constructed for eight breeds of cattle. These data provide fundamental information concerning bovine genome organization which will allow the design of studies to associate genetic variation with economically important traits and also provides bac...

  10. Whole genome sequencing in drug discovery research: a one fits all solution?

    OpenAIRE

    Marc Sultan

    2015-01-01

    With the recent availability of Illumina's HiSeq X ten sequencing platform, the cost of whole genome sequencing (WGS) has dropped to nearly $1,000 per genome. The affordability of WGS has now the potential of replacing other genotyping platforms such as whole exome sequencing (WES) and array based genotyping for (smaller) clinical study cohorts. In a recent pilot study, we compared the performance and genotyping quality of the HiSeq X WGS approach against WES and array based genotyping with r...

  11. Whole genome phylogenies for multiple Drosophila species

    Directory of Open Access Journals (Sweden)

    Seetharam Arun

    2012-12-01

    Full Text Available Abstract Background Reconstructing the evolutionary history of organisms using traditional phylogenetic methods may suffer from inaccurate sequence alignment. An alternative approach, particularly effective when whole genome sequences are available, is to employ methods that don’t use explicit sequence alignments. We extend a novel phylogenetic method based on Singular Value Decomposition (SVD to reconstruct the phylogeny of 12 sequenced Drosophila species. SVD analysis provides accurate comparisons for a high fraction of sequences within whole genomes without the prior identification of orthologs or homologous sites. With this method all protein sequences are converted to peptide frequency vectors within a matrix that is decomposed to provide simplified vector representations for each protein of the genome in a reduced dimensional space. These vectors are summed together to provide a vector representation for each species, and the angle between these vectors provides distance measures that are used to construct species trees. Results An unfiltered whole genome analysis (193,622 predicted proteins strongly supports the currently accepted phylogeny for 12 Drosophila species at higher dimensions except for the generally accepted but difficult to discern sister relationship between D. erecta and D. yakuba. Also, in accordance with previous studies, many sequences appear to support alternative phylogenies. In this case, we observed grouping of D. erecta with D. sechellia when approximately 55% to 95% of the proteins were removed using a filter based on projection values or by reducing resolution by using fewer dimensions. Similar results were obtained when just the melanogaster subgroup was analyzed. Conclusions These results indicate that using our novel phylogenetic method, it is possible to consult and interpret all predicted protein sequences within multiple whole genomes to produce accurate phylogenetic estimations of relatedness between

  12. Microbial species delineation using whole genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  13. Strategies and tools for whole genome alignments

    Energy Technology Data Exchange (ETDEWEB)

    Couronne, Olivier; Poliakov, Alexander; Bray, Nicolas; Ishkhanov,Tigran; Ryaboy, Dmitriy; Rubin, Edward; Pachter, Lior; Dubchak, Inna

    2002-11-25

    The availability of the assembled mouse genome makespossible, for the first time, an alignment and comparison of two largevertebrate genomes. We have investigated different strategies ofalignment for the subsequent analysis of conservation of genomes that areeffective for different quality assemblies. These strategies were appliedto the comparison of the working draft of the human genome with the MouseGenome Sequencing Consortium assembly, as well as other intermediatemouse assemblies. Our methods are fast and the resulting alignmentsexhibit a high degree of sensitivity, covering more than 90 percent ofknown coding exons in the human genome. We have obtained such coveragewhile preserving specificity. With a view towards the end user, we havedeveloped a suite of tools and websites for automatically aligning, andsubsequently browsing and working with whole genome comparisons. Wedescribe the use of these tools to identify conserved non-coding regionsbetween the human and mouse genomes, some of which have not beenidentified by other methods.

  14. Whole Genome Epidemiological Typing of Escherichia coli

    DEFF Research Database (Denmark)

    Kaas, Rolf Sommer

    is in general expensive and to some extent unreliable. Next generation sequencing has quickly become a tool widely available and has enabled even smaller laboratories to do whole genome sequencing (WGS). Having the entire genome available provides the opportunity to create the ultimate typing method. This Ph......D thesis attempts to take the first steps toward such a method. In Kaas I all publicly available E. coli genomes sequenced (186) are analyzed. 1,702 core genes were found in all genomes. 3,051 genes were found in 95% of the genomes. The pan genome was found to consist of 16,373 genes. The overall phylogeny...... was inferred from the core genome and also set into context of the Escherichia genus. The variance within each gene cluster was calculated in order to compare the variance between genes and possibly identify typing targets for further study. The variance scores calculated was also used to compare the three...

  15. Whole Genome Epidemiological Typing of Salmonella

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas

    . Technological advances and effective price in high throughput genome sequencing are making whole genome sequencing (WGS) available as a routine tool for bacterial typing. Typing of Salmonella, especially sub-typing within the same serotype or even the same clone, the genetic variation of the target genes being...... used for typing is crucial for successful discrimination. The core genes or the genes that are conserved in all members of a genus or species are potentially good candidates for investigating genomic variation in phylogeny and epidemiology. A total of 2,882 core genes have been observed among 73...... available Salmonella enterica genomes (accessed in April 2011). A consensus tree based on variation of the core genes gives better resolution than 16S rRNA and MLST that rarely provide separation between closely related strains. The performance of the pan-genome tree which is based on the presence...

  16. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby;

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome...... association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function....

  17. Whole genome analysis of a Vietnamese trio

    Indian Academy of Sciences (India)

    Dang Thanh Hai; Nguyen Dai Thanh; Pham Thi Minh Trang; Le Si Quang; Phan Thi Thu Hang; Dang Cao Cuong; Hoang Kim Phuc; Nguyen Huu Duc; Do Duc Dong; Bui Quang Minh; Pham Bao Son; Le Sy Vinh

    2015-03-01

    We here present the first whole genome analysis of an anonymous Kinh Vietnamese (KHV) trio whose genomes were deeply sequenced to 30-fold average coverage. The resulting short reads covered 99.91% of the human reference genome (GRCh37d5). We identified 4,719,412 SNPs and 827,385 short indels that satisfied the Mendelian inheritance law. Among them, 109,914 (2.3%) SNPs and 59,119 (7.1%) short indels were novel. We also detected 30,171 structural variants of which 27,604 (91.5%) were large indels. There were 6,681 large indels in the range 0.1–100 kbp occurring in the child genome that were also confirmed in either the father or mother genome.We compared these large indels against the DGV database and found that 1,499 (22.44%) were KHV specific. De novo assembly of high-quality unmapped reads yielded 789 contigs with the length ≥ 300 bp. There were 235 contigs from the child genome of which 199 (84.7%) were significantly matched with at least one contig from the father or mother genome. Blasting these 199 contigs against other alternative human genomes revealed 4 novel contigs. The novel variants identified from our study demonstrated the necessity of conducting more genome-wide studies not only for Kinh but also for other ethnic groups in Vietnam.

  18. Small Sample Whole-Genome Amplification

    Energy Technology Data Exchange (ETDEWEB)

    Hara, C A; Nguyen, C P; Wheeler, E K; Sorensen, K J; Arroyo, E S; Vrankovich, G P; Christian, A T

    2005-09-20

    Many challenges arise when trying to amplify and analyze human samples collected in the field due to limitations in sample quantity, and contamination of the starting material. Tests such as DNA fingerprinting and mitochondrial typing require a certain sample size and are carried out in large volume reactions; in cases where insufficient sample is present whole genome amplification (WGA) can be used. WGA allows very small quantities of DNA to be amplified in a way that enables subsequent DNA-based tests to be performed. A limiting step to WGA is sample preparation. To minimize the necessary sample size, we have developed two modifications of WGA: the first allows for an increase in amplified product from small, nanoscale, purified samples with the use of carrier DNA while the second is a single-step method for cleaning and amplifying samples all in one column. Conventional DNA cleanup involves binding the DNA to silica, washing away impurities, and then releasing the DNA for subsequent testing. We have eliminated losses associated with incomplete sample release, thereby decreasing the required amount of starting template for DNA testing. Both techniques address the limitations of sample size by providing ample copies of genomic samples. Carrier DNA, included in our WGA reactions, can be used when amplifying samples with the standard purification method, or can be used in conjunction with our single-step DNA purification technique to potentially further decrease the amount of starting sample necessary for future forensic DNA-based assays.

  19. Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness

    Science.gov (United States)

    ... For Consumers Home For Consumers Consumer Updates Whole Genome Sequencing: Cracking the Genetic Code for Foodborne Illness ... Bacteria that cause disease have millions of different genomes, or sequences of genetic code, each as unique ...

  20. Comparative Copy Number Variation From Whole Genome Sequencing

    OpenAIRE

    Janevski, A.; Varadan, V.; Kamalakaran, S.; Banerjee, N.; Dimitrova, D

    2011-01-01

    Whole genome sequencing enables a high resolution view of the humangenome and enables unique insights into copy number variations in anunprecedented scale. Numerous tools and studies have already been introduced that provide confirmatory and new genomic variability datain individuals and across populations. We investigate two such methods, CNV-seq and FREEC and compare their outputs when applied to five whole genome sequences representing four populations. We focus onthe ability of these tool...

  1. Mapping Challenging Mutations by Whole-Genome Sequencing

    OpenAIRE

    Smith, Harold E.; Fabritius, Amy S.; Aimee Jaramillo-Lambert; Andy Golden

    2016-01-01

    Whole-genome sequencing provides a rapid and powerful method for identifying mutations on a global scale, and has spurred a renewed enthusiasm for classical genetic screens in model organisms. The most commonly characterized category of mutation consists of monogenic, recessive traits, due to their genetic tractability. Therefore, most of the mapping methods for mutation identification by whole-genome sequencing are directed toward alleles that fulfill those criteria (i.e., single-gene, homoz...

  2. Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities

    OpenAIRE

    Chen, Kevin; Pachter, Lior

    2005-01-01

    The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fe...

  3. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 com-plex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  4. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 complex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  5. De novo mutations discovered in 8 Mexican American families through whole genome sequencing

    OpenAIRE

    Wang, Heming; Zhu, Xiaofeng

    2014-01-01

    De novo mutations enrich the sequence diversity and carry the clue of evolutional selection. Recent studies suggest the de novo mutations could be one of the risk factors for complex diseases. We conducted a survey of de novo mutations using the whole genome sequence data but only available on the odd autosomes of Mexican American families provided by Genetic Analysis Workshop 18. We extracted 8 three-generation families who have sequencing data available from 20 large pedigrees. By comparing...

  6. Whole-Genome Sequencing of Two Bartonella bacilliformis Strains

    Science.gov (United States)

    Guillen, Yolanda; Casadellà, Maria; García-de-la-Guarda, Ruth; Espinoza-Culupú, Abraham; Paredes, Roger; Ruiz, Joaquim

    2016-01-01

    Bartonella bacilliformis is the causative agent of Carrion’s disease, a highly endemic human bartonellosis in Peru. We performed a whole-genome assembly of two B. bacilliformis strains isolated from the blood of infected patients in the acute phase of Carrion’s disease from the Cusco and Piura regions in Peru. PMID:27389274

  7. Whole-Genome Sequences of Three Symbiotic Endozoicomonas Bacteria

    KAUST Repository

    Neave, Matthew J.

    2014-08-14

    Members of the genus Endozoicomonas associate with a wide range of marine organisms. Here, we report on the whole-genome sequencing, assembly, and annotation of three Endozoicomonas type strains. These data will assist in exploring interactions between Endozoicomonas organisms and their hosts, and it will aid in the assembly of genomes from uncultivated Endozoicomonas spp.

  8. Whole genome amplification - Review of applications and advances

    Energy Technology Data Exchange (ETDEWEB)

    Hawkins, Trevor L.; Detter, J.C.; Richardson, Paul

    2001-11-15

    The concept of Whole Genome Amplification is something that has arisen in the past few years as modifications to the polymerase chain reaction (PCR) have been adapted to replicate regions of genomes which are of biological interest. The applications here are many--forensics, embryonic disease diagnosis, bio terrorism genome detection, ''imoralization'' of clinical samples, microbial diversity, and genotyping. The key question is if DNA can be replicated a genome at a time without bias or non random distribution of the target. Several papers published in the last year and currently in preparation may lead to the conclusion that whole genome amplification may indeed be possible and therefore open up a new avenue to molecular biology.

  9. Whole genome sequencing in clinical and public health microbiology

    OpenAIRE

    Kwong, J. C.; McCallum, N; Sintchenko, V.; Howden, B. P.

    2015-01-01

    SummaryGenomics and whole genome sequencing (WGS) have the capacity to greatly enhance knowledge and understanding of infectious diseases and clinical microbiology. The growth and availability of bench-top WGS analysers has facilitated the feasibility of genomics in clinical and public health microbiology. Given current resource and infrastructure limitations, WGS is most applicable to use in public health laboratories, reference laboratories, and hospital infection control-affiliated laborat...

  10. Whole genome and transcriptome sequencing of a B3 thymoma.

    Directory of Open Access Journals (Sweden)

    Iacopo Petrini

    Full Text Available Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37. Copy number (CN aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs and 2 insertion/deletions (INDELs were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.

  11. Whole Genome Sequencing: Innovation Dream or Privacy Nightmare?

    OpenAIRE

    De Cristofaro, Emiliano

    2012-01-01

    Over the past several years, DNA sequencing has emerged as one of the driving forces in life-sciences, paving the way for affordable and accurate whole genome sequencing. As genomes represent the entirety of an organism's hereditary information, the availability of complete human genomes prompts a wide range of revolutionary applications. The hope for improving modern healthcare and better understanding the human genome propels many interesting and challenging research frontiers. Unfortunatel...

  12. Whole genome amplification of DNA for genotyping pharmacogenetics candidate genes.

    Directory of Open Access Journals (Sweden)

    Santosh ePhilips

    2012-03-01

    Full Text Available Whole genome amplification (WGA technologies can be used to amplify genomic DNA when only small amounts of DNA are available. The Multiple Displacement Amplification Phi polymerase based amplification has been shown to accurately amplify DNA for a variety of genotyping assays; however, it has not been tested for genotyping many of the clinically relevant genes important for pharmacogenetic studies, such as the cytochrome P450 genes, that are typically difficult to genotype due to multiple pseudogenes, copy number variations, and high similarity to other related genes. We evaluated whole genome amplified samples for Taqman™ genotyping of SNPs in a variety of pharmacogenetic genes. In 24 DNA samples from the Coriell human diversity panel, the call rates and concordance between amplified (~200-fold amplification and unamplified samples was 100% for two SNPs in CYP2D6 and one in ESR1. In samples from a breast cancer clinical trial (Trial 1, we compared the genotyping results in samples before and after WGA for four SNPs in CYP2D6, one SNP in CYP2C19, one SNP in CYP19A1, two SNPs in ESR1, and two SNPs in ESR2. The concordance rates were all >97%. Finally, we compared the allele frequencies of 143 SNPs determined in Trial 1 (whole genome amplified DNA to the allele frequencies determined in unamplified DNA samples from a separate trial (Trial 2 that enrolled a similar population. The call rates and allele frequencies between the two trials were 98% and 99.7%, respectively. We conclude that the whole genome amplified DNA is suitable for Taqman™ genotyping for a wide variety of pharmacogenetically relevant SNPs.

  13. Whole Genome and Transcriptome Sequencing of a B3 Thymoma

    OpenAIRE

    Iacopo Petrini; Arun Rajan; Trung Pham; Donna Voeller; Sean Davis; James Gao; Yisong Wang; Giuseppe Giaccone

    2013-01-01

    Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomi...

  14. Whole genome sequencing of clinical isolates of Giardia lamblia

    OpenAIRE

    Hanevik, Kurt; Bakken, R.; Brattbakk, Hans-Richard; Saghaug, Christina Skår; Langeland, Nina

    2015-01-01

    Clinical isolates from protozoan parasites such as Giardia lamblia are at present practically impossible to culture. By using simple cyst purification methods, we show that Giardia whole genome sequencing of clinical stool samples is possible. Immunomagnetic separation after sucrose gradient flotation gave superior results compared to sucrose gradient flotation alone. The method enables detailed analysis of a wide range of genes of interest for genotyping, virulence and drug resistance.

  15. Physical map-assisted whole-genome shotgun sequence assemblies

    OpenAIRE

    Warren, René L.; Varabei, Dmitry; Platt, Darren; Huang, Xiaoqiu; Messina, David; Yang, Shiaw-Pyng; Kronstad, James W.; Krzywinski, Martin; Warren, Wesley C; Wallis, John W.; Hillier, LaDeana W.; Chinwalla, Asif T.; Schein, Jacqueline E.; Siddiqui, Asim S.; Marra, Marco A.

    2006-01-01

    We describe a targeted approach to improve the contiguity of whole-genome shotgun sequence (WGS) assemblies at run-time, using information from Bacterial Artificial Chromosome (BAC)-based physical maps. Clone sizes and overlaps derived from clone fingerprints are used for the calculation of length constraints between any two BAC neighbors sharing 40% of their size. These constraints are used to promote the linkage and guide the arrangement of sequence contigs within a sequence scaffold at the...

  16. Whole-genome shotgun optical mapping of Rhodospirillum rubrum

    Energy Technology Data Exchange (ETDEWEB)

    Reslewic, S. [Univ. Wisc.-Madison; Zhou, S. [Univ. Wisc.-Madison; Place, M. [Univ. Wisc.-Madison; Zhang, Y. [Univ. Wisc.-Madison; Briska, A. [Univ. Wisc.-Madison; Goldstein, S. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Forrest, D. [Univ. Wisc.-Madison; Lim, A. [Univ. Wisc.-Madison; Lapidus, A. [Univ. Wisc.-Madison; Han, C. S. [Univ. Wisc.-Madison; Roberts, G. P. [Univ. Wisc.-Madison; Schwartz, D. C. [Univ. Wisc.-Madison

    2005-09-01

    Rhodospirillum rubrum is a phototrophic purple nonsulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems and as a source of hydrogen and biodegradable plastic production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction endonuclease maps (XbaI, NheI, and HindIII) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction endonuclease maps from randomly sheared genomic DNA molecules extracted from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the HindIII map acted as a scaffold for high-resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and confirmation of genome sequence, this work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a "molecular cytogenetics" approach to solving problems in genomic analysis.

  17. Whole-genome shotgun optical mapping of rhodospirillumrubrum

    Energy Technology Data Exchange (ETDEWEB)

    Reslewic, Susan; Zhou, Shiguo; Place, Mike; Zhang, Yaoping; Briska, Adam; Goldstein, Steve; Churas, Chris; Runnheim, Rod; Forrest,Dan; Lim, Alex; Lapidus, Alla; Han, Cliff S.; Roberts, Gary P.; Schwartz,David C.

    2004-07-01

    Rhodospirillum rubrum is a phototrophic purple non-sulfur bacterium known for its unique and well-studied nitrogen fixation and carbon monoxide oxidation systems, and as a source of hydrogen and biodegradable plastics production. To better understand this organism and to facilitate assembly of its sequence, three whole-genome restriction maps (Xba I, Nhe I, and Hind III) of R. rubrum strain ATCC 11170 were created by optical mapping. Optical mapping is a system for creating whole-genome ordered restriction maps from randomly sheared genomic DNA molecules extracted directly from cells. During the sequence finishing process, all three optical maps confirmed a putative error in sequence assembly, while the Hind III map acted as a scaffold for high resolution alignment with sequence contigs spanning the whole genome. In addition to highlighting optical mapping's role in the assembly and validation of genome sequence, our work underscores the unique niche in resolution occupied by the optical mapping system. With a resolution ranging from 6.5 kb (previously published) to 45 kb (reported here), optical mapping advances a ''molecular cytogenetics'' approach to solving problems in genomic analysis.

  18. Genomic prediction using QTL derived from whole genome sequence data

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc;

    This study investigated the gain in accuracy of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k SNP data. Analyses were performed for Nordic Holstein and Danish Jersey animals, using either...... a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model, results showed increases in accuracy of up to two percentage points for production traits in both Holstein and Jersey animals by including the extra variants in the analysis, and an extra 1.5 percentage points...

  19. Whole-Genome Sequences of Thirteen Isolates of Borrelia burgdorferi

    Energy Technology Data Exchange (ETDEWEB)

    Schutzer S. E.; Dunn J.; Fraser-Liggett, C. M.; Casjens, S. R.; Qiu, W.-G.; Mongodin, E. F.; Luft, B. J.

    2011-02-01

    Borrelia burgdorferi is a causative agent of Lyme disease in North America and Eurasia. The first complete genome sequence of B. burgdorferi strain 31, available for more than a decade, has assisted research on the pathogenesis of Lyme disease. Because a single genome sequence is not sufficient to understand the relationship between genotypic and geographic variation and disease phenotype, we determined the whole-genome sequences of 13 additional B. burgdorferi isolates that span the range of natural variation. These sequences should allow improved understanding of pathogenesis and provide a foundation for novel detection, diagnosis, and prevention strategies.

  20. What are people willing to pay for whole-genome sequencing information, and who decides what they receive?

    OpenAIRE

    Marshall, DA; Gonzalez, JM; Johnson, FR; Macdonald, KV; Pugh, A; Douglas, MP; Phillips, KA

    2016-01-01

    Whole-genome sequencing (WGS) can be used as a powerful diagnostic tool as well as for screening, but it may lead to anxiety, unnecessary testing, and overtreatment. Current guidelines suggest reporting clinically actionable secondary findings when diagnostic testing is performed. We examined preferences for receiving WGS results.A US nationally representative survey (n = 410 adults) was used to rank preferences for who decides (an expert panel, your doctor, you) which WGS results are reporte...

  1. Whole-genome sequencing of a laboratory-evolved yeast strain

    Directory of Open Access Journals (Sweden)

    Dunham Maitreya J

    2010-02-01

    Full Text Available Abstract Background Experimental evolution of microbial populations provides a unique opportunity to study evolutionary adaptation in response to controlled selective pressures. However, until recently it has been difficult to identify the precise genetic changes underlying adaptation at a genome-wide scale. New DNA sequencing technologies now allow the genome of parental and evolved strains of microorganisms to be rapidly determined. Results We sequenced >93.5% of the genome of a laboratory-evolved strain of the yeast Saccharomyces cerevisiae and its ancestor at >28× depth. Both single nucleotide polymorphisms and copy number amplifications were found, with specific gains over array-based methodologies previously used to analyze these genomes. Applying a segmentation algorithm to quantify structural changes, we determined the approximate genomic boundaries of a 5× gene amplification. These boundaries guided the recovery of breakpoint sequences, which provide insights into the nature of a complex genomic rearrangement. Conclusions This study suggests that whole-genome sequencing can provide a rapid approach to uncover the genetic basis of evolutionary adaptations, with further applications in the study of laboratory selections and mutagenesis screens. In addition, we show how single-end, short read sequencing data can provide detailed information about structural rearrangements, and generate predictions about the genomic features and processes that underlie genome plasticity.

  2. Whole Genome Re-Sequencing of Three Domesticated Chicken Breeds.

    Science.gov (United States)

    Oh, Dongyep; Son, Bongjun; Mun, Seyoung; Oh, Man Hwan; Oh, Sejong; Ha, Jaejung; Yi, Junkoo; Lee, Seunguk; Han, Kyudong

    2016-02-01

    Chicken is one of the most popular domesticated species worldwide, as it can serve an important role in agricultural as well as biomedical research fields. Because it inhabits almost every continent and presents diverse morphology and traits, the need of genetic markers for distinguishing each breed for various purposes has increased. The whole genome sequencing of three different breeds (White Leghorn, Korean domestic, and Araucana) that show similar coloring patterns, with the exception of the White Leghorn breed, have confirmed previously reported genomic alterations and identified many novel variants. Additionally, the Whole Genome Re-Sequencing (WGRS) approach identified an approximately 4 kb insert within SLCO1B3 responsible for blue egg shell color. Targeted investigation of pigment-related genes corroborated previously reported non-synonymous mutations, and provided deeper insight into chicken coloring, where not a single but a combination of non-synonymous mutations in the MC1R gene is likely to be responsible for altered feather coloring. PMID:26853871

  3. Whole-genome validation of high-information-content fingerprinting.

    Science.gov (United States)

    Nelson, William M; Bharti, Arvind K; Butler, Ed; Wei, Fusheng; Fuks, Galina; Kim, Hyeran; Wing, Rod A; Messing, Joachim; Soderlund, Carol

    2005-09-01

    Fluorescent-based high-information-content fingerprinting (HICF) techniques have recently been developed for physical mapping. These techniques make use of automated capillary DNA sequencing instruments to enable both high-resolution and high-throughput fingerprinting. In this article, we report the construction of a whole-genome HICF FPC map for maize (Zea mays subsp. mays cv B73), using a variant of HICF in which a type IIS restriction enzyme is used to generate the fluorescently labeled fragments. The HICF maize map was constructed from the same three maize bacterial artificial chromosome libraries as previously used for the whole-genome agarose FPC map, providing a unique opportunity for direct comparison of the agarose and HICF methods; as a result, it was found that HICF has substantially greater sensitivity in forming contigs. An improved assembly procedure is also described that uses automatic end-merging of contigs to reduce the effects of contamination and repetitive bands. Several new features in FPC v7.2 are presented, including shared-memory multiprocessing, which allows dramatically faster assemblies, and automatic end-merging, which permits more accurate assemblies. It is further shown that sequenced clones may be digested in silico and located accurately on the HICF assembly, despite size deviations that prevent the precise prediction of experimental fingerprints. Finally, repetitive bands are isolated, and their effect on the assembly is studied. PMID:16166258

  4. Concurrent array-based queue

    Energy Technology Data Exchange (ETDEWEB)

    Heidelberger, Philip; Steinmacher-Burow, Burkhard

    2015-01-06

    According to one embodiment, a method for implementing an array-based queue in memory of a memory system that includes a controller includes configuring, in the memory, metadata of the array-based queue. The configuring comprises defining, in metadata, an array start location in the memory for the array-based queue, defining, in the metadata, an array size for the array-based queue, defining, in the metadata, a queue top for the array-based queue and defining, in the metadata, a queue bottom for the array-based queue. The method also includes the controller serving a request for an operation on the queue, the request providing the location in the memory of the metadata of the queue.

  5. SNP annotation-based whole genomic prediction and selection

    DEFF Research Database (Denmark)

    Do, Duy Ngoc; Janss, Luc; Jensen, Just;

    2015-01-01

    into a training (968 pigs) and a validation dataset (304 pigs) by assigning records as before and after January 1, 2012, respectively. SNP were annotated by 14 different classes using Ensembl variant effect prediction. Predictive accuracy and prediction bias were calculated using Bayesian Power LASSO...... SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP groups. Genomic prediction has accuracy comparable to observed phenotype, and use of genomic prediction can be cost...... effective by replacing feed intake measurement. Genomic annotation had less impact on predictive accuracy traits considered here but may be different for other traits. It is the first study to provide useful insights into biological classes of SNP driving the whole genomic prediction for complex traits in...

  6. Whole-genome characterization of chemoresistant ovarian cancer.

    Science.gov (United States)

    Patch, Ann-Marie; Christie, Elizabeth L; Etemadmoghadam, Dariush; Garsed, Dale W; George, Joshy; Fereday, Sian; Nones, Katia; Cowin, Prue; Alsop, Kathryn; Bailey, Peter J; Kassahn, Karin S; Newell, Felicity; Quinn, Michael C J; Kazakoff, Stephen; Quek, Kelly; Wilhelm-Benartzi, Charlotte; Curry, Ed; Leong, Huei San; Hamilton, Anne; Mileshkin, Linda; Au-Yeung, George; Kennedy, Catherine; Hung, Jillian; Chiew, Yoke-Eng; Harnett, Paul; Friedlander, Michael; Quinn, Michael; Pyman, Jan; Cordner, Stephen; O'Brien, Patricia; Leditschke, Jodie; Young, Greg; Strachan, Kate; Waring, Paul; Azar, Walid; Mitchell, Chris; Traficante, Nadia; Hendley, Joy; Thorne, Heather; Shackleton, Mark; Miller, David K; Arnau, Gisela Mir; Tothill, Richard W; Holloway, Timothy P; Semple, Timothy; Harliwong, Ivon; Nourse, Craig; Nourbakhsh, Ehsan; Manning, Suzanne; Idrisoglu, Senel; Bruxner, Timothy J C; Christ, Angelika N; Poudel, Barsha; Holmes, Oliver; Anderson, Matthew; Leonard, Conrad; Lonie, Andrew; Hall, Nathan; Wood, Scott; Taylor, Darrin F; Xu, Qinying; Fink, J Lynn; Waddell, Nick; Drapkin, Ronny; Stronach, Euan; Gabra, Hani; Brown, Robert; Jewell, Andrea; Nagaraj, Shivashankar H; Markham, Emma; Wilson, Peter J; Ellul, Jason; McNally, Orla; Doyle, Maria A; Vedururu, Ravikiran; Stewart, Collin; Lengyel, Ernst; Pearson, John V; Waddell, Nicola; deFazio, Anna; Grimmond, Sean M; Bowtell, David D L

    2015-05-28

    Patients with high-grade serous ovarian cancer (HGSC) have experienced little improvement in overall survival, and standard treatment has not advanced beyond platinum-based combination chemotherapy, during the past 30 years. To understand the drivers of clinical phenotypes better, here we use whole-genome sequencing of tumour and germline DNA samples from 92 patients with primary refractory, resistant, sensitive and matched acquired resistant disease. We show that gene breakage commonly inactivates the tumour suppressors RB1, NF1, RAD51B and PTEN in HGSC, and contributes to acquired chemotherapy resistance. CCNE1 amplification was common in primary resistant and refractory disease. We observed several molecular events associated with acquired resistance, including multiple independent reversions of germline BRCA1 or BRCA2 mutations in individual patients, loss of BRCA1 promoter methylation, an alteration in molecular subtype, and recurrent promoter fusion associated with overexpression of the drug efflux pump MDR1. PMID:26017449

  7. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  8. Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing

    Directory of Open Access Journals (Sweden)

    Plant Ramona N

    2006-08-01

    Full Text Available Abstract Background Whole genome amplification is an increasingly common technique through which minute amounts of DNA can be multiplied to generate quantities suitable for genetic testing and analysis. Questions of amplification-induced error and template bias generated by these methods have previously been addressed through either small scale (SNPs or large scale (CGH array, FISH methodologies. Here we utilized whole genome sequencing to assess amplification-induced bias in both coding and non-coding regions of two bacterial genomes. Halobacterium species NRC-1 DNA and Campylobacter jejuni were amplified by several common, commercially available protocols: multiple displacement amplification, primer extension pre-amplification and degenerate oligonucleotide primed PCR. The amplification-induced bias of each method was assessed by sequencing both genomes in their entirety using the 454 Sequencing System technology and comparing the results with those obtained from unamplified controls. Results All amplification methodologies induced statistically significant bias relative to the unamplified control. For the Halobacterium species NRC-1 genome, assessed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 119 times greater than those from unamplified material, 164.0 times greater for Repli-G, 165.0 times greater for PEP-PCR and 252.0 times greater than the unamplified controls for DOP-PCR. For Campylobacter jejuni, also analyzed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 15 times greater than those from unamplified material, 19.8 times greater for Repli-G, 61.8 times greater for PEP-PCR and 220.5 times greater than the unamplified controls for DOP-PCR. Conclusion Of the amplification methodologies examined in this paper, the multiple displacement amplification products generated the least bias, and produced significantly higher yields of amplified DNA.

  9. Whole-genome sequence variation, population structure and demographic history of the Dutch population

    NARCIS (Netherlands)

    The Genome of the Netherlands Consortium; Marschall, T.; Schoenhuth, A.

    2014-01-01

    Whole-genome sequencing enables complete characterization of genetic variation, but geographic clustering of rare alleles demands many diverse populations be studied. Here we describe the Genome of the Netherlands (GoNL) Project, in which we sequenced the whole genomes of 250 Dutch parent-offspring

  10. Use of metaphors about exome and whole genome sequencing.

    Science.gov (United States)

    Nelson, Sarah C; Crouch, Julia M; Bamshad, Michael J; Tabor, Holly K; Yu, Joon-Ho

    2016-05-01

    Clinical and research uses of exome and whole genome sequencing (ES/WGS) are growing rapidly. An enhanced understanding of how individuals conceptualize and communicate about sequencing results is needed to ensure effective, mutual exchange of information between care providers and patients and between researchers and participants. Focus groups and interviews participants were recruited to discuss their attitudes and preferences for receiving hypothetical results from ES/WGS. African Americans were intentionally oversampled. We qualitatively analyzed participants' speech to identify unsolicited metaphorical language pertaining to genes and health, and grouped these occurrences into metaphorical concepts. Participants compared genetic information to physical objects including tools, weapons, contents of boxes, and formal documents or reports. These metaphorical concepts centered on several key themes, including locus of control; containment versus release of information; and desirability, usability, interpretability, and ownership of genetic results. Metaphorical language is often used intentionally or unintentionally in discussions about receiving results from ES/WGS in both clinical and research settings. Awareness of the use of metaphorical language and attention to its varied meanings facilitates effective communication about return of ES/WGS results. In turn, both should foster shared and informed decision-making and improve the translation of genetic information by clinicians and researchers. © 2016 Wiley Periodicals, Inc. PMID:26822973

  11. Information recovery from low coverage whole-genome bisulfite sequencing

    Science.gov (United States)

    Libertini, Emanuele; Heath, Simon C.; Hamoudi, Rifat A.; Gut, Marta; Ziller, Michael J.; Czyz, Agata; Ruotti, Victor; Stunnenberg, Hendrik G.; Frontini, Mattia; Ouwehand, Willem H.; Meissner, Alexander; Gut, Ivo G.; Beck, Stephan

    2016-01-01

    The cost of whole-genome bisulfite sequencing (WGBS) remains a bottleneck for many studies and it is therefore imperative to extract as much information as possible from a given dataset. This is particularly important because even at the recommend 30X coverage for reference methylomes, up to 50% of high-resolution features such as differentially methylated positions (DMPs) cannot be called with current methods as determined by saturation analysis. To address this limitation, we have developed a tool that dynamically segments WGBS methylomes into blocks of comethylation (COMETs) from which lost information can be recovered in the form of differentially methylated COMETs (DMCs). Using this tool, we demonstrate recovery of ∼30% of the lost DMP information content as DMCs even at very low (5X) coverage. This constitutes twice the amount that can be recovered using an existing method based on differentially methylated regions (DMRs). In addition, we explored the relationship between COMETs and haplotypes in lymphoblastoid cell lines of African and European origin. Using best fit analysis, we show COMETs to be correlated in a population-specific manner, suggesting that this type of dynamic segmentation may be useful for integrated (epi)genome-wide association studies in the future. PMID:27346250

  12. Whole genome shotgun assembly in theory and practice

    Science.gov (United States)

    Chapman, Jarrod Andrew

    The subject of this dissertation is the development of novel analytical and algorithmic approaches to the fragment assembly problem in the context of the Whole Genome Shotgun (WGS) DNA sequencing strategy. A collection of analyses and methods centered on the computational reconstruction of genomic DNA sequence from randomly sampled genome fragments, with particular focus on applications to large, polymorphic, and inhomogeneous datasets are presented. Several novel pre-assembly WGS data analyses are described including assessment of genome size, sequence uniformity, and repetitive element content with particular emphasis on the establishment of standardized quality assurance metrics for large WGS sequencing projects. A theoretical framework for understanding the statistical properties of WGS assemblies in the presence of paired-end sequence data is discussed and the algorithmic sub-problems of quality-based sequence trimming, global pairwise alignment detection, and consensus sequence generation are treated. Finally, as a novel application of these analyses and methods, the results of a collaboration to produce the first WGS sequence reconstruction of a community sample from a natural environment are presented.

  13. Current Developments in Prokaryotic Single Cell Whole Genome Amplification

    Energy Technology Data Exchange (ETDEWEB)

    Goudeau, Danielle; Nath, Nandita; Ciobanu, Doina; Cheng, Jan-Fang; Malmstrom, Rex

    2014-03-14

    Our approach to prokaryotic single-cell Whole Genome Amplification at the JGI continues to evolve. To increase both the quality and number of single-cell genomes produced, we explore all aspects of the process from cell sorting to sequencing. For example, we now utilize specialized reagents, acoustic liquid handling, and reduced reaction volumes eliminate non-target DNA contamination in WGA reactions. More specifically, we use a cleaner commercial WGA kit from Qiagen that employs a UV decontamination procedure initially developed at the JGI, and we use the Labcyte Echo for tip-less liquid transfer to set up 2uL reactions. Acoustic liquid handling also dramatically reduces reagent costs. In addition, we are exploring new cell lysis methods including treatment with Proteinase K, lysozyme, and other detergents, in order to complement standard alkaline lysis and allow for more efficient disruption of a wider range of cells. Incomplete lysis represents a major hurdle for WGA on some environmental samples, especially rhizosphere, peatland, and other soils. Finding effective lysis strategies that are also compatible with WGA is challenging, and we are currently assessing the impact of various strategies on genome recovery.

  14. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

    Science.gov (United States)

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  15. Whole genome sequencing of field isolates provides robust characterization of genetic diversity in Plasmodium vivax.

    Directory of Open Access Journals (Sweden)

    Ernest R Chan

    Full Text Available BACKGROUND: An estimated 2.85 billion people live at risk of Plasmodium vivax transmission. In endemic countries vivax malaria causes significant morbidity and its mortality is becoming more widely appreciated, drug-resistant strains are increasing in prevalence, and an increasing number of reports indicate that P. vivax is capable of breaking through the Duffy-negative barrier long considered to confer resistance to blood stage infection. Absence of robust in vitro propagation limits our understanding of fundamental aspects of the parasite's biology, including the determinants of its dormant hypnozoite phase, its virulence and drug susceptibility, and the molecular mechanisms underlying red blood cell invasion. METHODOLOGY/PRINCIPAL FINDINGS: Here, we report results from whole genome sequencing of five P. vivax isolates obtained from Malagasy and Cambodian patients, and of the monkey-adapted Belem strain. We obtained an average 70-400 X coverage of each genome, resulting in more than 93% of the Sal I reference sequence covered by 20 reads or more. Our study identifies more than 80,000 SNPs distributed throughout the genome which will allow designing association studies and population surveys. Analysis of the genome-wide genetic diversity in P. vivax also reveals considerable allele sharing among isolates from different continents. This observation could be consistent with a high level of gene flow among parasite strains distributed throughout the world. CONCLUSIONS: Our study shows that it is feasible to perform whole genome sequencing of P. vivax field isolates and rigorously characterize the genetic diversity of this parasite. The catalogue of polymorphisms generated here will enable large-scale genotyping studies and contribute to a better understanding of P. vivax traits such as drug resistance or erythrocyte invasion, partially circumventing the lack of laboratory culture that has hampered vivax research for years.

  16. Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks

    Science.gov (United States)

    Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S. K.; Mammel, Mark K.; Tarr, Phillip I.; Eppinger, Mark

    2016-01-01

    Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and

  17. A whole genome association study on meat palatability in hanwoo.

    Science.gov (United States)

    Hyeong, K-E; Lee, Y-M; Kim, Y-S; Nam, K C; Jo, C; Lee, K-H; Lee, J-E; Kim, J-J

    2014-09-01

    A whole genome association (WGA) study was carried out to find quantitative trait loci (QTL) for sensory evaluation traits in Hanwoo. Carcass samples of 250 Hanwoo steers were collected from National Agricultural Cooperative Livestock Research Institute, Ansung, Gyeonggi province, Korea, between 2011 and 2012 and genotyped with the Affymetrix Bovine Axiom Array 640K single nucleotide polymorphism (SNP) chip. Among the SNPs in the chip, a total of 322,160 SNPs were chosen after quality control tests. After adjusting for the effects of age, slaughter-year-season, and polygenic effects using genome relationship matrix, the corrected phenotypes for the sensory evaluation measurements were regressed on each SNP using a simple linear regression additive based model. A total of 1,631 SNPs were detected for color, aroma, tenderness, juiciness and palatability at 0.1% comparison-wise level. Among the significant SNPs, the best set of 52 SNP markers were chosen using a forward regression procedure at 0.05 level, among which the sets of 8, 14, 11, 10, and 9 SNPs were determined for the respectively sensory evaluation traits. The sets of significant SNPs explained 18% to 31% of phenotypic variance. Three SNPs were pleiotropic, i.e. AX-26703353 and AX-26742891 that were located at 101 and 110 Mb of BTA6, respectively, influencing tenderness, juiciness and palatability, while AX-18624743 at 3 Mb of BTA10 affected tenderness and palatability. Our results suggest that some QTL for sensory measures are segregating in a Hanwoo steer population. Additional WGA studies on fatty acid and nutritional components as well as the sensory panels are in process to characterize genetic architecture of meat quality and palatability in Hanwoo. PMID:25178363

  18. Post-Fragmentation Whole Genome Amplification-Based Method

    Science.gov (United States)

    Benardini, James; LaDuc, Myron T.; Langmore, John

    2011-01-01

    This innovation is derived from a proprietary amplification scheme that is based upon random fragmentation of the genome into a series of short, overlapping templates. The resulting shorter DNA strands (DNA fragments with defined 3 and 5 termini. Specific primers to these termini are then used to isothermally amplify this library into potentially unlimited quantities that can be used immediately for multiple downstream applications including gel eletrophoresis, quantitative polymerase chain reaction (QPCR), comparative genomic hybridization microarray, SNP analysis, and sequencing. The standard reaction can be performed with minimal hands-on time, and can produce amplified DNA in as little as three hours. Post-fragmentation whole genome amplification-based technology provides a robust and accurate method of amplifying femtogram levels of starting material into microgram yields with no detectable allele bias. The amplified DNA also facilitates the preservation of samples (spacecraft samples) by amplifying scarce amounts of template DNA into microgram concentrations in just a few hours. Based on further optimization of this technology, this could be a feasible technology to use in sample preservation for potential future sample return missions. The research and technology development described here can be pivotal in dealing with backward/forward biological contamination from planetary missions. Such efforts rely heavily on an increasing understanding of the burden and diversity of microorganisms present on spacecraft surfaces throughout assembly and testing. The development and implementation of these technologies could significantly improve the comprehensiveness and resolving power of spacecraft-associated microbial population censuses, and are important to the continued evolution and advancement of planetary protection capabilities. Current molecular procedures for assaying spacecraft-associated microbial burden and diversity have inherent sample loss issues at

  19. Whole-genome cartography of estrogen receptor alpha binding sites.

    Directory of Open Access Journals (Sweden)

    Chin-Yo Lin

    2007-06-01

    Full Text Available Using a chromatin immunoprecipitation-paired end diTag cloning and sequencing strategy, we mapped estrogen receptor alpha (ERalpha binding sites in MCF-7 breast cancer cells. We identified 1,234 high confidence binding clusters of which 94% are projected to be bona fide ERalpha binding regions. Only 5% of the mapped estrogen receptor binding sites are located within 5 kb upstream of the transcriptional start sites of adjacent genes, regions containing the proximal promoters, whereas vast majority of the sites are mapped to intronic or distal locations (>5 kb from 5' and 3' ends of adjacent transcript, suggesting transcriptional regulatory mechanisms over significant physical distances. Of all the identified sites, 71% harbored putative full estrogen response elements (EREs, 25% bore ERE half sites, and only 4% had no recognizable ERE sequences. Genes in the vicinity of ERalpha binding sites were enriched for regulation by estradiol in MCF-7 cells, and their expression profiles in patient samples segregate ERalpha-positive from ERalpha-negative breast tumors. The expression dynamics of the genes adjacent to ERalpha binding sites suggest a direct induction of gene expression through binding to ERE-like sequences, whereas transcriptional repression by ERalpha appears to be through indirect mechanisms. Our analysis also indicates a number of candidate transcription factor binding sites adjacent to occupied EREs at frequencies much greater than by chance, including the previously reported FOXA1 sites, and demonstrate the potential involvement of one such putative adjacent factor, Sp1, in the global regulation of ERalpha target genes. Unexpectedly, we found that only 22%-24% of the bona fide human ERalpha binding sites were overlapping conserved regions in whole genome vertebrate alignments, which suggest limited conservation of functional binding sites. Taken together, this genome-scale analysis suggests complex but definable rules governing ERalpha

  20. New wheat microRNA using whole-genome sequence.

    Science.gov (United States)

    Kurtoglu, Kuaybe Yucebilgili; Kantar, Melda; Budak, Hikmet

    2014-06-01

    MicroRNAs are post-transcriptional regulators of gene expression, taking roles in a variety of fundamental biological processes. Hence, their identification, annotation and characterization are of great significance, especially in bread wheat, one of the main food sources for humans. The recent availability of 5× coverage Triticum aestivum L. whole-genome sequence provided us with the opportunity to perform a systematic prediction of a complete catalogue of wheat microRNAs. Using an in silico homology-based approach, stem-loop coding regions were derived from two assemblies, constructed from wheat 454 reads. To avoid the presence of pseudo-microRNAs in the final data set, transposable element related stem-loops were eliminated by repeat analysis. Overall, 52 putative wheat microRNAs were predicted, including seven, which have not been previously published. Moreover, with distinct analysis of the two different assemblies, both variety and representation of putative microRNA-coding stem-loops were found to be predominant in the intergenic regions. By searching available expressed sequences and small RNA library databases, expression evidence for 39 (out of 52) putative wheat microRNAs was provided. Expression of three of the predicted microRNAs (miR166, miR396 and miR528) was also comparatively quantified with real-time quantitative reverse transcription PCR. This is the first report on in silico prediction of a whole repertoire of bread wheat microRNAs, supported by the wet-lab validation. PMID:24395439

  1. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  2. Assessment of Whole-Genome Mapping in a Well-Defined Outbreak of Salmonella enterica Serotype Saintpaul

    OpenAIRE

    Fey, P. D.; Iwen, P C; Zentz, E. B.; Briska, A. M.; Henkhaus, J. K.; Bryant, K.A.; Larson, M. A; Noel, R. K.; Hinrichs, S. H.

    2012-01-01

    We investigated the use of whole-genome mapping and pulsed-field gel electrophoresis (PFGE) with isolates from an outbreak of Salmonella enterica serotype Saintpaul. PFGE and whole-genome mapping were concordant with 22 of 23 isolates. Whole-genome mapping is a viable alternative tool for the epidemiological analysis of Salmonella food-borne disease investigations.

  3. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data

    NARCIS (Netherlands)

    Bryant, J.M.; Schürch, A.C.; Deutekom, van H.; Harris, S.R.; Beer, de J.L.; Jager, de V.C.L.; Kremer, K.; Hijum, van S.A.F.T.; Siezen, R.J.; Borgdorff, M.; Bentley, S.D.; Parkhill, J.; Soolingen, van D.

    2013-01-01

    BACKGROUND: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate knowledge of th

  4. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data

    NARCIS (Netherlands)

    J.M. Bryant (Josephine); A. Schürch (Anita); H. van Deutekom (Henk); S.R. Harris (Simon); J.L. de Beer (Jessica); V. de Jager (Victor); K. Kremer (Kristin); S.A.F.T. van Hijum (Sacha); R.J. Siezen (Roland); M.W. Borgdorff (Martien ); S.D. Bentley (Stephen); J. Parkhill (Julian); D. van Soolingen (Dick)

    2013-01-01

    textabstractBackground: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate kno

  5. Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data.

    NARCIS (Netherlands)

    Bryant, J.M.; Schurch, A.C.; Deutekom, H. van; Harris, S.R.; Beer, J.L. de; Jager, V. de; Kremer, K.; Hijum, S.A.F.T. van; Siezen, R.J.; Borgdorff, M.; Bentley, S.D.; Parkhill, J.; Soolingen, D. van

    2013-01-01

    BACKGROUND: Mycobacterium tuberculosis is characterised by limited genomic diversity, which makes the application of whole genome sequencing particularly attractive for clinical and epidemiological investigation. However, in order to confidently infer transmission events, an accurate knowledge of th

  6. Whole-Genome Shotgun Sequence of Arthrospira platensis Strain Paraca, a Cultivated and Edible Cyanobacterium

    OpenAIRE

    Lefort, Francois; Calmin, Gautier; Crovadore, Julien; Falquet, Jacques; Hurni, Jean-Pierre; Osteras, Magne; Haldemann, Francois; Farinelli, Laurent

    2014-01-01

    Here we report the whole-genome shotgun sequence of a Peruvian strain of Arthrospira platensis (Paraca), a cultivated and edible haloalkaliphilic cyanobacterium of great scientific, technical, and economic potential.

  7. Whole-Genome Sequencing of Micrococcus luteus Strain Modasa, of Indian Origin

    OpenAIRE

    A. Ghosh; Chaudhary, S. A.; Apurva, S. R.; T. Tiwari; Gupta, S.; A.K. Singh; Katudia, K. H.; Patel, M.P.; Chikara, S. K.

    2013-01-01

    The hydrocarbon-degrading bacterium Micrococcus luteus strain Modasa was isolated from contaminated soil from Modasa, North Gujarat, India. Whole-genome sequencing and analysis provide an insight into the potentially important genes responsible for bioremediation.

  8. New perspectives on microbial community distortion after whole-genome amplification

    Science.gov (United States)

    Whole-genome amplification (WGA) has become an important tool to explore the genomic information of microorganisms in an environmental sample with limited biomass, however potential selective biases during the amplification processes are poorly understood. Here, we describe the e...

  9. Rapid Identification of Potential Drugs for Diabetic Nephropathy Using Whole-Genome Expression Profiles of Glomeruli

    OpenAIRE

    Jingsong Shi; Song Jiang; Dandan Qiu; Weibo Le; Xiao Wang; Yinhui Lu; Zhihong Liu

    2016-01-01

    Objective. To investigate potential drugs for diabetic nephropathy (DN) using whole-genome expression profiles and the Connectivity Map (CMAP). Methodology. Eighteen Chinese Han DN patients and six normal controls were included in this study. Whole-genome expression profiles of microdissected glomeruli were measured using the Affymetrix human U133 plus 2.0 chip. Differentially expressed genes (DEGs) between late stage and early stage DN samples and the CMAP database were used to identify pote...

  10. Clinical Diagnosis by Whole-Genome Sequencing of a Prenatal Sample

    OpenAIRE

    Talkowski, Michael E.; Ordulu, Zehra; Pillalamarri, Vamsee; Benson, Carol B.; Blumenthal, Ian; Connolly, Susan; Hanscom, Carrie; Hussain, Naveed; Pereira, Shahrin; Picker, Jonathan; Rosenfeld, Jill A.; Shaffer, Lisa G.; Wilkins-Haug, Louise E.; Gusella, James F.; Morton, Cynthia C.

    2012-01-01

    Conventional cytogenetic testing offers low-resolution detection of balanced karyotypic abnormalities but cannot provide the precise, gene-level knowledge required to predict outcomes. The use of high-resolution whole-genome deep sequencing is currently impractical for the purpose of routine clinical care. We show here that whole-genome “jumping libraries” can offer an immediately applicable, nucleotide-level complement to conventional genetic diagnostics within a time frame that allows for c...

  11. Mining metagenomic whole genome sequences revealed subdominant but constant Lactobacillus population in the human gut microbiota.

    Science.gov (United States)

    Rossi, Maddalena; Martínez-Martínez, Daniel; Amaretti, Alberto; Ulrici, Alessandro; Raimondi, Stefano; Moya, Andrés

    2016-06-01

    The genus Lactobacillus includes over 215 species that colonize plants, foods, sewage and the gastrointestinal tract (GIT) of humans and animals. In the GIT, Lactobacillus population can be made by true inhabitants or by bacteria occasionally ingested with fermented or spoiled foods, or with probiotics. This study longitudinally surveyed Lactobacillus species and strains in the feces of a healthy subject through whole genome sequencing (WGS) data-mining, in order to identify members of the permanent or transient populations. In three time-points (0, 670 and 700 d), 58 different species were identified, 16 of them being retrieved for the first time in human feces. L. rhamnosus, L. ruminis, L. delbrueckii, L. plantarum, L. casei and L. acidophilus were the most represented, with estimated amounts ranging between 6 and 8 Log (cells g(-1) ), while the other were detected at 4 or 5 Log (cells g(-1) ). 86 Lactobacillus strains belonging to 52 species were identified. 43 seemingly occupied the GIT as true residents, since were detected in a time span of almost 2 years in all the three samples or in 2 samples separated by 670 or 700 d. As a whole, a stable community of lactobacilli was disclosed, with wide and understudied biodiversity. PMID:27043715

  12. Whole-genome shotgun optical mapping of Rhodobacter sphaeroides strain 2.4. 1 and its use for whole-genome shotgun sequence assembly

    Energy Technology Data Exchange (ETDEWEB)

    Shou, S. [Univ. Wisc.-Madison; Kvikstad, E. [Univ. Wisc.-Madison; Kile, A. [Univ. Wisc.-Madison; Severin, J. [Whole-genome shotgun optical mapping of Rhodobacter sphaeroides strain 2.4. 1 and its use for whole-genome shotgun sequence assembly; Forrest, D. [Univ. Wisc.-Madison; Runnheim, R. [Univ. Wisc.-Madison; Churas, C. [Univ. Wisc.-Madison; Hickman, J. W. [Univ. Wisc.-Madison; Mackenzie, C. [University of Texas–Houston Medical School; Choudhary, M. [University of Texas–Houston Medical School; Donohue, T. [Univ. Wisc.-Madison; Kaplan, S. [University of Texas–Houston Medical School; Schwartz, D. C. [Univ. Wisc.-Madison

    2003-09-01

    Rhodobacter sphaeroides 2.4.1 is a facultative photoheterotrophic bacterium with tremendous metabolic diversity, which has significantly contributed to our understanding of the molecular genetics of photosynthesis, photoheterotrophy, nitrogen fixation, hydrogen metabolism, carbon dioxide fixation, taxis, and tetrapyrrole biosynthesis. To further understand this remarkable bacterium, and to accelerate an ongoing sequencing project, two whole-genome restriction maps (EcoRI and HindIII) of R. sphaeroides strain 2.4.1 were constructed using shotgun optical mapping. The approach directly mapped genomic DNA by the random mapping of single molecules. The two maps were used to facilitate sequence assembly by providing an optical scaffold for high-resolution alignment and verification of sequence contigs. Our results show that such maps facilitated the closure of sequence gaps by the early detection of nascent sequence contigs during the course of the whole-genome shotgun sequencing process.

  13. Whole-Genome de novo Sequencing Of Quail And Grey Partridge

    DEFF Research Database (Denmark)

    Holm, Lars-Erik; Panitz, Frank; Burt, Dave;

    2011-01-01

    The development in sequencing methods has made it possible to perform whole genome de novo sequencing of species without large commercial interests. Within the EU-financed QUANTOMICS project (KBBE-2A-222664), we have performed de novo sequencing of quail (Coturnix coturnix) and grey partridge...... (Perdix perdix) on a Genome Analyzer GAII (Illumina) using paired-end sequencing. The amount of generated sequences amounts to 8 to 9 Gb for each species. The analysis and assembly of the generated sequences is ongoing. Access to the whole genome sequence from these two species will enable enhanced...... comparative studies towards the chicken genome and will aid in identifying evolutionarily conserved sequences within the Galliformes. The obtained sequences from quail and partridge represent a beginning of generating the whole genome sequence for these species. The continuation of establishing the genome...

  14. Whole genome sequence analysis of unidentified genetically modified papaya for development of a specific detection method.

    Science.gov (United States)

    Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko

    2016-08-15

    Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities. PMID:27006240

  15. Postzygotic single-nucleotide mosaicisms in whole-genome sequences of clinically unremarkable individuals

    OpenAIRE

    Huang, August Y.; Xu, Xiaojing; Ye, Adam Y.; Wu, Qixi; Yan, Linlin; Zhao, Boxun; Yang, Xiaoxu; He, Yao; Wang, Sheng; Zhang, Zheng; Gu, Bowen; Han-qing ZHAO; Wang, Meng; Gao, Hua; Gao, Ge

    2014-01-01

    Postzygotic single-nucleotide mutations (pSNMs) have been studied in cancer and a few other overgrowth human disorders at whole-genome scale and found to play critical roles. However, in clinically unremarkable individuals, pSNMs have never been identified at whole-genome scale largely due to technical difficulties and lack of matched control tissue samples, and thus the genome-wide characteristics of pSNMs remain unknown. We developed a new Bayesian-based mosaic genotyper and a series of eff...

  16. Whole-Genome Sequences of Two Borrelia afzelii and Two Borrelia garinii Lyme Disease Agent Isolates

    Energy Technology Data Exchange (ETDEWEB)

    Casjens, S.R.; Dunn, J.; Mongodin, E. F.; Qiu, W.-G.; Luft, B. J.; Fraser-Liggett, C. M.; Schutzer, S. E.

    2011-12-01

    Human Lyme disease is commonly caused by several species of spirochetes in the Borrelia genus. In Eurasia these species are largely Borrelia afzelii, B. garinii, B. burgdorferi, and B. bavariensis sp. nov. Whole-genome sequencing is an excellent tool for investigating and understanding the influence of bacterial diversity on the pathogenesis and etiology of Lyme disease. We report here the whole-genome sequences of four isolates from two of the Borrelia species that cause human Lyme disease, B. afzelii isolates ACA-1 and PKo and B. garinii isolates PBr and Far04.

  17. Whole genome association mapping by incompatibilities and local perfect phylogenies

    DEFF Research Database (Denmark)

    Mailund, Thomas; Besenbacher, Søren; Schierup, Mikkel Heide

    2006-01-01

    method. The method was also found to accurately localise the known susceptibility variants in an empirical data set—the ΔF508 mutation for cystic fibrosis—where the susceptibility variant is already known—and to find significant signals for association between the CYP2D6 gene and poor drug metabolism......, although for this dataset the highest association score is about 60kb from the CYP2D6 gene. Conclusions: Our method has been implemented in the Blossoc (BLOck aSSOCiation) software. Using Blossoc, genome wide chip-based surveys of 3 million SNPs in 1000 cases and 1000 controls can be analysed in less than...

  18. Whole-Genome Sequencing Detection of Ongoing Listeria Contamination at a Restaurant, Rhode Island, USA, 2014

    Science.gov (United States)

    Gosciminski, Michael; Miller, Adam

    2016-01-01

    In November 2014, the Rhode Island Department of Health investigated a cluster of 3 listeriosis cases. Using whole-genome sequencing to support epidemiologic, laboratory, and environmental investigations, the department identified 1 restaurant as the likely source of the outbreak and also linked the establishment to a listeriosis case that occurred in 2013. PMID:27434089

  19. Whole-genome sequence of “Candidatus Liberibacter solanacearum” strain R1 from California

    Science.gov (United States)

    The draft whole-genome sequence of “Candidatus Liberibacter solanacearum” strain R1, isolated from a tomato plant in California, United States, is reported. The R1 strain genome is 1,204,257 bp in size (G+C content of 35.3%), encoding 1,101 open reading frames and 57 RNA genes....

  20. Whole Genome Selection Project Involving 2,000 Industry AI Sires

    Science.gov (United States)

    Whole genome selection (WGS) uses markers spanning the genome to predict genetic merit for economically important traits. WGS may increase the rate of genetic progress through improved accuracy and reduced generation interval especially for traits that cannot be measured on breeding animals. In cont...

  1. Draft Whole-Genome Sequence of the Type Strain Bacillus horikoshii DSM 8719

    Science.gov (United States)

    Hernández-González, Ismael L.

    2016-01-01

    Members of the Bacillus genus have been extensively studied because of their ability to produce enzymes with high biotechnological value. Here, we report the draft of the whole-genome sequence of the type strain Bacillus horikoshii DSM 8719, an alkali-tolerant strain. PMID:27417833

  2. Tolerance of Whole-Genome Doubling Propagates Chromosomal Instability and Accelerates Cancer Genome Evolution

    DEFF Research Database (Denmark)

    Dewhurst, Sally M.; McGranahan, Nicholas; Burrell, Rebecca A.; Rowan, Andrew J.; Grönroos, Eva; Endesfelder, David; Joshi, Tejal; Mouradov, Dmitri; Gibbs, Peter; Ward, Robyn L.; Hawkins, Nicholas J.; Szallasi, Zoltan Imre; Sieber, Oliver M.; Swanton, Charles

    2014-01-01

    The contribution of whole-genome doubling to chromosomal instability (CIN) and tumor evolution is unclear. We use long-term culture of isogenic tetraploid cells from a stable diploid colon cancer progenitor to investigate how a genome-doubling event affects genome stability over time. Rare cells ...

  3. Whole-Genome Sequencing Detection of Ongoing Listeria Contamination at a Restaurant, Rhode Island, USA, 2014.

    Science.gov (United States)

    Barkley, Jonathan S; Gosciminski, Michael; Miller, Adam

    2016-08-01

    In November 2014, the Rhode Island Department of Health investigated a cluster of 3 listeriosis cases. Using whole-genome sequencing to support epidemiologic, laboratory, and environmental investigations, the department identified 1 restaurant as the likely source of the outbreak and also linked the establishment to a listeriosis case that occurred in 2013. PMID:27434089

  4. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity

    Directory of Open Access Journals (Sweden)

    Kok-Gan Chan

    2016-03-01

    Full Text Available Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000.

  5. Whole-Genome Shotgun Sequencing of Lactobacillus rhamnosus MTCC 5462, a Strain with Probiotic Potential

    OpenAIRE

    Prajapati, J. B.; Khedkar, C. D.; Chitra, J.; Suja, Senan; V. Mishra; Sreeja, V.; Patel, R. K.; Ahir, V. B.; Bhatt, V. D.; Sajnani, M. R.; Jakhesara, S. J.; Koringa, P. G.; Joshi, C. G.

    2012-01-01

    Lactobacillus rhamnosus MTCC 5462 was isolated from infant gastrointestinal flora. The strain exhibited an ability to reduce cholesterol and stimulate immunity. The strain has exhibited positive results in alleviating gastrointestinal discomfort and good potential as a probiotic. We sequenced the whole genome of the strain and compared it to the published genome sequence of Lactobacillus rhamnosus GG (ATCC 53103).

  6. Capsular Typing Method for Streptococcus agalactiae Using Whole-Genome Sequence Data

    OpenAIRE

    Sheppard, AE; Vaughan, A; Jones, N.; Turner, P; Turner, C.; Efstratiou, A.; Patel, D.; MMM Informatics Group; Walker, AS; Berkley, J.; Crook, DW; Seale, AC

    2016-01-01

    Group B streptococcus (GBS) capsular serotype is a major determ inant of virulence, and affects potential vaccine coverage. Here we report a whole genome sequencing-based method for GBS serotype assignment. This shows high agree ment (kappa=0.92) with conventional methods, and increased serotype assignment (100%) to all ten capsular types.

  7. Capsular Typing Method for Streptococcus agalactiae Using Whole-Genome Sequence Data.

    Science.gov (United States)

    Sheppard, Anna E; Vaughan, Alison; Jones, Nicola; Turner, Paul; Turner, Claudia; Efstratiou, Androulla; Patel, Darshana; Walker, A Sarah; Berkley, James A; Crook, Derrick W; Seale, Anna C

    2016-05-01

    Group B streptococcus (GBS) capsular serotypes are major determinants of virulence and affect potential vaccine coverage. Here we report a whole-genome-sequencing-based method for GBS serotype assignment. This method shows strong agreement (kappa of 0.92) with conventional methods and increased serotype assignment (100%) to all 10 capsular types. PMID:26962081

  8. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity

    OpenAIRE

    Kok-Gan Chan; Wai-Fong Yin; Xin-Yue Chan

    2015-01-01

    Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000.

  9. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity.

    Science.gov (United States)

    Chan, Kok-Gan; Yin, Wai-Fong; Chan, Xin-Yue

    2016-03-01

    Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000. PMID:26981378

  10. Whole-Genome Shotgun Sequence of Pseudomonas viridiflava, a Bacterium Species Pathogenic to Arabidopsis thaliana

    OpenAIRE

    Lefort, Francois; Calmin, Gautier; Crovadore, Julien; Osteras, Magne; Farinelli, Laurent

    2013-01-01

    We report here the first whole-genome shotgun sequence of Pseudomonas viridiflava strain UASWS38, a bacterium species pathogenic to the biological model plant Arabidopsis thaliana but also usable as a biological control agent and thus of great scientific interest for understanding the genetics of plant-microbe interactions.

  11. Whole-Genome Shotgun Sequence of Rhodococcus Species Strain JVH1

    OpenAIRE

    Brooks, Shannon L.; Van Hamme, Jonathan D.

    2012-01-01

    Here we present a whole-genome shotgun sequence of Rhodococcus species strain JVH1, an organism capable of degrading a variety of organosulfur compounds. In particular, JVH1 is able to selectively cleave carbon-sulfur bonds within alkyl chains. A large number of oxygenases were identified, consistent with other members of the genus.

  12. WIDE-CROSS WHOLE-GENOME RADIATION HYBIRD MAPPING OF THE COTTON (GOSSYPIUM BARBADENSE L.) GENOME

    Science.gov (United States)

    Whole-genome radiation hybrid mapping has been applied extensively to human and certain animal species but little to plants. We recently demonstrated an alternative mapping approach in cotton (Gossypium hirsutum L.) based on segmentation by 5-krad gamma-irradiation and derivation of wild-cross whol...

  13. A whole-genome assembly of the domestic cow, Bos taurus

    Science.gov (United States)

    Background: The genome of the domestic cow, Bos taurus, was sequenced using a mixture of hierarchical and whole-genome shotgun sequencing methods. Results: We have assembled the 35 million sequence reads and applied a variety of assembly improvement techniques, creating an assembly of 2.86 billion b...

  14. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity

    OpenAIRE

    Kok-Gan Chan; Wai-Fong Yin; Xin-Yue Chan

    2016-01-01

    Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000.

  15. Whole-genome analyses resolve early branches in the tree of life of modern birds

    DEFF Research Database (Denmark)

    Sicheritz-Pontén, Thomas; Li, Cai; Li, Bo;

    2014-01-01

    and confirm independent gains of vocal learning. Among Columbea, we identify pigeons and flamingoes as belonging to sister clades. Even with whole genomes, some of the earliest branches in Neoaves proved challenging to resolve, which was best explained by massive protein-coding sequence convergence...

  16. Whole-Genome Sequence of Aeromonas hydrophila Strain AH-1 (Serotype O11).

    Science.gov (United States)

    Forn-Cuní, Gabriel; Tomás, Juan M; Merino, Susana

    2016-01-01

    Aeromonas hydrophila is an emerging pathogen of aquatic and terrestrial animals, including humans. Here, we report the whole-genome sequence of the septicemic A. hydrophila AH-1 strain, belonging to the serotype O11, and the first mesophilic Aeromonas with surface layer (S-layer) to be sequenced. PMID:27587829

  17. Whole-Genome Shotgun Sequencing of a Colonizing Multilocus Sequence Type 17 Streptococcus agalactiae Strain

    Science.gov (United States)

    Singh, Pallavi; Springman, A. Cody; Davies, H. Dele

    2012-01-01

    This report highlights the whole-genome shotgun draft sequence for a Streptococcus agalactiae strain representing multilocus sequence type (ST) 17, isolated from a colonized woman at 8 weeks postpartum. This sequence represents an important addition to the published genomes and will promote comparative genomic studies of S. agalactiae recovered from diverse sources. PMID:23045509

  18. Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation

    DEFF Research Database (Denmark)

    Zhao, Shancen; Zheng, Pingping; Dong, Shanshan;

    2013-01-01

    dynamics remains largely unknown. We sequenced the whole genomes of 34 pandas at an average 4.7-fold coverage and used this data set together with the previously deep-sequenced panda genome to reconstruct a continuous demographic history of pandas from their origin to the present. We identify two...

  19. Whole-Genome Scans Provide Evidence of Adaptive Evolution in Malawian Plasmodium falciparum Isolates

    DEFF Research Database (Denmark)

    Ocholla, Harold; Preston, Mark D; Mipando, Mwapatsa;

    2014-01-01

    BACKGROUND:  Selection by host immunity and antimalarial drugs has driven extensive adaptive evolution in Plasmodium falciparum and continues to produce ever-changing landscapes of genetic variation. METHODS:  We performed whole-genome sequencing of 69 P. falciparum isolates from Malawi and used ...

  20. Whole-Genome Shotgun Sequencing of a Colonizing Multilocus Sequence Type 17 Streptococcus agalactiae Strain

    OpenAIRE

    Singh, Pallavi; Springman, A. Cody; Davies, H Dele; Manning, Shannon D.

    2012-01-01

    This report highlights the whole-genome shotgun draft sequence for a Streptococcus agalactiae strain representing multilocus sequence type (ST) 17, isolated from a colonized woman at 8 weeks postpartum. This sequence represents an important addition to the published genomes and will promote comparative genomic studies of S. agalactiae recovered from diverse sources.

  1. Genetics professionals' opinions of whole-genome sequencing in the newborn period.

    Science.gov (United States)

    Ulm, Elizabeth; Feero, W Gregory; Dineen, Richard; Charrow, Joel; Wicklund, Catherine

    2015-06-01

    Newborn screening (NBS) programs have been successful in identifying infants with rare, treatable, congenital conditions. While current programs rely largely on biochemical analysis, some predict that in the future, genome sequencing may be used as an adjunct. The purpose of this exploratory pilot study was to begin to characterize genetics professionals' opinions of the use of whole-genome sequencing (WGS) in NBS. We surveyed members of the American College of Medical Genetics and Genomics (ACMG) via an electronic survey distributed through email. The survey included questions about results disclosure, the current NBS paradigm, and the current criteria for adding a condition to the screening panel. The response rate was 7.3 % (n = 113/1549). The majority of respondents (85 %, n = 96/113) felt that WGS should not be currently used in NBS, and that if it were used, it should not be mandatory (86.5 %, n = 96/111). However, 75.7 % (n = 84/111) foresee it as a future use of WGS. Respondents felt that accurate interpretation of results (86.5 %, n = 83/96), a more extensive consent process (72.6 %, n = 69/95), pre- (79.2 %, n = 76/96) and post-test (91.6 %, n = 87/95) counseling, and comparable costs (70.8 %, n = 68/96) and turn-around-times (64.6 %, n = 62/96) to current NBS would be important for using WGS in NBS. Participants were in favor of disclosing most types of results at some point in the lifetime. However, the majority (87.3 %, n = 96/110) also indicated that parents should be able to choose what results are disclosed. Overall, respondents foresee NBS as a future use of WGS, but indicated that WGS should not occur within the framework of traditional NBS. They agreed with the current criteria for including a condition on the recommended uniform screening panel (RUSP). Further discussion about these criteria is needed in order to better understand how they could be utilized if WGS is incorporated into NBS. PMID:25348082

  2. Generation of physical map contig-specific sequences useful for whole genome sequence scaffolding.

    Directory of Open Access Journals (Sweden)

    Yanliang Jiang

    Full Text Available Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge.

  3. Whole genome association mapping by incompatibilities and local perfect phylogenies

    Directory of Open Access Journals (Sweden)

    Besenbacher Søren

    2006-10-01

    despite being significantly faster. For unphased genotype data, an initial step of estimating the phase only slightly decreases the power of the method. The method was also found to accurately localise the known susceptibility variants in an empirical data set – the ΔF508 mutation for cystic fibrosis – where the susceptibility variant is already known – and to find significant signals for association between the CYP2D6 gene and poor drug metabolism, although for this dataset the highest association score is about 60 kb from the CYP2D6 gene. Conclusion Our method has been implemented in the Blossoc (BLOck aSSOCiation software. Using Blossoc, genome wide chip-based surveys of 3 million SNPs in 1000 cases and 1000 controls can be analysed in less than two CPU hours.

  4. Transmission Clusters of Methicillin-Resistant Staphylococcus Aureus in Long-Term Care Facilities Based on Whole-Genome Sequencing.

    Science.gov (United States)

    Stine, O Colin; Burrowes, Shana; David, Sophia; Johnson, J Kristie; Roghmann, Mary-Claire

    2016-06-01

    OBJECTIVE To define how often methicillin-resistant Staphylococcus aureus (MRSA) is spread from resident to resident in long-term care facilities using whole-genome sequencing DESIGN Prospective cohort study SETTING A long-term care facility PARTICIPANTS Elderly residents in a long-term care facility METHODS Cultures for MRSA were obtained weekly from multiple body sites from residents with known MRSA colonization over 12-week study periods. Simultaneously, cultures to detect MRSA acquisition were obtained weekly from 2 body sites in residents without known MRSA colonization. During the first 12-week cycle on a single unit, we sequenced 8 MRSA isolates per swab for 2 body sites from each of 6 residents. During the second 12-week cycle, we sequenced 30 MRSA isolates from 13 residents with known MRSA colonization and 3 residents who had acquired MRSA colonization. RESULTS MRSA isolates from the same swab showed little genetic variation between isolates with the exception of isolates from wounds. The genetic variation of isolates between body sites on an individual was greater than that within a single body site with the exception of 1 sample, which had 2 unrelated strains among the 8 isolates. In the second cycle, 10 of 16 residents colonized with MRSA (63%) shared 1 of 3 closely related strains. Of the 3 residents with newly acquired MRSA, 2 residents harbored isolates that were members of these clusters. CONCLUSIONS Point prevalence surveys with whole-genome sequencing of MRSA isolates may detect resident-to-resident transmission more accurately than routine surveillance cultures for MRSA in long-term care facilities. Infect Control Hosp Epidemiol 2016;37:685-691. PMID:26941060

  5. Downsizing genomic medicine: approaching the ethical complexity of whole-genome sequencing by starting small.

    Science.gov (United States)

    Sharp, Richard R

    2011-03-01

    As we look to a time when whole-genome sequencing is integrated into patient care, it is possible to anticipate a number of ethical challenges that will need to be addressed. The most intractable of these concern informed consent and the responsible management of very large amounts of genetic information. Given the range of possible findings, it remains unclear to what extent it will be possible to obtain meaningful patient consent to genomic testing. Equally unclear is how clinicians will disseminate the enormous volume of genetic information produced by whole-genome sequencing. Toward developing practical strategies for managing these ethical challenges, we propose a research agenda that approaches multiplexed forms of clinical genetic testing as natural laboratories in which to develop best practices for managing the ethical complexities of genomic medicine. PMID:21311340

  6. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D;

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls......, for imputing sequence variant genotypes into reference sets for genomic prediction. Run 3.0 included 429 sequences, with 31.8 million variants detected. BayesRC, a new method for genomic prediction, addresses some challenges associated with using the sequence data, and takes advantage of biological...... information. In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant...

  7. Whole genome sequencing of Mycobacterium tuberculosis SB24 isolated from Sabah, Malaysia.

    Science.gov (United States)

    Philip, Noraini; Rodrigues, Kenneth Francis; William, Timothy; John, Daisy Vanitha

    2016-09-01

    Mycobacterium tuberculosis (M. tuberculosis) is the causative agent of tuberculosis (TB) that causes millions of death every year. We have sequenced the genome of M. tuberculosis isolated from cerebrospinal fluid (CSF) of a patient diagnosed with tuberculous meningitis (TBM). The isolated strain was referred as M. tuberculosis SB24. Genomic DNA of the M. tuberculosis SB24 was extracted and subjected to whole genome sequencing using PacBio platform. The draft genome size of M. tuberculosis SB24 was determined to be 4,452,489 bp with a G + C content of 65.6%. The whole genome shotgun project has been deposited in NCBI SRA under the accession number SRP076503. PMID:27556011

  8. Comparison of variations detection between whole-genome amplification methods used in single-cell resequencing

    DEFF Research Database (Denmark)

    Hou, Yong; Wu, Kui; Shi, Xulian;

    2015-01-01

    -oligonucleotide-primed PCR (DOP-PCR) and multiple annealing and looping-based amplification cycles (MALBAC). However, a comprehensive comparison of variations detection performance between these WGA methods has not yet been performed. RESULTS: We systematically compared the advantages and disadvantages of different WGA...... methods, focusing particularly on variations detection. Low-coverage whole-genome sequencing revealed that DOP-PCR had the highest duplication ratio, but an even read distribution and the best reproducibility and accuracy for detection of copy-number variations (CNVs). However, MDA had significantly......BACKGROUND: Single-cell resequencing (SCRS) provides many biomedical advances in variations detection at the single-cell level, but it currently relies on whole genome amplification (WGA). Three methods are commonly used for WGA: multiple displacement amplification (MDA), degenerate...

  9. Applications of the double-barreled data in whole-genome shotgun sequence assembly and analysis

    Institute of Scientific and Technical Information of China (English)

    HAN Yujun; WANG Jing; GU Xiaocheng; YU Jun; LI Songgang; NI Peixiang; L(U) Hong; YE Jia; HU Jianfei; CHEN Chen; HUANG Xiangang; CONG Lijuan; LI Guangyuan

    2005-01-01

    Double-barreled (DB) data have been widely used for the assembly of large genomes. Based on the experience of building the whole-genome working draft of Oryza sativa L.ssp. Indica, we present here the prevailing and improved uses of DB data in the assembly procedure and report on novel applications during the following data-mining processes such as acquiring precise insert fragment information of each clone across the genome, and a new kind of Iow-cost whole-genome microarray. With the increasing number of organisms being sequenced,we believe that DB data will play an important role both in other assembly procedures and infuture genomic studies.

  10. Unique Features of a Japanese ‘Candidatus Liberibacter asiaticus’ Strain Revealed by Whole Genome Sequencing

    OpenAIRE

    Hiroshi Katoh; Shin-Ichi Miyata; Hiromitsu Inoue; Toru Iwanami

    2014-01-01

    Citrus greening (huanglongbing) is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with phloem-limited bacteria of three species of α-Proteobacteria, namely, 'Candidatus Liberibacter asiaticus', 'Ca. L. americanus', and 'Ca. L. africanus'. Recent findings suggested that some Japanese strains lack the bacteriophage-type DNA polymerase region (DNA pol), in contrast to the Floridian psy62 strain. The whole genome sequence of the pol-negative 'C...

  11. A Model for Carbohydrate Metabolism in the Diatom Phaeodactylum tricornutum Deduced from Comparative Whole Genome Analysis

    OpenAIRE

    Kroth, Peter G.; Chiovitti, Anthony; Gruber, Ansgar; Martin-jezequel, Veronique; Mock, Thomas; Schnitzler Parker, Micaela; Michele S. Stanley; Kaplan, Aaron; Caron, Lise; Weber, Till; Maheswari, Uma; Armbrust, Elisabeth Virginia; Bowler, Chris

    2008-01-01

    Background:Diatoms are unicellular algae responsible for approximately 20% of global carbon fixation. Their evolution by secondary endocytobiosis resulted in a complex cellular structure and metabolism compared to algae with primary plastids.Methodology/Principal Findings:The whole genome sequence of the diatom Phaeodactylum tricornutum has recently been completed. We identified and annotated genes for enzymes involved in carbohydrate pathways based on extensive EST support and comparison to ...

  12. Effective Normalization for Copy Number Variation Detection from Whole Genome Sequencing

    OpenAIRE

    Janevski Angel; Varadan Vinay; Kamalakaran Sitharthan; Banerjee Nilanjana; Dimitrova Nevenka

    2012-01-01

    Abstract Background Whole genome sequencing enables a high resolution view of the human genome and provides unique insights into genome structure at an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools, while validated, also include a number of parameters that are configurable to genome data being analyzed. These algorithms allow for normalization to account for individual and population-specific effects on individual genome CNV e...

  13. Using Mendelian inheritance errors as quality control criteria in whole genome sequencing data set

    OpenAIRE

    Pilipenko, Valentina V; He, Hua; Kurowski, Brad G.; Alexander, Eileen S.; Zhang, Xue; Ding, Lili; Mersha, Tesfaye B.; Kottyan, Leah; Fardo, David W.; Martin, Lisa J.

    2014-01-01

    Although the technical and analytic complexity of whole genome sequencing is generally appreciated, best practices for data cleaning and quality control have not been defined. Family based data can be used to guide the standardization of specific quality control metrics in nonfamily based data. Given the low mutation rate, Mendelian inheritance errors are likely as a result of erroneous genotype calls. Thus, our goal was to identify the characteristics that determine Mendelian inheritance err...

  14. Pathway Processor: A Tool for Integrating Whole-Genome Expression Results into Metabolic Networks

    OpenAIRE

    Grosu, Paul; Townsend, Jeffrey P.; Hartl, Daniel L.; Cavalieri, Duccio

    2002-01-01

    We have developed a new tool to visualize expression data on metabolic pathways and to evaluate which metabolic pathways are most affected by transcriptional changes in whole-genome expression experiments. Using the Fisher Exact Test, the method scores biochemical pathways according to the probability that as many or more genes in a pathway would be significantly altered in a given experiment by chance alone. This method has been validated on diauxic shift experiments and reproduces well know...

  15. Whole-Genome Sequence of Rummeliibacillus stabekisii Strain PP9 Isolated from Antarctic Soil

    Science.gov (United States)

    da Mota, Fábio Faria; Vollú, Renata Estebanez; Jurelevicius, Diogo

    2016-01-01

    The whole genome of Rummeliibacillus stabekisii PP9, isolated from a soil sample from Antarctica, consists of a circular chromosome of 3,412,092 bp and a circular plasmid of 8,647 bp, with 3,244 protein-coding genes, 12 copies of the 16S-23S-5S rRNA operon, 101 tRNA genes, and 6 noncoding RNAs (ncRNAs). PMID:27231360

  16. Self-organizing Approach for Automated Gene Identification in Whole Genomes

    CERN Document Server

    Gorban, A N; Popova, T G; Gorban, Alexander N.; Zinovyev, Andrey Yu.; Popova, Tatyana G.

    2001-01-01

    An approach based on using the idea of distinguished coding phase in explicit form for identification of protein-coding regions (exons) in whole genome has been proposed. For several genomes an optimal window length for averaging GC-content function and calculating codon frequencies has been found. Self-training procedure based on clustering in multidimensional space of triplet frequencies is proposed. For visualization of data in the space of triplet requiencies method of elastic maps was applied.

  17. Clinical Application of Whole-Genome Sequencing To Inform Treatment for Multidrug-Resistant Tuberculosis Cases

    OpenAIRE

    Witney, Adam A.; Gould, Katherine A.; Arnold, Amber; Coleman, David; Delgado, Rachel; Dhillon, Jasvir; Pond, Marcus J.; Pope, Cassie F.; Planche, Tim D.; Stoker, Neil G.; Cosgrove, Catherine A.; Butcher, Philip D.; Harrison, Thomas S; Hinds, Jason

    2015-01-01

    The treatment of drug-resistant tuberculosis cases is challenging, as drug options are limited, and the existing diagnostics are inadequate. Whole-genome sequencing (WGS) has been used in a clinical setting to investigate six cases of suspected extensively drug-resistant Mycobacterium tuberculosis (XDR-TB) encountered at a London teaching hospital between 2008 and 2014. Sixteen isolates from six suspected XDR-TB cases were sequenced; five cases were analyzed in a clinically relevant time fram...

  18. Whole-genome fingerprint of the DNA methylome during human B cell differentiation

    OpenAIRE

    Kulis, Marta; Merkel, Angelika; Heath, Simon; Queirós, Ana C; Schuyler, Ronald P.; Castellano, Giancarlo; Beekman, Renée; Raineri, Emanuele; Esteve, Anna; Clot, Guillem; Verdaguer-Dot, Néria; Duran-Ferrer, Martí; Russiñol, Nuria; Vilarrasa-Blasi, Roser; Ecker, Simone

    2015-01-01

    International audience We analyzed the DNA methylome of ten subpopulations spanning the entire B cell differentiation program by whole-genome bisulfite sequencing and high-density microarrays. We observed that non-CpG methylation disappeared upon B cell commitment, whereas CpG methylation changed extensively during B cell maturation, showing an accumulative pattern and affecting around 30% of all measured CpG sites. Early differentiation stages mainly displayed enhancer demethylation, whic...

  19. Prospective Whole-Genome Sequencing Enhances National Surveillance of Listeria monocytogenes

    OpenAIRE

    Kwong, Jason C.; Mercoulia, Karolina; Tomita, Takehiro; Easton, Marion; Li, Hua Y.; Bulach, Dieter M.; Stinear, Timothy P.; Seemann, Torsten; Benjamin P Howden

    2016-01-01

    Whole-genome sequencing (WGS) has emerged as a powerful tool for comparing bacterial isolates in outbreak detection and investigation. Here we demonstrate that WGS performed prospectively for national epidemiologic surveillance of Listeria monocytogenes has the capacity to be superior to our current approaches using pulsed-field gel electrophoresis (PFGE), multilocus sequence typing (MLST), multilocus variable-number tandem-repeat analysis (MLVA), binary typing, and serotyping. Initially 423 ...

  20. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples

    OpenAIRE

    Chun-nan Dong; Ya-dong Yang; Shu-jin Li; Ya-ran Yang; Xiao-jing Zhang; Xiang-dong Fang; Jiang-wei Yan; Bin Cong

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and ident...

  1. Digital Droplet Multiple Displacement Amplification (ddMDA) for Whole Genome Sequencing of Limited DNA Samples

    OpenAIRE

    Minsoung Rhee; Yooli K Light; Meagher, Robert J.; Anup K. Singh

    2016-01-01

    Multiple displacement amplification (MDA) is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples) before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA) technique where partitioning of the template D...

  2. Whole-Genome Sequence of Chlamydia gallinacea Type Strain 08-1274/3

    Science.gov (United States)

    Hölzer, Martin; Laroucau, Karine; Creasy, Heather Huot; Ott, Sandra; Vorimore, Fabien; Bavoil, Patrik M.; Marz, Manja

    2016-01-01

    The recently introduced bacterial species Chlamydia gallinacea is known to occur in domestic poultry and other birds. Its potential as an avian pathogen and zoonotic agent is under investigation. The whole-genome sequence of its type strain, 08-1274/3, consists of a 1,059,583-bp chromosome with 914 protein-coding sequences (CDSs) and a plasmid (p1274) comprising 7,619 bp with 9 CDSs. PMID:27445388

  3. Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity

    OpenAIRE

    Dulak, Austin M.; Stojanov, Petar; Peng, Shouyong; Lawrence, Michael S; Fox, Cameron; Stewart, Chip; Bandla, Santhoshi; Imamura, Yu; Schumacher, Steven E; Shefler, Erica; McKenna, Aaron; Cibulskis, Kristian; Sivachenko, Andrey; Carter, Scott L.; Saksena, Gordon

    2012-01-01

    The incidence of esophageal adenocarcinoma (EAC) has risen 600% over the last 30 years. With a five-year survival rate of 15%, identification of new therapeutic targets for EAC is greatly important. We analyze the mutation spectra from whole exome sequencing of 149 EAC tumors/normal pairs, 15 of which have also been subjected to whole genome sequencing. We identify a mutational signature defined by a high prevalence of A to C transversions at AA dinucleotides. Statistical analysis of exome da...

  4. Systematic Pharmacogenomics Analysis of a Malay Whole Genome: Proof of Concept for Personalized Medicine

    OpenAIRE

    Salleh, Mohd Zaki; Teh, Lay Kek; Lee, Lian Shien; Ismet, Rose Iszati; Patowary, Ashok; Joshi, Kandarp; Pasha, Ayesha; Ahmed, Azni Zain; Janor, Roziah Mohd; Hamzah, Ahmad Sazali; Adam, Aishah; Yusoff, Khalid; Hoh, Boon Peng; Hatta, Fazleen Haslinda Mohd; Ismail, Mohamad Izwan

    2013-01-01

    Background With a higher throughput and lower cost in sequencing, second generation sequencing technology has immense potential for translation into clinical practice and in the realization of pharmacogenomics based patient care. The systematic analysis of whole genome sequences to assess patient to patient variability in pharmacokinetics and pharmacodynamics responses towards drugs would be the next step in future medicine in line with the vision of personalizing medicine. Methods Genomic DN...

  5. Whole-genome sequencing and analysis of the Malaysian cynomolgus macaque (Macaca fascicularis) genome

    OpenAIRE

    Higashino, Atsunori; Sakate, Ryuichi; Kameoka, Yosuke; Takahashi, Ichiro; Hirata, Makoto; Tanuma, Reiko; Masui, Tohru; Yasutomi, Yasuhiro; Osada, Naoki

    2012-01-01

    Background The genetic background of the cynomolgus macaque (Macaca fascicularis) is made complex by the high genetic diversity, population structure, and gene introgression from the closely related rhesus macaque (Macaca mulatta). Herein we report the whole-genome sequence of a Malaysian cynomolgus macaque male with more than 40-fold coverage, which was determined using a resequencing method based on the Indian rhesus macaque genome. Results We identified approximately 9.7 million single nuc...

  6. Analytical validation of whole exome and whole genome sequencing for clinical applications

    OpenAIRE

    Linderman, Michael D.; Brandt, Tracy; Edelmann, Lisa; Jabado, Omar; Kasai, Yumi; Kornreich, Ruth; Mahajan, Milind; Shah, Hardik; Kasarskis, Andrew; Eric E Schadt

    2014-01-01

    Background Whole exome and genome sequencing (WES/WGS) is now routinely offered as a clinical test by a growing number of laboratories. As part of the test design process each laboratory must determine the performance characteristics of the platform, test and informatics pipeline. This report documents one such characterization of WES/WGS. Methods Whole exome and whole genome sequencing was performed on multiple technical replicates of five reference samples using the Illumina HiSeq 2000/2500...

  7. Coverage Bias and Sensitivity of Variant Calling for Four Whole-genome Sequencing Technologies

    OpenAIRE

    Nora Rieber; Marc Zapatka; Bärbel Lasitschka; David Jones1; Paul Northcott; Barbara Hutter; Natalie Jäger; Marcel Kool; Michael Taylor; Peter Lichter; Stefan Pfister; Stephan Wolf; Benedikt Brors; Roland Eils

    2013-01-01

    The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina's HiSeq2000, Life Technologies' SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics' technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms...

  8. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing

    OpenAIRE

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

    2013-01-01

    Background Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from la...

  9. Coverage Bias and Sensitivity of Variant Calling for Four Whole-genome Sequencing Technologies

    OpenAIRE

    Rieber, Nora; Zapatka, Marc; Lasitschka, Bärbel; Jones, David,; Northcott, Paul; Hutter, Barbara; Jäger, Natalie; Kool, Marcel; Taylor, Michael; Lichter, Peter; Pfister, Stefan; Wolf, Stephan; Brors, Benedikt; Eils, Roland

    2013-01-01

    The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina’s HiSeq2000, Life Technologies’ SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics’ technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms...

  10. Identification of emergent blaCMY-2-carrying Proteus mirabilis lineages by whole-genome sequencing

    OpenAIRE

    Mac Aogáin, M.; Rogers, T.R.; Crowley, B.

    2016-01-01

    Whole-genome sequencing of 24 Proteus mirabilis isolates revealed the clonal expansion of two cefoxitin-resistant strains among patients with community-onset infection. These strains harboured bla CMY-2 within a chromosomally located integrative and conjugative element and exhibited multidrug resistance phenotypes. A predominant strain, identified in 18 patients, also harboured the PGI-1 genomic island and associated resistance genes, accounting for its broader antibiotic resistance profile. ...

  11. Organization and Evolution of Primate Centromeric DNA from Whole-Genome Shotgun Sequence Data

    OpenAIRE

    Alkan, Can; Eichler, Evan E.; Ventura, Mario; Archidiacono, Nicoletta; Rocchi, Mariano; Sahinalp, S Cenk

    2007-01-01

    Author Summary Centromeric DNA has been described as the last frontier of genomic sequencing; such regions are typically poorly assembled during the whole-genome shotgun sequence assembly process due to their repetitive complexity. This paper develops a computational algorithm to systematically extract data regarding primate centromeric DNA structure and organization from that ∼5% of sequence that is not included as part of standard genome sequence assemblies. Using this computational approac...

  12. Whole-genome resequencing of Hanwoo (Korean cattle) and insight into regions of homozygosity

    OpenAIRE

    Lee, Kyung-Tai; Chung, Won-Hyong; Lee, Sung-Yeoun; Choi, Jung-Woo; Kim, Jiwoong; Lim, Dajeong; Lee, Seunghwan; Jang, Gul-Won; Kim, Bumsoo; Choy, Yun Ho; Liao, Xiaoping; Stothard, Paul; Moore, Stephen S; Lee, Sang-Heon; Ahn, Sungmin

    2013-01-01

    Background Hanwoo (Korean cattle), which originated from natural crossbreeding between taurine and zebu cattle, migrated to the Korean peninsula through North China. Hanwoo were raised as draft animals until the 1970s without the introduction of foreign germplasm. Since 1979, Hanwoo has been bred as beef cattle. Genetic variation was analyzed by whole-genome deep resequencing of a Hanwoo bull. The Hanwoo genome was compared to that of two other breeds, Black Angus and Holstein, and genes with...

  13. How accurately is ncRNA aligned within whole-genome multiple alignments?

    OpenAIRE

    Ruzzo Walter L; Wang Adrienne X; Tompa Martin

    2007-01-01

    Abstract Background Multiple alignment of homologous DNA sequences is of great interest to biologists since it provides a window into evolutionary processes. At present, the accuracy of whole-genome multiple alignments, particularly in noncoding regions, has not been thoroughly evaluated. Results We evaluate the alignment accuracy of certain noncoding regions using noncoding RNA alignments from Rfam as a reference. We inspect the MULTIZ 17-vertebrate alignment from the UCSC Genome Browser for...

  14. Screening of whole genome sequences identified high-impact variants for stallion fertility

    OpenAIRE

    Schrimpf, Rahel; Gottschalk, Maren; Metzger, Julia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

    2016-01-01

    Background Stallion fertility is an economically important trait due to the increase of artificial insemination in horses. The availability of whole genome sequence data facilitates identification of rare high-impact variants contributing to stallion fertility. The aim of our study was to genotype rare high-impact variants retrieved from next-generation sequencing (NGS)-data of 11 horses in order to unravel harmful genetic variants in large samples of stallions. Methods Gene ontology (GO) ter...

  15. Rapid Whole-Genome Sequencing for Surveillance of Salmonella enterica Serovar Enteritidis

    OpenAIRE

    den Bakker, Henk C.; Allard, Marc W.; Bopp, Dianna; Brown, Eric W.; Fontana, John; Iqbal, Zamin; Kinney, Aristea; Limberger, Ronald; Musser, Kimberlee A.; Shudt, Matthew; Strain, Errol; Wiedmann, Martin; Wolfgang, William J.

    2014-01-01

    For Salmonella enterica serovar Enteritidis, 85% of isolates can be classified into 5 pulsed-field gel electrophoresis (PFGE) types. However, PFGE has limited discriminatory power for outbreak detection. Although whole-genome sequencing has been found to improve discrimination of outbreak clusters, whether this procedure can be used in real-time in a public health laboratory is not known. Therefore, we conducted a retrospective and prospective analysis. The retrospective study investigated is...

  16. Microarray-based whole-genome hybridization as a tool for determining procaryotic species relatedness

    Energy Technology Data Exchange (ETDEWEB)

    Wu, L.; Liu, X.; Fields, M.W.; Thompson, D.K.; Bagwell, C.E.; Tiedje, J. M.; Hazen, T.C.; Zhou, J.

    2008-01-15

    The definition and delineation of microbial species are of great importance and challenge due to the extent of evolution and diversity. Whole-genome DNA-DNA hybridization is the cornerstone for defining procaryotic species relatedness, but obtaining pairwise DNA-DNA reassociation values for a comprehensive phylogenetic analysis of procaryotes is tedious and time consuming. A previously described microarray format containing whole-genomic DNA (the community genome array or CGA) was rigorously evaluated as a high-throughput alternative to the traditional DNA-DNA reassociation approach for delineating procaryotic species relationships. DNA similarities for multiple bacterial strains obtained with the CGA-based hybridization were comparable to those obtained with various traditional whole-genome hybridization methods (r=0.87, P<0.01). Significant linear relationships were also observed between the CGA-based genome similarities and those derived from small subunit (SSU) rRNA gene sequences (r=0.79, P<0.0001), gyrB sequences (r=0.95, P<0.0001) or REP- and BOX-PCR fingerprinting profiles (r=0.82, P<0.0001). The CGA hybridization-revealed species relationships in several representative genera, including Pseudomonas, Azoarcus and Shewanella, were largely congruent with previous classifications based on various conventional whole-genome DNA-DNA reassociation, SSU rRNA and/or gyrB analyses. These results suggest that CGA-based DNA-DNA hybridization could serve as a powerful, high-throughput format for determining species relatedness among microorganisms.

  17. Heritability of pulmonary function estimated from pedigree and whole-genome markers

    OpenAIRE

    Klimentidis, Yann C.; Vazquez, Ana I; de los Campos, Gustavo; Allison, David B.; Dransfield, Mark T.; Thannickal, Victor J.

    2013-01-01

    Asthma and chronic obstructive pulmonary disease (COPD) are major worldwide health problems. Pulmonary function testing is a useful diagnostic tool for these diseases, and is known to be influenced by genetic and environmental factors. Previous studies have demonstrated that a substantial proportion of the variation in pulmonary function phenotypes can be explained by familial relationships. The availability of whole-genome single nucleotide polymorphism (SNP) data enables us to further evalu...

  18. A whole-genome, radiation hybrid mapping resource of hexaploid wheat.

    Science.gov (United States)

    Tiwari, Vijay K; Heesacker, Adam; Riera-Lizarazu, Oscar; Gunn, Hilary; Wang, Shichen; Wang, Yi; Gu, Young Q; Paux, Etienne; Koo, Dal-Hoe; Kumar, Ajay; Luo, Ming-Cheng; Lazo, Gerard; Zemetra, Robert; Akhunov, Eduard; Friebe, Bernd; Poland, Jesse; Gill, Bikram S; Kianian, Shahryar; Leonard, Jeffrey M

    2016-04-01

    Generating a contiguous, ordered reference sequence of a complex genome such as hexaploid wheat (2n = 6x = 42; approximately 17 GB) is a challenging task due to its large, highly repetitive, and allopolyploid genome. In wheat, ordering of whole-genome or hierarchical shotgun sequencing contigs is primarily based on recombination and comparative genomics-based approaches. However, comparative genomics approaches are limited to syntenic inference and recombination is suppressed within the pericentromeric regions of wheat chromosomes, thus, precise ordering of physical maps and sequenced contigs across the whole-genome using these approaches is nearly impossible. We developed a whole-genome radiation hybrid (WGRH) resource and tested it by genotyping a set of 115 randomly selected lines on a high-density single nucleotide polymorphism (SNP) array. At the whole-genome level, 26 299 SNP markers were mapped on the RH panel and provided an average mapping resolution of approximately 248 Kb/cR1500 with a total map length of 6866 cR1500 . The 7296 unique mapping bins provided a five- to eight-fold higher resolution than genetic maps used in similar studies. Most strikingly, the RH map had uniform bin resolution across the entire chromosome(s), including pericentromeric regions. Our research provides a valuable and low-cost resource for anchoring and ordering sequenced BAC and next generation sequencing (NGS) contigs. The WGRH developed for reference wheat line Chinese Spring (CS-WGRH), will be useful for anchoring and ordering sequenced BAC and NGS based contigs for assembling a high-quality, reference sequence of hexaploid wheat. Additionally, this study provides an excellent model for developing similar resources for other polyploid species. PMID:26945524

  19. Whole genome transcript profiling from fingerstick blood samples: a comparison and feasibility study

    OpenAIRE

    Williams Adam R; Mondala Tony S; Robison Elizabeth H; Head Steven R; Salomon Daniel R; Kurian Sunil M

    2009-01-01

    Abstract Background Whole genome gene expression profiling has revolutionized research in the past decade especially with the advent of microarrays. Recently, there have been significant improvements in whole blood RNA isolation techniques which, through stabilization of RNA at the time of sample collection, avoid bias and artifacts introduced during sample handling. Despite these improvements, current human whole blood RNA stabilization/isolation kits are limited by the requirement of a veno...

  20. Comparison of Whole-Genome Sequencing and Molecular-Epidemiological Techniques for Clostridium difficile Strain Typing.

    Science.gov (United States)

    Dominguez, Samuel R; Anderson, Lydia J; Kotter, Cassandra V; Littlehorn, Cynthia A; Arms, Lesley E; Dowell, Elaine; Todd, James K; Frank, Daniel N

    2016-09-01

    We analyzed in parallel 27 pediatric Clostridium difficile isolates by repetitive sequence-based polymerase chain reaction (RepPCR), pulsed-field gel electrophoresis (PFGE), and whole-genome next-generation sequencing. Next-generation sequencing distinguished 3 groups of isolates that were indistinguishable by RepPCR and 1 isolate that clustered in the same PFGE group as other isolates. PMID:26407257

  1. Whole genome expression profile in neuroblastoma cells exposed to 1-methyl-4-phenylpyridine

    OpenAIRE

    Mazzio, E; Soliman, KFA

    2012-01-01

    Mitochondrial dysfunction and subsequent energy failure is a contributing factor to degeneration of the substantia nigra pars compacta associated with Parkinson’s disease (PD). In this study, we investigate molecular events trigger by 1-methyl-4-phenylpyridine (MPP+) using whole genome-expression microarray, western blot and HPLC quantification of metabolites. The data show that MPP+ (500μM) evokes obstruction of mitochondrial respiration/oxidative phosphorylation (OXPHOS) in mouse neuroblast...

  2. Inference of Gorilla Demographic and Selective History from Whole-Genome Sequence Data

    OpenAIRE

    McManus, Kimberly F; Kelley, Joanna L.; Song, Shiya; Veeramah, Krishna R; Woerner, August E.; Stevison, Laurie S.; Ryder, Oliver A.; Ape Genome Project, Great; Kidd, Jeffrey M.; Wall, Jeffrey D.; Bustamante, Carlos D.; Hammer, Michael F.

    2015-01-01

    Although population-level genomic sequence data have been gathered extensively for humans, similar data from our closest living relatives are just beginning to emerge. Examination of genomic variation within great apes offers many opportunities to increase our understanding of the forces that have differentially shaped the evolutionary history of hominid taxa. Here, we expand upon the work of the Great Ape Genome Project by analyzing medium to high coverage whole-genome sequences from 14 west...

  3. Reducing INDEL calling errors in whole genome and exome sequencing data

    OpenAIRE

    Fang, Han; Wu, Yiyang; Narzisi, Giuseppe; O’Rawe, Jason A; Barrón, Laura T Jimenez; Rosenbaum, Julie; Ronemus, Michael; Iossifov, Ivan; Schatz, Michael C.; Lyon, Gholson J

    2014-01-01

    Background INDELs, especially those disrupting protein-coding regions of the genome, have been strongly associated with human diseases. However, there are still many errors with INDEL variant calling, driven by library preparation, sequencing biases, and algorithm artifacts. Methods We characterized whole genome sequencing (WGS), whole exome sequencing (WES), and PCR-free sequencing data from the same samples to investigate the sources of INDEL errors. We also developed a classification schem...

  4. Multiple Mutations in Heterogeneous Miltefosine-Resistant Leishmania major Population as Determined by Whole Genome Sequencing

    OpenAIRE

    Adriano C Coelho; Sébastien Boisvert; Angana Mukherjee; Philippe Leprohon; Jacques Corbeil; Marc Ouellette

    2012-01-01

    BACKGROUND: Miltefosine (MF) is the first oral compound used in the chemotherapy against leishmaniasis. Since the mechanism of action of this drug and the targets of MF in Leishmania are unclear, we generated in a step-by-step manner Leishmania major promastigote mutants highly resistant to MF. Two of the mutants were submitted to a short-read whole genome sequencing for identifying potential genes associated with MF resistance. METHODS/PRINCIPAL FINDINGS: Analysis of the genome assemblies re...

  5. Colorectal cancer and the human gut microbiome: reproducibility with whole-genome shotgun sequencing

    OpenAIRE

    Emily Vogtmann; Xing Hua; Georg Zeller; Shinichi Sunagawa; Voigt, Anita Y.; Rajna Hercog; Goedert, James J.; Jianxin Shi; Peer Bork; Rashmi Sinha

    2016-01-01

    Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously pub...

  6. Self-organizing Approach for Automated Gene Identification in Whole Genomes

    OpenAIRE

    Gorban, Alexander N; Zinovyev, Andrey Yu.; Popova, Tatyana G.

    2001-01-01

    An approach based on using the idea of distinguished coding phase in explicit form for identification of protein-coding regions (exons) in whole genome has been proposed. For several genomes an optimal window length for averaging GC-content function and calculating codon frequencies has been found. Self-training procedure based on clustering in multidimensional space of triplet frequencies is proposed. For visualization of data in the space of triplet requiencies method of elastic maps was ap...

  7. Analysis on n-gram statistics and linguistic features of whole genome protein sequences

    Institute of Scientific and Technical Information of China (English)

    DONG Qi-wen; WANG Xiao-long; LIN Lei

    2008-01-01

    To obtain the statistical sequence analysis on a large number of genomic and proteomie sequences available for different organisms,the n-grams of whole genome protein sequences from 20 organisms were extracted.Their linguistic features were analyzed by two tests:Zipf power law and Shannon entropy,developed for analysis of natural languages and symbolic sequences.The natural genome proteins and the artificial genome proteins were compared with each other and some statistical features of n-grams were discovered.The results show that:the n-grams of whole genome protein sequences approximately follow the Zipf law when n is larger than 4;the Shannon n-gram entropy of natural genome proteins is lower than that of artificial proteins;a simple unigram model can distinguish different organisms;there exist organism-specific usages of "phrases" in protein sequences.It is suggested that further detailed analysis on n-gram of whole genome protein sequences will result in a powerful model for mapping the relationship of protein sequence,structure and function.

  8. Rapid Identification of Potential Drugs for Diabetic Nephropathy Using Whole-Genome Expression Profiles of Glomeruli

    Directory of Open Access Journals (Sweden)

    Jingsong Shi

    2016-01-01

    Full Text Available Objective. To investigate potential drugs for diabetic nephropathy (DN using whole-genome expression profiles and the Connectivity Map (CMAP. Methodology. Eighteen Chinese Han DN patients and six normal controls were included in this study. Whole-genome expression profiles of microdissected glomeruli were measured using the Affymetrix human U133 plus 2.0 chip. Differentially expressed genes (DEGs between late stage and early stage DN samples and the CMAP database were used to identify potential drugs for DN using bioinformatics methods. Results. (1 A total of 1065 DEGs (FDR 1.5 were found in late stage DN patients compared with early stage DN patients. (2 Piperlongumine, 15d-PGJ2 (15-delta prostaglandin J2, vorinostat, and trichostatin A were predicted to be the most promising potential drugs for DN, acting as NF-κB inhibitors, histone deacetylase inhibitors (HDACIs, PI3K pathway inhibitors, or PPARγ agonists, respectively. Conclusion. Using whole-genome expression profiles and the CMAP database, we rapidly predicted potential DN drugs, and therapeutic potential was confirmed by previously published studies. Animal experiments and clinical trials are needed to confirm both the safety and efficacy of these drugs in the treatment of DN.

  9. Whole Genome Mapping with Feature Sets from High-Throughput Sequencing Data.

    Science.gov (United States)

    Pan, Yonglong; Wang, Xiaoming; Liu, Lin; Wang, Hao; Luo, Meizhong

    2016-01-01

    A good physical map is essential to guide sequence assembly in de novo whole genome sequencing, especially when sequences are produced by high-throughput sequencing such as next-generation-sequencing (NGS) technology. We here present a novel method, Feature sets-based Genome Mapping (FGM). With FGM, physical map and draft whole genome sequences can be generated, anchored and integrated using the same data set of NGS sequences, independent of restriction digestion. Method model was created and parameters were inspected by simulations using the Arabidopsis genome sequence. In the simulations, when ~4.8X genome BAC library including 4,096 clones was used to sequence the whole genome, ~90% of clones were successfully connected to physical contigs, and 91.58% of genome sequences were mapped and connected to chromosomes. This method was experimentally verified using the existing physical map and genome sequence of rice. Of 4,064 clones covering 115 Mb sequence selected from ~3 tiles of 3 chromosomes of a rice draft physical map, 3,364 clones were reconstructed into physical contigs and 98 Mb sequences were integrated into the 3 chromosomes. The physical map-integrated draft genome sequences can provide permanent frameworks for eventually obtaining high-quality reference sequences by targeted sequencing, gap filling and combining other sequences. PMID:27611682

  10. Development and preliminary evaluation of an online educational video about whole-genome sequencing for research participants, patients, and the general public

    Science.gov (United States)

    Sanderson, Saskia C.; Suckiel, Sabrina A.; Zweig, Micol; Bottinger, Erwin P.; Jabs, Ethylin Wang; Richardson, Lynne D.

    2016-01-01

    Background: As whole-genome sequencing (WGS) increases in availability, WGS educational aids are needed for research participants, patients, and the general public. Our aim was therefore to develop an accessible and scalable WGS educational aid. Genet Med 18 5, 501–512. Methods: We engaged multiple stakeholders in an iterative process over a 1-year period culminating in the production of a novel 10-minute WGS educational animated video, “Whole Genome Sequencing and You” (https://goo.gl/HV8ezJ). We then presented the animated video to 281 online-survey respondents (the video-information group). There were also two comparison groups: a written-information group (n = 281) and a no-information group (n = 300). Genet Med 18 5, 501–512. Results: In the video-information group, 79% reported the video was easy to understand, satisfaction scores were high (mean 4.00 on 1–5 scale, where 5 = high satisfaction), and knowledge increased significantly. There were significant differences in knowledge compared with the no-information group but few differences compared with the written-information group. Intention to receive personal results from WGS and decisional conflict in response to a hypothetical scenario did not differ between the three groups. Genet Med 18 5, 501–512. Conclusions: The educational animated video, “Whole Genome Sequencing and You,” was well received by this sample of online-survey respondents. Further work is needed to evaluate its utility as an aid to informed decision making about WGS in other populations. Genet Med 18 5, 501–512. PMID:26334178

  11. Allelic imbalance analysis by high-density single-nucleotide polymorphic allele (SNP) array with whole genome amplified DNA

    OpenAIRE

    Wong, Kwong-Kwok; Tsang, Yvonne T.M.; Shen, Jianhe; Cheng, Rita S.; Chang, Yi-Mieng; Man, Tsz-Kwong; Lau, Ching C.

    2004-01-01

    Besides their use in mRNA expression profiling, oligonucleotide microarrays have also been applied to single-nucleotide polymorphism (SNP) and loss of heterozygosity (LOH) or allelic imbalance studies. In this report, we evaluate the reliability of using whole genome amplified DNA for analysis with an oligonucleotide microarray containing 11 560 SNPs to detect allelic imbalance and chromosomal copy number abnormalities. Whole genome SNP analyses were performed with DNA extracted from osteosar...

  12. Whole-Genome Sequencing Allows for Improved Identification of Persistent Listeria monocytogenes in Food-Associated Environments

    OpenAIRE

    Stasiewicz, Matthew J.; Oliver, Haley F; Wiedmann, Martin; den Bakker, Henk C

    2015-01-01

    While the food-borne pathogen Listeria monocytogenes can persist in food associated environments, there are no whole-genome sequence (WGS) based methods to differentiate persistent from sporadic strains. Whole-genome sequencing of 188 isolates from a longitudinal study of L. monocytogenes in retail delis was used to (i) apply single-nucleotide polymorphism (SNP)-based phylogenetics for subtyping of L. monocytogenes, (ii) use SNP counts to differentiate persistent from repeatedly reintroduced ...

  13. Accuracy of genomic prediction using imputed whole-genome sequence data in white layers.

    Science.gov (United States)

    Heidaritabar, M; Calus, M P L; Megens, H-J; Vereijken, A; Groenen, M A M; Bastiaansen, J W M

    2016-06-01

    There is an increasing interest in using whole-genome sequence data in genomic selection breeding programmes. Prediction of breeding values is expected to be more accurate when whole-genome sequence is used, because the causal mutations are assumed to be in the data. We performed genomic prediction for the number of eggs in white layers using imputed whole-genome resequence data including ~4.6 million SNPs. The prediction accuracies based on sequence data were compared with the accuracies from the 60 K SNP panel. Predictions were based on genomic best linear unbiased prediction (GBLUP) as well as a Bayesian variable selection model (BayesC). Moreover, the prediction accuracy from using different types of variants (synonymous, non-synonymous and non-coding SNPs) was evaluated. Genomic prediction using the 60 K SNP panel resulted in a prediction accuracy of 0.74 when GBLUP was applied. With sequence data, there was a small increase (~1%) in prediction accuracy over the 60 K genotypes. With both 60 K SNP panel and sequence data, GBLUP slightly outperformed BayesC in predicting the breeding values. Selection of SNPs more likely to affect the phenotype (i.e. non-synonymous SNPs) did not improve the accuracy of genomic prediction. The fact that sequence data were based on imputation from a small number of sequenced animals may have limited the potential to improve the prediction accuracy. A small reference population (n = 1004) and possible exclusion of many causal SNPs during quality control can be other possible reasons for limited benefit of sequence data. We expect, however, that the limited improvement is because the 60 K SNP panel was already sufficiently dense to accurately determine the relationships between animals in our data. PMID:26776363

  14. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    International Nuclear Information System (INIS)

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. - Highlights: • Regulated functions were screened using whole genome expression assays. • Polystyrene particles regulated more genes than short carbon nanotubes. • Protein coating of polystyrene particles did not change regulation pattern. • Functions regulated by microarray were confirmed by cell-based assay

  15. High depth, whole-genome sequencing of cholera isolates from Haiti and the Dominican Republic

    Directory of Open Access Journals (Sweden)

    Sealfon Rachel

    2012-09-01

    Full Text Available Abstract Background Whole-genome sequencing is an important tool for understanding microbial evolution and identifying the emergence of functionally important variants over the course of epidemics. In October 2010, a severe cholera epidemic began in Haiti, with additional cases identified in the neighboring Dominican Republic. We used whole-genome approaches to sequence four Vibrio cholerae isolates from Haiti and the Dominican Republic and three additional V. cholerae isolates to a high depth of coverage (>2000x; four of the seven isolates were previously sequenced. Results Using these sequence data, we examined the effect of depth of coverage and sequencing platform on genome assembly and identification of sequence variants. We found that 50x coverage is sufficient to construct a whole-genome assembly and to accurately call most variants from 100 base pair paired-end sequencing reads. Phylogenetic analysis between the newly sequenced and thirty-three previously sequenced V. cholerae isolates indicates that the Haitian and Dominican Republic isolates are closest to strains from South Asia. The Haitian and Dominican Republic isolates form a tight cluster, with only four variants unique to individual isolates. These variants are located in the CTX region, the SXT region, and the core genome. Of the 126 mutations identified that separate the Haiti-Dominican Republic cluster from the V. cholerae reference strain (N16961, 73 are non-synonymous changes, and a number of these changes cluster in specific genes and pathways. Conclusions Sequence variant analyses of V. cholerae isolates, including multiple isolates from the Haitian outbreak, identify coverage-specific and technology-specific effects on variant detection, and provide insight into genomic change and functional evolution during an epidemic.

  16. Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing

    Directory of Open Access Journals (Sweden)

    Nelson Rex T

    2007-09-01

    Full Text Available Abstract Background Soybean, Glycine max (L. Merr., is a well documented paleopolyploid. What remains relatively under characterized is the level of sequence identity in retained homeologous regions of the genome. Recently, the Department of Energy Joint Genome Institute and United States Department of Agriculture jointly announced the sequencing of the soybean genome. One of the initial concerns is to what extent sequence identity in homeologous regions would have on whole genome shotgun sequence assembly. Results Seventeen BACs representing ~2.03 Mb were sequenced as representative potential homeologous regions from the soybean genome. Genetic mapping of each BAC shows that 11 of the 20 chromosomes are represented. Sequence comparisons between homeologous BACs shows that the soybean genome is a mosaic of retained paleopolyploid regions. Some regions appear to be highly conserved while other regions have diverged significantly. Large-scale "batch" reassembly of all 17 BACs combined showed that even the most homeologous BACs with upwards of 95% sequence identity resolve into their respective homeologous sequences. Potential assembly errors were generated by tandemly duplicated pentatricopeptide repeat containing genes and long simple sequence repeats. Analysis of a whole-genome shotgun assembly of 80,000 randomly chosen JGI-DOE sequence traces reveals some new soybean-specific repeat sequences. Conclusion This analysis investigated both the structure of the paleopolyploid soybean genome and the potential effects retained homeology will have on assembling the whole genome shotgun sequence. Based upon these results, homeologous regions similar to those characterized here will not cause major assembly issues.

  17. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    Science.gov (United States)

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  18. Use of whole genome expression analysis in the toxicity screening of nanoparticles

    Energy Technology Data Exchange (ETDEWEB)

    Fröhlich, Eleonore, E-mail: eleonore.froehlich@medunigraz.at [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Meindl, Claudia; Wagner, Karin [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Leitinger, Gerd [Center for Medical Research, Medical University of Graz, Stiftingtalstr. 24, 8010 Graz (Austria); Institute for Cell Biology, Histology and Embryology, Medical University of Graz, Harrachgasse 21, 8010 Graz (Austria); Roblegg, Eva [Institute of Pharmaceutical Sciences, Department of Pharmaceutical Technology, Karl-Franzens-University of Graz, Universitätsplatz 1, 8010 Graz (Austria)

    2014-10-15

    The use of nanoparticles (NPs) offers exciting new options in technical and medical applications provided they do not cause adverse cellular effects. Cellular effects of NPs depend on particle parameters and exposure conditions. In this study, whole genome expression arrays were employed to identify the influence of particle size, cytotoxicity, protein coating, and surface functionalization of polystyrene particles as model particles and for short carbon nanotubes (CNTs) as particles with potential interest in medical treatment. Another aim of the study was to find out whether screening by microarray would identify other or additional targets than commonly used cell-based assays for NP action. Whole genome expression analysis and assays for cell viability, interleukin secretion, oxidative stress, and apoptosis were employed. Similar to conventional assays, microarray data identified inflammation, oxidative stress, and apoptosis as affected by NP treatment. Application of lower particle doses and presence of protein decreased the total number of regulated genes but did not markedly influence the top regulated genes. Cellular effects of CNTs were small; only carboxyl-functionalized single-walled CNTs caused appreciable regulation of genes. It can be concluded that regulated functions correlated well with results in cell-based assays. Presence of protein mitigated cytotoxicity but did not cause a different pattern of regulated processes. - Highlights: • Regulated functions were screened using whole genome expression assays. • Polystyrene particles regulated more genes than short carbon nanotubes. • Protein coating of polystyrene particles did not change regulation pattern. • Functions regulated by microarray were confirmed by cell-based assay.

  19. Genomic Epidemiology: Whole-Genome-Sequencing–Powered Surveillance and Outbreak Investigation of Foodborne Bacterial Pathogens

    DEFF Research Database (Denmark)

    Deng, Xiangyu; den Bakker, Henk C.; Hendriksen, Rene S.

    2016-01-01

    As we are approaching the twentieth anniversary of PulseNet, a network of public health and regulatory laboratories that has changed the landscape of foodborne illness surveillance through molecular subtyping, public health microbiology is undergoing another transformation brought about by so......-called next-generation sequencing (NGS) technologies that have made whole-genome sequencing (WGS) of foodborne bacterial pathogens a realistic and superior alternative to traditional subtyping methods. Routine, real-time, and widespread application of WGS in food safety and public health is on the horizon...

  20. Whole-genome sequencing identifies a recurrent functional synonymous mutation in melanoma

    OpenAIRE

    Gartner, Jared J; Stephen C. J. Parker; Prickett, Todd D.; Dutton-Regester, Ken; Stitzel, Michael L.; Lin, Jimmy C.; Davis, Sean; Simhadri, Vijaya L.; Jha, Sujata; Katagiri, Nobuko; Gotea, Valer; Jamie K. Teer; Wei, Xiaomu; Morken, Mario A; Umesh K Bhanot

    2013-01-01

    Synonymous mutations, which do not alter the protein sequence, have been shown to affect protein function [Sauna ZE, Kimchi-Sarfaty C (2011) Nat Rev Genet 12(10):683–691]. However, synonymous mutations are rarely investigated in the cancer genomics field. We used whole-genome and -exome sequencing to identify somatic mutations in 29 melanoma samples. Validation of one synonymous somatic mutation in BCL2L12 in 285 samples identified 12 cases that harbored the recurrent F17F mutation. This muta...

  1. Mycobacterial DNA Extraction for Whole-Genome Sequencing from Early Positive Liquid (MGIT) Cultures

    OpenAIRE

    Votintseva, Antonina A.; Pankhurst, Louise J.; Anson, Luke W.; Morgan, Marcus R.; Gascoyne-Binzi, Deborah; Walker, Timothy M; Quan, T. Phuong; Wyllie, David H; Del Ojo Elias, Carlos; Wilcox, Mark; Walker, A. Sarah; Peto, Tim E A; Crook, Derrick W.

    2015-01-01

    We developed a low-cost and reliable method of DNA extraction from as little as 1 ml of early positive mycobacterial growth indicator tube (MGIT) cultures that is suitable for whole-genome sequencing to identify mycobacterial species and predict antibiotic resistance in clinical samples. The DNA extraction method is based on ethanol precipitation supplemented by pretreatment steps with a MolYsis kit or saline wash for the removal of human DNA and a final DNA cleanup step with solid-phase reve...

  2. Mycobacterial DNA extraction for whole-genome sequencing from early positive liquid (MGIT) cultures

    OpenAIRE

    Votintseva, AA; Pankhurst, LJ; Anson, LW; Morgan, *; Gascoyne-Binzi, D.; Walker, TM; quan, TP; Wyllie, DH; Del Ojo Elias, C; Wilcox, M; Walker, AS; Peto, TE; Crook, DW

    2015-01-01

    We developed a low-cost and reliable method of DNA extraction from as little as 1 ml of early positive mycobacterial growth indicator tube (MGIT) cultures that is suitable for whole-genome sequencing to identify mycobacterial species and predict antibiotic resistance in clinical samples. The DNA extraction method is based on ethanol precipitation supplemented by pretreatment steps with a MolYsis kit or saline wash for the removal of human DNA and a final DNA cleanup step with solid-phase reve...

  3. Unique Features of a Japanese ‘Candidatus Liberibacter asiaticus’ Strain Revealed by Whole Genome Sequencing

    OpenAIRE

    Katoh, Hiroshi; Miyata, Shin-ichi; Inoue, Hiromitsu; Iwanami, Toru

    2014-01-01

    Citrus greening (huanglongbing) is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with phloem-limited bacteria of three species of α-Proteobacteria, namely, ‘Candidatus Liberibacter asiaticus’, ‘Ca. L. americanus’, and ‘Ca. L. africanus’. Recent findings suggested that some Japanese strains lack the bacteriophage-type DNA polymerase region (DNA pol), in contrast to the Floridian psy62 strain. The whole genome sequence of the pol-negative ‘C...

  4. Refining QTL with high-density SNP genotyping and whole genome sequence in three cattle breeds

    DEFF Research Database (Denmark)

    Sahana, Goutam; Guldbrandtsen, Bernt; Lund, Mogens Sandø

    2012-01-01

    Genome-wide association study was carried out in Nordic Holsteins, Nordic Red and Jersey breeds for functional traits using BovineHD Genotyping BreadChip (Illumina, San Diego, CA). The association analyses were carried out using both linear mixed model approach and a Bayesian variable selection...... method. Principal components were used to account for population structure. The QTL segregating in all three breeds were selected and a few of the most significant ones were followed in further analyses. The polymorphisms in the identified QTL regions were imputed using 90 whole genome sequences...

  5. Two Methods of Whole-Genome Amplification Enable Accurate Genotyping Across a 2320-SNP Linkage Panel

    OpenAIRE

    Barker, David L.; Hansen, Mark S. T.; Faruqi, A. Fawad; Giannola, Diane; Irsula, Orlando R.; Lasken, Roger S; Latterich, Martin; Makarov, Vladimir; Oliphant, Arnold; Pinter, Jonathon H.; Shen, Richard; Sleptsova, Irina; Ziehler, William; Lai, Eric

    2004-01-01

    Comprehensive genome scans involving many thousands of SNP assays will require significant amounts of genomic DNA from each sample. We report two successful methods for amplifying whole-genomic DNA prior to SNP analysis, multiple displacement amplification, and OmniPlex technology. We determined the coverage of amplification by analyzing a SNP linkage marker set that contained 2320 SNP markers spread across the genome at an average distance of 2.5 cM. We observed a concordance of >99.8% in ge...

  6. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    DEFF Research Database (Denmark)

    Li, Ying-hui; Zhao, Shan-cen; Ma, Jian-xin; Li, Dong; Yan, Long; Li, Jun; Qi, Xiao-tian; Guo, Xiao-sen; Zhang, Le; He, Wei-ming; Chang, Ru-zhen; Liang, Qin-si; Guo, Yong; Ye, Chen; Wang, Xiao-bo; Tao, Yong; Guan, Rong-xia; Wang, Jun-yi; Liu, Yu-lin; Jin, Long-guo; Zhang, Xiu-qing; Liu, Zhang-xiong; Zhang, Li-juan; Chen, Jie; Wang, Ke-jing; Nielsen, Rasmus; Li, Rui-qiang; Chen, Peng-yin; Li, Wen-bin; Reif, Jochen C.; Purugganan, Michael; Wang, Jian; Zhang, Meng-chen; Wang, Jun; Qiu, Li-juan

    2013-01-01

    artificial selection during domestication led to more pronounced reduction in the genetic diversity of soybean than the switch from landraces to elite cultivars. Only a small proportion (2.99%) of the whole genomic regions appear to be affected by artificial selection for preferred agricultural traits. The...... and genetic improvement were identified.CONCLUSIONS:Given the uniqueness of the soybean germplasm sequenced, this study drew a clear picture of human-mediated evolution of the soybean genomes. The genomic resources and information provided by this study would also facilitate the discovery of genes...

  7. Whole genome sequence of Pantoea ananatis R100, an antagonistic bacterium isolated from rice seed.

    Science.gov (United States)

    Wu, Liwen; Liu, Ruifang; Niu, Yaofang; Lin, Haiyan; Ye, Weijun; Guo, Longbiao; Hu, Xingming

    2016-05-10

    Pantoea ananatis is a group of bacteria, which was first reported as plant pathogen. Recently, several papers also described its biocontrol ability. In 2003, P. ananatis R100, which showed strong antagonism against several plant pathogens, was isolated from rice seeds. In this study, whole genome sequence of this strain was determined by SMRT Cell technology. The total genome size of R100 is 4,857,861bp with 4659 coding genes (CDS), 82 tRNAs and 22 rRNAs. The genome sequence of R100 may shed a light on the research of antagonism P. ananatis. PMID:26965742

  8. Whole genome amplification from a single cell: implications for genetic analysis.

    OpenAIRE

    Zhang, L; Cui, X.; Schmitt, K.; R.; Hubert; Navidi, W.; Arnheim, N

    1992-01-01

    We have developed an in vitro method for amplifying a large fraction of the DNA sequences present in a single haploid cell by repeated primer extensions using a mixture of 15-base random oligonucleotides. We studied 12 genetic loci and estimate that the probability of amplifying any sequence in the genome to a minimum of 30 copies is not less than 0.78 (95% confidence). Whole genome amplification beginning with a single cell, or other samples with very small amounts of DNA, has significant im...

  9. Whole genome sequences and annotation of Micrococcus luteus SUBG006, a novel phytopathogen of mango

    Directory of Open Access Journals (Sweden)

    Purvi M. Rakhashiya

    2015-12-01

    Full Text Available Actinobaceria, Micrococcus luteus SUBG006 was isolated from infected leaves of Mangifera indica L. vr. Nylon in Rajkot, (22.30°N, 70.78°E, Gujarat, India. The genome size is 3.86 Mb with G + C content of 69.80% and contains 112 rRNA sequences (5S, 16S and 23S. The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JOKP00000000.

  10. Identification of emergent bla CMY-2 -carrying Proteus mirabilis lineages by whole-genome sequencing.

    Science.gov (United States)

    Mac Aogáin, M; Rogers, T R; Crowley, B

    2016-01-01

    Whole-genome sequencing of 24 Proteus mirabilis isolates revealed the clonal expansion of two cefoxitin-resistant strains among patients with community-onset infection. These strains harboured bla CMY-2 within a chromosomally located integrative and conjugative element and exhibited multidrug resistance phenotypes. A predominant strain, identified in 18 patients, also harboured the PGI-1 genomic island and associated resistance genes, accounting for its broader antibiotic resistance profile. The identification of these novel multidrug-resistant strains among community-onset infections suggests that they are endemic to this region and represent emergent P. mirabilis lineages of clinical significance. PMID:26865983

  11. Sequence Determination from Overlapping Fragments: A Simple Model of Whole-Genome Shotgun Sequencing

    Science.gov (United States)

    Derrida, Bernard; Fink, Thomas M.

    2002-02-01

    Assembling fragments randomly sampled from along a sequence is the basis of whole-genome shotgun sequencing, a technique used to map the DNA of the human and other genomes. We calculate the probability that a random sequence can be recovered from a collection of overlapping fragments. We provide an exact solution for an infinite alphabet and in the case of constant overlaps. For the general problem we apply two assembly strategies and give the probability that the assembly puzzle can be solved in the limit of infinitely many fragments.

  12. Kuwaiti population subgroup of nomadic Bedouin ancestry—Whole genome sequence and analysis

    OpenAIRE

    Sumi Elsa John; Gaurav Thareja; Prashantha Hebbar; Kazem Behbehani; Thangavel Alphonse Thanaraj; Osama Alsmadi

    2015-01-01

    Kuwaiti native population comprises three distinct genetic subgroups of Persian, “city-dwelling” Saudi Arabian tribe, and nomadic “tent-dwelling” Bedouin ancestry. Bedouin subgroup is characterized by presence of 17% African ancestry; it owes it origin to nomadic tribes of the deserts of Arabian Peninsula and North Africa. By sequencing whole genome of a Kuwaiti male from this subgroup at 41X coverage, we report 3,752,878 SNPs, 411,839 indels, and 8451 structural variations. Neighbor-joining ...

  13. Comparing Platforms for C. elegans Mutant Identification Using High-Throughput Whole-Genome Sequencing

    OpenAIRE

    Shen, Yufeng; Sarin, Sumeet; Liu, Ye; Hobert, Oliver; Pe'er, Itsik

    2008-01-01

    Background Whole-genome sequencing represents a promising approach to pinpoint chemically induced mutations in genetic model organisms, thereby short-cutting time-consuming genetic mapping efforts. Principal Findings We compare here the ability of two leading high-throughput platforms for paired-end deep sequencing, SOLiD (ABI) and Genome Analyzer (Illumina; “Solexa”), to achieve the goal of mutant detection. As a test case we used a mutant C. elegans strain that harbors a mutation in the lsy...

  14. A Bivariate Whole Genome Linkage Study Identified Genomic Regions Influencing Both BMD and Bone Structure

    OpenAIRE

    Liu, Xiao-Gang; Liu, Yong-Jun; Liu, Jianfeng; Pei, Yufang; Xiong, Dong-Hai; Shen, Hui; Deng, Hong-Yi; Papasian, Christopher J.; Drees, Betty M.; Hamilton, James J.; Recker, Robert R.; Deng, Hong-Wen

    2008-01-01

    Areal BMD (aBMD) and areal bone size (ABS) are biologically correlated traits and are each important determinants of bone strength and risk of fractures. Studies showed that aBMD and ABS are genetically correlated, indicating that they may share some common genetic factors, which, however, are largely unknown. To study the genetic factors influencing both aBMD and ABS, bivariate whole genome linkage analyses were conducted for aBMD-ABS at the femoral neck (FN), lumbar spine (LS), and ultradis...

  15. The First Kazakh Whole Genomes: The First Report of NGS Data

    Directory of Open Access Journals (Sweden)

    Ainur Akilzhanova

    2014-12-01

    Full Text Available Introduction: The human genome sequence will underpin human biology and medicine in the next century, providing a single, essential reference to all genetic information. Extraordinary technological advances and decreases in the cost of DNA sequencing have made the possibility of whole genome sequencing (WGS feasible as a highly accessible test for numerous indications. The international project “Genetic architecture of Kazakh population” is well underway to determine the complete DNA. Next generation sequencing is a powerful tool for genetic analysis, which will enable us to uncover the association of loci at specific sites in the genome associated with disease. The aim of this study was to introduce first data on WGS of 6 Kazakh individuals.Methods: This pilot study is among the first WGS performed on 6 healthy Kazakh individuals, using next generation sequencing platform HiSeq2000, Illumina by manufacturer’s protocols. All generated *.bcl files were simultaneously converted and demultiplexed using bcl2fasta application. Alignment of sequence reads performed using bwa-mem against human b19 reference genome. Sorting, removing of intermediate files, *.bam files assembling, and marking duplicates were performed using PicardTools package. GATK haplotype caller tool was used for variant calling. ClinVar, SNPedia, and Cosmic databases were processed to identify clinical genomic variants in 6 Kazakh whole genomes. Java Runtime Environment and R. Bioconductor packages were installed to perform raw data processing and run program scripts.Results: The sequence alignment and mapping procedures on reference genome hg19 of each 6 healthy Kazakh individual were completed. Between 87,308,581,400 and 107,526,741,301 total base pairs were sequenced with average coverage x29.85. Between 98.85% and 99.58% base pairs were totally mapped and on average 96.07% were properly paired. Het/Hom and Ti/Tv ratios for each whole genome ranged from 1.35 to 1.52 and

  16. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    DEFF Research Database (Denmark)

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang;

    2015-01-01

    The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been...... provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population....

  17. Whole genome sequences and annotation of Micrococcus luteus SUBG006, a novel phytopathogen of mango.

    Science.gov (United States)

    Rakhashiya, Purvi M; Patel, Pooja P; Thaker, Vrinda S

    2015-12-01

    Actinobaceria, Micrococcus luteus SUBG006 was isolated from infected leaves of Mangifera indica L. vr. Nylon in Rajkot, (22.30°N, 70.78°E), Gujarat, India. The genome size is 3.86 Mb with G + C content of 69.80% and contains 112 rRNA sequences (5S, 16S and 23S). The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JOKP00000000. PMID:26697318

  18. Whole genome sequences and annotation of Micrococcus luteus SUBG006, a novel phytopathogen of mango

    OpenAIRE

    Rakhashiya, Purvi M.; Patel, Pooja P.; Thaker, Vrinda S.

    2015-01-01

    Actinobaceria, Micrococcus luteus SUBG006 was isolated from infected leaves of Mangifera indica L. vr. Nylon in Rajkot, (22.30°N, 70.78°E), Gujarat, India. The genome size is 3.86 Mb with G + C content of 69.80% and contains 112 rRNA sequences (5S, 16S and 23S). The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JOKP00000000.

  19. When aging meets microgravity: whole genome promoters and enchancers transcription landscape in zebrafish onboard ISS

    Science.gov (United States)

    Arshanovskii, Kirill; Gusev, Oleg; Sychev, Vladimir; Poddubko, Svetlana; Deviatiiarov, Ruslan

    2016-07-01

    In order to gen new insights of gene regulation changes under conditions of real spaceflight, we have conducted whole-genome analysis of dynamic of promotes and enhancers transcriptional changes in zebrafish during prolonged exposure to real spaceflight. In the frame of Russia-Japan joint experiments "Aquatic Habitat"-"Aquarium" we have conducted Cap Analysis of Gene Expression (CAGE) assay of zebrafish in the rage from 7 to 40 days of real spaceflight onboard ISS. The analysis showed that both gene expression patterns and architecture of shapes and types of the promoters are affected by spaceflight environment.

  20. Whole-Genome Sequencing for Routine Pathogen Surveillance in Public Health: a Population Snapshot of Invasive Staphylococcus aureus in Europe

    Science.gov (United States)

    Aanensen, David M.; Feil, Edward J.; Holden, Matthew T. G.; Dordel, Janina; Yeats, Corin A.; Fedosejev, Artemij; Goater, Richard; Castillo-Ramírez, Santiago; Corander, Jukka; Colijn, Caroline; Chlebowicz, Monika A.; Schouls, Leo; Heck, Max; Pluister, Gerlinde; Ruimy, Raymond; Kahlmeter, Gunnar; Åhman, Jenny; Matuschek, Erika; Friedrich, Alexander W.; Bentley, Stephen D.; Spratt, Brian G.

    2016-01-01

    ABSTRACT The implementation of routine whole-genome sequencing (WGS) promises to transform our ability to monitor the emergence and spread of bacterial pathogens. Here we combined WGS data from 308 invasive Staphylococcus aureus isolates corresponding to a pan-European population snapshot, with epidemiological and resistance data. Geospatial visualization of the data is made possible by a generic software tool designed for public health purposes that is available at the project URL (http://www.microreact.org/project/EkUvg9uY?tt=rc). Our analysis demonstrates that high-risk clones can be identified on the basis of population level properties such as clonal relatedness, abundance, and spatial structuring and by inferring virulence and resistance properties on the basis of gene content. We also show that in silico predictions of antibiotic resistance profiles are at least as reliable as phenotypic testing. We argue that this work provides a comprehensive road map illustrating the three vital components for future molecular epidemiological surveillance: (i) large-scale structured surveys, (ii) WGS, and (iii) community-oriented database infrastructure and analysis tools. PMID:27150362

  1. Whole genome duplication affects evolvability of flowering time in an autotetraploid plant.

    Directory of Open Access Journals (Sweden)

    Sara L Martin

    Full Text Available Whole genome duplications have occurred recurrently throughout the evolutionary history of eukaryotes. The resulting genetic and phenotypic changes can influence physiological and ecological responses to the environment; however, the impact of genome copy number on evolvability has rarely been examined experimentally. Here, we evaluate the effect of genome duplication on the ability to respond to selection for early flowering time in lines drawn from naturally occurring diploid and autotetraploid populations of the plant Chamerion angustifolium (fireweed. We contrast this with the result of four generations of selection on synthesized neoautotetraploids, whose genic variability is similar to diploids but genome copy number is similar to autotetraploids. In addition, we examine correlated responses to selection in all three groups. Diploid and both extant tetraploid and neoautotetraploid lines responded to selection with significant reductions in time to flowering. Evolvability, measured as realized heritability, was significantly lower in extant tetraploids (^b(T =  0.31 than diploids (^b(T =  0.40. Neotetraploids exhibited the highest evolutionary response (^b(T  =  0.55. The rapid shift in flowering time in neotetraploids was associated with an increase in phenotypic variability across generations, but not with change in genome size or phenotypic correlations among traits. Our results suggest that whole genome duplications, without hybridization, may initially alter evolutionary rate, and that the dynamic nature of neoautopolyploids may contribute to the prevalence of polyploidy throughout eukaryotes.

  2. Independent Evolution of Winner Traits without Whole Genome Duplication in Dekkera Yeasts

    Science.gov (United States)

    Dai, Shao-Xing; Li, Wen-Xing; Zheng, Jun-Juan; Li, Gong-Hua; Huang, Jing-Fei

    2016-01-01

    Dekkera yeasts have often been considered as alternative sources of ethanol production that could compete with S. cerevisiae. The two lineages of yeasts independently evolved traits that include high glucose and ethanol tolerance, aerobic fermentation, and a rapid ethanol fermentation rate. The Saccharomyces yeasts attained these traits mainly through whole genome duplication approximately 100 million years ago (Mya). However, the Dekkera yeasts, which were separated from S. cerevisiae approximately 200 Mya, did not undergo whole genome duplication (WGD) but still occupy a niche similar to S. cerevisiae. Upon analysis of two Dekkera yeasts and five closely related non-WGD yeasts, we found that a massive loss of cis-regulatory elements occurred in an ancestor of the Dekkera yeasts, which led to improved mitochondrial functions similar to the S. cerevisiae yeasts. The evolutionary analysis indicated that genes involved in the transcription and translation process exhibited faster evolution in the Dekkera yeasts. We detected 90 positively selected genes, suggesting that the Dekkera yeasts evolved an efficient translation system to facilitate adaptive evolution. Moreover, we identified that 12 vacuolar H+-ATPase (V-ATPase) function genes that were under positive selection, which assists in developing tolerance to high alcohol and high sugar stress. We also revealed that the enzyme PGK1 is responsible for the increased rate of glycolysis in the Dekkera yeasts. These results provide important insights to understand the independent adaptive evolution of the Dekkera yeasts and provide tools for genetic modification promoting industrial usage. PMID:27152421

  3. Epigenetic regulation of subgenome dominance following whole genome triplication in Brassica rapa.

    Science.gov (United States)

    Cheng, Feng; Sun, Chao; Wu, Jian; Schnable, James; Woodhouse, Margaret R; Liang, Jianli; Cai, Chengcheng; Freeling, Michael; Wang, Xiaowu

    2016-07-01

    Subgenome dominance is an important phenomenon observed in allopolyploids after whole genome duplication, in which one subgenome retains more genes as well as contributes more to the higher expressing gene copy of paralogous genes. To dissect the mechanism of subgenome dominance, we systematically investigated the relationships of gene expression, transposable element (TE) distribution and small RNA targeting, relating to the multicopy paralogous genes generated from whole genome triplication in Brassica rapa. The subgenome dominance was found to be regulated by a relatively stable factor established previously, then inherited by and shared among B. rapa varieties. In addition, we found a biased distribution of TEs between flanking regions of paralogous genes. Furthermore, the 24-nt small RNAs target TEs and are negatively correlated to the dominant expression of individual paralogous gene pairs. The biased distribution of TEs among subgenomes and the targeting of 24-nt small RNAs together produce the dominant expression phenomenon at a subgenome scale. Based on these findings, we propose a bucket hypothesis to illustrate subgenome dominance and hybrid vigor. Our findings and hypothesis are valuable for the evolutionary study of polyploids, and may shed light on studies of hybrid vigor, which is common to most species. PMID:26871271

  4. Integration of transcriptome and whole genomic resequencing data to identify key genes affecting swine fat deposition.

    Directory of Open Access Journals (Sweden)

    Kai Xing

    Full Text Available Fat deposition is highly correlated with the growth, meat quality, reproductive performance and immunity of pigs. Fatty acid synthesis takes place mainly in the adipose tissue of pigs; therefore, in this study, a high-throughput massively parallel sequencing approach was used to generate adipose tissue transcriptomes from two groups of Songliao black pigs that had opposite backfat thickness phenotypes. The total number of paired-end reads produced for each sample was in the range of 39.29-49.36 millions. Approximately 188 genes were differentially expressed in adipose tissue and were enriched for metabolic processes, such as fatty acid biosynthesis, lipid synthesis, metabolism of fatty acids, etinol, caffeine and arachidonic acid and immunity. Additionally, many genetic variations were detected between the two groups through pooled whole-genome resequencing. Integration of transcriptome and whole-genome resequencing data revealed important genomic variations among the differentially expressed genes for fat deposition, for example, the lipogenic genes. Further studies are required to investigate the roles of candidate genes in fat deposition to improve pig breeding programs.

  5. Kernel-based whole-genome prediction of complex traits: a review

    Directory of Open Access Journals (Sweden)

    Gota eMorota

    2014-10-01

    Full Text Available Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways, thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

  6. Comparison of microbial DNA enrichment tools for metagenomic whole genome sequencing.

    Science.gov (United States)

    Thoendel, Matthew; Jeraldo, Patricio R; Greenwood-Quaintance, Kerryl E; Yao, Janet Z; Chia, Nicholas; Hanssen, Arlen D; Abdel, Matthew P; Patel, Robin

    2016-08-01

    Metagenomic whole genome sequencing for detection of pathogens in clinical samples is an exciting new area for discovery and clinical testing. A major barrier to this approach is the overwhelming ratio of human to pathogen DNA in samples with low pathogen abundance, which is typical of most clinical specimens. Microbial DNA enrichment methods offer the potential to relieve this limitation by improving this ratio. Two commercially available enrichment kits, the NEBNext Microbiome DNA Enrichment Kit and the Molzym MolYsis Basic kit, were tested for their ability to enrich for microbial DNA from resected arthroplasty component sonicate fluids from prosthetic joint infections or uninfected sonicate fluids spiked with Staphylococcus aureus. Using spiked uninfected sonicate fluid there was a 6-fold enrichment of bacterial DNA with the NEBNext kit and 76-fold enrichment with the MolYsis kit. Metagenomic whole genome sequencing of sonicate fluid revealed 13- to 85-fold enrichment of bacterial DNA using the NEBNext enrichment kit. The MolYsis approach achieved 481- to 9580-fold enrichment, resulting in 7 to 59% of sequencing reads being from the pathogens known to be present in the samples. These results demonstrate the usefulness of these tools when testing clinical samples with low microbial burden using next generation sequencing. PMID:27237775

  7. Construction of a phylogenetic tree of photosynthetic prokaryotes based on average similarities of whole genome sequences.

    Directory of Open Access Journals (Sweden)

    Soichirou Satoh

    Full Text Available Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.

  8. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Directory of Open Access Journals (Sweden)

    Can Alkan

    2007-09-01

    Full Text Available The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  9. Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa

    Energy Technology Data Exchange (ETDEWEB)

    McCluskey, Kevin; Wiest, Aric E.; Grigoriev, Igor V.; Lipzen, Anna; Martin, Joel; Schackwitz, Wendy; Baker, Scott E.

    2011-06-02

    Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.

  10. Targeted or whole genome sequencing of formalin fixed tissue samples: potential applications in cancer genomics.

    Science.gov (United States)

    Munchel, Sarah; Hoang, Yen; Zhao, Yue; Cottrell, Joseph; Klotzle, Brandy; Godwin, Andrew K; Koestler, Devin; Beyerlein, Peter; Fan, Jian-Bing; Bibikova, Marina; Chien, Jeremy

    2015-09-22

    Current genomic studies are limited by the poor availability of fresh-frozen tissue samples. Although formalin-fixed diagnostic samples are in abundance, they are seldom used in current genomic studies because of the concern of formalin-fixation artifacts. Better characterization of these artifacts will allow the use of archived clinical specimens in translational and clinical research studies. To provide a systematic analysis of formalin-fixation artifacts on Illumina sequencing, we generated 26 DNA sequencing data sets from 13 pairs of matched formalin-fixed paraffin-embedded (FFPE) and fresh-frozen (FF) tissue samples. The results indicate high rate of concordant calls between matched FF/FFPE pairs at reference and variant positions in three commonly used sequencing approaches (whole genome, whole exome, and targeted exon sequencing). Global mismatch rates and C · G > T · A substitutions were comparable between matched FF/FFPE samples, and discordant rates were low (<0.26%) in all samples. Finally, low-pass whole genome sequencing produces similar pattern of copy number alterations between FF/FFPE pairs. The results from our studies suggest the potential use of diagnostic FFPE samples for cancer genomic studies to characterize and catalog variations in cancer genomes. PMID:26305677

  11. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    Science.gov (United States)

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  12. Comparative whole genome sequence analysis of wild-type and cidofovir-resistant monkeypoxvirus

    Directory of Open Access Journals (Sweden)

    Huggins John

    2010-05-01

    Full Text Available Abstract We performed whole genome sequencing of a cidofovir {[(S-1-(3-hydroxy-2-phosphonylmethoxy-propyl cytosine] [HPMPC]}-resistant (CDV-R strain of Monkeypoxvirus (MPV. Whole-genome comparison with the wild-type (WT strain revealed 55 single-nucleotide polymorphisms (SNPs and one tandem-repeat contraction. Over one-third of all identified SNPs were located within genes comprising the poxvirus replication complex, including the DNA polymerase, RNA polymerase, mRNA capping methyltransferase, DNA processivity factor, and poly-A polymerase. Four polymorphic sites were found within the DNA polymerase gene. DNA polymerase mutations observed at positions 314 and 684 in MPV were consistent with CDV-R loci previously identified in Vaccinia virus (VACV. These data suggest the mechanism of CDV resistance may be highly conserved across Orthopoxvirus (OPV species. SNPs were also identified within virulence genes such as the A-type inclusion protein, serine protease inhibitor-like protein SPI-3, Schlafen ATPase and thymidylate kinase, among others. Aberrant chain extension induced by CDV may lead to diverse alterations in gene expression and viral replication that may result in both adaptive and attenuating mutations. Defining the potential contribution of substitutions in the replication complex and RNA processing machinery reported here may yield further insight into CDV resistance and may augment current therapeutic development strategies.

  13. Multiple mutations in heterogeneous miltefosine-resistant Leishmania major population as determined by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Adriano C Coelho

    Full Text Available BACKGROUND: Miltefosine (MF is the first oral compound used in the chemotherapy against leishmaniasis. Since the mechanism of action of this drug and the targets of MF in Leishmania are unclear, we generated in a step-by-step manner Leishmania major promastigote mutants highly resistant to MF. Two of the mutants were submitted to a short-read whole genome sequencing for identifying potential genes associated with MF resistance. METHODS/PRINCIPAL FINDINGS: Analysis of the genome assemblies revealed several independent point mutations in a P-type ATPase involved in phospholipid translocation. Mutations in two other proteins-pyridoxal kinase and α-adaptin like protein-were also observed in independent mutants. The role of these proteins in the MF resistance was evaluated by gene transfection and gene disruption and both the P-type ATPase and pyridoxal kinase were implicated in MF susceptibility. The study also highlighted that resistance can be highly heterogeneous at the population level with individual clones derived from this population differing both in terms of genotypes but also susceptibility phenotypes. CONCLUSIONS/SIGNIFICANCE: Whole genome sequencing was used to pinpoint known and new resistance markers associated with MF resistance in the protozoan parasite Leishmania. The study also demonstrated the polyclonal nature of a resistant population with individual cells with varying susceptibilities and genotypes.

  14. Whole-Genome Mapping as a Novel High-Resolution Typing Tool for Legionella pneumophila

    Science.gov (United States)

    Euser, Sjoerd M.; Landman, Fabian; Bruin, Jacob P.; IJzerman, Ed P.; den Boer, Jeroen W.; Schouls, Leo M.

    2015-01-01

    Legionella is the causative agent for Legionnaires' disease (LD) and is responsible for several large outbreaks in the world. More than 90% of LD cases are caused by Legionella pneumophila, and studies on the origin and transmission routes of this pathogen rely on adequate molecular characterization of isolates. Current typing of L. pneumophila mainly depends on sequence-based typing (SBT). However, studies have shown that in some outbreak situations, SBT does not have sufficient discriminatory power to distinguish between related and nonrelated L. pneumophila isolates. In this study, we used a novel high-resolution typing technique, called whole-genome mapping (WGM), to differentiate between epidemiologically related and nonrelated L. pneumophila isolates. Assessment of the method by various validation experiments showed highly reproducible results, and WGM was able to confirm two well-documented Dutch L. pneumophila outbreaks. Comparison of whole-genome maps of the two outbreaks together with WGMs of epidemiologically nonrelated L. pneumophila isolates showed major differences between the maps, and WGM yielded a higher discriminatory power than SBT. In conclusion, WGM can be a valuable alternative to perform outbreak investigations of L. pneumophila in real time since the turnaround time from culture to comparison of the L. pneumophila maps is less than 24 h. PMID:26202110

  15. Analysis of the microbiome: Advantages of whole genome shotgun versus 16S amplicon sequencing.

    Science.gov (United States)

    Ranjan, Ravi; Rani, Asha; Metwally, Ahmed; McGee, Halvor S; Perkins, David L

    2016-01-22

    The human microbiome has emerged as a major player in regulating human health and disease. Translational studies of the microbiome have the potential to indicate clinical applications such as fecal transplants and probiotics. However, one major issue is accurate identification of microbes constituting the microbiota. Studies of the microbiome have frequently utilized sequencing of the conserved 16S ribosomal RNA (rRNA) gene. We present a comparative study of an alternative approach using whole genome shotgun sequencing (WGS). In the present study, we analyzed the human fecal microbiome compiling a total of 194.1 × 10(6) reads from a single sample using multiple sequencing methods and platforms. Specifically, after establishing the reproducibility of our methods with extensive multiplexing, we compared: 1) The 16S rRNA amplicon versus the WGS method, 2) the Illumina HiSeq versus MiSeq platforms, 3) the analysis of reads versus de novo assembled contigs, and 4) the effect of shorter versus longer reads. Our study demonstrates that whole genome shotgun sequencing has multiple advantages compared with the 16S amplicon method including enhanced detection of bacterial species, increased detection of diversity and increased prediction of genes. In addition, increased length, either due to longer reads or the assembly of contigs, improved the accuracy of species detection. PMID:26718401

  16. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

    Science.gov (United States)

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. PMID:26989155

  17. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  18. Use of Whole Genome Sequencing and Patient Interviews To Link a Case of Sporadic Listeriosis to Consumption of Prepackaged Lettuce.

    Science.gov (United States)

    Jackson, K A; Stroika, S; Katz, L S; Beal, J; Brandt, E; Nadon, C; Reimer, A; Major, B; Conrad, A; Tarr, C; Jackson, B R; Mody, R K

    2016-05-01

    We report on a case of listeriosis in a patient who probably consumed a prepackaged romaine lettuce-containing product recalled for Listeria monocytogenes contamination. Although definitive epidemiological information demonstrating exposure to the specific recalled product was lacking, the patient reported consumption of a prepackaged romaine lettuce-containing product of either the recalled brand or a different brand. A multinational investigation found that patient and food isolates from the recalled product were indistinguishable by pulsed-field gel electrophoresis and were highly related by whole genome sequencing, differing by four alleles by whole genome multilocus sequence typing and by five high-quality single nucleotide polymorphisms, suggesting a common source. To our knowledge, this is the first time prepackaged lettuce has been identified as a likely source for listeriosis. This investigation highlights the power of whole genome sequencing, as well as the continued need for timely and thorough epidemiological exposure data to identify sources of foodborne infections. PMID:27296429

  19. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    DEFF Research Database (Denmark)

    Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Bellod Cisneros, Jose Luis;

    2016-01-01

    and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the...... web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes...... platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely...

  20. Identification of genomic regions associated with female fertility in Danish Jersey using whole genome sequence data

    DEFF Research Database (Denmark)

    Höglund, Johanna; Guldbrandtsen, Bernt; Lund, Mogens Sandø;

    2015-01-01

    Background: Female fertility is an important trait in cattle breeding programs. In the Nordic countries selection is based on a fertility index (FTI). The fertility index is a weighted combination of four female fertility traits estimated breeding values for number of inseminations per conception...... sires from Denmark with official breeding values for female fertility traits. The association analyses were carried out in two steps: first the cattle genome was scanned for quantitative trait loci using a sire model for FTI using imputed whole genome sequence variants; second the significant...... cows on BTA20, BTA23 and BTA25, IFL for heifers on BTA7 and QTL9-2 on BTA9, NRR for heifers on BTA7 and BTA23, and NRR for cows on BTA23. Conclusion: The genome wide association study presented here revealed 6 genomic regions associated with FTI. Screening these 6 QTL regions for the underlying female...

  1. Hepatitis C virus whole genome sequencing: Current methods/issues and future challenges.

    Science.gov (United States)

    Trémeaux, Pauline; Caporossi, Alban; Thélu, Marie-Ange; Blum, Michael; Leroy, Vincent; Morand, Patrice; Larrat, Sylvie

    2016-10-01

    Therapy for hepatitis C is currently undergoing a revolution. The arrival of new antiviral agents targeting viral proteins reinforces the need for a better knowledge of the viral strains infecting each patient. Hepatitis C virus (HCV) whole genome sequencing provides essential information for precise typing, study of the viral natural history or identification of resistance-associated variants. First performed with Sanger sequencing, the arrival of next-generation sequencing (NGS) has simplified the technical process and provided more detailed data on the nature and evolution of viral quasi-species. We will review the different techniques used for HCV complete genome sequencing and their applications, both before and after the apparition of NGS. The progress brought by new and future technologies will also be discussed, as well as the remaining difficulties, largely due to the genomic variability. PMID:27068766

  2. Whole genome sequencing of emerging multidrug resistant Candida auris isolates in India demonstrates low genetic variation.

    Science.gov (United States)

    Sharma, C; Kumar, N; Pandey, R; Meis, J F; Chowdhary, A

    2016-09-01

    Candida auris is an emerging multidrug resistant yeast that causes nosocomial fungaemia and deep-seated infections. Notably, the emergence of this yeast is alarming as it exhibits resistance to azoles, amphotericin B and caspofungin, which may lead to clinical failure in patients. The multigene phylogeny and amplified fragment length polymorphism typing methods report the C. auris population as clonal. Here, using whole genome sequencing analysis, we decipher for the first time that C. auris strains from four Indian hospitals were highly related, suggesting clonal transmission. Further, all C. auris isolates originated from cases of fungaemia and were resistant to fluconazole (MIC >64 mg/L). PMID:27617098

  3. Practical Value of Food Pathogen Traceability through Building a Whole-Genome Sequencing Network and Database.

    Science.gov (United States)

    Allard, Marc W; Strain, Errol; Melka, David; Bunning, Kelly; Musser, Steven M; Brown, Eric W; Timme, Ruth

    2016-08-01

    The FDA has created a United States-based open-source whole-genome sequencing network of state, federal, international, and commercial partners. The GenomeTrakr network represents a first-of-its-kind distributed genomic food shield for characterizing and tracing foodborne outbreak pathogens back to their sources. The GenomeTrakr network is leading investigations of outbreaks of foodborne illnesses and compliance actions with more accurate and rapid recalls of contaminated foods as well as more effective monitoring of preventive controls for food manufacturing environments. An expanded network would serve to provide an international rapid surveillance system for pathogen traceback, which is critical to support an effective public health response to bacterial outbreaks. PMID:27008877

  4. Multiplex SNP analysis on whole genome amplified DNA from archived dried bloodspots, a validation study

    DEFF Research Database (Denmark)

    Tvedegaard, Kristine C.; Parner, Erik; Hooper, Craig W.;

    Multiplex SNP analysis on whole genome amplified DNA from archived dried bloodspots, a validation study Kristine C. Tvedegaard,1 Erik Parner,1 Craig W. Hooper,2 Jørn Atterman,1 Niels Gregersen3, Poul Thorsen,1 1Institute of Public Health, NANEA at Department of Epidemiology, University of Aarhus...... further development of allele specific primer extension (ASPE) for multiplex SNP analysis based on the Luminex 100 IS platform. It uses isobases (isoC and isoG) and the software MultiCode-PLx platform for data analysis and data handling. We validate the EraGen multicode system in two 6-plex assays used on.......3-100%, repeatability ranged from 99.2-99.7% and robustness ranged from 94.1-99.3%. CONCLUSION: The Multi-Code System is a highly sensitive and specific method for multiplex SNP analysis on WGA DNA from archived dried bloodspots....

  5. Rapid high resolution genotyping of Francisella tularensis by whole genome sequence comparison of annotated genes ("MLST+".

    Directory of Open Access Journals (Sweden)

    Markus H Antwerpen

    Full Text Available The zoonotic disease tularemia is caused by the bacterium Francisella tularensis. This pathogen is considered as a category A select agent with potential to be misused in bioterrorism. Molecular typing based on DNA-sequence like canSNP-typing or MLVA has become the accepted standard for this organism. Due to the organism's highly clonal nature, the current typing methods have reached their limit of discrimination for classifying closely related subpopulations within the subspecies F. tularensis ssp. holarctica. We introduce a new gene-by-gene approach, MLST+, based on whole genome data of 15 sequenced F. tularensis ssp. holarctica strains and apply this approach to investigate an epidemic of lethal tularemia among non-human primates in two animal facilities in Germany. Due to the high resolution of MLST+ we are able to demonstrate that three independent clones of this highly infectious pathogen were responsible for these spatially and temporally restricted outbreaks.

  6. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle

    DEFF Research Database (Denmark)

    Daetwyler, Hans D; Capitan, Aurélien; Pausch, Hubert;

    2014-01-01

    The 1000 bull genomes project supports the goal of accelerating the rates of genetic gain in domestic cattle while at the same time considering animal health and welfare by providing the annotated sequence variants and genotypes of key ancestor bulls. In the first phase of the 1000 bull genomes...... project, we sequenced the whole genomes of 234 cattle to an average of 8.3-fold coverage. This sequencing includes data for 129 individuals from the global Holstein-Friesian population, 43 individuals from the Fleckvieh breed and 15 individuals from the Jersey breed. We identified a total of 28.3 million...... variants, with an average of 1.44 heterozygous sites per kilobase for each individual. We demonstrate the use of this database in identifying a recessive mutation underlying embryonic death and a dominant mutation underlying lethal chrondrodysplasia. We also performed genome-wide association studies for...

  7. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing

    Science.gov (United States)

    Alioto, Tyler S.; Buchhalter, Ivo; Derdak, Sophia; Hutter, Barbara; Eldridge, Matthew D.; Hovig, Eivind; Heisler, Lawrence E.; Beck, Timothy A.; Simpson, Jared T.; Tonon, Laurie; Sertier, Anne-Sophie; Patch, Ann-Marie; Jäger, Natalie; Ginsbach, Philip; Drews, Ruben; Paramasivam, Nagarajan; Kabbe, Rolf; Chotewutmontri, Sasithorn; Diessl, Nicolle; Previti, Christopher; Schmidt, Sabine; Brors, Benedikt; Feuerbach, Lars; Heinold, Michael; Gröbner, Susanne; Korshunov, Andrey; Tarpey, Patrick S.; Butler, Adam P.; Hinton, Jonathan; Jones, David; Menzies, Andrew; Raine, Keiran; Shepherd, Rebecca; Stebbings, Lucy; Teague, Jon W.; Ribeca, Paolo; Giner, Francesc Castro; Beltran, Sergi; Raineri, Emanuele; Dabad, Marc; Heath, Simon C.; Gut, Marta; Denroche, Robert E.; Harding, Nicholas J.; Yamaguchi, Takafumi N.; Fujimoto, Akihiro; Nakagawa, Hidewaki; Quesada, Víctor; Valdés-Mas, Rafael; Nakken, Sigve; Vodák, Daniel; Bower, Lawrence; Lynch, Andrew G.; Anderson, Charlotte L.; Waddell, Nicola; Pearson, John V.; Grimmond, Sean M.; Peto, Myron; Spellman, Paul; He, Minghui; Kandoth, Cyriac; Lee, Semin; Zhang, John; Létourneau, Louis; Ma, Singer; Seth, Sahil; Torrents, David; Xi, Liu; Wheeler, David A.; López-Otín, Carlos; Campo, Elías; Campbell, Peter J.; Boutros, Paul C.; Puente, Xose S.; Gerhard, Daniela S.; Pfister, Stefan M.; McPherson, John D.; Hudson, Thomas J.; Schlesner, Matthias; Lichter, Peter; Eils, Roland; Jones, David T. W.; Gut, Ivo G.

    2015-01-01

    As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy. PMID:26647970

  8. Whole-genome sequencing of a malignant granular cell tumor with metabolic response to pazopanib

    Science.gov (United States)

    Wei, Lei; Liu, Song; Conroy, Jeffrey; Wang, Jianmin; Papanicolau-Sengos, Antonios; Glenn, Sean T.; Murakami, Mitsuko; Liu, Lu; Hu, Qiang; Conroy, Jacob; Miles, Kiersten Marie; Nowak, David E.; Liu, Biao; Qin, Maochun; Bshara, Wiam; Omilian, Angela R.; Head, Karen; Bianchi, Michael; Burgher, Blake; Darlak, Christopher; Kane, John; Merzianu, Mihai; Cheney, Richard; Fabiano, Andrew; Salerno, Kilian; Talati, Chetasi; Khushalani, Nikhil I.; Trump, Donald L.; Johnson, Candace S.; Morrison, Carl D.

    2015-01-01

    Granular cell tumors are an uncommon soft tissue neoplasm. Malignant granular cell tumors comprise T transitions, particularly when immediately preceded by a 5′ G. A loss-of-function mutation was detected in a newly recognized tumor suppressor candidate, BRD7. No mutations were found in known targets of pazopanib. However, we identified a receptor tyrosine kinase pathway mutation in GFRA2 that warrants further evaluation. To the best of our knowledge, this is only the second reported case of a malignant granular cell tumor exhibiting a response to pazopanib, and the first whole-genome sequencing of this uncommon tumor type. The findings provide insight into the genetic basis of malignant granular cell tumors and identify potential targets for further investigation. PMID:27148567

  9. The effect of whole genome amplification on samples originating from more than one donor

    DEFF Research Database (Denmark)

    Thacker, C.R.; Balogh, M.K.; Børsting, Claus;

    2006-01-01

    In this study, the GenomiPhi(TM) DNA Amplification Kit (Amersham Biosciences) was used to investigate the potential of whole genome amplification (WGA) when considering samples originating from more than one donor. DNA was extracted from blood samples, quantified and normalised before being mixed...... ratios were found to match the expected peak ratios regardless of the starting concentration of DNA. With samples mixed in the ratio of 1:7 and 1:15, and when the concentration of starting material was at the manufacturer's lower limit, too few minor component peaks were found to allow for statistical...... analysis. With an initial template exceeding 1 ng/[mu]L there was an increase in problems associated with profile interpretation but the results obtained indicated that mixture proportions could be quantifiably maintained...

  10. A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome.

    Science.gov (United States)

    Chapman, Jarrod A; Mascher, Martin; Buluç, Aydın; Barry, Kerrie; Georganas, Evangelos; Session, Adam; Strnadova, Veronika; Jenkins, Jerry; Sehgal, Sunish; Oliker, Leonid; Schmutz, Jeremy; Yelick, Katherine A; Scholz, Uwe; Waugh, Robbie; Poland, Jesse A; Muehlbauer, Gary J; Stein, Nils; Rokhsar, Daniel S

    2015-01-01

    Polyploid species have long been thought to be recalcitrant to whole-genome assembly. By combining high-throughput sequencing, recent developments in parallel computing, and genetic mapping, we derive, de novo, a sequence assembly representing 9.1 Gbp of the highly repetitive 16 Gbp genome of hexaploid wheat, Triticum aestivum, and assign 7.1 Gb of this assembly to chromosomal locations. The genome representation and accuracy of our assembly is comparable or even exceeds that of a chromosome-by-chromosome shotgun assembly. Our assembly and mapping strategy uses only short read sequencing technology and is applicable to any species where it is possible to construct a mapping population. PMID:25637298

  11. Identification of cancer-driver genes in focal genomic alterations from whole genome sequencing data.

    Science.gov (United States)

    Jang, Ho; Hur, Youngmi; Lee, Hyunju

    2016-01-01

    DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes. PMID:27156852

  12. Whole genome sequencing provides an unambiguous link between Salmonella Dublin outbreak strain and a historical isolate.

    Science.gov (United States)

    Mohammed, M; Delappe, N; O'Connor, J; McKeown, P; Garvey, P; Cormican, M

    2016-02-01

    Salmonella enterica subsp. enterica serovar Dublin is an uncommon cause of human salmonellosis; however, a relatively high proportion of cases are associated with invasive disease. The serotype is associated with cattle. A geographically diffuse outbreak of S. Dublin involving nine patients occurred in Ireland in 2013. The source of infection was not identified. Typing of outbreak associated isolates by pulsed-field gel electrophoresis (PFGE) was of limited value because PFGE has limited discriminatory power for S. Dublin. Whole genome sequencing (WGS) showed conclusively that the isolates were closely related to each other, to an apparently unrelated isolate from 2011 and distinct from other isolates that were not readily distinguishable by PFGE. PMID:26165314

  13. Mapping genomic features to functional traits through microbial whole genome sequences.

    Science.gov (United States)

    Zhang, Wei; Zeng, Erliang; Liu, Dan; Jones, Stuart E; Emrich, Scott

    2014-01-01

    Recently, the utility of trait-based approaches for microbial communities has been identified. Increasing availability of whole genome sequences provide the opportunity to explore the genetic foundations of a variety of functional traits. We proposed a machine learning framework to quantitatively link the genomic features with functional traits. Genes from bacteria genomes belonging to different functional traits were grouped to Cluster of Orthologs (COGs), and were used as features. Then, TF-IDF technique from the text mining domain was applied to transform the data to accommodate the abundance and importance of each COG. After TF-IDF processing, COGs were ranked using feature selection methods to identify their relevance to the functional trait of interest. Extensive experimental results demonstrated that functional trait related genes can be detected using our method. Further, the method has the potential to provide novel biological insights. PMID:24989863

  14. Genomic Epidemiology: Whole-Genome-Sequencing-Powered Surveillance and Outbreak Investigation of Foodborne Bacterial Pathogens.

    Science.gov (United States)

    Deng, Xiangyu; den Bakker, Henk C; Hendriksen, Rene S

    2016-01-01

    As we are approaching the twentieth anniversary of PulseNet, a network of public health and regulatory laboratories that has changed the landscape of foodborne illness surveillance through molecular subtyping, public health microbiology is undergoing another transformation brought about by so-called next-generation sequencing (NGS) technologies that have made whole-genome sequencing (WGS) of foodborne bacterial pathogens a realistic and superior alternative to traditional subtyping methods. Routine, real-time, and widespread application of WGS in food safety and public health is on the horizon. Technological, operational, and policy challenges are still present and being addressed by an international and multidisciplinary community of researchers, public health practitioners, and other stakeholders. PMID:26772415

  15. How accurately is ncRNA aligned within whole-genome multiple alignments?

    Directory of Open Access Journals (Sweden)

    Ruzzo Walter L

    2007-10-01

    Full Text Available Abstract Background Multiple alignment of homologous DNA sequences is of great interest to biologists since it provides a window into evolutionary processes. At present, the accuracy of whole-genome multiple alignments, particularly in noncoding regions, has not been thoroughly evaluated. Results We evaluate the alignment accuracy of certain noncoding regions using noncoding RNA alignments from Rfam as a reference. We inspect the MULTIZ 17-vertebrate alignment from the UCSC Genome Browser for all the human sequences in the Rfam seed alignments. In particular, we find 638 instances of chimeric and partial alignments to human noncoding RNA elements, of which at least 225 can be improved by straightforward means. As a byproduct of our procedure, we predict many novel instances of known ncRNA families that are suggested by the alignment. Conclusion MULTIZ does a fairly accurate job of aligning these genomes in these difficult regions. However, our experiments indicate that better alignments exist in some regions.

  16. Diversity through duplication: whole-genome sequencing reveals novel gene retrocopies in the human population.

    Science.gov (United States)

    Richardson, Sandra R; Salvador-Palomeque, Carmen; Faulkner, Geoffrey J

    2014-05-01

    Gene retrocopies are generated by reverse transcription and genomic integration of mRNA. As such, retrocopies present an important exception to the central dogma of molecular biology, and have substantially impacted the functional landscape of the metazoan genome. While an estimated 8,000-17,000 retrocopies exist in the human genome reference sequence, the extent of variation between individuals in terms of retrocopy content has remained largely unexplored. Three recent studies by Abyzov et al., Ewing et al. and Schrider et al. have exploited 1,000 Genomes Project Consortium data, as well as other sources of whole-genome sequencing data, to uncover novel gene retrocopies. Here, we compare the methods and results of these three studies, highlight the impact of retrocopies in human diversity and genome evolution, and speculate on the potential for somatic gene retrocopies to impact cancer etiology and genetic diversity among individual neurons in the mammalian brain. PMID:24615986

  17. Clinical decision support for whole genome sequence information leveraging a service-oriented architecture: a prototype.

    Science.gov (United States)

    Welch, Brandon M; Rodriguez-Loya, Salvador; Eilbeck, Karen; Kawamoto, Kensaku

    2014-01-01

    Whole genome sequence (WGS) information could soon be routinely available to clinicians to support the personalized care of their patients. At such time, clinical decision support (CDS) integrated into the clinical workflow will likely be necessary to support genome-guided clinical care. Nevertheless, developing CDS capabilities for WGS information presents many unique challenges that need to be overcome for such approaches to be effective. In this manuscript, we describe the development of a prototype CDS system that is capable of providing genome-guided CDS at the point of care and within the clinical workflow. To demonstrate the functionality of this prototype, we implemented a clinical scenario of a hypothetical patient at high risk for Lynch Syndrome based on his genomic information. We demonstrate that this system can effectively use service-oriented architecture principles and standards-based components to deliver point of care CDS for WGS information in real-time. PMID:25954430

  18. Identification of cancer-driver genes in focal genomic alterations from whole genome sequencing data

    Science.gov (United States)

    Jang, Ho; Hur, Youngmi; Lee, Hyunju

    2016-01-01

    DNA copy number alterations (CNAs) are the main genomic events that occur during the initiation and development of cancer. Distinguishing driver aberrant regions from passenger regions, which might contain candidate target genes for cancer therapies, is an important issue. Several methods for identifying cancer-driver genes from multiple cancer patients have been developed for single nucleotide polymorphism (SNP) arrays. However, for NGS data, methods for the SNP array cannot be directly applied because of different characteristics of NGS such as higher resolutions of data without predefined probes and incorrectly mapped reads to reference genomes. In this study, we developed a wavelet-based method for identification of focal genomic alterations for sequencing data (WIFA-Seq). We applied WIFA-Seq to whole genome sequencing data from glioblastoma multiforme, ovarian serous cystadenocarcinoma and lung adenocarcinoma, and identified focal genomic alterations, which contain candidate cancer-related genes as well as previously known cancer-driver genes. PMID:27156852

  19. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.

    Science.gov (United States)

    Alioto, Tyler S; Buchhalter, Ivo; Derdak, Sophia; Hutter, Barbara; Eldridge, Matthew D; Hovig, Eivind; Heisler, Lawrence E; Beck, Timothy A; Simpson, Jared T; Tonon, Laurie; Sertier, Anne-Sophie; Patch, Ann-Marie; Jäger, Natalie; Ginsbach, Philip; Drews, Ruben; Paramasivam, Nagarajan; Kabbe, Rolf; Chotewutmontri, Sasithorn; Diessl, Nicolle; Previti, Christopher; Schmidt, Sabine; Brors, Benedikt; Feuerbach, Lars; Heinold, Michael; Gröbner, Susanne; Korshunov, Andrey; Tarpey, Patrick S; Butler, Adam P; Hinton, Jonathan; Jones, David; Menzies, Andrew; Raine, Keiran; Shepherd, Rebecca; Stebbings, Lucy; Teague, Jon W; Ribeca, Paolo; Giner, Francesc Castro; Beltran, Sergi; Raineri, Emanuele; Dabad, Marc; Heath, Simon C; Gut, Marta; Denroche, Robert E; Harding, Nicholas J; Yamaguchi, Takafumi N; Fujimoto, Akihiro; Nakagawa, Hidewaki; Quesada, Víctor; Valdés-Mas, Rafael; Nakken, Sigve; Vodák, Daniel; Bower, Lawrence; Lynch, Andrew G; Anderson, Charlotte L; Waddell, Nicola; Pearson, John V; Grimmond, Sean M; Peto, Myron; Spellman, Paul; He, Minghui; Kandoth, Cyriac; Lee, Semin; Zhang, John; Létourneau, Louis; Ma, Singer; Seth, Sahil; Torrents, David; Xi, Liu; Wheeler, David A; López-Otín, Carlos; Campo, Elías; Campbell, Peter J; Boutros, Paul C; Puente, Xose S; Gerhard, Daniela S; Pfister, Stefan M; McPherson, John D; Hudson, Thomas J; Schlesner, Matthias; Lichter, Peter; Eils, Roland; Jones, David T W; Gut, Ivo G

    2015-01-01

    As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼ 100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy. PMID:26647970

  20. Lysis of a Single Cyanobacterium for Whole Genome Amplification

    Directory of Open Access Journals (Sweden)

    Richard N. Zare

    2013-08-01

    Full Text Available Bacterial species from natural environments, exhibiting a great degree of genetic diversity that has yet to be characterized, pose a specific challenge to whole genome amplification (WGA from single cells. A major challenge is establishing an effective, compatible, and controlled lysis protocol. We present a novel lysis protocol that can be used to extract genomic information from a single cyanobacterium of Synechocystis sp. PCC 6803 known to have multilayer cell wall structures that resist conventional lysis methods. Simple but effective strategies for releasing genomic DNA from captured cells while retaining cellular identities for single-cell analysis are presented. Successful sequencing of genetic elements from single-cell amplicons prepared by multiple displacement amplification (MDA is demonstrated for selected genes (15 loci nearly equally spaced throughout the main chromosome.

  1. Rapid determination of anti-tuberculosis drug resistance from whole-genome sequences

    KAUST Repository

    Coll, Francesc

    2015-05-27

    Mycobacterium tuberculosis drug resistance (DR) challenges effective tuberculosis disease control. Current molecular tests examine limited numbers of mutations, and although whole genome sequencing approaches could fully characterise DR, data complexity has restricted their clinical application. A library (1,325 mutations) predictive of DR for 15 anti-tuberculosis drugs was compiled and validated for 11 of them using genomic-phenotypic data from 792 strains. A rapid online ‘TB-Profiler’ tool was developed to report DR and strain-type profiles directly from raw sequences. Using our DR mutation library, in silico diagnostic accuracy was superior to some commercial diagnostics and alternative databases. The library will facilitate sequence-based drug-susceptibility testing.

  2. Overview of HBV whole genome data in public repositories and the Chinese HBV reference sequences

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The number of Hepatitis B virus (HBV) whole genomic sequences in public nucleotide databases (GenBank, EMBL, and DDBJ) had reached 866 by January 1, 2007. Coming from 46 countries and regions, these sequences were categorized as eight genotypes (A-H). With the statistical and phylogenetic analysis on all available complete genomic data of HBV, we here present an overview of HBV sequences in public databases. From all registered 229 HBV genomes in Chinese regions as well as 59 sequencing data from our research group, we report the establishment of reference sequences of HBV strains prevailing in China. These analyses provide clues for the effects of HBV genotypes in host clinical progressions, geographic distribution of the infection, and the viral evolutionary history. Moreover, the viral sequence reference would be helpful in the identification of various HBV mutations. Based on the analysis of various public databases,we suggest that the Chinese HBV database with the clinical information should be constructed.

  3. A strategic stakeholder approach for addressing further analysis requests in whole genome sequencing research.

    Science.gov (United States)

    Thornock, Bradley Steven O

    2016-01-01

    Whole genome sequencing (WGS) can be a cost-effective and efficient means of diagnosis for some children, but it also raises a number of ethical concerns. One such concern is how researchers derive and communicate results from WGS, including future requests for further analysis of stored sequences. The purpose of this paper is to think about what is at stake, and for whom, in any solution that is developed to deal with such requests. To accomplish this task, this paper will utilize stakeholder theory, a common method used in business ethics. Several scenarios that connect stakeholder concerns and WGS will also posited and analyzed. This paper concludes by developing criteria composed of a series of questions that researchers can answer in order to more effectively address requests for further analysis of stored sequences. PMID:27091475

  4. Bioinformatics Workflow for Clinical Whole Genome Sequencing at Partners HealthCare Personalized Medicine

    Directory of Open Access Journals (Sweden)

    Ellen A. Tsai

    2016-02-01

    Full Text Available Effective implementation of precision medicine will be enhanced by a thorough understanding of each patient’s genetic composition to better treat his or her presenting symptoms or mitigate the onset of disease. This ideally includes the sequence information of a complete genome for each individual. At Partners HealthCare Personalized Medicine, we have developed a clinical process for whole genome sequencing (WGS with application in both healthy individuals and those with disease. In this manuscript, we will describe our bioinformatics strategy to efficiently process and deliver genomic data to geneticists for clinical interpretation. We describe the handling of data from FASTQ to the final variant list for clinical review for the final report. We will also discuss our methodology for validating this workflow and the cost implications of running WGS.

  5. Phylogenetics and differentiation of Salmonella Newport lineages by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Guojie Cao

    Full Text Available Salmonella Newport has ranked in the top three Salmonella serotypes associated with foodborne outbreaks from 1995 to 2011 in the United States. In the current study, we selected 26 S. Newport strains isolated from diverse sources and geographic locations and then conducted 454 shotgun pyrosequencing procedures to obtain 16-24 × coverage of high quality draft genomes for each strain. Comparative genomic analysis of 28 S. Newport strains (including 2 reference genomes and 15 outgroup genomes identified more than 140,000 informative SNPs. A resulting phylogenetic tree consisted of four sublineages and indicated that S. Newport had a clear geographic structure. Strains from Asia were divergent from those from the Americas. Our findings demonstrated that analysis using whole genome sequencing data resulted in a more accurate picture of phylogeny compared to that using single genes or small sets of genes. We selected loci around the mutS gene of S. Newport to differentiate distinct lineages, including those between invH and mutS genes at the 3' end of Salmonella Pathogenicity Island 1 (SPI-1, ste fimbrial operon, and Clustered, Regularly Interspaced, Short Palindromic Repeats (CRISPR associated-proteins (cas. These genes in the outgroup genomes held high similarity with either S. Newport Lineage II or III at the same loci. S. Newport Lineages II and III have different evolutionary histories in this region and our data demonstrated genetic flow and homologous recombination events around mutS. The findings suggested that S. Newport Lineages II and III diverged early in the serotype evolution and have evolved largely independently. Moreover, we identified genes that could delineate sublineages within the phylogenetic tree and that could be used as potential biomarkers for trace-back investigations during outbreaks. Thus, whole genome sequencing data enabled us to better understand the genetic background of pathogenicity and evolutionary history of S

  6. A model for carbohydrate metabolism in the diatom Phaeodactylum tricornutum deduced from comparative whole genome analysis.

    Directory of Open Access Journals (Sweden)

    Peter G Kroth

    Full Text Available BACKGROUND: Diatoms are unicellular algae responsible for approximately 20% of global carbon fixation. Their evolution by secondary endocytobiosis resulted in a complex cellular structure and metabolism compared to algae with primary plastids. METHODOLOGY/PRINCIPAL FINDINGS: The whole genome sequence of the diatom Phaeodactylum tricornutum has recently been completed. We identified and annotated genes for enzymes involved in carbohydrate pathways based on extensive EST support and comparison to the whole genome sequence of a second diatom, Thalassiosira pseudonana. Protein localization to mitochondria was predicted based on identified similarities to mitochondrial localization motifs in other eukaryotes, whereas protein localization to plastids was based on the presence of signal peptide motifs in combination with plastid localization motifs previously shown to be required in diatoms. We identified genes potentially involved in a C4-like photosynthesis in P. tricornutum and, on the basis of sequence-based putative localization of relevant proteins, discuss possible differences in carbon concentrating mechanisms and CO(2 fixation between the two diatoms. We also identified genes encoding enzymes involved in photorespiration with one interesting exception: glycerate kinase was not found in either P. tricornutum or T. pseudonana. Various Calvin cycle enzymes were found in up to five different isoforms, distributed between plastids, mitochondria and the cytosol. Diatoms store energy either as lipids or as chrysolaminaran (a beta-1,3-glucan outside of the plastids. We identified various beta-glucanases and large membrane-bound glucan synthases. Interestingly most of the glucanases appear to contain C-terminal anchor domains that may attach the enzymes to membranes. CONCLUSIONS/SIGNIFICANCE: Here we present a detailed synthesis of carbohydrate metabolism in diatoms based on the genome sequences of Thalassiosira pseudonana and Phaeodactylum tricornutum

  7. Navigating Microbiological Food Safety in the Era of Whole-Genome Sequencing.

    Science.gov (United States)

    Ronholm, J; Nasheri, Neda; Petronella, Nicholas; Pagotto, Franco

    2016-10-01

    The epidemiological investigation of a foodborne outbreak, including identification of related cases, source attribution, and development of intervention strategies, relies heavily on the ability to subtype the etiological agent at a high enough resolution to differentiate related from nonrelated cases. Historically, several different molecular subtyping methods have been used for this purpose; however, emerging techniques, such as single nucleotide polymorphism (SNP)-based techniques, that use whole-genome sequencing (WGS) offer a resolution that was previously not possible. With WGS, unlike traditional subtyping methods that lack complete information, data can be used to elucidate phylogenetic relationships and disease-causing lineages can be tracked and monitored over time. The subtyping resolution and evolutionary context provided by WGS data allow investigators to connect related illnesses that would be missed by traditional techniques. The added advantage of data generated by WGS is that these data can also be used for secondary analyses, such as virulence gene detection, antibiotic resistance gene profiling, synteny comparisons, mobile genetic element identification, and geographic attribution. In addition, several software packages are now available to generate in silico results for traditional molecular subtyping methods from the whole-genome sequence, allowing for efficient comparison with historical databases. Metagenomic approaches using next-generation sequencing have also been successful in the detection of nonculturable foodborne pathogens. This review addresses state-of-the-art techniques in microbial WGS and analysis and then discusses how this technology can be used to help support food safety investigations. Retrospective outbreak investigations using WGS are presented to provide organism-specific examples of the benefits, and challenges, associated with WGS in comparison to traditional molecular subtyping techniques. PMID:27559074

  8. Light whole genome sequence for SNP discovery across domestic cat breeds

    Directory of Open Access Journals (Sweden)

    Driscoll Carlos

    2010-06-01

    Full Text Available Abstract Background The domestic cat has offered enormous genomic potential in the veterinary description of over 250 hereditary disease models as well as the occurrence of several deadly feline viruses (feline leukemia virus -- FeLV, feline coronavirus -- FECV, feline immunodeficiency virus - FIV that are homologues to human scourges (cancer, SARS, and AIDS respectively. However, to realize this bio-medical potential, a high density single nucleotide polymorphism (SNP map is required in order to accomplish disease and phenotype association discovery. Description To remedy this, we generated 3,178,297 paired fosmid-end Sanger sequence reads from seven cats, and combined these data with the publicly available 2X cat whole genome sequence. All sequence reads were assembled together to form a 3X whole genome assembly allowing the discovery of over three million SNPs. To reduce potential false positive SNPs due to the low coverage assembly, a low upper-limit was placed on sequence coverage and a high lower-limit on the quality of the discrepant bases at a potential variant site. In all domestic cats of different breeds: female Abyssinian, female American shorthair, male Cornish Rex, female European Burmese, female Persian, female Siamese, a male Ragdoll and a female African wildcat were sequenced lightly. We report a total of 964 k common SNPs suitable for a domestic cat SNP genotyping array and an additional 900 k SNPs detected between African wildcat and domestic cats breeds. An empirical sampling of 94 discovered SNPs were tested in the sequenced cats resulting in a SNP validation rate of 99%. Conclusions These data provide a large collection of mapped feline SNPs across the cat genome that will allow for the development of SNP genotyping platforms for mapping feline diseases.

  9. Environmental Whole-Genome Amplification to Access Microbial Diversity in Contaminated Sediments

    Energy Technology Data Exchange (ETDEWEB)

    Abulencia, C.B.; Wyborski, D.L.; Garcia, J.; Podar, M.; Chen, W.; Chang, S.H.; Chang, H.W.; Watson, D.; Brodie,E.I.; Hazen, T.C.; Keller, M.

    2005-12-10

    Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using ?29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2 percent genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9 percent of the sequences had significant similarities to known proteins, and ''clusters of orthologous groups'' (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible.

  10. Whole genome investigation of a divergent clade of the pathogen Streptococcus suis

    Directory of Open Access Journals (Sweden)

    Abiyad eBaig

    2015-11-01

    Full Text Available Streptococcus suis is a major porcine and zoonotic pathogen responsible for significant economic losses in the pig industry and an increasing number of human cases. Multiple isolates of S. suis show marked genomic diversity. Here we report the analysis of whole genome sequences of nine pig isolates that caused disease typical of S. suis and had phenotypic characteristics of S. suis, but their genomes were divergent from those of many other S. suis isolates. Comparison of protein sequences predicted from divergent genomes with those from normal S. suis reduced the size of core genome from 793 to only 397 genes. Divergence was clear if phylogenetic analysis was performed on reduced core genes and MLST alleles. Phylogenies based on certain other genes (16S rRNA, sodA, recN and cpn60 did not show divergence for all isolates, suggesting recombination between some divergent isolates with normal S. suis for these genes. Indeed, there is evidence of recent recombination between the divergent and normal S. suis genomes for 249 of 397 core genes. In addition, phylogenetic analysis based on the 16S rRNA gene and 132 genes that were conserved between the divergent isolates and representatives of the broader Streptococcus genus showed that divergent isolates were more closely related to S. suis. Six out of nine divergent isolates possessed a S. suis-like capsule region with variation in capsular gene sequences but the remaining three did not have a discrete capsule locus. The majority (40/70, of virulence-associated genes in normal S. suis were present in the divergent genomes. Overall, the divergent isolates extend the current diversity of S. suis species but the phenotypic similarities and the large amount of gene exchange with normal S. suis gives insufficient evidence to assign these isolates to a new species or subspecies. Further sampling and whole genome analysis of more isolates is warranted to understand the diversity of the species.

  11. Integrated clinical, whole-genome, and transcriptome analysis of multisampled lethal metastatic prostate cancer.

    Science.gov (United States)

    Bova, G Steven; Kallio, Heini M L; Annala, Matti; Kivinummi, Kati; Högnäs, Gunilla; Häyrynen, Sergei; Rantapero, Tommi; Kivinen, Virpi; Isaacs, William B; Tolonen, Teemu; Nykter, Matti; Visakorpi, Tapio

    2016-05-01

    We report the first combined analysis of whole-genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole-genome and transcriptome sequence was obtained from nine anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 yr before death. Transcriptome analysis revealed increased expression of androgen receptor (AR)-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only one of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today, given this knowledge, the use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations may be critical for effective actionability and merit further study. Our findings suggest that a large set of deeply analyzed cases could serve as a powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials. PMID:27148588

  12. Evaluation of whole genome sequencing for outbreak detection of Salmonella enterica.

    Directory of Open Access Journals (Sweden)

    Pimlapas Leekitcharoenphon

    Full Text Available Salmonella enterica is a common cause of minor and large food borne outbreaks. To achieve successful and nearly 'real-time' monitoring and identification of outbreaks, reliable sub-typing is essential. Whole genome sequencing (WGS shows great promises for using as a routine epidemiological typing tool. Here we evaluate WGS for typing of S. Typhimurium including different approaches for analyzing and comparing the data. A collection of 34 S. Typhimurium isolates was sequenced. This consisted of 18 isolates from six outbreaks and 16 epidemiologically unrelated background strains. In addition, 8 S. Enteritidis and 5 S. Derby were also sequenced and used for comparison. A number of different bioinformatics approaches were applied on the data; including pan-genome tree, k-mer tree, nucleotide difference tree and SNP tree. The outcome of each approach was evaluated in relation to the association of the isolates to specific outbreaks. The pan-genome tree clustered 65% of the S. Typhimurium isolates according to the pre-defined epidemiology, the k-mer tree 88%, the nucleotide difference tree 100% and the SNP tree 100% of the strains within S. Typhimurium. The resulting outcome of the four phylogenetic analyses were also compared to PFGE revealing that WGS typing achieved the greater performance than the traditional method. In conclusion, for S. Typhimurium, SNP analysis and nucleotide difference approach of WGS data seem to be the superior methods for epidemiological typing compared to other phylogenetic analytic approaches that may be used on WGS. These approaches were also superior to the more classical typing method, PFGE. Our study also indicates that WGS alone is insufficient to determine whether strains are related or un-related to outbreaks. This still requires the combination of epidemiological data and whole genome sequencing results.

  13. Integrated clinical, whole-genome, and transcriptome analysis of multisampled lethal metastatic prostate cancer

    Science.gov (United States)

    Bova, G. Steven; Kallio, Heini M.L.; Annala, Matti; Kivinummi, Kati; Högnäs, Gunilla; Häyrynen, Sergei; Rantapero, Tommi; Kivinen, Virpi; Isaacs, William B.; Tolonen, Teemu; Nykter, Matti; Visakorpi, Tapio

    2016-01-01

    We report the first combined analysis of whole-genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole-genome and transcriptome sequence was obtained from nine anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 yr before death. Transcriptome analysis revealed increased expression of androgen receptor (AR)-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only one of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today, given this knowledge, the use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations may be critical for effective actionability and merit further study. Our findings suggest that a large set of deeply analyzed cases could serve as a powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials.

  14. Whole-genome sequencing identifies emergence of a quinolone resistance mutation in a case of Stenotrophomonas maltophilia bacteremia.

    Science.gov (United States)

    Pak, Theodore R; Altman, Deena R; Attie, Oliver; Sebra, Robert; Hamula, Camille L; Lewis, Martha; Deikus, Gintaras; Newman, Leah C; Fang, Gang; Hand, Jonathan; Patel, Gopi; Wallach, Fran; Schadt, Eric E; Huprikar, Shirish; van Bakel, Harm; Kasarskis, Andrew; Bashir, Ali

    2015-11-01

    Whole-genome sequences for Stenotrophomonas maltophilia serial isolates from a bacteremic patient before and after development of levofloxacin resistance were assembled de novo and differed by one single-nucleotide variant in smeT, a repressor for multidrug efflux operon smeDEF. Along with sequenced isolates from five contemporaneous cases, they displayed considerable diversity compared against all published complete genomes. Whole-genome sequencing and complete assembly can conclusively identify resistance mechanisms emerging in S. maltophilia strains during clinical therapy. PMID:26324280

  15. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang; Wu, Honglong; Zhu, Hongmei; Li, Ruiqiang; Cao, Hongzhi; Wu, Boxin; Huang, Shujia; Shao, Haojing; Ma, Hanzhou; Zhang, Fan; Feng, Shuijian; Zhang, Wei; Du, Hongli; Tian, Geng; Li, Jingxiang; Zhang, Xiuqing; Li, Songgang; Bolund, Lars; Kristiansen, Karsten; de Smith, Adam J; Blakemore, Alexandra I F; Coin, Lachlan J M; Yang, Huanming; Wang, Jian; Wang, Jun

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  16. Direct DNA Extraction from Mycobacterium tuberculosis Frozen Stocks as a Reculture-Independent Approach to Whole-Genome Sequencing

    DEFF Research Database (Denmark)

    Bjorn-Mortensen, K; Zallet, J; Lillebaek, T;

    2015-01-01

    Culturing before DNA extraction represents a major time-consuming step in whole-genome sequencing of slow-growing bacteria, such as Mycobacterium tuberculosis. We report a workflow to extract DNA from frozen isolates without reculturing. Prepared libraries and sequence data were comparable with...

  17. Whole genome sequencing of Candidatus Liberibacter asiaticus strain A4 from Guangdong, China, and strain HHCA from California

    Science.gov (United States)

    “Candidatus Liberibacter asiaticus” is associated with citrus Huanglongbing (HLB) in both China and the United States. While HLB has been known for over a century in Guangdong, China, the disease was first discovered in California in 2012. To better study the “old” and “new” HLBs, whole genomes of “...

  18. Understanding the Quorum-Sensing Bacterium Pantoea stewartii Strain M009 with Whole-Genome Sequencing Analysis.

    Science.gov (United States)

    Tan, Wen-Si; Chang, Chien-Yi; Yin, Wai-Fong; Chan, Kok-Gan

    2015-01-01

    Pantoea stewartii is known to be the causative agent of Stewart's wilt, which usually affects sweet corn (Zea mays) with the corn flea beetle as the transmission vector. In this work, we present the whole-genome sequence of Pantoea stewartii strain M009, isolated from a Malaysian tropical rainforest waterfall. PMID:25635007

  19. Understanding the Quorum-Sensing Bacterium Pantoea stewartii Strain M009 with Whole-Genome Sequencing Analysis

    OpenAIRE

    Tan, Wen-Si; Chang, Chien-Yi; Yin, Wai-Fong; Chan, Kok-Gan

    2015-01-01

    Pantoea stewartii is known to be the causative agent of Stewart’s wilt, which usually affects sweet corn (Zea mays) with the corn flea beetle as the transmission vector. In this work, we present the whole-genome sequence of Pantoea stewartii strain M009, isolated from a Malaysian tropical rainforest waterfall.

  20. Detection and Whole-Genome Sequencing of Carbapenemase-Producing Aeromonas hydrophila Isolates from Routine Perirectal Surveillance Culture.

    Science.gov (United States)

    Hughes, Heather Y; Conlan, Sean P; Lau, Anna F; Dekker, John P; Michelin, Angela V; Youn, Jung-Ho; Henderson, David K; Frank, Karen M; Segre, Julia A; Palmore, Tara N

    2016-04-01

    Perirectal surveillance cultures and a stool culture grewAeromonasspecies from three patients over a 6-week period and were without epidemiological links. Detection of theblaKPC-2gene in one isolate prompted inclusion of non-Enterobacteriaceaein our surveillance culture workup. Whole-genome sequencing confirmed that the isolates were unrelated and provided data forAeromonasreference genomes. PMID:26888898

  1. Identifying Rare Variation in Cases of Schizophrenia in the Isolated Population of the Faroe Islands using Whole-genome Sequencing

    DEFF Research Database (Denmark)

    Als, Thomas Damm; Lescai, Francesco; Dahl, Hans;

    studies aiming to map risk variants involved in complex traits. We aim at utilizing samples of cases and controls of the isolated population of the Faroe Islands to conduct whole-genome-sequence analysis in order to identify rare genetic variants associated with schizophrenia. We will search for rare...

  2. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus

    OpenAIRE

    Yookyung Lee; Sooyeon Lim; Moon-Soo Rhee; Dong-Ho Chang; Byoung-Chan Kim

    2016-01-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000.

  3. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus.

    Science.gov (United States)

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-03-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000. PMID:26981432

  4. Whole-Genome Shotgun Sequence of Bacillus amyloliquefaciens Strain UASWS BA1, a Bacterium Antagonistic to Plant Pathogenic Fungi

    OpenAIRE

    Lefort, F; Calmin, G.; Pelleteret, P.; Farinelli, L.; Osteras, M; Crovadore, J.

    2014-01-01

    We report here the whole-genome shotgun sequence of Bacillus amyloliquefaciens strain UASWS BA1, isolated from inner wood tissues of a decaying Platanus × acerifolia tree. This strain proved to be antagonistic to several plant pathogenic fungi and oomycetes and can be developed as a biological control agent in agriculture.

  5. Whole genome amplification induced bias in the detection of KRAS-mutated cell populations during colorectal carcinoma tissue testing.

    Science.gov (United States)

    Stranska, Jana; Jancik, Sylwia; Slavkovsky, Rastislav; Holinkova, Veronika; Rabcanova, Miroslava; Vojta, Petr; Hajduch, Marian; Drabek, Jiri

    2015-03-01

    Whole genome amplification replicates the entire DNA content of a sample and can thus help to circumvent material limitations when insufficient DNA is available for planned genetic analyses. However, there are conflicting data in the literature whether whole genome amplification introduces bias or reflects precisely the spectrum of starting DNA. We analyzed the origins of discrepancies in KRAS (Kirsten rat sarcoma viral oncogene homolog gene) mutation detection in six of ten samples amplified using the GenomePlex® Tissue Whole Genome Amplification kit 5 (WGA5; Sigma-Aldrich, St. Louis, MO, USA) and KRAS StripAssay® (KRAS SA; ViennaLab Diagnostics, Vienna, Austria). We undertook reextraction, reamplification, retyping, authentication, reanalysis, and reinterpretation to determine whether the discrepancies originated during the preanalytical, analytical, and/or interpretative phase of genotyping. We conclude that a combination of glass slide/sample heterogeneity and biased amplification due to stochastic effects in the early phases of whole genome amplification (WGA) may have adversely affected the results obtained. Our findings are relevant for both forensic genetics testing and massively parallel sequencing using preamplification. PMID:25655305

  6. Whole-genome amplified DNA from stored dried blood spots is reliable in high resolution melting curve and sequencing analysis

    DEFF Research Database (Denmark)

    Winkel, Bo G; Hollegaard, Mads Vilhelm; Olesen, Morten S; Svendsen, Jesper H; Haunsø, Stig; Hougaard, David M; Tfelt-Hansen, Jacob

    2011-01-01

    The use of dried blood spots (DBS) samples in genomic workup has been limited by the relative low amounts of genomic DNA (gDNA) they contain. It remains to be proven that whole genome amplified DNA (wgaDNA) from stored DBS samples, constitutes a reliable alternative to gDNA.We wanted to compare m...

  7. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Science.gov (United States)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Nielsen, Mette T; Rosenqvist Lund, Birthe S; Ameh, James A; Ambali, Abdul G; Sørensen, Gitte; Le Hello, Simon; Aarestrup, Frank M; Hendriksen, Rene S

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections. PMID:27228329

  8. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Thorup Nielsen, Mette;

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely...

  9. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    Directory of Open Access Journals (Sweden)

    Pimlapas Leekitcharoenphon

    Full Text Available Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections.

  10. Comparing Whole-Genome Sequencing with Sanger Sequencing for spa Typing of Methicillin-Resistant Staphylococcus aureus

    DEFF Research Database (Denmark)

    Bartels, Mette Damkjaer; Petersen, Andreas; Worning, Peder;

    2014-01-01

    spa typing of methicillin-resistant Staphylococcus aureus (MRSA) has traditionally been done by PCR amplification and Sanger sequencing of the spa repeat region. At Hvidovre Hospital, Denmark, whole-genome sequencing (WGS) of all MRSA isolates has been performed routinely since January 2013, and an...

  11. Whole-genome gene expression profiling of formalin-fixed, paraffin-embedded tissue samples.

    Directory of Open Access Journals (Sweden)

    Craig April

    Full Text Available BACKGROUND: We have developed a gene expression assay (Whole-Genome DASL, capable of generating whole-genome gene expression profiles from degraded samples such as formalin-fixed, paraffin-embedded (FFPE specimens. METHODOLOGY/PRINCIPAL FINDINGS: We demonstrated a similar level of sensitivity in gene detection between matched fresh-frozen (FF and FFPE samples, with the number and overlap of probes detected in the FFPE samples being approximately 88% and 95% of that in the corresponding FF samples, respectively; 74% of the differentially expressed probes overlapped between the FF and FFPE pairs. The WG-DASL assay is also able to detect 1.3-1.5 and 1.5-2 -fold changes in intact and FFPE samples, respectively. The dynamic range for the assay is approximately 3 logs. Comparing the WG-DASL assay with an in vitro transcription-based labeling method yielded fold-change correlations of R(2 approximately 0.83, while fold-change comparisons with quantitative RT-PCR assays yielded R(2 approximately 0.86 and R(2 approximately 0.55 for intact and FFPE samples, respectively. Additionally, the WG-DASL assay yielded high self-correlations (R(2>0.98 with low intact RNA inputs ranging from 1 ng to 100 ng; reproducible expression profiles were also obtained with 250 pg total RNA (R(2 approximately 0.92, with approximately 71% of the probes detected in 100 ng total RNA also detected at the 250 pg level. When FFPE samples were assayed, 1 ng total RNA yielded self-correlations of R(2 approximately 0.80, while still maintaining a correlation of R(2 approximately 0.75 with standard FFPE inputs (200 ng. CONCLUSIONS/SIGNIFICANCE: Taken together, these results show that WG-DASL assay provides a reliable platform for genome-wide expression profiling in archived materials. It also possesses utility within clinical settings where only limited quantities of samples may be available (e.g. microdissected material or when minimally invasive procedures are performed (e

  12. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    Science.gov (United States)

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang; Estrada, Karol; Rosello-Diez, Alberto; Leo, Paul J; Dahia, Chitra L; Park-Min, Kyung Hyun; Tobias, Jonathan H; Kooperberg, Charles; Kleinman, Aaron; Styrkarsdottir, Unnur; Liu, Ching-Ti; Uggla, Charlotta; Evans, Daniel S; Nielson, Carrie M; Walter, Klaudia; Pettersson-Kymmer, Ulrika; McCarthy, Shane; Eriksson, Joel; Kwan, Tony; Jhamai, Mila; Trajanoska, Katerina; Memari, Yasin; Min, Josine; Huang, Jie; Danecek, Petr; Wilmot, Beth; Li, Rui; Chou, Wen-Chi; Mokry, Lauren E; Moayyeri, Alireza; Claussnitzer, Melina; Cheng, Chia-Ho; Cheung, Warren; Medina-Gómez, Carolina; Ge, Bing; Chen, Shu-Huang; Choi, Kwangbom; Oei, Ling; Fraser, James; Kraaij, Robert; Hibbs, Matthew A; Gregson, Celia L; Paquette, Denis; Hofman, Albert; Wibom, Carl; Tranah, Gregory J; Marshall, Mhairi; Gardiner, Brooke B; Cremin, Katie; Auer, Paul; Hsu, Li; Ring, Sue; Tung, Joyce Y; Thorleifsson, Gudmar; Enneman, Anke W; van Schoor, Natasja M; de Groot, Lisette C.P.G.M.; van der Velde, Nathalie; Melin, Beatrice; Kemp, John P; Christiansen, Claus; Sayers, Adrian; Zhou, Yanhua; Calderari, Sophie; van Rooij, Jeroen; Carlson, Chris; Peters, Ulrike; Berlivet, Soizik; Dostie, Josée; Uitterlinden, Andre G; Williams, Stephen R.; Farber, Charles; Grinberg, Daniel; LaCroix, Andrea Z; Haessler, Jeff; Chasman, Daniel I; Giulianini, Franco; Rose, Lynda M; Ridker, Paul M; Eisman, John A; Nguyen, Tuan V; Center, Jacqueline R; Nogues, Xavier; Garcia-Giralt, Natalia; Launer, Lenore L; Gudnason, Vilmunder; Mellström, Dan; Vandenput, Liesbeth; Karlsson, Magnus K; Ljunggren, Östen; Svensson, Olle; Hallmans, Göran; Rousseau, François; Giroux, Sylvie; Bussière, Johanne; Arp, Pascal P; Koromani, Fjorda; Prince, Richard L; Lewis, Joshua R; Langdahl, Bente L; Hermann, A Pernille; Jensen, Jens-Erik B; Kaptoge, Stephen; Khaw, Kay-Tee; Reeve, Jonathan; Formosa, Melissa M; Xuereb-Anastasi, Angela; Åkesson, Kristina; McGuigan, Fiona E; Garg, Gaurav; Olmos, Jose M; Zarrabeitia, Maria T; Riancho, Jose A; Ralston, Stuart H; Alonso, Nerea; Jiang, Xi; Goltzman, David; Pastinen, Tomi; Grundberg, Elin; Gauguier, Dominique; Orwoll, Eric S; Karasik, David; Davey-Smith, George; Smith, Albert V; Siggeirsdottir, Kristin; Harris, Tamara B; Zillikens, M Carola; van Meurs, Joyce BJ; Thorsteinsdottir, Unnur; Maurano, Matthew T; Timpson, Nicholas J; Soranzo, Nicole; Durbin, Richard; Wilson, Scott G; Ntzani, Evangelia E; Brown, Matthew A; Stefansson, Kari; Hinds, David A; Spector, Tim; Cupples, L Adrienne; Ohlsson, Claes; Greenwood, Celia MT; Jackson, Rebecca D; Rowe, David W; Loomis, Cynthia A; Evans, David M; Ackert-Bicknell, Cheryl L; Joyner, Alexandra L; Duncan, Emma L; Kiel, Douglas P; Rivadeneira, Fernando; Richards, J Brent

    2016-01-01

    SUMMARY The extent to which low-frequency (minor allele frequency [MAF] between 1–5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is largely unknown. Bone mineral density (BMD) is highly heritable, is a major predictor of osteoporotic fractures and has been previously associated with common genetic variants1–8, and rare, population-specific, coding variants9. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n=2,882 from UK10K), whole-exome sequencing (n= 3,549), deep imputation of genotyped samples using a combined UK10K/1000Genomes reference panel (n=26,534), and de-novo replication genotyping (n= 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size 4-fold larger than the mean of previously reported common variants for lumbar spine BMD8 (rs11692564[T], MAF = 1.7%, replication effect size = +0.20 standard deviations [SD], Pmeta = 2×10−14), which was also associated with a decreased risk of fracture (OR = 0.85; P = 2×10−11; ncases = 98,742 and ncontrols = 409,511). Using an En1Cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, likely as a consequence of high bone turn-over. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817[T], MAF = 1.1%, replication effect size = +0.39 SD, Pmeta = 1×10−11). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of

  13. Whole genome sequencing of Saccharomyces cerevisiae: from genotype to phenotype for improved metabolic engineering applications

    Directory of Open Access Journals (Sweden)

    Asadollahi Mohammad A

    2010-12-01

    Full Text Available Abstract Background The need for rapid and efficient microbial cell factory design and construction are possible through the enabling technology, metabolic engineering, which is now being facilitated by systems biology approaches. Metabolic engineering is often complimented by directed evolution, where selective pressure is applied to a partially genetically engineered strain to confer a desirable phenotype. The exact genetic modification or resulting genotype that leads to the improved phenotype is often not identified or understood to enable further metabolic engineering. Results In this work we performed whole genome high-throughput sequencing and annotation can be used to identify single nucleotide polymorphisms (SNPs between Saccharomyces cerevisiae strains S288c and CEN.PK113-7D. The yeast strain S288c was the first eukaryote sequenced, serving as the reference genome for the Saccharomyces Genome Database, while CEN.PK113-7D is a preferred laboratory strain for industrial biotechnology research. A total of 13,787 high-quality SNPs were detected between both strains (reference strain: S288c. Considering only metabolic genes (782 of 5,596 annotated genes, a total of 219 metabolism specific SNPs are distributed across 158 metabolic genes, with 85 of the SNPs being nonsynonymous (e.g., encoding amino acid modifications. Amongst metabolic SNPs detected, there was pathway enrichment in the galactose uptake pathway (GAL1, GAL10 and ergosterol biosynthetic pathway (ERG8, ERG9. Physiological characterization confirmed a strong deficiency in galactose uptake and metabolism in S288c compared to CEN.PK113-7D, and similarly, ergosterol content in CEN.PK113-7D was significantly higher in both glucose and galactose supplemented cultivations compared to S288c. Furthermore, DNA microarray profiling of S288c and CEN.PK113-7D in both glucose and galactose batch cultures did not provide a clear hypothesis for major phenotypes observed, suggesting that

  14. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction.

    Science.gov (United States)

    Brøndum, R F; Su, G; Janss, L; Sahana, G; Guldbrandtsen, B; Boichard, D; Lund, M S

    2015-06-01

    This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome

  15. Impacts of Whole-Genome Triplication on MIRNA Evolution in Brassica rapa.

    Science.gov (United States)

    Sun, Chao; Wu, Jian; Liang, Jianli; Schnable, James C; Yang, Wencai; Cheng, Feng; Wang, Xiaowu

    2015-11-01

    MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play essential roles in eukaryotes. Although the influence of whole-genome triplication (WGT) on protein-coding genes has been well documented in Brassica rapa, little is known about its impacts on MIRNAs. In this study, through generating a comprehensive annotation of 680 MIRNAs for B. rapa, we analyzed the evolutionary characteristics of these MIRNAs from different aspects in B. rapa. First, while MIRNAs and genes show similar patterns of biased distribution among subgenomes of B. rapa, we found that MIRNAs are much more overretained than genes following fractionation after WGT. Second, multiple-copy MIRNAs show significant sequence conservation than that of single-copy MIRNAs, which is opposite to that of genes. This indicates that increased purifying selection is acting upon these highly retained multiple-copy MIRNAs and their functional importance over singleton MIRNAs. Furthermore, we found the extensive divergence between pairs of miRNAs and their target genes following the WGT in B. rapa. In summary, our study provides a valuable resource for exploring MIRNA in B. rapa and highlights the impacts of WGT on the evolution of MIRNA. PMID:26527651

  16. PKS and NRPS gene clusters from microbial symbiont cells of marine sponges by whole genome amplification.

    Science.gov (United States)

    Siegl, Alexander; Hentschel, Ute

    2010-08-01

    Whole genome amplification (WGA) approaches provide genomic information on single microbial cells and hold great promise for the field of environmental microbiology. Here, the microbial consortia of the marine sponge Aplysina aerophoba were sorted by fluorescence-activated cell sorting (FACS) and then subjected to WGA. A cosmid library was constructed from the WGA product of a sample containing two bacterial cells, one a member of the candidate phylum Poribacteria and one of a sponge-specific clade of Chloroflexi. Library screening led to the genomic characterization of three cosmid clones, encoding a polyketide synthase (PKS), a non-ribosomal peptide synthetase (NRPS) and the Chloroflexi 16S rRNA gene. PCR screening of WGA products from additional, FACS-sorted single bacterial symbiont cells supports the assignment of the Sup-PKS gene to the Poribacteria and the novel NRPS gene to the Chloroflexi. This promising single-cell genomics approach has permitted cloning of entire gene clusters from single microbial cells of known phylogenetic origin and thus provides a sought-after link between phylogeny and function. PMID:23766222

  17. Whole-genome sequencing identifies a recurrent functional synonymous mutation in melanoma

    Science.gov (United States)

    Gartner, Jared J.; Parker, Stephen C. J.; Prickett, Todd D.; Dutton-Regester, Ken; Stitzel, Michael L.; Lin, Jimmy C.; Davis, Sean; Simhadri, Vijaya L.; Jha, Sujata; Katagiri, Nobuko; Gotea, Valer; Teer, Jamie K.; Morken, Mario A.; Bhanot, Umesh K.; Chen, Guo; Elnitski, Laura L.; Davies, Michael A.; Gershenwald, Jeffrey E.; Carter, Hannah; Karchin, Rachel; Robinson, William; Robinson, Steven; Rosenberg, Steven A.; Collins, Francis S.; Parmigiani, Giovanni; Komar, Anton A.; Kimchi-Sarfaty, Chava; Hayward, Nicholas K.; Margulies, Elliott H.; Samuels, Yardena

    2013-01-01

    Synonymous mutations, which do not alter the protein sequence, have been shown to affect protein function [Sauna ZE, Kimchi-Sarfaty C (2011) Nat Rev Genet 12(10):683–691]. However, synonymous mutations are rarely investigated in the cancer genomics field. We used whole-genome and -exome sequencing to identify somatic mutations in 29 melanoma samples. Validation of one synonymous somatic mutation in BCL2L12 in 285 samples identified 12 cases that harbored the recurrent F17F mutation. This mutation led to increased BCL2L12 mRNA and protein levels because of differential targeting of WT and mutant BCL2L12 by hsa-miR-671–5p. Protein made from mutant BCL2L12 transcript bound p53, inhibited UV-induced apoptosis more efficiently than WT BCL2L12, and reduced endogenous p53 target gene transcription. This report shows selection of a recurrent somatic synonymous mutation in cancer. Our data indicate that silent alterations have a role to play in human cancer, emphasizing the importance of their investigation in future cancer genome studies. PMID:23901115

  18. Neuropeptide evolution: Chelicerate neurohormone and neuropeptide genes may reflect one or more whole genome duplications.

    Science.gov (United States)

    Veenstra, Jan A

    2016-04-01

    Four genomes and two transcriptomes from six Chelicerate species were analyzed for the presence of neuropeptide and neurohormone precursors and their GPCRs. The genome from the spider Stegodyphus mimosarum yielded 87 neuropeptide precursors and 120 neuropeptide GPCRs. Many neuropeptide transcripts were also found in the transcriptomes of three other spiders, Latrodectus hesperus, Parasteatoda tepidariorum and Acanthoscurria geniculata. For the scorpion Mesobuthus martensii the numbers are 79 and 93 respectively. The very small genome of the house dust mite, Dermatophagoides farinae, on the other hand contains a much smaller number of such genes. A few new putative Arthropod neuropeptide genes were discovered. Thus, both spiders and the scorpion have an achatin gene and in spiders there are two different genes encoding myosuppressin-like peptides while spiders also have two genes encoding novel LGamides. Another finding is the presence of trissin in spiders and scorpions, while neuropeptide genes that seem to be orthologs of Lottia LFRYamide and Platynereis CCRFamide were also found. Such genes were also found in various insect species, but seem to be lacking from the Holometabola. The Chelicerate neuropeptide and neuropeptide GPCR genes often have paralogs. As the large majority of these are probably not due to local gene duplications, is plausible that they reflect the effects of one or more ancient whole genome duplications. PMID:26928473

  19. Mycobacterial DNA extraction for whole-genome sequencing from early positive liquid (MGIT) cultures.

    Science.gov (United States)

    Votintseva, Antonina A; Pankhurst, Louise J; Anson, Luke W; Morgan, Marcus R; Gascoyne-Binzi, Deborah; Walker, Timothy M; Quan, T Phuong; Wyllie, David H; Del Ojo Elias, Carlos; Wilcox, Mark; Walker, A Sarah; Peto, Tim E A; Crook, Derrick W

    2015-04-01

    We developed a low-cost and reliable method of DNA extraction from as little as 1 ml of early positive mycobacterial growth indicator tube (MGIT) cultures that is suitable for whole-genome sequencing to identify mycobacterial species and predict antibiotic resistance in clinical samples. The DNA extraction method is based on ethanol precipitation supplemented by pretreatment steps with a MolYsis kit or saline wash for the removal of human DNA and a final DNA cleanup step with solid-phase reversible immobilization beads. The protocol yielded ≥0.2 ng/μl of DNA for 90% (MolYsis kit) and 83% (saline wash) of positive MGIT cultures. A total of 144 (94%) of the 154 samples sequenced on the MiSeq platform (Illumina) achieved the target of 1 million reads, with 90% coverage achieved. The DNA extraction protocol, therefore, will facilitate fast and accurate identification of mycobacterial species and resistance using a range of bioinformatics tools. PMID:25631807

  20. Whole-genome bisulfite DNA sequencing of a DNMT3B mutant patient

    Science.gov (United States)

    Heyn, Holger; Vidal, Enrique; Sayols, Sergi; Sanchez-Mut, Jose V.; Moran, Sebastian; Medina, Ignacio; Sandoval, Juan; Simó-Riudalbas, Laia; Szczesna, Karolina; Huertas, Dori; Gatto, Sole; Matarazzo, Maria R.; Dopazo, Joaquin; Esteller, Manel

    2012-01-01

    The immunodeficiency, centromere instability and facial anomalies (ICF) syndrome is associated to mutations of the DNA methyl-transferase DNMT3B, resulting in a reduction of enzyme activity. Aberrant expression of immune system genes and hypomethylation of pericentromeric regions accompanied by chromosomal instability were determined as alterations driving the disease phenotype. However, so far only technologies capable to analyze single loci were applied to determine epigenetic alterations in ICF patients. In the current study, we performed whole-genome bisulphite sequencing to assess alteration in DNA methylation at base pair resolution. Genome-wide we detected a decrease of methylation level of 42%, with the most profound changes occurring in inactive heterochromatic regions, satellite repeats and transposons. Interestingly, transcriptional active loci and ribosomal RNA repeats escaped global hypomethylation. Despite a genome-wide loss of DNA methylation the epigenetic landscape and crucial regulatory structures were conserved. Remarkably, we revealed a mislocated activity of mutant DNMT3B to H3K4me1 loci resulting in hypermethylation of active promoters. Functionally, we could associate alterations in promoter methylation with the ICF syndrome immunodeficient phenotype by detecting changes in genes related to the B-cell receptor mediated maturation pathway. PMID:22595875

  1. Whole genome assessment of the retinal response to diabetes reveals a progressive neurovascular inflammatory response

    Directory of Open Access Journals (Sweden)

    Brucklacher Robert M

    2008-06-01

    Full Text Available Abstract Background Despite advances in the understanding of diabetic retinopathy, the nature and time course of molecular changes in the retina with diabetes are incompletely described. This study characterized the functional and molecular phenotype of the retina with increasing durations of diabetes. Results Using the streptozotocin-induced rat model of diabetes, levels of retinal permeability, caspase activity, and gene expression were examined after 1 and 3 months of diabetes. Gene expression changes were identified by whole genome microarray and confirmed by qPCR in the same set of animals as used in the microarray analyses and subsequently validated in independent sets of animals. Increased levels of vascular permeability and caspase-3 activity were observed at 3 months of diabetes, but not 1 month. Significantly more and larger magnitude gene expression changes were observed after 3 months than after 1 month of diabetes. Quantitative PCR validation of selected genes related to inflammation, microvasculature and neuronal function confirmed gene expression changes in multiple independent sets of animals. Conclusion These changes in permeability, apoptosis, and gene expression provide further evidence of progressive retinal malfunction with increasing duration of diabetes. The specific gene expression changes confirmed in multiple sets of animals indicate that pro-inflammatory, anti-vascular barrier, and neurodegenerative changes occur in tandem with functional increases in apoptosis and vascular permeability. These responses are shared with the clinically documented inflammatory response in diabetic retinopathy suggesting that this model may be used to test anti-inflammatory therapeutics.

  2. The "most wanted" taxa from the human microbiome for whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Anthony A Fodor

    Full Text Available The goal of the Human Microbiome Project (HMP is to generate a comprehensive catalog of human-associated microorganisms including reference genomes representing the most common species. Toward this goal, the HMP has characterized the microbial communities at 18 body habitats in a cohort of over 200 healthy volunteers using 16S rRNA gene (16S sequencing and has generated nearly 1,000 reference genomes from human-associated microorganisms. To determine how well current reference genome collections capture the diversity observed among the healthy microbiome and to guide isolation and future sequencing of microbiome members, we compared the HMP's 16S data sets to several reference 16S collections to create a 'most wanted' list of taxa for sequencing. Our analysis revealed that the diversity of commonly occurring taxa within the HMP cohort microbiome is relatively modest, few novel taxa are represented by these OTUs and many common taxa among HMP volunteers recur across different populations of healthy humans. Taken together, these results suggest that it should be possible to perform whole-genome sequencing on a large fraction of the human microbiome, including the 'most wanted', and that these sequences should serve to support microbiome studies across multiple cohorts. Also, in stark contrast to other taxa, the 'most wanted' organisms are poorly represented among culture collections suggesting that novel culture- and single-cell-based methods will be required to isolate these organisms for sequencing.

  3. Genomic view of bipolar disorder revealed by whole genome sequencing in a genetic isolate.

    Directory of Open Access Journals (Sweden)

    Benjamin Georgi

    2014-03-01

    Full Text Available Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders.

  4. Landscape of somatic mutations in 560 breast cancer whole-genome sequences.

    Science.gov (United States)

    Nik-Zainal, Serena; Davies, Helen; Staaf, Johan; Ramakrishna, Manasa; Glodzik, Dominik; Zou, Xueqing; Martincorena, Inigo; Alexandrov, Ludmil B; Martin, Sancha; Wedge, David C; Van Loo, Peter; Ju, Young Seok; Smid, Marcel; Brinkman, Arie B; Morganella, Sandro; Aure, Miriam R; Lingjærde, Ole Christian; Langerød, Anita; Ringnér, Markus; Ahn, Sung-Min; Boyault, Sandrine; Brock, Jane E; Broeks, Annegien; Butler, Adam; Desmedt, Christine; Dirix, Luc; Dronov, Serge; Fatima, Aquila; Foekens, John A; Gerstung, Moritz; Hooijer, Gerrit K J; Jang, Se Jin; Jones, David R; Kim, Hyung-Yong; King, Tari A; Krishnamurthy, Savitri; Lee, Hee Jin; Lee, Jeong-Yeon; Li, Yilong; McLaren, Stuart; Menzies, Andrew; Mustonen, Ville; O'Meara, Sarah; Pauporté, Iris; Pivot, Xavier; Purdie, Colin A; Raine, Keiran; Ramakrishnan, Kamna; Rodríguez-González, F Germán; Romieu, Gilles; Sieuwerts, Anieta M; Simpson, Peter T; Shepherd, Rebecca; Stebbings, Lucy; Stefansson, Olafur A; Teague, Jon; Tommasi, Stefania; Treilleux, Isabelle; Van den Eynden, Gert G; Vermeulen, Peter; Vincent-Salomon, Anne; Yates, Lucy; Caldas, Carlos; van't Veer, Laura; Tutt, Andrew; Knappskog, Stian; Tan, Benita Kiat Tee; Jonkers, Jos; Borg, Åke; Ueno, Naoto T; Sotiriou, Christos; Viari, Alain; Futreal, P Andrew; Campbell, Peter J; Span, Paul N; Van Laere, Steven; Lakhani, Sunil R; Eyfjord, Jorunn E; Thompson, Alastair M; Birney, Ewan; Stunnenberg, Hendrik G; van de Vijver, Marc J; Martens, John W M; Børresen-Dale, Anne-Lise; Richardson, Andrea L; Kong, Gu; Thomas, Gilles; Stratton, Michael R

    2016-06-01

    We analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, another with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer. PMID:27135926

  5. Kuwaiti population subgroup of nomadic Bedouin ancestry—Whole genome sequence and analysis

    Directory of Open Access Journals (Sweden)

    Sumi Elsa John

    2015-03-01

    Full Text Available Kuwaiti native population comprises three distinct genetic subgroups of Persian, “city-dwelling” Saudi Arabian tribe, and nomadic “tent-dwelling” Bedouin ancestry. Bedouin subgroup is characterized by presence of 17% African ancestry; it owes it origin to nomadic tribes of the deserts of Arabian Peninsula and North Africa. By sequencing whole genome of a Kuwaiti male from this subgroup at 41X coverage, we report 3,752,878 SNPs, 411,839 indels, and 8451 structural variations. Neighbor-joining tree, based on shared variant positions carrying disease-risk alleles between the Bedouin and other continental genomes, places Bedouin genome at the nexus of African, Asian, and European genomes in concordance with geographical location of Kuwait and Peninsula. In congruence with participant's medical history for morbid obesity and bronchial asthma, risk alleles are seen at deleterious SNPs associated with obesity and asthma. Many of the observed deleterious ‘novel’ variants lie in genes associated with autosomal recessive disorders characteristic of the region.

  6. BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU

    Directory of Open Access Journals (Sweden)

    Ruibang Luo

    2014-06-01

    Full Text Available This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels, BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads, or just 25 min for 210-fold whole exome sequencing. BALSA’s speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.

  7. Digital Droplet Multiple Displacement Amplification (ddMDA for Whole Genome Sequencing of Limited DNA Samples.

    Directory of Open Access Journals (Sweden)

    Minsoung Rhee

    Full Text Available Multiple displacement amplification (MDA is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA technique where partitioning of the template DNA into thousands of sub-nanoliter droplets, each containing a small number of DNA fragments, greatly reduces the competition among DNA fragments for primers and polymerase thereby greatly reducing amplification bias. Consequently, the ddMDA approach enabled a more uniform coverage of amplification over the entire length of the genome, with significantly lower bias and non-specific amplification than conventional MDA. For a sample containing 0.1 pg/μL of E. coli DNA (equivalent of ~3/1000 of an E. coli genome per droplet, ddMDA achieves a 65-fold increase in coverage in de novo assembly, and more than 20-fold increase in specificity (percentage of reads mapping to E. coli compared to the conventional tube MDA. ddMDA offers a powerful method useful for many applications including medical diagnostics, forensics, and environmental microbiology.

  8. Whole genome and transcriptome sequencing of matched primary and peritoneal metastatic gastric carcinoma.

    Science.gov (United States)

    Zhang, J; Huang, J Y; Chen, Y N; Yuan, F; Zhang, H; Yan, F H; Wang, M J; Wang, G; Su, M; Lu, G; Huang, Y; Dai, H; Ji, J; Zhang, J; Zhang, J N; Jiang, Y N; Chen, S J; Zhu, Z G; Yu, Y Y

    2015-01-01

    Gastric cancer is one of the most aggressive cancers and is the second leading cause of cancer death worldwide. Approximately 40% of global gastric cancer cases occur in China, with peritoneal metastasis being the prevalent form of recurrence and metastasis in advanced disease. Currently, there are limited clinical approaches for predicting and treatment of peritoneal metastasis, resulting in a 6-month average survival time. By comprehensive genome analysis will uncover the pathogenesis of peritoneal metastasis. Here we describe a comprehensive whole-genome and transcriptome sequencing analysis of one advanced gastric cancer case, including non-cancerous mucosa, primary cancer and matched peritoneal metastatic cancer. The peripheral blood is used as normal control. We identified 27 mutated genes, of which 19 genes are reported in COSMIC database (ZNF208, CRNN, ATXN3, DCTN1, RP1L1, PRB4, PRB1, MUC4, HS6ST3, MUC17, JAM2, ITGAD, IREB2, IQUB, CORO1B, CCDC121, AKAP2, ACAN and ACADL), and eight genes have not previously been described in gastric cancer (CCDC178, ARMC4, TUBB6, PLIN4, PKLR, PDZD2, DMBT1and DAB1).Additionally,GPX4 and MPND in 19q13.3-13.4 region, is characterized as a novel fusion-gene. This study disclosed novel biological markers and tumorigenic pathways that would predict gastric cancer occurring peritoneal metastasis. PMID:26330360

  9. Single Cell Analysis of Dystrophin and SRY Gene by Using Whole Genome Amplification

    Institute of Scientific and Technical Information of China (English)

    徐晨明; 金帆; 黄荷凤; 陶冶; 叶英辉

    2001-01-01

    Objective To develop a reliable and sensitive method for detection of sex and multiloci of Duchenne muscular dystrophy (DMD) gene in single cell Materials & methods Whole genome of single cell were amplified by using 15-base random primers (primer extension preamplification, PEP), then a small aliquot of PEP product were analyzed by using locus-specific nest PCR amplification. The procedure was evaluated by detection dystrophin exons 8, 17, 19, 44, 45, 48 and human testis-determining gene (SRY)in single lymphocytes from known sources and single blastomeres from the couples with no family history of DMD.Results The amplification efficiency rate of six dystrophin exons from single lymphocytes and single blastomeres were 97. 2% (175/180) and 100% (60/60) respectively.Results of SRY showed that 100% (15/15) amplification in single male-derived lymphocytes and 0% (0/15) amplification in single female-derived lymphocytes. Conclusion The technique of single cell PEP-nest PCR for dystrophin exons 8, 17,19, 44, 45, 48 and SRY is highly specifc. PEP-nest PCR is suitable for Preimplantation genetic diagnosis (PGD) of DMD at single cell level.

  10. Discovery of new Mycoplasma pneumoniae antigens by use of a whole-genome lambda display library.

    Science.gov (United States)

    Beghetto, Elisa; De Paolis, Francesca; Montagnani, Francesca; Cellesi, Carla; Gargano, Nicola

    2009-01-01

    Mycoplasma pneumoniae is the leading cause of atypical pneumonia in children and young adults. Bacterial colonization can occur in both the upper and the lower respiratory tracts and take place both endemically and epidemically worldwide. Characteristically, the infection is chronic in onset and recovery and both humoral and cell-mediated mechanisms are involved in the response to bacterial colonization. To identify bacterial proteins recognized by host antibody responses, a whole-genome M. pneumoniae library was created and displayed on lambda bacteriophage. The challenge of such a library with sera from individuals hospitalized for mycoplasmal pneumonia allowed the identification of a panel of recombinant bacteriophages carrying B-cell epitopes. Among the already known M. pneumoniae B-cell antigens, our results confirmed the immunogenicity of P1 and P30 adhesins. Also, the data presented in this study localized, within their sequences, the immunodominant epitopes recognized by human immunoglobulins. Furthermore, library screening allowed the identification of four novel immunogenic polypeptides, respectively, encoded by fragments of the MPN152, MPN426, MPN456 and MPN-500 open reading frames, highlighting and further confirming the potential of lambda display technology in antigen and epitope discovery. PMID:18992837

  11. Whole genome sequencing provides insights into the genetic determinants of invasiveness in Salmonella Dublin.

    Science.gov (United States)

    Mohammed, M; Cormican, M

    2016-08-01

    Salmonella enterica subsp. enterica serovar Dublin (S. Dublin) is one of the non-typhoidal Salmonella (NTS); however, a relatively high proportion of human infections are associated with invasive disease. We applied whole genome sequencing to representative invasive and non-invasive clinical isolates of S. Dublin to determine the genomic variations among them and to investigate the underlying genetic determinants associated with invasiveness in S. Dublin. Although no particular genomic variation was found to differentiate in invasive and non-invasive isolates four virulence factors were detected within the genome of all isolates including two different type VI secretion systems (T6SS) encoded on two Salmonella pathogenicity islands (SPI), including SPI-6 (T6SSSPI-6) and SPI-19 (T6SSSPI-19), an intact lambdoid prophage (Gifsy-2-like prophage) that contributes significantly to the virulence and pathogenesis of Salmonella serotypes in addition to a virulence plasmid. These four virulence factors may all contribute to the potential of S. Dublin to cause invasive disease in humans. PMID:26996313

  12. Prospective Whole-Genome Sequencing Enhances National Surveillance of Listeria monocytogenes.

    Science.gov (United States)

    Kwong, Jason C; Mercoulia, Karolina; Tomita, Takehiro; Easton, Marion; Li, Hua Y; Bulach, Dieter M; Stinear, Timothy P; Seemann, Torsten; Howden, Benjamin P

    2016-02-01

    Whole-genome sequencing (WGS) has emerged as a powerful tool for comparing bacterial isolates in outbreak detection and investigation. Here we demonstrate that WGS performed prospectively for national epidemiologic surveillance of Listeria monocytogenes has the capacity to be superior to our current approaches using pulsed-field gel electrophoresis (PFGE), multilocus sequence typing (MLST), multilocus variable-number tandem-repeat analysis (MLVA), binary typing, and serotyping. Initially 423 L. monocytogenes isolates underwent WGS, and comparisons uncovered a diverse genetic population structure derived from three distinct lineages. MLST, binary typing, and serotyping results inferred in silico from the WGS data were highly concordant (>99%) with laboratory typing performed in parallel. However, WGS was able to identify distinct nested clusters within groups of isolates that were otherwise indistinguishable using our current typing methods. Routine WGS was then used for prospective epidemiologic surveillance on a further 97 L. monocytogenes isolates over a 12-month period, which provided a greater level of discrimination than that of conventional typing for inferring linkage to point source outbreaks. A risk-based alert system based on WGS similarity was used to inform epidemiologists required to act on the data. Our experience shows that WGS can be adopted for prospective L. monocytogenes surveillance and investigated for other pathogens relevant to public health. PMID:26607978

  13. Whole Genome Sequencing Identifies a Novel Factor Required for Secretory Granule Maturation in Tetrahymena thermophila

    Science.gov (United States)

    Kontur, Cassandra; Kumar, Santosh; Lan, Xun; Pritchard, Jonathan K.; Turkewitz, Aaron P.

    2016-01-01

    Unbiased genetic approaches have a unique ability to identify novel genes associated with specific biological pathways. Thanks to next generation sequencing, forward genetic strategies can be expanded to a wider range of model organisms. The formation of secretory granules, called mucocysts, in the ciliate Tetrahymena thermophila relies, in part, on ancestral lysosomal sorting machinery, but is also likely to involve novel factors. In prior work, multiple strains with defects in mucocyst biogenesis were generated by nitrosoguanidine mutagenesis, and characterized using genetic and cell biological approaches, but the genetic lesions themselves were unknown. Here, we show that analyzing one such mutant by whole genome sequencing reveals a novel factor in mucocyst formation. Strain UC620 has both morphological and biochemical defects in mucocyst maturation—a process analogous to dense core granule maturation in animals. Illumina sequencing of a pool of UC620 F2 clones identified a missense mutation in a novel gene called MMA1 (Mucocyst maturation). The defects in UC620 were rescued by expression of a wild-type copy of MMA1, and disrupting MMA1 in an otherwise wild-type strain phenocopies UC620. The product of MMA1, characterized as a CFP-tagged copy, encodes a large soluble cytosolic protein. A small fraction of Mma1p-CFP is pelletable, which may reflect association with endosomes. The gene has no identifiable homologs except in other Tetrahymena species, and therefore represents an evolutionarily recent innovation that is required for granule maturation. PMID:27317773

  14. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples.

    Science.gov (United States)

    Dong, Chun-Nan; Yang, Ya-Dong; Li, Shu-Jin; Yang, Ya-Ran; Zhang, Xiao-Jing; Fang, Xiang-Dong; Yan, Jiang-Wei; Cong, Bin

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and identified 2,462 single nucleotide variations (SNVs), 128 insertion-deletion polymorphisms (indels). After comparing the sequence reads with 44 STR loci commonly used in forensics, five STRs (TH01, TPOX, D18S51, DYS391, and D10S1248)were matched. We compared these "nucleosome protected STRs" (NPSTRs) with five other non-NPSTRs using mini-STR primer design, real-time PCR, and capillary gel electrophoresis on artificially degraded DNA. Moreover, genotyping performance of the five NPSTRs and five non-NPSTRs was also tested with real casework samples. All results show that loci located in nucleosomes are more likely to be successfully genotyped in degraded samples. In conclusion, after further strict validation, these markers could be incorporated into future forensic and paleontology identification kits, resulting in higher discriminatory power for certain degraded sample types. PMID:27189082

  15. A Recent Whole-Genome Duplication Divides Populations of a Globally Distributed Microsporidian.

    Science.gov (United States)

    Williams, Tom A; Nakjang, Sirintra; Campbell, Scott E; Freeman, Mark A; Eydal, Matthías; Moore, Karen; Hirt, Robert P; Embley, T Martin; Williams, Bryony A P

    2016-08-01

    The Microsporidia are a major group of intracellular fungi and important parasites of animals including insects, fish, and immunocompromised humans. Microsporidian genomes have undergone extreme reductive evolution but there are major differences in genome size and structure within the group: some are prokaryote-like in size and organisation (marine microsporidian infecting goosefish worldwide. Our analysis revealed that population structure across the Atlantic Ocean is associated with a conserved difference in ploidy, with American and Canadian isolates sharing an ancestral whole genome duplication that was followed by widespread pseudogenisation and sorting-out of paralogue pairs. While past analyses have suggested de novo gene formation of microsporidian-specific genes, we found evidence for the origin of new genes from noncoding sequence since the divergence of these populations. Some of these genes experience selective constraint, suggesting the evolution of new functions and local host adaptation. Combining our data with published microsporidian genomes, we show that nucleotide composition across the phylum is shaped by a mutational bias favoring A and T nucleotides, which is opposed by an evolutionary force favoring an increase in genomic GC content. This study reveals ongoing dramatic reorganization of genome structure and the evolution of new gene functions in modern microsporidians despite extensive genomic streamlining in their common ancestor. PMID:27189558

  16. Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity.

    Science.gov (United States)

    Dulak, Austin M; Stojanov, Petar; Peng, Shouyong; Lawrence, Michael S; Fox, Cameron; Stewart, Chip; Bandla, Santhoshi; Imamura, Yu; Schumacher, Steven E; Shefler, Erica; McKenna, Aaron; Carter, Scott L; Cibulskis, Kristian; Sivachenko, Andrey; Saksena, Gordon; Voet, Douglas; Ramos, Alex H; Auclair, Daniel; Thompson, Kristin; Sougnez, Carrie; Onofrio, Robert C; Guiducci, Candace; Beroukhim, Rameen; Zhou, Zhongren; Lin, Lin; Lin, Jules; Reddy, Rishindra; Chang, Andrew; Landrenau, Rodney; Pennathur, Arjun; Ogino, Shuji; Luketich, James D; Golub, Todd R; Gabriel, Stacey B; Lander, Eric S; Beer, David G; Godfrey, Tony E; Getz, Gad; Bass, Adam J

    2013-05-01

    The incidence of esophageal adenocarcinoma (EAC) has risen 600% over the last 30 years. With a 5-year survival rate of ~15%, the identification of new therapeutic targets for EAC is greatly important. We analyze the mutation spectra from whole-exome sequencing of 149 EAC tumor-normal pairs, 15 of which have also been subjected to whole-genome sequencing. We identify a mutational signature defined by a high prevalence of A>C transversions at AA dinucleotides. Statistical analysis of exome data identified 26 significantly mutated genes. Of these genes, five (TP53, CDKN2A, SMAD4, ARID1A and PIK3CA) have previously been implicated in EAC. The new significantly mutated genes include chromatin-modifying factors and candidate contributors SPG20, TLR4, ELMO1 and DOCK2. Functional analyses of EAC-derived mutations in ELMO1 identifies increased cellular invasion. Therefore, we suggest the potential activation of the RAC1 pathway as a contributor to EAC tumorigenesis. PMID:23525077

  17. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples

    Science.gov (United States)

    Dong, Chun-nan; Yang, Ya-dong; Li, Shu-jin; Yang, Ya-ran; Zhang, Xiao-jing; Fang, Xiang-dong; Yan, Jiang-wei; Cong, Bin

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and identified 2,462 single nucleotide variations (SNVs), 128 insertion-deletion polymorphisms (indels). After comparing the sequence reads with 44 STR loci commonly used in forensics, five STRs (TH01, TPOX, D18S51, DYS391, and D10S1248)were matched. We compared these “nucleosome protected STRs” (NPSTRs) with five other non-NPSTRs using mini-STR primer design, real-time PCR, and capillary gel electrophoresis on artificially degraded DNA. Moreover, genotyping performance of the five NPSTRs and five non-NPSTRs was also tested with real casework samples. All results show that loci located in nucleosomes are more likely to be successfully genotyped in degraded samples. In conclusion, after further strict validation, these markers could be incorporated into future forensic and paleontology identification kits, resulting in higher discriminatory power for certain degraded sample types. PMID:27189082

  18. Whole genome sequencing and complete genetic analysis reveals novel pathways to glycopeptide resistance in Staphylococcus aureus.

    Directory of Open Access Journals (Sweden)

    Adriana Renzoni

    Full Text Available The precise mechanisms leading to the emergence of low-level glycopeptide resistance in Staphylococcus aureus are poorly understood. In this study, we used whole genome deep sequencing to detect differences between two isogenic strains: a parental strain and a stable derivative selected stepwise for survival on 4 µg/ml teicoplanin, but which grows at higher drug concentrations (MIC 8 µg/ml. We uncovered only three single nucleotide changes in the selected strain. Nonsense mutations occurred in stp1, encoding a serine/threonine phosphatase, and in yjbH, encoding a post-transcriptional negative regulator of the redox/thiol stress sensor and global transcriptional regulator, Spx. A missense mutation (G45R occurred in the histidine kinase sensor of cell wall stress, VraS. Using genetic methods, all single, pairwise combinations, and a fully reconstructed triple mutant were evaluated for their contribution to low-level glycopeptide resistance. We found a synergistic cooperation between dual phospho-signalling systems and a subtle contribution from YjbH, suggesting the activation of oxidative stress defences via Spx. To our knowledge, this is the first genetic demonstration of multiple sensor and stress pathways contributing simultaneously to glycopeptide resistance development. The multifactorial nature of glycopeptide resistance in this strain suggests a complex reprogramming of cell physiology to survive in the face of drug challenge.

  19. Automated whole-genome multiple alignment of rat, mouse, and human

    Energy Technology Data Exchange (ETDEWEB)

    Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

    2004-07-04

    We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.

  20. A human genome-wide library of local phylogeny predictions for whole-genome inference problems

    Directory of Open Access Journals (Sweden)

    Schwartz Russell

    2008-08-01

    Full Text Available Abstract Background Many common inference problems in computational genetics depend on inferring aspects of the evolutionary history of a data set given a set of observed modern sequences. Detailed predictions of the full phylogenies are therefore of value in improving our ability to make further inferences about population history and sources of genetic variation. Making phylogenetic predictions on the scale needed for whole-genome analysis is, however, extremely computationally demanding. Results In order to facilitate phylogeny-based predictions on a genomic scale, we develop a library of maximum parsimony phylogenies within local regions spanning all autosomal human chromosomes based on Haplotype Map variation data. We demonstrate the utility of this library for population genetic inferences by examining a tree statistic we call 'imperfection,' which measures the reuse of variant sites within a phylogeny. This statistic is significantly predictive of recombination rate, shows additional regional and population-specific conservation, and allows us to identify outlier genes likely to have experienced unusual amounts of variation in recent human history. Conclusion Recent theoretical advances in algorithms for phylogenetic tree reconstruction have made it possible to perform large-scale inferences of local maximum parsimony phylogenies from single nucleotide polymorphism (SNP data. As results from the imperfection statistic demonstrate, phylogeny predictions encode substantial information useful for detecting genomic features and population history. This data set should serve as a platform for many kinds of inferences one may wish to make about human population history and genetic variation.

  1. Whole-genome sequencing of multidrug-resistant Mycobacterium tuberculosis isolates from Myanmar.

    Science.gov (United States)

    Aung, Htin Lin; Tun, Thanda; Moradigaravand, Danesh; Köser, Claudio U; Nyunt, Wint Wint; Aung, Si Thu; Lwin, Thandar; Thinn, Kyi Kyi; Crump, John A; Parkhill, Julian; Peacock, Sharon J; Cook, Gregory M; Hill, Philip C

    2016-09-01

    Drug-resistant tuberculosis (TB) is a major health threat in Myanmar. An initial study was conducted to explore the potential utility of whole-genome sequencing (WGS) for the diagnosis and management of drug-resistant TB in Myanmar. Fourteen multidrug-resistant Mycobacterium tuberculosis isolates were sequenced. Known resistance genes for a total of nine antibiotics commonly used in the treatment of drug-susceptible and multidrug-resistant TB (MDR-TB) in Myanmar were interrogated through WGS. All 14 isolates were MDR-TB, consistent with the results of phenotypic drug susceptibility testing (DST), and the Beijing lineage predominated. Based on the results of WGS, 9 of the 14 isolates were potentially resistant to at least one of the drugs used in the standard MDR-TB regimen but for which phenotypic DST is not conducted in Myanmar. This study highlights a need for the introduction of second-line DST as part of routine TB diagnosis in Myanmar as well as new classes of TB drugs to construct effective regimens. PMID:27530852

  2. Use of Whole Genome Sequencing for Diagnosis and Discovery in the Cancer Genetics Clinic

    Directory of Open Access Journals (Sweden)

    Samantha B. Foley

    2015-01-01

    Full Text Available Despite the potential of whole-genome sequencing (WGS to improve patient diagnosis and care, the empirical value of WGS in the cancer genetics clinic is unknown. We performed WGS on members of two cohorts of cancer genetics patients: those with BRCA1/2 mutations (n = 176 and those without (n = 82. Initial analysis of potentially pathogenic variants (PPVs, defined as nonsynonymous variants with allele frequency < 1% in ESP6500 in 163 clinically-relevant genes suggested that WGS will provide useful clinical results. This is despite the fact that a majority of PPVs were novel missense variants likely to be classified as variants of unknown significance (VUS. Furthermore, previously reported pathogenic missense variants did not always associate with their predicted diseases in our patients. This suggests that the clinical use of WGS will require large-scale efforts to consolidate WGS and patient data to improve accuracy of interpretation of rare variants. While loss-of-function (LoF variants represented only a small fraction of PPVs, WGS identified additional cancer risk LoF PPVs in patients with known BRCA1/2 mutations and led to cancer risk diagnoses in 21% of non-BRCA cancer genetics patients after expanding our analysis to 3209 ClinVar genes. These data illustrate how WGS can be used to improve our ability to discover patients' cancer genetic risks.

  3. Whole-Genome Screening of Newborns? The Constitutional Boundaries of State Newborn Screening Programs

    Science.gov (United States)

    King, Jaime S.; Smith, Monica E.

    2016-01-01

    State newborn screening (NBS) programs routinely screen nearly all of the 4 million newborns in the United States each year for ~30 primary conditions and a number of secondary conditions. NBS could be on the cusp of an unprecedented expansion as a result of advances in whole-genome sequencing (WGS). As WGS becomes cheaper and easier and as our knowledge and understanding of human genetics expand, the question of whether WGS has a role to play in state NBS programs becomes increasingly relevant and complex. As geneticists and state public health officials begin to contemplate the technical and procedural details of whether WGS could benefit existing NBS programs, this is an opportune time to revisit the legal framework of state NBS programs. In this article, we examine the constitutional underpinnings of state-mandated NBS and explore the range of current state statutes and regulations that govern the programs. We consider the legal refinements that will be needed to keep state NBS programs within constitutional bounds, focusing on 2 areas of concern: consent procedures and the criteria used to select new conditions for NBS panels. We conclude by providing options for states to consider when contemplating the use of WGS for NBS. PMID:26729704

  4. Whole genome amplification and de novo assembly of single bacterial cells.

    Directory of Open Access Journals (Sweden)

    Sébastien Rodrigue

    Full Text Available BACKGROUND: Single-cell genome sequencing has the potential to allow the in-depth exploration of the vast genetic diversity found in uncultured microbes. We used the marine cyanobacterium Prochlorococcus as a model system for addressing important challenges facing high-throughput whole genome amplification (WGA and complete genome sequencing of individual cells. METHODOLOGY/PRINCIPAL FINDINGS: We describe a pipeline that enables single-cell WGA on hundreds of cells at a time while virtually eliminating non-target DNA from the reactions. We further developed a post-amplification normalization procedure that mitigates extreme variations in sequencing coverage associated with multiple displacement amplification (MDA, and demonstrated that the procedure increased sequencing efficiency and facilitated genome assembly. We report genome recovery as high as 99.6% with reference-guided assembly, and 95% with de novo assembly starting from a single cell. We also analyzed the impact of chimera formation during MDA on de novo assembly, and discuss strategies to minimize the presence of incorrectly joined regions in contigs. CONCLUSIONS/SIGNIFICANCE: The methods describe in this paper will be useful for sequencing genomes of individual cells from a variety of samples.

  5. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins.

    Science.gov (United States)

    Croucher, Nicholas J; Page, Andrew J; Connor, Thomas R; Delaney, Aidan J; Keane, Jacqueline A; Bentley, Stephen D; Parkhill, Julian; Harris, Simon R

    2015-02-18

    The emergence of new sequencing technologies has facilitated the use of bacterial whole genome alignments for evolutionary studies and outbreak analyses. These datasets, of increasing size, often include examples of multiple different mechanisms of horizontal sequence transfer resulting in substantial alterations to prokaryotic chromosomes. The impact of these processes demands rapid and flexible approaches able to account for recombination when reconstructing isolates' recent diversification. Gubbins is an iterative algorithm that uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer while concurrently constructing a maximum likelihood phylogeny based on the putative point mutations outside these regions of high sequence diversity. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistically parameterized models of bacterial evolution, and achieves convergence in only a few hours on alignments of hundreds of bacterial genome sequences. Gubbins is appropriate for reconstructing the recent evolutionary history of a variety of haploid genotype alignments, as it makes no assumptions about the underlying mechanism of recombination. The software is freely available for download at github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on Linux and Mac OS X. PMID:25414349

  6. A proposed clinical decision support architecture capable of supporting whole genome sequence information.

    Science.gov (United States)

    Welch, Brandon M; Loya, Salvador Rodriguez; Eilbeck, Karen; Kawamoto, Kensaku

    2014-04-01

    Whole genome sequence (WGS) information may soon be widely available to help clinicians personalize the care and treatment of patients. However, considerable barriers exist, which may hinder the effective utilization of WGS information in a routine clinical care setting. Clinical decision support (CDS) offers a potential solution to overcome such barriers and to facilitate the effective use of WGS information in the clinic. However, genomic information is complex and will require significant considerations when developing CDS capabilities. As such, this manuscript lays out a conceptual framework for a CDS architecture designed to deliver WGS-guided CDS within the clinical workflow. To handle the complexity and breadth of WGS information, the proposed CDS framework leverages service-oriented capabilities and orchestrates the interaction of several independently-managed components. These independently-managed components include the genome variant knowledge base, the genome database, the CDS knowledge base, a CDS controller and the electronic health record (EHR). A key design feature is that genome data can be stored separately from the EHR. This paper describes in detail: (1) each component of the architecture; (2) the interaction of the components; and (3) how the architecture attempts to overcome the challenges associated with WGS information. We believe that service-oriented CDS capabilities will be essential to using WGS information for personalized medicine. PMID:25411644

  7. Whole-genome transcriptional analysis of heavy metal stresses inCaulobacter crescentus

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Ping; Brodie, Eoin L.; Suzuki, Yohey; McAdams, Harley H.; Andersen, Gary L.

    2005-09-21

    The bacterium Caulobacter crescentus and related stalkbacterial species are known for their distinctive ability to live in lownutrient environments, a characteristic of most heavy metal contaminatedsites. Caulobacter crescentus is a model organism for studying cell cycleregulation with well developed genetics. We have identified the pathwaysresponding to heavy metal toxicity in C. crescentus to provide insightsfor possible application of Caulobacter to environmental restoration. Weexposed C. crescentus cells to four heavy metals (chromium, cadmium,selenium and uranium) and analyzed genome wide transcriptional activitiespost exposure using a Affymetrix GeneChip microarray. C. crescentusshowed surprisingly high tolerance to uranium, a possible mechanism forwhich may be formation of extracellular calcium-uranium-phosphateprecipitates. The principal response to these metals was protectionagainst oxidative stress (up-regulation of manganese-dependent superoxidedismutase, sodA). Glutathione S-transferase, thioredoxin, glutaredoxinsand DNA repair enzymes responded most strongly to cadmium and chromate.The cadmium and chromium stress response also focused on reducing theintracellular metal concentration, with multiple efflux pumps employed toremove cadmium while a sulfate transporter was down-regulated to reducenon-specific uptake of chromium. Membrane proteins were also up-regulatedin response to most of the metals tested. A two-component signaltransduction system involved in the uranium response was identified.Several differentially regulated transcripts from regions previously notknown to encode proteins were identified, demonstrating the advantage ofevaluating the transcriptome using whole genome microarrays.

  8. Allele-specific copy-number discovery from whole-genome and whole-exome sequencing.

    Science.gov (United States)

    Wang, WeiBo; Wang, Wei; Sun, Wei; Crowley, James J; Szatkiewicz, Jin P

    2015-08-18

    Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/. PMID:25883151

  9. Two Rounds of Whole Genome Duplication in the AncestralVertebrate

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir; Boore, Jeffrey L.

    2005-04-12

    The hypothesis that the relatively large and complex vertebrate genome was created by two ancient, whole genome duplications has been hotly debated, but remains unresolved. We reconstructed the evolutionary relationships of all gene families from the complete gene sets of a tunicate, fish, mouse, and human, then determined when each gene duplicated relative to the evolutionary tree of the organisms. We confirmed the results of earlier studies that there remains little signal of these events in numbers of duplicated genes, gene tree topology, or the number of genes per multigene family. However, when we plotted the genomic map positions of only the subset of paralogous genes that were duplicated prior to the fish-tetrapod split, their global physical organization provides unmistakable evidence of two distinct genome duplication events early in vertebrate evolution indicated by clear patterns of 4-way paralogous regions covering a large part of the human genome. Our results highlight the potential for these large-scale genomic events to have driven the evolutionary success of the vertebrate lineage.

  10. A Recent Whole-Genome Duplication Divides Populations of a Globally Distributed Microsporidian

    Science.gov (United States)

    Williams, Tom A.; Nakjang, Sirintra; Campbell, Scott E.; Freeman, Mark A.; Eydal, Matthías; Moore, Karen; Hirt, Robert P.; Embley, T. Martin; Williams, Bryony A. P.

    2016-01-01

    The Microsporidia are a major group of intracellular fungi and important parasites of animals including insects, fish, and immunocompromised humans. Microsporidian genomes have undergone extreme reductive evolution but there are major differences in genome size and structure within the group: some are prokaryote-like in size and organisation (difference in ploidy, with American and Canadian isolates sharing an ancestral whole genome duplication that was followed by widespread pseudogenisation and sorting-out of paralogue pairs. While past analyses have suggested de novo gene formation of microsporidian-specific genes, we found evidence for the origin of new genes from noncoding sequence since the divergence of these populations. Some of these genes experience selective constraint, suggesting the evolution of new functions and local host adaptation. Combining our data with published microsporidian genomes, we show that nucleotide composition across the phylum is shaped by a mutational bias favoring A and T nucleotides, which is opposed by an evolutionary force favoring an increase in genomic GC content. This study reveals ongoing dramatic reorganization of genome structure and the evolution of new gene functions in modern microsporidians despite extensive genomic streamlining in their common ancestor. PMID:27189558

  11. Digital Droplet Multiple Displacement Amplification (ddMDA) for Whole Genome Sequencing of Limited DNA Samples

    Science.gov (United States)

    Rhee, Minsoung; Light, Yooli K.; Meagher, Robert J.; Singh, Anup K.

    2016-01-01

    Multiple displacement amplification (MDA) is a widely used technique for amplification of DNA from samples containing limited amounts of DNA (e.g., uncultivable microbes or clinical samples) before whole genome sequencing. Despite its advantages of high yield and fidelity, it suffers from high amplification bias and non-specific amplification when amplifying sub-nanogram of template DNA. Here, we present a microfluidic digital droplet MDA (ddMDA) technique where partitioning of the template DNA into thousands of sub-nanoliter droplets, each containing a small number of DNA fragments, greatly reduces the competition among DNA fragments for primers and polymerase thereby greatly reducing amplification bias. Consequently, the ddMDA approach enabled a more uniform coverage of amplification over the entire length of the genome, with significantly lower bias and non-specific amplification than conventional MDA. For a sample containing 0.1 pg/μL of E. coli DNA (equivalent of ~3/1000 of an E. coli genome per droplet), ddMDA achieves a 65-fold increase in coverage in de novo assembly, and more than 20-fold increase in specificity (percentage of reads mapping to E. coli) compared to the conventional tube MDA. ddMDA offers a powerful method useful for many applications including medical diagnostics, forensics, and environmental microbiology. PMID:27144304

  12. Isolation and whole genome sequencing of a Ruminococcus-like bacterium, associated with irritable bowel syndrome.

    Science.gov (United States)

    Hynönen, Ulla; Rasinkangas, Pia; Satokari, Reetta; Paulin, Lars; de Vos, Willem M; Pietilä, Taija E; Kant, Ravi; Palva, Airi

    2016-06-01

    In our previous studies on the intestinal microbiota in irritable bowel syndrome (IBS), we identified a bacterial phylotype with higher abundance in patients suffering from diarrhea than in healthy controls. In the present work, we have isolated in pure culture strain RT94, belonging to this phylotype, determined its whole genome sequence and performed an extensive genomic analysis and phenotypical testing. This revealed strain RT94 to be a strict anaerobe apparently belonging to a novel species with only 94% similarity in the 16S rRNA gene sequence to the closest relatives Ruminococcus torques and Ruminococcus lactaris. The G + C content of strain RT94 is 45.2 mol% and the major long-chain cellular fatty acids are C16:0, C18:0 and C14:0. The isolate is metabolically versatile but not a mucus or cellulose utilizer. It produces acetate, ethanol, succinate, lactate and formate, but very little butyrate, as end products of glucose metabolism. The mechanisms underlying the association of strain RT94 with diarrhea-type IBS are discussed. PMID:26946362

  13. Unique features of a Japanese 'Candidatus Liberibacter asiaticus' strain revealed by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Hiroshi Katoh

    Full Text Available Citrus greening (huanglongbing is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with phloem-limited bacteria of three species of α-Proteobacteria, namely, 'Candidatus Liberibacter asiaticus', 'Ca. L. americanus', and 'Ca. L. africanus'. Recent findings suggested that some Japanese strains lack the bacteriophage-type DNA polymerase region (DNA pol, in contrast to the Floridian psy62 strain. The whole genome sequence of the pol-negative 'Ca. L. asiaticus' Japanese isolate Ishi-1 was determined by metagenomic analysis of DNA extracted from 'Ca. L. asiaticus'-infected psyllids and leaf midribs. The 1.19-Mb genome has an average 36.32% GC content. Annotation revealed 13 operons encoding rRNA and 44 tRNA genes, but no typical bacterial pathogenesis-related genes were located within the genome, similar to the Floridian psy62 and Chinese gxpsy. In contrast to other 'Ca. L. asiaticus' strains, the genome of the Japanese Ishi-1 strain lacks a prophage-related region.

  14. Bonus Organisms in High-Throughput Eukaryotic Whole-Genome Shorgun Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Pangilinan, Jasmyn; Shapiro, Harris; Tu, Hank; Platt, Darren

    2006-02-06

    The DOE Joint Genome Institute has sequenced over 50 eukaryotic genomes, ranging in size from 15 MB to 1.6 GB, over a wide range of organism types. In the course of doing so, it has become clear that a substantial fraction of these data sets contains bonus organisms, usually prokaryotes, in addition to the desired genome. While some of these additional organisms are extraneous contamination, they are sometimes symbionts, and so can be of biological interest. Therefore, it is desirable to assemble the bonus organisms along with the main genome. This transforms the problem into one of metagenomic assembly, which is considerably more challenging than traditional whole-genome shotgun (WGS) assembly. The different organisms will usually be present at different sequence depths, which is difficult to handle in most WGS assemblers. In addition, with multiple distinct genomes present, chimerism can produce cross-organism combinations. Finally, there is no guarantee that only a single bonus organism will be present. For example, one JGI project contained at least two different prokaryotic contaminants, plus a 145 KB plasmid of unknown origin. We have developed techniques to routinely identify and handle such bonus organisms in a high-throughput sequencing environment. Approaches include screening and partitioning the unassembled data, and iterative subassemblies. These methods are applicable not only to bonus organisms, but also to desired components such as organelles. These procedures have the additional benefit of identifying, and allowing for the removal of, cloning artifacts such as E.coli and spurious vector inclusions.

  15. Whole-genome sequencing to establish relapse or re-infection with Mycobacterium tuberculosis: a retrospective observational study

    Science.gov (United States)

    Bryant, Josephine M; Harris, Simon R; Parkhill, Julian; Dawson, Rodney; Diacon, Andreas H; van Helden, Paul; Pym, Alex; Mahayiddin, Aziah A; Chuchottaworn, Charoen; Sanne, Ian M; Louw, Cheryl; Boeree, Martin J; Hoelscher, Michael; McHugh, Timothy D; Bateson, Anna L C; Hunt, Robert D; Mwaigwisya, Solomon; Wright, Laura; Gillespie, Stephen H; Bentley, Stephen D

    2013-01-01

    Summary Background Recurrence of tuberculosis after treatment makes management difficult and is a key factor for determining treatment efficacy. Two processes can cause recurrence: relapse of the primary infection or re-infection with an exogenous strain. Although re-infection can and does occur, its importance to tuberculosis epidemiology and its biological basis is still debated. We used whole-genome sequencing—which is more accurate than conventional typing used to date—to assess the frequency of recurrence and to gain insight into the biological basis of re-infection. Methods We assessed patients from the REMoxTB trial—a randomised controlled trial of tuberculosis treatment that enrolled previously untreated participants with Mycobacterium tuberculosis infection from Malaysia, South Africa, and Thailand. We did whole-genome sequencing and mycobacterial interspersed repetitive unit-variable number of tandem repeat (MIRU-VNTR) typing of pairs of isolates taken by sputum sampling: one from before treatment and another from either the end of failed treatment at 17 weeks or later or from a recurrent infection. We compared the number and location of SNPs between isolates collected at baseline and recurrence. Findings We assessed 47 pairs of isolates. Whole-genome sequencing identified 33 cases with little genetic distance (0–6 SNPs) between strains, deemed relapses, and three cases for which the genetic distance ranged from 1306 to 1419 SNPs, deemed re-infections. Six cases of relapse and six cases of mixed infection were classified differently by whole-genome sequencing and MIRU-VNTR. We detected five single positive isolates (positive culture followed by at least two negative cultures) without clinical evidence of disease. Interpretation Whole-genome sequencing enables the differentiation of relapse and re-infection cases with greater resolution than do genotyping methods used at present, such as MIRU-VNTR, and provides insights into the biology of

  16. Construction and Evaluation of Desulfovibrio vulgaris Whole-Genome Oligonucleotide Microarrays

    Energy Technology Data Exchange (ETDEWEB)

    Z. He; Q. He; L. Wu; M.E. Clark; J.D. Wall; Jizhong Zhou; Matthew W. Fields

    2004-03-17

    Desulfovibrio vulgaris Hildenborough has been the focus of biochemical and physiological studies in the laboratory, and the metabolic versatility of this organism has been largely recognized, particularly the reduction of sulfate, fumarate, iron, uranium and chromium. In addition, a Desulfovibrio sp. has been shown to utilize uranium as the sole electron acceptor. D. vulgaris is a d-Proteobacterium with a genome size of 3.6 Mb and 3584 ORFs. The whole-genome microarrays of D. vulgaris have been constructed using 70mer oligonucleotides. All ORFs in the genome were represented with 3471 (97.1%) unique probes and 103 (2.9%) non-specific probes that may have cross-hybridization with other ORFs. In preparation for use of the experimental microarrays, artificial probes and targets were designed to assess specificity and sensitivity and identify optimal hybridization conditions for oligonucleotide microarrays. The results indicated that for 50mer and 70mer oligonucleotide arrays, hybridization at 45 C to 50 C, washing at 37 C and a wash time of 2.5 to 5 minutes obtained specific and strong hybridization signals. In order to evaluate the performance of the experimental microarrays, growth conditions were selected that were expected to give significant hybridization differences for different sets of genes. The initial evaluations were performed using D. vulgaris cells grown at logarithmic and stationary phases. Transcriptional analysis of D. vulgaris cells sampled during logarithmic phase growth indicated that 25% of annotated ORFs were up-regulated and 3% of annotated ORFs were downregulated compared to stationary phase cells. The up-regulated genes included ORFs predicted to be involved with acyl chain biosynthesis, amino acid ABC transporter, translational initiation factors, and ribosomal proteins. In the stationary phase growth cells, the two most up-regulated ORFs (70-fold) were annotated as a carboxynorspermidine decarboxylase and a 2C-methyl-D-erythritol-2

  17. Whole genome protein microarrays for serum profiling of immunodominant antigens of Bacillus anthracis

    Directory of Open Access Journals (Sweden)

    Karen Elizabeth Kempsell

    2015-08-01

    Full Text Available A commercial Bacillus anthracis (Anthrax whole genome protein microarray has been used to identify immunogenic Anthrax proteins using sera from groups of donors with (a confirmed B. anthracis naturally acquired cutaneous infection, (b confirmed B. anthracis intravenous drug use-acquired infection (c occupational exposure in a wool-sorters factory (d humans and rabbits vaccinated with the UK Anthrax protein vaccine and compared to naïve unexposed controls. Anti-IAP responses were observed for both IgG and IgA in the challenged groups; however the anti-IAP IgG response was more evident in the vaccinated group and the anti-IAP IgA response more evident in the B. anthracis-infected groups. Infected individuals appeared somewhat suppressed for their general IgG response, compared with other challenged groups.Immunogenic protein antigens were identified in all groups, some of which were shared between groups whilst others were specific for individual groups. The toxin proteins were immunodominant in all vaccinated, infected or other challenged groups. However a number of other chromosomally-located and plasmid encoded open reading frames were also recognised by infected or exposed groups in comparison to controls. Some of these antigens e.g. BA4182 are not recognised by vaccinated individuals, suggesting that there are proteins more specifically expressed by live Anthrax spores in vivo and are not currently found in the UK licensed Anthrax Vaccine (AVP. These may perhaps be preferentially expressed during infection and represent expression of alternative pathways in the B. anthracis ‘infectome’. These may make highly attractive candidates for diagnostic and vaccine biomarker development as they may be more specifically associated with the infectious phase of the pathogen. A number of B. anthracis small hypothetical protein targets have been synthesised, tested in mouse immunogenicity studies and validated in parallel using human sera from the

  18. Whole genome evaluation of horizontal transfers in the pathogenic fungus Aspergillus fumigatus

    Directory of Open Access Journals (Sweden)

    Deschavanne Patrick

    2010-03-01

    Full Text Available Abstract Background Numerous cases of horizontal transfers (HTs have been described for eukaryote genomes, but in contrast to prokaryote genomes, no whole genome evaluation of HTs has been carried out. This is mainly due to a lack of parametric methods specially designed to take the intrinsic heterogeneity of eukaryote genomes into account. We applied a simple and tested method based on local variations of genomic signatures to analyze the genome of the pathogenic fungus Aspergillus fumigatus. Results We detected 189 atypical regions containing 214 genes, accounting for about 1 Mb of DNA sequences. However, the fraction of atypical DNA detected was smaller than the average amount detected in the same conditions in prokaryote genomes (3.1% vs 5.6%. It appeared that about one third of these regions contained no annotated genes, a proportion far greater than in prokaryote genomes. When analyzing the origin of these HTs by comparing their signatures to a home made database of species signatures, 3 groups of donor species emerged: bacteria (40%, fungi (25%, and viruses (22%. It is to be noticed that though inter-domain exchanges are confirmed, we only put in evidence very few exchanges between eukaryotic kingdoms. Conclusions In conclusion, we demonstrated that HTs are not negligible in eukaryote genomes, bearing in mind that in our stringent conditions this amount is a floor value, though of a lesser extent than in prokaryote genomes. The biological mechanisms underlying those transfers remain to be elucidated as well as the biological functions of the transferred genes.

  19. Colorectal Cancer and the Human Gut Microbiome: Reproducibility with Whole-Genome Shotgun Sequencing

    Science.gov (United States)

    Hua, Xing; Zeller, Georg; Sunagawa, Shinichi; Voigt, Anita Y.; Hercog, Rajna; Goedert, James J.; Shi, Jianxin; Bork, Peer; Sinha, Rashmi

    2016-01-01

    Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously published 16S rRNA study to the metagenomics-derived taxonomy within the same population. In addition, metagenome-predicted genes, modules, and pathways in the Washington, DC cases and controls were compared to cases and controls recruited in France whose specimens were processed using the same platform. Associations between the presence of fecal Fusobacteria, Fusobacterium, and Porphyromonas with colorectal cancer detected by 16S rRNA were reproduced by metagenomics, whereas higher relative abundance of Clostridia in cancer cases based on 16S rRNA was merely borderline based on metagenomics. This demonstrated that within the same sample set, most, but not all taxonomic associations were seen with both methods. Considering significant cancer associations with the relative abundance of genes, modules, and pathways in a recently published French metagenomics dataset, statistically significant associations in the Washington, DC population were detected for four out of 10 genes, three out of nine modules, and seven out of 17 pathways. In total, colorectal cancer status in the Washington, DC study was associated with 39% of the metagenome-predicted genes, modules, and pathways identified in the French study. More within and between population comparisons are needed to identify sources of variation and disease associations that can be reproduced despite these variations. Future studies should have larger sample sizes or pool data across studies to have sufficient power to detect

  20. Validation of Pooled Whole-Genome Re-Sequencing in Arabidopsis lyrata.

    Directory of Open Access Journals (Sweden)

    Marco Fracassetti

    Full Text Available Sequencing pooled DNA of multiple individuals from a population instead of sequencing individuals separately has become popular due to its cost-effectiveness and simple wet-lab protocol, although some criticism of this approach remains. Here we validated a protocol for pooled whole-genome re-sequencing (Pool-seq of Arabidopsis lyrata libraries prepared with low amounts of DNA (1.6 ng per individual. The validation was based on comparing single nucleotide polymorphism (SNP frequencies obtained by pooling with those obtained by individual-based Genotyping By Sequencing (GBS. Furthermore, we investigated the effect of sample number, sequencing depth per individual and variant caller on population SNP frequency estimates. For Pool-seq data, we compared frequency estimates from two SNP callers, VarScan and Snape; the former employs a frequentist SNP calling approach while the latter uses a Bayesian approach. Results revealed concordance correlation coefficients well above 0.8, confirming that Pool-seq is a valid method for acquiring population-level SNP frequency data. Higher accuracy was achieved by pooling more samples (25 compared to 14 and working with higher sequencing depth (4.1× per individual compared to 1.4× per individual, which increased the concordance correlation coefficient to 0.955. The Bayesian-based SNP caller produced somewhat higher concordance correlation coefficients, particularly at low sequencing depth. We recommend pooling at least 25 individuals combined with sequencing at a depth of 100× to produce satisfactory frequency estimates for common SNPs (minor allele frequency above 0.05.

  1. Whole Genome Analysis of Leptospira licerasiae Provides Insight into Leptospiral Evolution and Pathogenicity

    Science.gov (United States)

    Selengut, Jeremy D.; Harkins, Derek M.; Patra, Kailash P.; Moreno, Angelo; Lehmann, Jason S.; Purushe, Janaki; Sanka, Ravi; Torres, Michael; Webster, Nicholas J.; Vinetz, Joseph M.; Matthias, Michael A.

    2012-01-01

    The whole genome analysis of two strains of the first intermediately pathogenic leptospiral species to be sequenced (Leptospira licerasiae strains VAR010 and MMD0835) provides insight into their pathogenic potential and deepens our understanding of leptospiral evolution. Comparative analysis of eight leptospiral genomes shows the existence of a core leptospiral genome comprising 1547 genes and 452 conserved genes restricted to infectious species (including L. licerasiae) that are likely to be pathogenicity-related. Comparisons of the functional content of the genomes suggests that L. licerasiae retains several proteins related to nitrogen, amino acid and carbohydrate metabolism which might help to explain why these Leptospira grow well in artificial media compared with pathogenic species. L. licerasiae strains VAR010T and MMD0835 possess two prophage elements. While one element is circular and shares homology with LE1 of L. biflexa, the second is cryptic and homologous to a previously identified but unnamed region in L. interrogans serovars Copenhageni and Lai. We also report a unique O-antigen locus in L. licerasiae comprised of a 6-gene cluster that is unexpectedly short compared with L. interrogans in which analogous regions may include >90 such genes. Sequence homology searches suggest that these genes were acquired by lateral gene transfer (LGT). Furthermore, seven putative genomic islands ranging in size from 5 to 36 kb are present also suggestive of antecedent LGT. How Leptospira become naturally competent remains to be determined, but considering the phylogenetic origins of the genes comprising the O-antigen cluster and other putative laterally transferred genes, L. licerasiae must be able to exchange genetic material with non-invasive environmental bacteria. The data presented here demonstrate that L. licerasiae is genetically more closely related to pathogenic than to saprophytic Leptospira and provide insight into the genomic bases for its infectiousness

  2. Colorectal Cancer and the Human Gut Microbiome: Reproducibility with Whole-Genome Shotgun Sequencing.

    Science.gov (United States)

    Vogtmann, Emily; Hua, Xing; Zeller, Georg; Sunagawa, Shinichi; Voigt, Anita Y; Hercog, Rajna; Goedert, James J; Shi, Jianxin; Bork, Peer; Sinha, Rashmi

    2016-01-01

    Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously published 16S rRNA study to the metagenomics-derived taxonomy within the same population. In addition, metagenome-predicted genes, modules, and pathways in the Washington, DC cases and controls were compared to cases and controls recruited in France whose specimens were processed using the same platform. Associations between the presence of fecal Fusobacteria, Fusobacterium, and Porphyromonas with colorectal cancer detected by 16S rRNA were reproduced by metagenomics, whereas higher relative abundance of Clostridia in cancer cases based on 16S rRNA was merely borderline based on metagenomics. This demonstrated that within the same sample set, most, but not all taxonomic associations were seen with both methods. Considering significant cancer associations with the relative abundance of genes, modules, and pathways in a recently published French metagenomics dataset, statistically significant associations in the Washington, DC population were detected for four out of 10 genes, three out of nine modules, and seven out of 17 pathways. In total, colorectal cancer status in the Washington, DC study was associated with 39% of the metagenome-predicted genes, modules, and pathways identified in the French study. More within and between population comparisons are needed to identify sources of variation and disease associations that can be reproduced despite these variations. Future studies should have larger sample sizes or pool data across studies to have sufficient power to detect

  3. Whole-genome SNP association analysis of reproduction traits in the Finnish Landrace pig breed

    Directory of Open Access Journals (Sweden)

    Uimari Pekka

    2011-12-01

    Full Text Available Abstract Background Good genetic progress for pig reproduction traits has been achieved using a quantitative genetics-based multi-trait BLUP evaluation system. At present, whole-genome single nucleotide polymorphisms (SNP panels provide a new tool for pig selection. The purpose of this study was to identify SNP associated with reproduction traits in the Finnish Landrace pig breed using the Illumina PorcineSNP60 BeadChip. Methods Association of each SNP with different traits was tested with a weighted linear model, using SNP genotype as a covariate and animal as a random variable. Deregressed estimated breeding values of the progeny tested boars were used as the dependent variable and weights were based on their reliabilities. Statistical significance of the associations was based on Bonferroni-corrected P-values. Results Deregressed estimated breeding values were available for 328 genotyped boars. Of the 62 163 SNP in the chip, 57 868 SNP had a call rate > 0.9 and 7 632 SNP were monomorphic. Statistically significant results (P-value P-value P-value = 1.69E-08 more than unfavourable double homozygote animals. A region on chromosome 9 (66 Mb was statistically significant for piglet mortality between birth and weaning in later parity (0.44 piglets between homozygotes, P-value = 6.94E-08. Conclusions Three separate regions on chromosome 9 gave significant results for litter size and pig mortality. The frequencies of favourable alleles of the significant SNP are moderate in the Finnish Landrace population and these SNP are thus valuable candidates for possible marker-assisted selection.

  4. Whole-Genome Saliva and Blood DNA Methylation Profiling in Individuals with a Respiratory Allergy

    Science.gov (United States)

    Declerck, Ken; Traen, Sophie; Koppen, Gudrun; Van Camp, Guy; Schoeters, Greet; Vanden Berghe, Wim; De Boever, Patrick

    2016-01-01

    The etiology of respiratory allergies (RA) can be partly explained by DNA methylation changes caused by adverse environmental and lifestyle factors experienced early in life. Longitudinal, prospective studies can aid in the unravelment of the epigenetic mechanisms involved in the disease development. High compliance rates can be expected in these studies when data is collected using non-invasive and convenient procedures. Saliva is an attractive biofluid to analyze changes in DNA methylation patterns. We investigated in a pilot study the differential methylation in saliva of RA (n = 5) compared to healthy controls (n = 5) using the Illumina Methylation 450K BeadChip platform. We evaluated the results against the results obtained in mononuclear blood cells from the same individuals. Differences in methylation patterns from saliva and mononuclear blood cells were clearly distinguishable (PAdj0.2), though the methylation status of about 96% of the cg-sites was comparable between peripheral blood mononuclear cells and saliva. When comparing RA cases with healthy controls, the number of differentially methylated sites (DMS) in saliva and blood were 485 and 437 (P0.1), respectively, of which 216 were in common. The methylation levels of these sites were significantly correlated between blood and saliva. The absolute levels of methylation in blood and saliva were confirmed for 3 selected DMS in the PM20D1, STK32C, and FGFR2 genes using pyrosequencing analysis. The differential methylation could only be confirmed for DMS in PM20D1 and STK32C genes in saliva. We show that saliva can be used for genome-wide methylation analysis and that it is possible to identify DMS when comparing RA cases and healthy controls. The results were replicated in blood cells of the same individuals and confirmed by pyrosequencing analysis. This study provides proof-of-concept for the applicability of saliva-based whole-genome methylation analysis in the field of respiratory allergy. PMID

  5. Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform

    Directory of Open Access Journals (Sweden)

    Shepherd Lara D

    2010-09-01

    Full Text Available Abstract Background Complete chloroplast genome sequences provide a valuable source of molecular markers for studies in molecular ecology and evolution of plants. To obtain complete genome sequences, recent studies have made use of the polymerase chain reaction to amplify overlapping fragments from conserved gene loci. However, this approach is time consuming and can be more difficult to implement where gene organisation differs among plants. An alternative approach is to first isolate chloroplasts and then use the capacity of high-throughput sequencing to obtain complete genome sequences. We report our findings from studies of the latter approach, which used a simple chloroplast isolation procedure, multiply-primed rolling circle amplification of chloroplast DNA, Illumina Genome Analyzer II sequencing, and de novo assembly of paired-end sequence reads. Results A modified rapid chloroplast isolation protocol was used to obtain plant DNA that was enriched for chloroplast DNA, but nevertheless contained nuclear and mitochondrial DNA. Multiply-primed rolling circle amplification of this mixed template produced sufficient quantities of chloroplast DNA, even when the amount of starting material was small, and improved the template quality for Illumina Genome Analyzer II (hereafter Illumina GAII sequencing. We demonstrate, using independent samples of karaka (Corynocarpus laevigatus, that there is high fidelity in the sequence obtained from this template. Although less than 20% of our sequenced reads could be mapped to chloroplast genome, it was relatively easy to assemble complete chloroplast genome sequences from the mixture of nuclear, mitochondrial and chloroplast reads. Conclusions We report successful whole genome sequencing of chloroplast DNA from karaka, obtained efficiently and with high fidelity.

  6. Whole-Genome Sequencing Analysis Accurately Predicts Antimicrobial Resistance Phenotypes in Campylobacter spp.

    Science.gov (United States)

    Zhao, S; Tyson, G H; Chen, Y; Li, C; Mukherjee, S; Young, S; Lam, C; Folster, J P; Whichard, J M; McDermott, P F

    2016-01-01

    The objectives of this study were to identify antimicrobial resistance genotypes for Campylobacter and to evaluate the correlation between resistance phenotypes and genotypes using in vitro antimicrobial susceptibility testing and whole-genome sequencing (WGS). A total of 114 Campylobacter species isolates (82 C. coli and 32 C. jejuni) obtained from 2000 to 2013 from humans, retail meats, and cecal samples from food production animals in the United States as part of the National Antimicrobial Resistance Monitoring System were selected for study. Resistance phenotypes were determined using broth microdilution of nine antimicrobials. Genomic DNA was sequenced using the Illumina MiSeq platform, and resistance genotypes were identified using assembled WGS sequences through blastx analysis. Eighteen resistance genes, including tet(O), blaOXA-61, catA, lnu(C), aph(2″)-Ib, aph(2″)-Ic, aph(2')-If, aph(2″)-Ig, aph(2″)-Ih, aac(6')-Ie-aph(2″)-Ia, aac(6')-Ie-aph(2″)-If, aac(6')-Im, aadE, sat4, ant(6'), aad9, aph(3')-Ic, and aph(3')-IIIa, and mutations in two housekeeping genes (gyrA and 23S rRNA) were identified. There was a high degree of correlation between phenotypic resistance to a given drug and the presence of one or more corresponding resistance genes. Phenotypic and genotypic correlation was 100% for tetracycline, ciprofloxacin/nalidixic acid, and erythromycin, and correlations ranged from 95.4% to 98.7% for gentamicin, azithromycin, clindamycin, and telithromycin. All isolates were susceptible to florfenicol, and no genes associated with florfenicol resistance were detected. There was a strong correlation (99.2%) between resistance genotypes and phenotypes, suggesting that WGS is a reliable indicator of resistance to the nine antimicrobial agents assayed in this study. WGS has the potential to be a powerful tool for antimicrobial resistance surveillance programs. PMID:26519386

  7. Whole Genome Duplications Shaped the Receptor Tyrosine Kinase Repertoire of Jawed Vertebrates.

    Science.gov (United States)

    Brunet, Frédéric G; Volff, Jean-Nicolas; Schartl, Manfred

    2016-01-01

    The receptor tyrosine kinase (RTK) gene family, involved primarily in cell growth and differentiation, comprises proteins with a common enzymatic tyrosine kinase intracellular domain adjacent to a transmembrane region. The amino-terminal portion of RTKs is extracellular and made of different domains, the combination of which characterizes each of the 20 RTK subfamilies among mammals. We analyzed a total of 7,376 RTK sequences among 143 vertebrate species to provide here the first comprehensive census of the jawed vertebrate repertoire. We ascertained the 58 genes previously described in the human and mouse genomes and established their phylogenetic relationships. We also identified five additional RTKs amounting to a total of 63 genes in jawed vertebrates. We found that the vertebrate RTK gene family has been shaped by the two successive rounds of whole genome duplications (WGD) called 1R and 2R (1R/2R) that occurred at the base of the vertebrates. In addition, the Vegfr and Ephrin receptor subfamilies were expanded by single gene duplications. In teleost fish, 23 additional RTK genes have been retained after another expansion through the fish-specific third round (3R) of WGD. Several lineage-specific gene losses were observed. For instance, birds have lost three RTKs, and different genes are missing in several fish sublineages. The RTK gene family presents an unusual high gene retention rate from the vertebrate WGDs (58.75% after 1R/2R, 64.4% after 3R), resulting in an expansion that might be correlated with the evolution of complexity of vertebrate cellular communication and intracellular signaling. PMID:27260203

  8. Genome-wide association study for longevity with whole-genome sequencing in 3 cattle breeds.

    Science.gov (United States)

    Zhang, Qianqian; Guldbrandtsen, Bernt; Thomasen, Jørn Rind; Lund, Mogens Sandø; Sahana, Goutam

    2016-09-01

    Longevity is an important economic trait in dairy production. Improvements in longevity could increase the average number of lactations per cow, thereby affecting the profitability of the dairy cattle industry. Improved longevity for cows reduces the replacement cost of stock and enables animals to achieve the highest production period. Moreover, longevity is an indirect indicator of animal welfare. Using whole-genome sequencing variants in 3 dairy cattle breeds, we carried out an association study and identified 7 genomic regions in Holstein and 5 regions in Red Dairy Cattle that were associated with longevity. Meta-analyses of 3 breeds revealed 2 significant genomic regions, located on chromosomes 6 (META-CHR6-88MB) and 18 (META-CHR18-58MB). META-CHR6-88MB overlaps with 2 known genes: neuropeptide G-protein coupled receptor (NPFFR2; 89,052,210-89,059,348 bp) and vitamin D-binding protein precursor (GC; 88,695,940-88,739,180 bp). The NPFFR2 gene was previously identified as a candidate gene for mastitis resistance. META-CHR18-58MB overlaps with zinc finger protein 717 (ZNF717; 58,130,465-58,141,877 bp) and zinc finger protein 613 (ZNF613; 58,115,782-58,117,110 bp), which have been associated with calving difficulties. Information on longevity-associated genomic regions could be used to find causal genes/variants influencing longevity and exploited to improve the reliability of genomic prediction. PMID:27289149

  9. Whole-genome expression analysis reveals genes associated with treatment response to escitalopram in major depression.

    Science.gov (United States)

    Pettai, Kristi; Milani, Lili; Tammiste, Anu; Võsa, Urmo; Kolde, Raivo; Eller, Triin; Nutt, David; Metspalu, Andres; Maron, Eduard

    2016-09-01

    The reasons for variability in treatment response in major depressive disorder (MDD) are not fully understood, but there is accumulating evidence suggesting that therapeutic outcomes of antidepressants can be influenced by genetic factors. In the present study we applied the microarray Illumina platform for whole genome expression profiling in depressive patients treated with escitalopram medication in order to identify genes underlying response to antidepressant treatment. The initial study sample consisted of 135 outpatients with major depressive disorder (mean age 31.1±11.6 years, 68% females) treated with escitalopram 10-20mg/day for 12 weeks, from which 87 patients (55 females) were included in gene expression analyzing. The gene expression profiles were measured on peripheral blood cells at baseline, at week 4 and at the end of treatment (week 12) using BeadChips Illumina. The fold change was used to demonstrate rate of changes in average gene expressions between studied groups. Statistical analyses were performed using the false discovery rate (FDR). The most interesting gene, which showed the predictive effect on treatment outcome by delineating low dose responders and treatment-resistant patients at the beginning of medication, was NLGN2, belonging to a family of neuronal cell surface proteins and involving in synapse formation. In addition, the several gene clusters, related to immune response, signal transduction and neurotrophin pathway, have distinguished responders from non-responders at the week 4 of treatment. After 4 weeks of escitalopram treatment (10mg/day), the YWHAZ gene has showed the highest transcriptional change in responders as compared with non-responders. Finally, at the end of the treatment we noticed that at least three genes (NR2C2, ZNF641, FKBP1A) have been strongly associated with resistance to escitalopram. Thus the results of this study support that exploration of peripheral gene expression is a useful tool in the further

  10. Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease

    Science.gov (United States)

    Ellingford, Jamie M.; Barton, Stephanie; Bhaskar, Sanjeev; Williams, Simon G.; Sergouniotis, Panagiotis I.; O'Sullivan, James; Lamb, Janine A.; Perveen, Rahat; Hall, Georgina; Newman, William G.; Bishop, Paul N.; Roberts, Stephen A.; Leach, Rick; Tearle, Rick; Bayliss, Stuart; Ramsden, Simon C.; Nemeth, Andrea H.; Black, Graeme C.M.

    2016-01-01

    Purpose To compare the efficacy of whole genome sequencing (WGS) with targeted next-generation sequencing (NGS) in the diagnosis of inherited retinal disease (IRD). Design Case series. Participants A total of 562 patients diagnosed with IRD. Methods We performed a direct comparative analysis of current molecular diagnostics with WGS. We retrospectively reviewed the findings from a diagnostic NGS DNA test for 562 patients with IRD. A subset of 46 of 562 patients (encompassing potential clinical outcomes of diagnostic analysis) also underwent WGS, and we compared mutation detection rates and molecular diagnostic yields. In addition, we compared the sensitivity and specificity of the 2 techniques to identify known single nucleotide variants (SNVs) using 6 control samples with publically available genotype data. Main Outcome Measures Diagnostic yield of genomic testing. Results Across known disease-causing genes, targeted NGS and WGS achieved similar levels of sensitivity and specificity for SNV detection. However, WGS also identified 14 clinically relevant genetic variants through WGS that had not been identified by NGS diagnostic testing for the 46 individuals with IRD. These variants included large deletions and variants in noncoding regions of the genome. Identification of these variants confirmed a molecular diagnosis of IRD for 11 of the 33 individuals referred for WGS who had not obtained a molecular diagnosis through targeted NGS testing. Weighted estimates, accounting for population structure, suggest that WGS methods could result in an overall 29% (95% confidence interval, 15–45) uplift in diagnostic yield. Conclusions We show that WGS methods can detect disease-causing genetic variants missed by current NGS diagnostic methodologies for IRD and thereby demonstrate the clinical utility and additional value of WGS. PMID:26872967

  11. Colorectal Cancer and the Human Gut Microbiome: Reproducibility with Whole-Genome Shotgun Sequencing.

    Directory of Open Access Journals (Sweden)

    Emily Vogtmann

    Full Text Available Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously published 16S rRNA study to the metagenomics-derived taxonomy within the same population. In addition, metagenome-predicted genes, modules, and pathways in the Washington, DC cases and controls were compared to cases and controls recruited in France whose specimens were processed using the same platform. Associations between the presence of fecal Fusobacteria, Fusobacterium, and Porphyromonas with colorectal cancer detected by 16S rRNA were reproduced by metagenomics, whereas higher relative abundance of Clostridia in cancer cases based on 16S rRNA was merely borderline based on metagenomics. This demonstrated that within the same sample set, most, but not all taxonomic associations were seen with both methods. Considering significant cancer associations with the relative abundance of genes, modules, and pathways in a recently published French metagenomics dataset, statistically significant associations in the Washington, DC population were detected for four out of 10 genes, three out of nine modules, and seven out of 17 pathways. In total, colorectal cancer status in the Washington, DC study was associated with 39% of the metagenome-predicted genes, modules, and pathways identified in the French study. More within and between population comparisons are needed to identify sources of variation and disease associations that can be reproduced despite these variations. Future studies should have larger sample sizes or pool data across studies to have sufficient

  12. Whole genome analysis of Leptospira licerasiae provides insight into leptospiral evolution and pathogenicity.

    Directory of Open Access Journals (Sweden)

    Jessica N Ricaldi

    Full Text Available The whole genome analysis of two strains of the first intermediately pathogenic leptospiral species to be sequenced (Leptospira licerasiae strains VAR010 and MMD0835 provides insight into their pathogenic potential and deepens our understanding of leptospiral evolution. Comparative analysis of eight leptospiral genomes shows the existence of a core leptospiral genome comprising 1547 genes and 452 conserved genes restricted to infectious species (including L. licerasiae that are likely to be pathogenicity-related. Comparisons of the functional content of the genomes suggests that L. licerasiae retains several proteins related to nitrogen, amino acid and carbohydrate metabolism which might help to explain why these Leptospira grow well in artificial media compared with pathogenic species. L. licerasiae strains VAR010(T and MMD0835 possess two prophage elements. While one element is circular and shares homology with LE1 of L. biflexa, the second is cryptic and homologous to a previously identified but unnamed region in L. interrogans serovars Copenhageni and Lai. We also report a unique O-antigen locus in L. licerasiae comprised of a 6-gene cluster that is unexpectedly short compared with L. interrogans in which analogous regions may include >90 such genes. Sequence homology searches suggest that these genes were acquired by lateral gene transfer (LGT. Furthermore, seven putative genomic islands ranging in size from 5 to 36 kb are present also suggestive of antecedent LGT. How Leptospira become naturally competent remains to be determined, but considering the phylogenetic origins of the genes comprising the O-antigen cluster and other putative laterally transferred genes, L. licerasiae must be able to exchange genetic material with non-invasive environmental bacteria. The data presented here demonstrate that L. licerasiae is genetically more closely related to pathogenic than to saprophytic Leptospira and provide insight into the genomic bases for

  13. Quantification of trace-level DNA by real-time whole genome amplification.

    Directory of Open Access Journals (Sweden)

    Min-Jung Kang

    Full Text Available Quantification of trace amounts of DNA is a challenge in analytical applications where the concentration of a target DNA is very low or only limited amounts of samples are available for analysis. PCR-based methods including real-time PCR are highly sensitive and widely used for quantification of low-level DNA samples. However, ordinary PCR methods require at least one copy of a specific gene sequence for amplification and may not work for a sub-genomic amount of DNA. We suggest a real-time whole genome amplification method adopting the degenerate oligonucleotide primed PCR (DOP-PCR for quantification of sub-genomic amounts of DNA. This approach enabled quantification of sub-picogram amounts of DNA independently of their sequences. When the method was applied to the human placental DNA of which amount was accurately determined by inductively coupled plasma-optical emission spectroscopy (ICP-OES, an accurate and stable quantification capability for DNA samples ranging from 80 fg to 8 ng was obtained. In blind tests of laboratory-prepared DNA samples, measurement accuracies of 7.4%, -2.1%, and -13.9% with analytical precisions around 15% were achieved for 400-pg, 4-pg, and 400-fg DNA samples, respectively. A similar quantification capability was also observed for other DNA species from calf, E. coli, and lambda phage. Therefore, when provided with an appropriate standard DNA, the suggested real-time DOP-PCR method can be used as a universal method for quantification of trace amounts of DNA.

  14. Gene evolution and gene expression after whole genome duplication in fish: the PhyloFish database.

    Science.gov (United States)

    Pasquier, Jeremy; Cabau, Cédric; Nguyen, Thaovi; Jouanno, Elodie; Severac, Dany; Braasch, Ingo; Journot, Laurent; Pontarotti, Pierre; Klopp, Christophe; Postlethwait, John H; Guiguen, Yann; Bobe, Julien

    2016-01-01

    With more than 30,000 species, ray-finned fish represent approximately half of vertebrates. The evolution of ray-finned fish was impacted by several whole genome duplication (WGD) events including a teleost-specific WGD event (TGD) that occurred at the root of the teleost lineage about 350 million years ago (Mya) and more recent WGD events in salmonids, carps, suckers and others. In plants and animals, WGD events are associated with adaptive radiations and evolutionary innovations. WGD-spurred innovation may be especially relevant in the case of teleost fish, which colonized a wide diversity of habitats on earth, including many extreme environments. Fish biodiversity, the use of fish models for human medicine and ecological studies, and the importance of fish in human nutrition, fuel an important need for the characterization of gene expression repertoires and corresponding evolutionary histories of ray-finned fish genes. To this aim, we performed transcriptome analyses and developed the PhyloFish database to provide (i) de novo assembled gene repertoires in 23 different ray-finned fish species including two holosteans (i.e. a group that diverged from teleosts before TGD) and 21 teleosts (including six salmonids), and (ii) gene expression levels in ten different tissues and organs (and embryos for many) in the same species. This resource was generated using a common deep RNA sequencing protocol to obtain the most exhaustive gene repertoire possible in each species that allows between-species comparisons to study the evolution of gene expression in different lineages. The PhyloFish database described here can be accessed and searched using RNAbrowse, a simple and efficient solution to give access to RNA-seq de novo assembled transcripts. PMID:27189481

  15. Microbiota present in cystic fibrosis lungs as revealed by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Philippe M Hauser

    Full Text Available Determination of the precise composition and variation of microbiota in cystic fibrosis lungs is crucial since chronic inflammation due to microorganisms leads to lung damage and ultimately, death. However, this constitutes a major technical challenge. Culturing of microorganisms does not provide a complete representation of a microbiota, even when using culturomics (high-throughput culture. So far, only PCR-based metagenomics have been investigated. However, these methods are biased towards certain microbial groups, and suffer from uncertain quantification of the different microbial domains. We have explored whole genome sequencing (WGS using the Illumina high-throughput technology applied directly to DNA extracted from sputa obtained from two cystic fibrosis patients. To detect all microorganism groups, we used four procedures for DNA extraction, each with a different lysis protocol. We avoided biases due to whole DNA amplification thanks to the high efficiency of current Illumina technology. Phylogenomic classification of the reads by three different methods produced similar results. Our results suggest that WGS provides, in a single analysis, a better qualitative and quantitative assessment of microbiota compositions than cultures and PCRs. WGS identified a high quantity of Haemophilus spp. (patient 1 or Staphylococcus spp. plus Streptococcus spp. (patient 2 together with low amounts of anaerobic (Veillonella, Prevotella, Fusobacterium and aerobic bacteria (Gemella, Moraxella, Granulicatella. WGS suggested that fungal members represented very low proportions of the microbiota, which were detected by cultures and PCRs because of their selectivity. The future increase of reads' sizes and decrease in cost should ensure the usefulness of WGS for the characterisation of microbiota.

  16. Inference of gorilla demographic and selective history from whole-genome sequence data.

    Science.gov (United States)

    McManus, Kimberly F; Kelley, Joanna L; Song, Shiya; Veeramah, Krishna R; Woerner, August E; Stevison, Laurie S; Ryder, Oliver A; Ape Genome Project, Great; Kidd, Jeffrey M; Wall, Jeffrey D; Bustamante, Carlos D; Hammer, Michael F

    2015-03-01

    Although population-level genomic sequence data have been gathered extensively for humans, similar data from our closest living relatives are just beginning to emerge. Examination of genomic variation within great apes offers many opportunities to increase our understanding of the forces that have differentially shaped the evolutionary history of hominid taxa. Here, we expand upon the work of the Great Ape Genome Project by analyzing medium to high coverage whole-genome sequences from 14 western lowland gorillas (Gorilla gorilla gorilla), 2 eastern lowland gorillas (G. beringei graueri), and a single Cross River individual (G. gorilla diehli). We infer that the ancestors of western and eastern lowland gorillas diverged from a common ancestor approximately 261 ka, and that the ancestors of the Cross River population diverged from the western lowland gorilla lineage approximately 68 ka. Using a diffusion approximation approach to model the genome-wide site frequency spectrum, we infer a history of western lowland gorillas that includes an ancestral population expansion of 1.4-fold around 970 ka and a recent 5.6-fold contraction in population size 23 ka. The latter may correspond to a major reduction in African equatorial forests around the Last Glacial Maximum. We also analyze patterns of variation among western lowland gorillas to identify several genomic regions with strong signatures of recent selective sweeps. We find that processes related to taste, pancreatic and saliva secretion, sodium ion transmembrane transport, and cardiac muscle function are overrepresented in genomic regions predicted to have experienced recent positive selection. PMID:25534031

  17. Examining phylogenetic relationships of Erwinia and Pantoea species using whole genome sequence data.

    Science.gov (United States)

    Zhang, Yucheng; Qiu, Sai

    2015-11-01

    The genera Erwinia and Pantoea contain species that are devastating plant pathogens, non-pathogen epiphytes, and opportunistic human pathogens. However, some controversies persist in the taxonomic classification of these two closely related genera. The phylogenomic analysis of these two genera was investigated via a comprehensive analysis of 25 Erwinia genomes and 23 Pantoea genomes. Single-copy orthologs could be extracted from the Erwinia/Pantoea core-genome to reconstruct the Erwinia/Pantoea phylogeny. This tree has strong bootstrap support for almost all branches. We also estimated the in silico DNA-DNA hybridization (isDDH) and the average nucleotide identity (ANI) values between each genome; strains from the same species showed ANI values ≥96% and isDDH values >70%. These data confirm that whole genome sequence data provides a powerful tool to resolve the complex taxonomic questions of Erwinia/Pantoea, e.g. Pantoea agglomerans 299R was not clustered into a single group with other P. agglomerans strains, and the ANI values and isDDH values between them were agglomerans 299R should not be classified into the P. agglomerans species. In addition, another strain (Pantoea sp. At_9b) was identified that may represent a novel Pantoea species. We also evaluated the performance of six commonly used housekeeping genes (atpD, carA, gyrB, infB, recA, and rpoB) in phylogenetic inference. A single gene was not enough to obtain a reliable species tree, and it was necessary to use the multilocus sequence analysis of the six marker genes to recover the Erwinia/Pantoea phylogeny. PMID:26296376

  18. Economic evidence on identifying clinically actionable findings with whole-genome sequencing: a scoping review.

    Science.gov (United States)

    Douglas, Michael P; Ladabaum, Uri; Pletcher, Mark J; Marshall, Deborah A; Phillips, Kathryn A

    2016-02-01

    The American College of Medical Genetics and Genomics (ACMG) recommends that mutations in 56 genes for 24 conditions are clinically actionable and should be reported as secondary findings after whole-genome sequencing (WGS). Our aim was to identify published economic evaluations of detecting mutations in these genes among the general population or among targeted/high-risk populations and conditions and identify gaps in knowledge. A targeted PubMed search from 1994 through November 2014 was performed, and we included original, English-language articles reporting cost-effectiveness or a cost-to-utility ratio or net benefits/benefit-cost focused on screening (not treatment) for conditions and genes listed by the ACMG. Articles were screened, classified as targeting a high-risk or general population, and abstracted by two reviewers. General population studies were evaluated for actual cost-effectiveness measures (e.g., incremental cost-effectiveness ratios (ICER)), whereas studies of targeted populations were evaluated for whether at least one scenario proposed was cost-effective (e.g., ICER of ≤$100,000 per life-year or quality-adjusted life-year gained). A total of 607 studies were identified, and 32 relevant studies were included. Identified studies addressed fewer than one-third (7 of 24; 29%) of the ACMG conditions. The cost-effectiveness of screening in the general population was examined for only 2 of 24 conditions (8%). The cost-effectiveness of most genetic findings that the ACMG recommends for return has not been evaluated in economic studies or in the context of screening in the general population. The individual studies do not directly address the cost-effectiveness of WGS. PMID:25996638

  19. An assessment of population structure in eight breeds of cattle using a whole genome SNP panel

    Directory of Open Access Journals (Sweden)

    Gao Chuan

    2008-05-01

    Full Text Available Abstract Background Analyses of population structure and breed diversity have provided insight into the origin and evolution of cattle. Previously, these studies have used a low density of microsatellite markers, however, with the large number of single nucleotide polymorphism markers that are now available, it is possible to perform genome wide population genetic analyses in cattle. In this study, we used a high-density panel of SNP markers to examine population structure and diversity among eight cattle breeds sampled from Bos indicus and Bos taurus. Results Two thousand six hundred and forty one single nucleotide polymorphisms (SNPs spanning all of the bovine autosomal genome were genotyped in Angus, Brahman, Charolais, Dutch Black and White Dairy, Holstein, Japanese Black, Limousin and Nelore cattle. Population structure was examined using the linkage model in the program STRUCTURE and Fst estimates were used to construct a neighbor-joining tree to represent the phylogenetic relationship among these breeds. Conclusion The whole-genome SNP panel identified several levels of population substructure in the set of examined cattle breeds. The greatest level of genetic differentiation was detected between the Bos taurus and Bos indicus breeds. When the Bos indicus breeds were excluded from the analysis, genetic differences among beef versus dairy and European versus Asian breeds were detected among the Bos taurus breeds. Exploration of the number of SNP loci required to differentiate between breeds showed that for 100 SNP loci, individuals could only be correctly clustered into breeds 50% of the time, thus a large number of SNP markers are required to replace the 30 microsatellite markers that are currently commonly used in genetic diversity studies.

  20. Flexible positions, managed hopes: the promissory bioeconomy of a whole genome sequencing cancer study.

    Science.gov (United States)

    Haase, Rachel; Michie, Marsha; Skinner, Debra

    2015-04-01

    Genomic research has rapidly expanded its scope and ambition over the past decade, promoted by both public and private sectors as having the potential to revolutionize clinical medicine. This promissory bioeconomy of genomic research and technology is generated by, and in turn generates, the hopes and expectations shared by investors, researchers and clinicians, patients, and the general public alike. Examinations of such bioeconomies have often focused on the public discourse, media representations, and capital investments that fuel these "regimes of hope," but also crucial are the more intimate contexts of small-scale medical research, and the private hopes, dreams, and disappointments of those involved. Here we examine one local site of production in a university-based clinical research project that sought to identify novel cancer predisposition genes through whole genome sequencing in individuals at high risk for cancer. In-depth interviews with 24 adults who donated samples to the study revealed an ability to shift flexibly between positioning themselves as research participants on the one hand, and as patients or as family members of patients, on the other. Similarly, interviews with members of the research team highlighted the dual nature of their positions as researchers and as clinicians. For both parties, this dual positioning shaped their investment in the project and valuing of its possible outcomes. In their narratives, all parties shifted between these different relational positions as they managed hopes and expectations for the research project. We suggest that this flexibility facilitated study implementation and participation in the face of potential and probable disappointment on one or more fronts, and acted as a key element in the resilience of this local promissory bioeconomy. We conclude that these multiple dimensions of relationality and positionality are inherent and essential in the creation of any complex economy, "bio" or otherwise. PMID

  1. Whole genome sequence and genome annotation of Colletotrichum acutatum, causal agent of anthracnose in pepper plants in South Korea

    Directory of Open Access Journals (Sweden)

    Joon-Hee Han

    2016-06-01

    Full Text Available Colletotrichum acutatum is a destructive fungal pathogen which causes anthracnose in a wide range of crops. Here we report the whole genome sequence and annotation of C. acutatum strain KC05, isolated from an infected pepper in Kangwon, South Korea. Genomic DNA from the KC05 strain was used for the whole genome sequencing using a PacBio sequencer and the MiSeq system. The KC05 genome was determined to be 52,190,760 bp in size with a G + C content of 51.73% in 27 scaffolds and to contain 13,559 genes with an average length of 1516 bp. Gene prediction and annotation were performed by incorporating RNA-Seq data. The genome sequence of the KC05 was deposited at DDBJ/ENA/GenBank under the accession number LUXP00000000.

  2. Implementation of Nationwide Real-time Whole-genome Sequencing to Enhance Listeriosis Outbreak Detection and Investigation.

    Science.gov (United States)

    Jackson, Brendan R; Tarr, Cheryl; Strain, Errol; Jackson, Kelly A; Conrad, Amanda; Carleton, Heather; Katz, Lee S; Stroika, Steven; Gould, L Hannah; Mody, Rajal K; Silk, Benjamin J; Beal, Jennifer; Chen, Yi; Timme, Ruth; Doyle, Matthew; Fields, Angela; Wise, Matthew; Tillman, Glenn; Defibaugh-Chavez, Stephanie; Kucerova, Zuzana; Sabol, Ashley; Roache, Katie; Trees, Eija; Simmons, Mustafa; Wasilenko, Jamie; Kubota, Kristy; Pouseele, Hannes; Klimke, William; Besser, John; Brown, Eric; Allard, Marc; Gerner-Smidt, Peter

    2016-08-01

    Listeria monocytogenes (Lm) causes severe foodborne illness (listeriosis). Previous molecular subtyping methods, such as pulsed-field gel electrophoresis (PFGE), were critical in detecting outbreaks that led to food safety improvements and declining incidence, but PFGE provides limited genetic resolution. A multiagency collaboration began performing real-time, whole-genome sequencing (WGS) on all US Lm isolates from patients, food, and the environment in September 2013, posting sequencing data into a public repository. Compared with the year before the project began, WGS, combined with epidemiologic and product trace-back data, detected more listeriosis clusters and solved more outbreaks (2 outbreaks in pre-WGS year, 5 in WGS year 1, and 9 in year 2). Whole-genome multilocus sequence typing and single nucleotide polymorphism analyses provided equivalent phylogenetic relationships relevant to investigations; results were most useful when interpreted in context of epidemiological data. WGS has transformed listeriosis outbreak surveillance and is being implemented for other foodborne pathogens. PMID:27090985

  3. Whole genome sequence and genome annotation of Colletotrichum acutatum, causal agent of anthracnose in pepper plants in South Korea.

    Science.gov (United States)

    Han, Joon-Hee; Chon, Jae-Kyung; Ahn, Jong-Hwa; Choi, Ik-Young; Lee, Yong-Hwan; Kim, Kyoung Su

    2016-06-01

    Colletotrichum acutatum is a destructive fungal pathogen which causes anthracnose in a wide range of crops. Here we report the whole genome sequence and annotation of C. acutatum strain KC05, isolated from an infected pepper in Kangwon, South Korea. Genomic DNA from the KC05 strain was used for the whole genome sequencing using a PacBio sequencer and the MiSeq system. The KC05 genome was determined to be 52,190,760 bp in size with a G + C content of 51.73% in 27 scaffolds and to contain 13,559 genes with an average length of 1516 bp. Gene prediction and annotation were performed by incorporating RNA-Seq data. The genome sequence of the KC05 was deposited at DDBJ/ENA/GenBank under the accession number LUXP00000000. PMID:27114908

  4. Identification of Ohnolog Genes Originating from Whole Genome Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes

    OpenAIRE

    Priya Singh, Param; Arora, Jatin; Isambert, Hervé

    2015-01-01

    Whole genome duplications (WGD) have now been firmly established in all major eukary-otic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to domina...

  5. “We don’t know her history, her background”: Adoptive parents’ perspectives on whole genome sequencing results

    OpenAIRE

    Crouch, Julia; Yu, Joon-Ho; Shankar, Aditi G.; Tabor, Holly K.

    2014-01-01

    Exome sequencing and whole genome sequencing (ES/WGS) can provide parents with a wide range of genetic information about their children, and adoptive parents may have unique issues to consider regarding possible access to this information. The few papers published on adoption and genetics have focused on targeted genetic testing of children in the pre-adoption context. There are no data on adoptive parent perspectives about pediatric ES/WGS, including their preferences about different kinds o...

  6. Identifying Likely Transmission Pathways within a 10-Year Community Outbreak of Tuberculosis by High-Depth Whole Genome Sequencing

    OpenAIRE

    Outhred, Alexander C.; Holmes, Nadine; Sadsad, Rosemarie; Martinez, Elena; Jelfs, Peter; Hill-Cawthorne, Grant A.; Gilbert, Gwendolyn L.; Marais, Ben J.; Sintchenko, Vitali

    2016-01-01

    Background Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways. Methods We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed ...

  7. Arapan-S: a fast and highly accurate whole-genome assembly software for viruses and small genomes

    OpenAIRE

    Sahli Mohammed; Shibuya Tetsuo

    2012-01-01

    Abstract Background Genome assembly is considered to be a challenging problem in computational biology, and has been studied extensively by many researchers. It is extremely difficult to build a general assembler that is able to reconstruct the original sequence instead of many contigs. However, we believe that creating specific assemblers, for solving specific cases, will be much more fruitful than creating general assemblers. Findings In this paper, we present Arapan-S, a whole-genome assem...

  8. Implications of using whole genome sequencing to test unselected populations for high risk breast cancer genes: a modelling study

    OpenAIRE

    Warren-Gash, Charlotte; Kroese, Mark; Burton, Hilary; Pharoah, Paul

    2016-01-01

    Background The decision to test for high risk breast cancer gene mutations is traditionally based on risk scores derived from age, family and personal cancer history. Next generation sequencing technologies such as whole genome sequencing (WGS) make wider population testing more feasible. In the UK’s 100,000 Genomes Project, mutations in 16 genes including BRCA1 and BRCA2 are to be actively sought regardless of clinical presentation. The implications of deploying this approach at scale for pa...

  9. Whole-genome sequencing to establish relapse or re-infection with Mycobacterium tuberculosis : a retrospective observational study

    OpenAIRE

    Bryant, Josephine M; Harris, Simon R; Parkhill, Julian; Dawson, Rodney; Diacon, Andreas; van Helden, Paul; Pym, Alex; Ahmad Mahayiddin, Aziah; Chuchottaworn, C; Sanne, Ian M; Louw, Cheryl; Boeree, Martin J.; Hoelscher, Michael; Timothy D. McHugh; Bateson, Anna L C

    2013-01-01

    Summary Background Recurrence of tuberculosis after treatment makes management difficult and is a key factor for determining treatment efficacy. Two processes can cause recurrence: relapse of the primary infection or re-infection with an exogenous strain. Although re-infection can and does occur, its importance to tuberculosis epidemiology and its biological basis is still debated. We used whole-genome sequencing—which is more accurate than conventional typing used to date—to assess the frequ...

  10. Identification of antimicrobial resistance genes in multidrug-resistant clinical Bacteroides fragilis isolates by whole genome shotgun sequencing

    DEFF Research Database (Denmark)

    Sydenham, Thomas Vognbjerg; Sóki, József; Hasman, Henrik;

    2015-01-01

    Bacteroides fragilis constitutes the most frequent anaerobic bacterium causing bacteremia in humans. The genetic background for antimicrobial resistance in B. fragilis is diverse with some genes requiring insertion sequence (IS) elements inserted upstream for increased expression. To evaluate whole...... genome shotgun sequencing as a method for predicting antimicrobial resistance properties, one meropenem resistant and five multidrug-resistant blood culture isolates were sequenced and antimicrobial resistance genes and IS elements identified using ResFinder 2.1 (http...

  11. Whole genome RNA expression profiling for the identification of novel biomarkers in the diagnosis and prognosis of biliary tract cancer

    OpenAIRE

    Chapman, M H

    2011-01-01

    Biliary tract cancer (BTC) is difficult to diagnose, in part related to the lack of reliable tumour markers. The aim of this project was to use whole genome RNA expression profiling in order to identify novel biomarkers for diagnosis and prognosis in biliary tract cancer. Chapter 1 summarises clinical aspects of BTC as well as current diagnostic and prognostic tests. Chapter 2 addresses the identification of circulating tumour cells for the diagnosis of BTC. It includes d...

  12. Divergent Whole-Genome Methylation Maps of Human and Chimpanzee Brains Reveal Epigenetic Basis of Human Regulatory Evolution

    OpenAIRE

    Zeng, Jia; Konopka, Genevieve; Hunt, Brendan G.; Preuss, Todd M.; Geschwind, Dan; Yi, Soojin V.

    2012-01-01

    DNA methylation is a pervasive epigenetic DNA modification that strongly affects chromatin regulation and gene expression. To date, it remains largely unknown how patterns of DNA methylation differ between closely related species and whether such differences contribute to species-specific phenotypes. To investigate these questions, we generated nucleotide-resolution whole-genome methylation maps of the prefrontal cortex of multiple humans and chimpanzees. Levels and patterns of DNA methylatio...

  13. Whole-genome sequence of Sunxiuqinia dokdonensis DH1(T), isolated from deep sub-seafloor sediment in Dokdo Island.

    Science.gov (United States)

    Lim, Sooyeon; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-09-01

    Sunxiuqinia dokdonensis DH1(T) was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000. PMID:27437183

  14. Whole-genome sequence of Sunxiuqinia dokdonensis DH1T, isolated from deep sub-seafloor sediment in Dokdo Island

    Directory of Open Access Journals (Sweden)

    Sooyeon Lim

    2016-09-01

    Full Text Available Sunxiuqinia dokdonensis DH1T was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  15. PhyResSE: a Web Tool Delineating Mycobacterium tuberculosis Antibiotic Resistance and Lineage from Whole-Genome Sequencing Data

    OpenAIRE

    Feuerriegel, Silke; Schleusener, Viola; Beckert, Patrick; Kohl, Thomas A.; Miotto, Paolo; Cirillo, Daniela M; Cabibbe, Andrea M.; Niemann, Stefan; Fellenberg, Kurt

    2015-01-01

    Antibiotic-resistant tuberculosis poses a global threat, causing the deaths of hundreds of thousands of people annually. While whole-genome sequencing (WGS), with its unprecedented level of detail, promises to play an increasingly important role in diagnosis, data analysis is a daunting challenge. Here, we present a simple-to-use web service (free for academic use at http://phyresse.org). Delineating both lineage and resistance, it provides state-of-the-art methodology to life scientists and ...

  16. Characterization of Genomic Variants Associated with Scout and Recruit Behavioral Castes in Honey Bees Using Whole-Genome Sequencing

    OpenAIRE

    Southey, Bruce R.; Ping Zhu; Carr-Markell, Morgan K.; Liang, Zhengzheng S.; Amro Zayed; Ruiqiang Li; Robinson, Gene E.; Rodriguez-Zas, Sandra L.

    2016-01-01

    Among forager honey bees, scouts seek new resources and return to the colony, enlisting recruits to collect these resources. Differentially expressed genes between these behaviors and genetic variability in scouting phenotypes have been reported. Whole-genome sequencing of 44 Apis mellifera scouts and recruits was undertaken to detect variants and further understand the genetic architecture underlying the behavioral differences between scouts and recruits. The median coverage depth in recruit...

  17. Whole genome sequencing of Halomonas sp. SUBG004 isolated from Little Rann of Kutch, a desert of India

    OpenAIRE

    Jigna H. Patel; Thaker, Vrinda S.

    2015-01-01

    A salt tolerant strain, designated as SUBG004, was isolated from the desert of India, Little Rann of Kutch. The organism is a Gram-negative, facultatively anaerobic and rod shaped bacterium. Chemotaxonomic and phylogenetic properties were consistent with its classification in the genus Halomonas. Here we report the whole genome sequence of Halomonas sp. SUBG004 deposited in DDBJ/EMBL/GenBank under accession number JPEU0100000 which provides insights for salt stress adaptation through betaine ...

  18. Whole genome sequencing of Halomonas sp. SUBG004 isolated from Little Rann of Kutch, a desert of India.

    Science.gov (United States)

    Patel, Jigna H; Thaker, Vrinda S

    2015-12-01

    A salt tolerant strain, designated as SUBG004, was isolated from the desert of India, Little Rann of Kutch. The organism is a Gram-negative, facultatively anaerobic and rod shaped bacterium. Chemotaxonomic and phylogenetic properties were consistent with its classification in the genus Halomonas. Here we report the whole genome sequence of Halomonas sp. SUBG004 deposited in DDBJ/EMBL/GenBank under accession number JPEU0100000 which provides insights for salt stress adaptation through betaine synthesis. PMID:26697321

  19. Whole genome sequencing of Halomonas sp. SUBG004 isolated from Little Rann of Kutch, a desert of India

    Directory of Open Access Journals (Sweden)

    Jigna H. Patel

    2015-12-01

    Full Text Available A salt tolerant strain, designated as SUBG004, was isolated from the desert of India, Little Rann of Kutch. The organism is a Gram-negative, facultatively anaerobic and rod shaped bacterium. Chemotaxonomic and phylogenetic properties were consistent with its classification in the genus Halomonas. Here we report the whole genome sequence of Halomonas sp. SUBG004 deposited in DDBJ/EMBL/GenBank under accession number JPEU0100000 which provides insights for salt stress adaptation through betaine synthesis.

  20. Whole-genome sequencing of bladder cancers reveals somatic CDKN1A mutations and clinicopathological associations with mutation burden

    OpenAIRE

    Cazier, J.-B.; Rao, S. R.; Mclean, C. M.; A. L. Walker; Wright, B J; Jaeger, E. E. M.; Kartsonaki, C.; Marsden, L.; Yau, C; Camps, C.; Kaisaki, P.; ,; Allan, Christopher; Attar, Moustafa; Bell, John

    2014-01-01

    Bladder cancers are a leading cause of death from malignancy. Molecular markers might predict disease progression and behaviour more accurately than the available prognostic factors. Here we use whole-genome sequencing to identify somatic mutations and chromosomal changes in 14 bladder cancers of different grades and stages. As well as detecting the known bladder cancer driver mutations, we report the identification of recurrent protein-inactivating mutations in CDKN1A and FAT1. The former ar...

  1. Whole-Genome Resequencing Analysis of Hanwoo and Yanbian Cattle to Identify Genome-Wide SNPs and Signatures of Selection

    OpenAIRE

    Choi, Jung-Woo; Choi, Bong-Hwan; Lee, Seung-Hwan; Lee, Seung-Soo; Kim, Hyeong-Cheol; Yu, Dayeong; Chung, Won-Hyong; Lee, Kyung-Tai; Chai, Han-Ha; Cho, Yong-Min; Lim, Dajeong

    2015-01-01

    Over the last 30 years, Hanwoo has been selectively bred to improve economically important traits. Hanwoo is currently the representative Korean native beef cattle breed, and it is believed that it shared an ancestor with a Chinese breed, Yanbian cattle, until the last century. However, these two breeds have experienced different selection pressures during recent decades. Here, we whole-genome sequenced 10 animals each of Hanwoo and Yanbian cattle (20 total) using the Illumina HiSeq 2000 sequ...

  2. Generation of whole genome sequences of new Cryptosporidium hominis and Cryptosporidium parvum isolates directly from stool samples

    OpenAIRE

    Hadfield, Stephen J.; Pachebat, Justin A; Swain, Martin T; Robinson, Guy; Cameron, Simon JS; Alexander, Jenna; Hegarty, Matthew J.; Elwin, Kristin; Chalmers, Rachel M.

    2015-01-01

    Background Whole genome sequencing (WGS) of Cryptosporidium spp. has previously relied on propagation of the parasite in animals to generate enough oocysts from which to extract DNA of sufficient quantity and purity for analysis. We have developed and validated a method for preparation of genomic Cryptosporidium DNA suitable for WGS directly from human stool samples and used it to generate 10 high-quality whole Cryptosporidium genome assemblies. Our method uses a combination of salt flotation...

  3. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak

    OpenAIRE

    Saelens, Joseph W.; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M.; Ana M Xet-Mull; Stout, Jason E.; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M.

    2015-01-01

    Summary Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from in...

  4. Identifying Gene Disruptions in Novel Balanced de novo Constitutional Translocations in Childhood Cancer Patients by Whole Genome Sequencing

    OpenAIRE

    Ritter, Deborah I.; Haines, Katherine; Cheung, Hannah; Davis, Caleb F.; Lau, Ching C.; Berg, Jonathan S.; Brown, Chester W.; Thompson, Patrick A.; Gibbs, Richard; Wheeler, David A.; Plon, Sharon E.

    2015-01-01

    Purpose We applied whole genome sequencing to children diagnosed with neoplasms and found to carry apparently balanced constitutional translocations, to discover novel genic disruptions. Methods We applied SV calling programs CREST, Break Dancer, SV-STAT and CGAP-CNV, and developed an annotative filtering strategy to achieve nucleotide resolution at the translocations. Results We identified the breakpoints for t(6;12) (p21.1;q24.31) disrupting HNF1A in a patient diagnosed with hepatic adenoma...

  5. Whole genome bisulfite sequencing of cell-free DNA and its cellular contributors uncovers placenta hypomethylated domains

    OpenAIRE

    Jensen, Taylor J.; Kim, Sung K; Zhu, Zhanyang; Chin, Christine; Gebhard, Claudia; Lu, Tim; Deciu, Cosmin; Van den Boom, Dirk; Ehrich, Mathias

    2015-01-01

    Background Circulating cell-free fetal DNA has enabled non-invasive prenatal fetal aneuploidy testing without direct discrimination of the maternal and fetal DNA. Testing may be improved by specifically enriching the sample material for fetal DNA. DNA methylation may allow for such a separation of DNA; however, this depends on knowledge of the methylomes of circulating cell-free DNA and its cellular contributors. Results We perform whole genome bisulfite sequencing on a set of unmatched sampl...

  6. Whole Genome Sequencing of Mycobacterium tuberculosis Reveals Slow Growth and Low Mutation Rates during Latent Infections in Humans

    OpenAIRE

    Roberto Colangeli; Vic L Arcus; Cursons, Ray T.; Ali Ruthe; Noel Karalus; Kathy Coley; Manning, Shannon D.; Soyeon Kim; Emily Marchiano; David Alland

    2014-01-01

    Very little is known about the growth and mutation rates of Mycobacterium tuberculosis during latent infection in humans. However, studies in rhesus macaques have suggested that latent infections have mutation rates that are higher than that observed during active tuberculosis disease. Elevated mutation rates are presumed risk factors for the development of drug resistance. Therefore, the investigation of mutation rates during human latency is of high importance. We performed whole genome mut...

  7. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus

    Directory of Open Access Journals (Sweden)

    Yookyung Lee

    2016-03-01

    Full Text Available Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000.

  8. Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data.

    Science.gov (United States)

    Agrawal, Saumya; Ganley, Austen R D

    2016-01-01

    The ribosomal RNA genes (rDNA) encode the major rRNA species of the ribosome, and thus are essential across life. These genes are highly repetitive in most eukaryotes, forming blocks of tandem repeats that form the core of nucleoli. The primary role of the rDNA in encoding rRNA has been long understood, but more recently the rDNA has been implicated in a number of other important biological phenomena, including genome stability, cell cycle, and epigenetic silencing. Noncoding elements, primarily located in the intergenic spacer region, appear to mediate many of these phenomena. Although sequence information is available for the genomes of many organisms, in almost all cases rDNA repeat sequences are lacking, primarily due to problems in assembling these intriguing regions during whole genome assemblies. Here, we present a method to obtain complete rDNA repeat unit sequences from whole genome assemblies. Limitations of next generation sequencing (NGS) data make them unsuitable for assembling complete rDNA unit sequences; therefore, the method we present relies on the use of Sanger whole genome sequence data. Our method makes use of the Arachne assembler, which can assemble highly repetitive regions such as the rDNA in a memory-efficient way. We provide a detailed step-by-step protocol for generating rDNA sequences from whole genome Sanger sequence data using Arachne, for refining complete rDNA unit sequences, and for validating the sequences obtained. In principle, our method will work for any species where the rDNA is organized into tandem repeats. This will help researchers working on species without a complete rDNA sequence, those working on evolutionary aspects of the rDNA, and those interested in conducting phylogenetic footprinting studies with the rDNA. PMID:27576718

  9. Whole-genome sequencing reveals small genomic regions of introgression in an introduced crater lake population of threespine stickleback.

    Science.gov (United States)

    Yoshida, Kohta; Miyagi, Ryutaro; Mori, Seiichi; Takahashi, Aya; Makino, Takashi; Toyoda, Atsushi; Fujiyama, Asao; Kitano, Jun

    2016-04-01

    Invasive species pose a major threat to biological diversity. Although introduced populations often experience population bottlenecks, some invasive species are thought to be originated from hybridization between multiple populations or species, which can contribute to the maintenance of high genetic diversity. Recent advances in genome sequencing enable us to trace the evolutionary history of invasive species even at whole-genome level and may help to identify the history of past hybridization that may be overlooked by traditional marker-based analysis. Here, we conducted whole-genome sequencing of eight threespine stickleback (Gasterosteus aculeatus) individuals, four from a recently introduced crater lake population and four of the putative source population. We found that both populations have several small genomic regions with high genetic diversity, which resulted from introgression from a closely related species (Gasterosteus nipponicus). The sizes of the regions were too small to be detected with traditional marker-based analysis or even some reduced-representation sequencing methods. Further amplicon sequencing revealed linkage disequilibrium around an introgression site, which suggests the possibility of selective sweep at the introgression site. Thus, interspecies introgression might predate introduction and increase genetic variation in the source population. Whole-genome sequencing of even a small number of individuals can therefore provide higher resolution inference of history of introduced populations. PMID:27069575

  10. Whole genome transcript profiling from fingerstick blood samples: a comparison and feasibility study

    Directory of Open Access Journals (Sweden)

    Williams Adam R

    2009-12-01

    Full Text Available Abstract Background Whole genome gene expression profiling has revolutionized research in the past decade especially with the advent of microarrays. Recently, there have been significant improvements in whole blood RNA isolation techniques which, through stabilization of RNA at the time of sample collection, avoid bias and artifacts introduced during sample handling. Despite these improvements, current human whole blood RNA stabilization/isolation kits are limited by the requirement of a venous blood sample of at least 2.5 mL. While fingerstick blood collection has been used for many different assays, there has yet to be a kit developed to isolate high quality RNA for use in gene expression studies from such small human samples. The clinical and field testing advantages of obtaining reliable and reproducible gene expression data from a fingerstick are many; it is less invasive, time saving, more mobile, and eliminates the need of a trained phlebotomist. Furthermore, this method could also be employed in small animal studies, i.e. mice, where larger sample collections often require sacrificing the animal. In this study, we offer a rapid and simple method to extract sufficient amounts of high quality total RNA from approximately 70 μl of whole blood collected via a fingerstick using a modified protocol of the commercially available Qiagen PAXgene RNA Blood Kit. Results From two sets of fingerstick collections, about 70 uL whole blood collected via finger lancet and capillary tube, we recovered an average of 252.6 ng total RNA with an average RIN of 9.3. The post-amplification yields for 50 ng of total RNA averaged at 7.0 ug cDNA. The cDNA hybridized to Affymetrix HG-U133 Plus 2.0 GeneChips had an average % Present call of 52.5%. Both fingerstick collections were highly correlated with r2 values ranging from 0.94 to 0.97. Similarly both fingerstick collections were highly correlated to the venous collection with r2 values ranging from 0.88 to 0

  11. Whole genome resequencing of black Angus and Holstein cattle for SNP and CNV discovery

    Directory of Open Access Journals (Sweden)

    Stothard Paul

    2011-11-01

    Full Text Available Abstract Background One of the goals of livestock genomics research is to identify the genetic differences responsible for variation in phenotypic traits, particularly those of economic importance. Characterizing the genetic variation in livestock species is an important step towards linking genes or genomic regions with phenotypes. The completion of the bovine genome sequence and recent advances in DNA sequencing technology allow for in-depth characterization of the genetic variations present in cattle. Here we describe the whole-genome resequencing of two Bos taurus bulls from distinct breeds for the purpose of identifying and annotating novel forms of genetic variation in cattle. Results The genomes of a Black Angus bull and a Holstein bull were sequenced to 22-fold and 19-fold coverage, respectively, using the ABI SOLiD system. Comparisons of the sequences with the Btau4.0 reference assembly yielded 7 million single nucleotide polymorphisms (SNPs, 24% of which were identified in both animals. Of the total SNPs found in Holstein, Black Angus, and in both animals, 81%, 81%, and 75% respectively are novel. In-depth annotations of the data identified more than 16 thousand distinct non-synonymous SNPs (85% novel between the two datasets. Alignments between the SNP-altered proteins and orthologues from numerous species indicate that many of the SNPs alter well-conserved amino acids. Several SNPs predicted to create or remove stop codons were also found. A comparison between the sequencing SNPs and genotyping results from the BovineHD high-density genotyping chip indicates a detection rate of 91% for homozygous SNPs and 81% for heterozygous SNPs. The false positive rate is estimated to be about 2% for both the Black Angus and Holstein SNP sets, based on follow-up genotyping of 422 and 427 SNPs, respectively. Comparisons of read depth between the two bulls along the reference assembly identified 790 putative copy-number variations (CNVs. Ten

  12. Whole Genome Sequencing Based Characterization of Extensively Drug-Resistant Mycobacterium tuberculosis Isolates from Pakistan

    KAUST Repository

    Ali, Asho

    2015-02-26

    Improved molecular diagnostic methods for detection drug resistance in Mycobacterium tuberculosis (MTB) strains are required. Resistance to first- and second- line anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs) in particular genes. However, these SNPs can vary between MTB lineages therefore local data is required to describe different strain populations. We used whole genome sequencing (WGS) to characterize 37 extensively drug-resistant (XDR) MTB isolates from Pakistan and investigated 40 genes associated with drug resistance. Rifampicin resistance was attributable to SNPs in the rpoB hot-spot region. Isoniazid resistance was most commonly associated with the katG codon 315 (92%) mutation followed by inhA S94A (8%) however, one strain did not have SNPs in katG, inhA or oxyR-ahpC. All strains were pyrazimamide resistant but only 43% had pncA SNPs. Ethambutol resistant strains predominantly had embB codon 306 (62%) mutations, but additional SNPs at embB codons 406, 378 and 328 were also present. Fluoroquinolone resistance was associated with gyrA 91-94 codons in 81% of strains; four strains had only gyr B mutations, while others did not have SNPs in either gyrA or gyrB. Streptomycin resistant strains had mutations in ribosomal RNA genes; rpsL codon 43 (42%); rrs 500 region (16%), and gidB (34%) while six strains did not have mutations in any of these genes. Amikacin/kanamycin/capreomycin resistance was associated with SNPs in rrs at nt1401 (78%) and nt1484 (3%), except in seven (19%) strains. We estimate that if only the common hot-spot region targets of current commercial assays were used, the concordance between phenotypic and genotypic testing for these XDR strains would vary between rifampicin (100%), isoniazid (92%), flouroquinolones (81%), aminoglycoside (78%) and ethambutol (62%); while pncA sequencing would provide genotypic resistance in less than half the isolates. This work highlights the importance of expanded

  13. Whole genome duplications and expansion of the vertebrate GATA transcription factor gene family

    Directory of Open Access Journals (Sweden)

    Bowerman Bruce

    2009-08-01

    Full Text Available Abstract Background GATA transcription factors influence many developmental processes, including the specification of embryonic germ layers. The GATA gene family has significantly expanded in many animal lineages: whereas diverse cnidarians have only one GATA transcription factor, six GATA genes have been identified in many vertebrates, five in many insects, and eleven to thirteen in Caenorhabditis nematodes. All bilaterian animal genomes have at least one member each of two classes, GATA123 and GATA456. Results We have identified one GATA123 gene and one GATA456 gene from the genomic sequence of two invertebrate deuterostomes, a cephalochordate (Branchiostoma floridae and a hemichordate (Saccoglossus kowalevskii. We also have confirmed the presence of six GATA genes in all vertebrate genomes, as well as additional GATA genes in teleost fish. Analyses of conserved sequence motifs and of changes to the exon-intron structure, and molecular phylogenetic analyses of these deuterostome GATA genes support their origin from two ancestral deuterostome genes, one GATA 123 and one GATA456. Comparison of the conserved genomic organization across vertebrates identified eighteen paralogous gene families linked to multiple vertebrate GATA genes (GATA paralogons, providing the strongest evidence yet for expansion of vertebrate GATA gene families via genome duplication events. Conclusion From our analysis, we infer the evolutionary birth order and relationships among vertebrate GATA transcription factors, and define their expansion via multiple rounds of whole genome duplication events. As the genomes of four independent invertebrate deuterostome lineages contain single copy GATA123 and GATA456 genes, we infer that the 0R (pre-genome duplication invertebrate deuterostome ancestor also had two GATA genes, one of each class. Synteny analyses identify duplications of paralogous chromosomal regions (paralogons, from single ancestral vertebrate GATA123 and GATA456

  14. Whole Genome Association Studies of Residual Feed Intake and Related Traits in the Pig.

    Directory of Open Access Journals (Sweden)

    Suneel K Onteru

    Full Text Available Residual feed intake (RFI, a measure of feed efficiency, is the difference between observed feed intake and the expected feed requirement predicted from growth and maintenance. Pigs with low RFI have reduced feed costs without compromising their growth. Identification of genes or genetic markers associated with RFI will be useful for marker-assisted selection at an early age of animals with improved feed efficiency.Whole genome association studies (WGAS for RFI, average daily feed intake (ADFI, average daily gain (ADG, back fat (BF and loin muscle area (LMA were performed on 1,400 pigs from the divergently selected ISU-RFI lines, using the Illumina PorcineSNP60 BeadChip. Various statistical methods were applied to find SNPs and genomic regions associated with the traits, including a Bayesian approach using GenSel software, and frequentist approaches such as allele frequency differences between lines, single SNP and haplotype analyses using PLINK software. Single SNP and haplotype analyses showed no significant associations (except for LMA after genomic control and FDR. Bayesian analyses found at least 2 associations for each trait at a false positive probability of 0.5. At generation 8, the RFI selection lines mainly differed in allele frequencies for SNPs near (<0.05 Mb genes that regulate insulin release and leptin functions. The Bayesian approach identified associations of genomic regions containing insulin release genes (e.g., GLP1R, CDKAL, SGMS1 with RFI and ADFI, of regions with energy homeostasis (e.g., MC4R, PGM1, GPR81 and muscle growth related genes (e.g., TGFB1 with ADG, and of fat metabolism genes (e.g., ACOXL, AEBP1 with BF. Specifically, a very highly significantly associated QTL for LMA on SSC7 with skeletal myogenesis genes (e.g., KLHL31 was identified for subsequent fine mapping.Important genomic regions associated with RFI related traits were identified for future validation studies prior to their incorporation in marker

  15. Expression profiling of five different xenobiotics using a Caenorhabditis elegans whole genome microarray.

    Science.gov (United States)

    Reichert, Kerstin; Menzel, Ralph

    2005-10-01

    The soil nematode Caenorhabditis elegans is frequently used in ecotoxicological studies due to its wide distribution in terrestrial habitats, its easy handling in the laboratory, and its sensitivity against different kinds of stress. Since its genome has been completely sequenced, more and more studies are investigating the functional relation of gene expression and phenotypic response. For these reasons C. elegans seems to be an attractive animal for the development of a new, genome based, ecotoxicological test system. In recent years, the DNA array technique has been established as a powerful tool to obtain distinct gene expression patterns in response to different experimental conditions. Using a C. elegans whole genome DNA microarray in this study, the effects of five different xenobiotics on the gene expression of the nematode were investigated. The exposure time for the following five applied compounds beta-NF (5 mg/l), Fla (0.5 mg/l), atrazine (25 mg/l), clofibrate (10 mg/l) and DES (0.5 mg/l) was 48+/-5 h. The analysis of the data showed a clear induction of 203 genes belonging to different families like the cytochromes P450, UDP-glucoronosyltransferases (UDPGT), glutathione S-transferases (GST), carboxylesterases, collagenes, C-type lectins and others. Under the applied conditions, fluoranthene was able to induce most of the induceable genes, followed by clofibrate, atrazine, beta-naphthoflavone and diethylstilbestrol. A decreased expression could be shown for 153 genes with atrazine having the strongest effect followed by fluoranthene, diethylstilbestrol, beta-naphthoflavone and clofibrate. For upregulated genes a change ranging from approximately 2.1- till 42.3-fold and for downregulated genes from approximately 2.1 till 6.6-fold of gene expression could be affected through the applied xenobiotics. The results confirm the applicability of the gene expression for the development of an ecotoxicological test system. Compared to classical tests the main

  16. Preparation of a phage DNA fragment library for whole genome shotgun sequencing.

    Science.gov (United States)

    Summer, Elizabeth J

    2009-01-01

    The most efficient method to determine the genomic sequence of a dsDNA phage is to use a whole genome shotgun approach (WGSA). Preparation of a library where each genomic fragment has an equal chance of being represented is critical to the success of the WGSA. For many phages, there are regions of the genome likely to be under-represented in the shotgun library, which results in more gaps in the shotgun assembly than predicted by the Poisson distribution. However, as phage genomes are relatively small, this increased number of gaps does not present an insurmountable impediment to using the WGSA. This chapter will focus on construction of a high-quality random library and sequence analysis of this library in a 96-well format. Techniques are described for the mechanical fragmentation of genomic DNA into 2 kb average size fragments, preparation of the fragmented DNA for shotgun cloning, and advice on the choice of cloning vector for library preparation. Protocols for deepwell block culture, plasmid isolation, and sequencing in 96-well format are given. The rationale for determining the total number of random clones from a library to sequence for a 50 and 150 kb genome is explained. The steps involved in going from hundreds of shotgun sequencing traces to generating contigs will be outlined as well as how to close gaps in the sequence by primer walking on phage DNA and PCR-generated templates. Finally, examples will be given of how biological information about the phage genomic termini can be derived by analysis of the organization of individual clones in the shotgun sequence assembly. Specific examples are given for the circularly permuted termini of pac type phages, the direct terminal repeats found in most T7-like phages, variable host DNA at either end as in the Mu-like phages, and the 5' and 3' overhanging ends of cos type phages. The end result of these steps is the entire DNA sequence of a novel phage, ready for gene prediction. PMID:19082550

  17. Whole genome SNP discovery and analysis of genetic diversity in Turkey (Meleagris gallopavo

    Directory of Open Access Journals (Sweden)

    Aslam Muhammad L

    2012-08-01

    whole genome SNP discovery study in turkey resulted in the detection of 5.49 million putative SNPs compared to the reference genome. All commercial lines appear to share a common origin. Presence of different alleles/haplotypes in the SM population highlights that specific haplotypes have been selected in the modern domesticated turkey.

  18. Microarray analysis of serum mRNA in patients with head and neck squamous cell carcinoma at whole-genome scale

    Czech Academy of Sciences Publication Activity Database

    Čapková, M.; Šáchová, Jana; Strnad, Hynek; Kolář, Michal; Hroudová, Miluše; Chovanec, M.; Čada, Z.; Štefl, M.; Valach, J.; Kastner, J.; Smetana, K. Jr.; Plzák, J.

    -, April 23 (2014). ISSN 2314-6141 R&D Projects: GA MZd(CZ) NT13488 Institutional support: RVO:68378050 Keywords : Microarray Analysis * Head and Neck Squamous Cell Carcinoma * whole-genome scale Subject RIV: EB - Genetics ; Molecular Biology

  19. Whole Genome DNA Sequence Analysis of Salmonella subspecies enterica serotype Tennessee obtained from related peanut butter foodborne outbreaks.

    Directory of Open Access Journals (Sweden)

    Mark R Wilson

    Full Text Available Establishing an association between possible food sources and clinical isolates requires discriminating the suspected pathogen from an environmental background, and distinguishing it from other closely-related foodborne pathogens. We used whole genome sequencing (WGS to Salmonella subspecies enterica serotype Tennessee (S. Tennessee to describe genomic diversity across the serovar as well as among and within outbreak clades of strains associated with contaminated peanut butter. We analyzed 71 isolates of S. Tennessee from disparate food, environmental, and clinical sources and 2 other closely-related Salmonella serovars as outgroups (S. Kentucky and S. Cubana, which were also shot-gun sequenced. A whole genome single nucleotide polymorphism (SNP analysis was performed using a maximum likelihood approach to infer phylogenetic relationships. Several monophyletic lineages of S. Tennessee with limited SNP variability were identified that recapitulated several food contamination events. S. Tennessee clades were separated from outgroup salmonellae by more than sixteen thousand SNPs. Intra-serovar diversity of S. Tennessee was small compared to the chosen outgroups (1,153 SNPs, suggesting recent divergence of some S. Tennessee clades. Analysis of all 1,153 SNPs structuring an S. Tennessee peanut butter outbreak cluster revealed that isolates from several food, plant, and clinical isolates were very closely related, as they had only a few SNP differences between them. SNP-based cluster analyses linked specific food sources to several clinical S. Tennessee strains isolated in separate contamination events. Environmental and clinical isolates had very similar whole genome sequences; no markers were found that could be used to discriminate between these sources. Finally, we identified SNPs within variable S. Tennessee genes that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts

  20. Whole Genome DNA Sequence Analysis of Salmonella subspecies enterica serotype Tennessee obtained from related peanut butter foodborne outbreaks.

    Science.gov (United States)

    Wilson, Mark R; Brown, Eric; Keys, Chris; Strain, Errol; Luo, Yan; Muruvanda, Tim; Grim, Christopher; Jean-Gilles Beaubrun, Junia; Jarvis, Karen; Ewing, Laura; Gopinath, Gopal; Hanes, Darcy; Allard, Marc W; Musser, Steven

    2016-01-01

    Establishing an association between possible food sources and clinical isolates requires discriminating the suspected pathogen from an environmental background, and distinguishing it from other closely-related foodborne pathogens. We used whole genome sequencing (WGS) to Salmonella subspecies enterica serotype Tennessee (S. Tennessee) to describe genomic diversity across the serovar as well as among and within outbreak clades of strains associated with contaminated peanut butter. We analyzed 71 isolates of S. Tennessee from disparate food, environmental, and clinical sources and 2 other closely-related Salmonella serovars as outgroups (S. Kentucky and S. Cubana), which were also shot-gun sequenced. A whole genome single nucleotide polymorphism (SNP) analysis was performed using a maximum likelihood approach to infer phylogenetic relationships. Several monophyletic lineages of S. Tennessee with limited SNP variability were identified that recapitulated several food contamination events. S. Tennessee clades were separated from outgroup salmonellae by more than sixteen thousand SNPs. Intra-serovar diversity of S. Tennessee was small compared to the chosen outgroups (1,153 SNPs), suggesting recent divergence of some S. Tennessee clades. Analysis of all 1,153 SNPs structuring an S. Tennessee peanut butter outbreak cluster revealed that isolates from several food, plant, and clinical isolates were very closely related, as they had only a few SNP differences between them. SNP-based cluster analyses linked specific food sources to several clinical S. Tennessee strains isolated in separate contamination events. Environmental and clinical isolates had very similar whole genome sequences; no markers were found that could be used to discriminate between these sources. Finally, we identified SNPs within variable S. Tennessee genes that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts during future

  1. Fatal Cases of Influenza A(H3N2) in Children: Insights from Whole Genome Sequence Analysis

    OpenAIRE

    Monica Galiano; Johnson, Benjamin F.; Richard Myers; Joanna Ellis; Rod Daniels; Maria Zambon

    2012-01-01

    During the Northern Hemisphere winter of 2003-2004 the emergence of a novel influenza antigenic variant, A/Fujian/411/2002-like(H3N2), was associated with an unusually high number of fatalities in children. Seventeen fatal cases in the UK were laboratory confirmed for Fujian/411-like viruses. To look for phylogenetic patterns and genetic markers that might be associated with increased virulence, sequencing and phylogenetic analysis of the whole genomes of 63 viruses isolated from fatal cases ...

  2. Enterobacter asburiae Strain L1: Complete Genome and Whole Genome Optical Mapping Analysis of a Quorum Sensing Bacterium

    OpenAIRE

    Yin Yin Lau; Wai-Fong Yin; Kok-Gan Chan

    2014-01-01

    Enterobacter asburiae L1 is a quorum sensing bacterium isolated from lettuce leaves. In this study, for the first time, the complete genome of E. asburiae L1 was sequenced using the single molecule real time sequencer (PacBio RSII) and the whole genome sequence was verified by using optical genome mapping (OpGen) technology. In our previous study, E. asburiae L1 has been reported to produce AHLs, suggesting the possibility of virulence factor regulation which is quorum sensing dependent. This...

  3. ReAS: Recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun.

    Directory of Open Access Journals (Sweden)

    Ruiqiang Li

    2005-09-01

    Full Text Available We describe an algorithm, ReAS, to recover ancestral sequences for transposable elements (TEs from the unassembled reads of a whole genome shotgun. The main assumptions are that these TEs must exist at high copy numbers across the genome and must not be so old that they are no longer recognizable in comparison to their ancestral sequences. Tested on the japonica rice genome, ReAS was able to reconstruct all of the high copy sequences in the Repbase repository of known TEs, and increase the effectiveness of RepeatMasker in identifying TEs from genome sequences.

  4. Whole genome sequencing identifies circulating Beijing-lineage Mycobacterium tuberculosis strains in Guatemala and an associated urban outbreak.

    Science.gov (United States)

    Saelens, Joseph W; Lau-Bonilla, Dalia; Moller, Anneliese; Medina, Narda; Guzmán, Brenda; Calderón, Maylena; Herrera, Raúl; Sisk, Dana M; Xet-Mull, Ana M; Stout, Jason E; Arathoon, Eduardo; Samayoa, Blanca; Tobin, David M

    2015-12-01

    Limited data are available regarding the molecular epidemiology of Mycobacterium tuberculosis (Mtb) strains circulating in Guatemala. Beijing-lineage Mtb strains have gained prevalence worldwide and are associated with increased virulence and drug resistance, but there have been only a few cases reported in Central America. Here we report the first whole genome sequencing of Central American Beijing-lineage strains of Mtb. We find that multiple Beijing-lineage strains, derived from independent founding events, are currently circulating in Guatemala, but overall still represent a relatively small proportion of disease burden. Finally, we identify a specific Beijing-lineage outbreak centered on a poor neighborhood in Guatemala City. PMID:26542222

  5. Low-coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for phylogenetic marker development and gene discovery1

    Science.gov (United States)

    Gardner, Elliot M.; Johnson, Matthew G.; Ragone, Diane; Wickett, Norman J.; Zerega, Nyree J. C.

    2016-01-01

    Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes.

  6. Identifying Likely Transmission Pathways within a 10-Year Community Outbreak of Tuberculosis by High-Depth Whole Genome Sequencing.

    Directory of Open Access Journals (Sweden)

    Alexander C Outhred

    Full Text Available Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways.We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants.Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade.Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster.

  7. Efficient Haplotype Inference Algorithms in One Whole Genome Scan for Pedigree Data with Non-genotyped Founders

    Institute of Scientific and Technical Information of China (English)

    Yongxi Cheng; Hadi Sabaa; Zhipeng Cai; Randy Goebel; Guohui Lin

    2009-01-01

    An efficient rule-based algorithm is presented for haplotype inference from general pedigree genotype data, with the assumption of no recombination. This algorithm generalizes previous algorithms to handle the cases where some pedigree founders are not genotyped, provided that for each nuclear family at least one parent is genotyped and each non-genotyped founder appears in exactly one nuclear family. The importance of this generalization lies in that such cases frequently happen in real data, because some founders may have passed away and their genotype data can no longer be collected. The algorithm runs in O(m3n3) time, where m is the number of single nucleotide polymorphism (SNP) loci under consideration and n is the number of genotyped members in the pedigree. This zero-recombination haplotyping algorithm is extended to a maximum parsimoniously haplotyping algorithm in one whole genome scan to minimize the total number of breakpoint sites, or equivalently, the number of maximal zero-recombination chromosomal regions. We show that such a whole genome scan haplotyping algorithm can be implemented in O(m3n3) time in a novel incremental fashion,here m denotes the total number of SNP loci along the chromosome.

  8. Enterobacter asburiae Strain L1: Complete Genome and Whole Genome Optical Mapping Analysis of a Quorum Sensing Bacterium

    Directory of Open Access Journals (Sweden)

    Yin Yin Lau

    2014-07-01

    Full Text Available Enterobacter asburiae L1 is a quorum sensing bacterium isolated from lettuce leaves. In this study, for the first time, the complete genome of E. asburiae L1 was sequenced using the single molecule real time sequencer (PacBio RSII and the whole genome sequence was verified by using optical genome mapping (OpGen technology. In our previous study, E. asburiae L1 has been reported to produce AHLs, suggesting the possibility of virulence factor regulation which is quorum sensing dependent. This evoked our interest to study the genome of this bacterium and here we present the complete genome of E. asburiae L1, which carries the virulence factor gene virK, the N-acyl homoserine lactone-based QS transcriptional regulator gene luxR and the N-acyl homoserine lactone synthase gene which we firstly named easI. The availability of the whole genome sequence of E. asburiae L1 will pave the way for the study of the QS-mediated gene expression in this bacterium. Hence, the importance and functions of these signaling molecules can be further studied in the hope of elucidating the mechanisms of QS-regulation in E. asburiae. To the best of our knowledge, this is the first documentation of both a complete genome sequence and the establishment of the molecular basis of QS properties of E. asburiae.

  9. Enterobacter asburiae strain L1: complete genome and whole genome optical mapping analysis of a quorum sensing bacterium.

    Science.gov (United States)

    Lau, Yin Yin; Yin, Wai-Fong; Chan, Kok-Gan

    2014-01-01

    Enterobacter asburiae L1 is a quorum sensing bacterium isolated from lettuce leaves. In this study, for the first time, the complete genome of E. asburiae L1 was sequenced using the single molecule real time sequencer (PacBio RSII) and the whole genome sequence was verified by using optical genome mapping (OpGen) technology. In our previous study, E. asburiae L1 has been reported to produce AHLs, suggesting the possibility of virulence factor regulation which is quorum sensing dependent. This evoked our interest to study the genome of this bacterium and here we present the complete genome of E. asburiae L1, which carries the virulence factor gene virK, the N-acyl homoserine lactone-based QS transcriptional regulator gene luxR and the N-acyl homoserine lactone synthase gene which we firstly named easI. The availability of the whole genome sequence of E. asburiae L1 will pave the way for the study of the QS-mediated gene expression in this bacterium. Hence, the importance and functions of these signaling molecules can be further studied in the hope of elucidating the mechanisms of QS-regulation in E. asburiae. To the best of our knowledge, this is the first documentation of both a complete genome sequence and the establishment of the molecular basis of QS properties of E. asburiae. PMID:25196111

  10. An efficient procedure for plant organellar genome assembly, based on whole genome data from the 454 GS FLX sequencing platform

    Directory of Open Access Journals (Sweden)

    Zhang Tongwu

    2011-11-01

    Full Text Available Abstract Motivation Complete organellar genome sequences (chloroplasts and mitochondria provide valuable resources and information for studying plant molecular ecology and evolution. As high-throughput sequencing technology advances, it becomes the norm that a shotgun approach is used to obtain complete genome sequences. Therefore, to assemble organellar sequences from the whole genome, shotgun reads are inevitable. However, associated techniques are often cumbersome, time-consuming, and difficult, because true organellar DNA is difficult to separate efficiently from nuclear copies, which have been transferred to the nucleus through the course of evolution. Results We report a new, rapid procedure for plant chloroplast and mitochondrial genome sequencing and assembly using the Roche/454 GS FLX platform. Plant cells can contain multiple copies of the organellar genomes, and there is a significant correlation between the depth of sequence reads in contigs and the number of copies of the genome. Without isolating organellar DNA from the mixture of nuclear and organellar DNA for sequencing, we retrospectively extracted assembled contigs of either chloroplast or mitochondrial sequences from the whole genome shotgun data. Moreover, the contig connection graph property of Newbler (a platform-specific sequence assembler ensures an efficient final assembly. Using this procedure, we assembled both chloroplast and mitochondrial genomes of a resurrection plant, Boea hygrometrica, with high fidelity. We also present information and a minimal sequence dataset as a reference for the assembly of other plant organellar genomes.

  11. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  12. Construction of whole genome radiation hybrid panels and map of chromosome 5A of wheat using asymmetric somatic hybridization.

    Directory of Open Access Journals (Sweden)

    Chuanen Zhou

    Full Text Available To explore the feasibility of constructing a whole genome radiation hybrid (WGRH map in plant species with large genomes, asymmetric somatic hybridization between wheat (Triticum aestivum L. and Bupleurum scorzonerifolium Willd. was performed. The protoplasts of wheat were irradiated with ultraviolet light (UV and gamma-ray and rescued by protoplast fusion using B. scorzonerifolium as the recipient. Assessment of SSR markers showed that the radiation hybrids have the average marker retention frequency of 15.5%. Two RH panels (RHPWI and RHPWII that contained 92 and 184 radiation hybrids, respectively, were developed and used for mapping of 68 SSR markers in chromosome 5A of wheat. A total of 1557 and 2034 breaks were detected in each panel. The RH map of chromosome 5A based on RHPWII was constructed. The distance of the comprehensive map was 2103 cR and the approximate resolution was estimated to be ∼501.6 kb/break. The RH panels evaluated in this study enabled us to order the ESTs in a single deletion bin or in the multiple bins cross the chromosome. These results demonstrated that RH mapping via protoplast fusion is feasible at the whole genome level for mapping purposes in wheat and the potential value of this mapping approach for the plant species with large genomes.

  13. Contamination-controlled high-throughput whole genome sequencing for influenza A viruses using the MiSeq sequencer.

    Science.gov (United States)

    Lee, Hong Kai; Lee, Chun Kiat; Tang, Julian Wei-Tze; Loh, Tze Ping; Koay, Evelyn Siew-Chuan

    2016-01-01

    Accurate full-length genomic sequences are important for viral phylogenetic studies. We developed a targeted high-throughput whole genome sequencing (HT-WGS) method for influenza A viruses, which utilized an enzymatic cleavage-based approach, the Nextera XT DNA library preparation kit, for library preparation. The entire library preparation workflow was adapted for the Sentosa SX101, a liquid handling platform, to automate this labor-intensive step. As the enzymatic cleavage-based approach generates low coverage reads at both ends of the cleaved products, we corrected this loss of sequencing coverage at the termini by introducing modified primers during the targeted amplification step to generate full-length influenza A sequences with even coverage across the whole genome. Another challenge of targeted HTS is the risk of specimen-to-specimen cross-contamination during the library preparation step that results in the calling of false-positive minority variants. We included an in-run, negative system control to capture contamination reads that may be generated during the liquid handling procedures. The upper limits of 99.99% prediction intervals of the contamination rate were adopted as cut-off values of contamination reads. Here, 148 influenza A/H3N2 samples were sequenced using the HTS protocol and were compared against a Sanger-based sequencing method. Our data showed that the rate of specimen-to-specimen cross-contamination was highly significant in HTS. PMID:27624998

  14. Evaluation ofA Single-reaction Method for Whole Genome Sequencing of Influenza A Virus using Next Generation Sequencing

    Institute of Scientific and Technical Information of China (English)

    ZOU Xiao Hui; CHEN Wen Bing; ZHAO Xiang; ZHU Wen Fei; YANG Lei; WANG Da Yan; SHU Yue Long

    2016-01-01

    ObjectiveTo evaluate a single-reaction genome amplification method, the multisegment reverse transcription-PCR (M-RTPCR), for its sensitivity to full genome sequencing of influenza A virus, and the ability to differentiate mix-subtype virus, using the next generation sequencing (NGS) platform. MethodsVirus genome copy was quantified and serially diluted to different titers, followed by amplification with the M-RTPCR method and sequencing on the NGS platform. Furthermore, we manually mixed two subtype viruses to different titer rate and amplified the mixed virus with the M-RTPCR protocol, followed by whole genome sequencing on the NGS platform. We also used clinical samples to test the method performance. ResultsThe M-RTPCR method obtained complete genome of testing virus at 125 copies/reaction and determined the virus subtype at titer of 25 copies/reaction. Moreover, the two subtypes in the mixed virus could be discriminated, even though these two virus copies differed by 200-fold using this amplification protocol. The sensitivity of this protocol we detected using virus RNA was also confirmed with clinical samples containing low-titer virus. ConclusionThe M-RTPCR is a robust and sensitive amplification method for whole genome sequencing of influenza A virus using NGS platform.

  15. Prognostic Impact of Array-based Genomic Profiles in Esophageal Squamous Cell Cancer

    International Nuclear Information System (INIS)

    Esophageal squamous cell carcinoma (ESCC) is a genetically complex tumor type and a major cause of cancer related mortality. Although distinct genetic alterations have been linked to ESCC development and prognosis, the genetic alterations have not gained clinical applicability. We applied array-based comparative genomic hybridization (aCGH) to obtain a whole genome copy number profile relevant for identifying deranged pathways and clinically applicable markers. A 32 k aCGH platform was used for high resolution mapping of copy number changes in 30 stage I-IV ESCC. Potential interdependent alterations and deranged pathways were identified and copy number changes were correlated to stage, differentiation and survival. Copy number alterations affected median 19% of the genome and included recurrent gains of chromosome regions 5p, 7p, 7q, 8q, 10q, 11q, 12p, 14q, 16p, 17p, 19p, 19q, and 20q and losses of 3p, 5q, 8p, 9p and 11q. High-level amplifications were observed in 30 regions and recurrently involved 7p11 (EGFR), 11q13 (MYEOV, CCND1, FGF4, FGF3, PPFIA, FAD, TMEM16A, CTTS and SHANK2) and 11q22 (PDFG). Gain of 7p22.3 predicted nodal metastases and gains of 1p36.32 and 19p13.3 independently predicted poor survival in multivariate analysis. aCGH profiling verified genetic complexity in ESCC and herein identified imbalances of multiple central tumorigenic pathways. Distinct gains correlate with clinicopathological variables and independently predict survival, suggesting clinical applicability of genomic profiling in ESCC

  16. An evaluation of alternative methods for constructing phylogenies from whole genome sequence data: a case study with Salmonella

    Directory of Open Access Journals (Sweden)

    James B. Pettengill

    2014-10-01

    Full Text Available Comparative genomics based on whole genome sequencing (WGS is increasingly being applied to investigate questions within evolutionary and molecular biology, as well as questions concerning public health (e.g., pathogen outbreaks. Given the impact that conclusions derived from such analyses may have, we have evaluated the robustness of clustering individuals based on WGS data to three key factors: (1 next-generation sequencing (NGS platform (HiSeq, MiSeq, IonTorrent, 454, and SOLiD, (2 algorithms used to construct a SNP (single nucleotide polymorphism matrix (reference-based and reference-free, and (3 phylogenetic inference method (FastTreeMP, GARLI, and RAxML. We carried out these analyses on 194 whole genome sequences representing 107 unique Salmonella enterica subsp. enterica ser. Montevideo strains. Reference-based approaches for identifying SNPs produced trees that were significantly more similar to one another than those produced under the reference-free approach. Topologies inferred using a core matrix (i.e., no missing data were significantly more discordant than those inferred using a non-core matrix that allows for some missing data. However, allowing for too much missing data likely results in a high false discovery rate of SNPs. When analyzing the same SNP matrix, we observed that the more thorough inference methods implemented in GARLI and RAxML produced more similar topologies than FastTreeMP. Our results also confirm that reproducibility varies among NGS platforms where the MiSeq had the lowest number of pairwise differences among replicate runs. Our investigation into the robustness of clustering patterns illustrates the importance of carefully considering how data from different platforms are combined and analyzed. We found clear differences in the topologies inferred, and certain methods performed significantly better than others for discriminating between the highly clonal organisms investigated here. The methods supported by

  17. Whole-genome sequences of influenza A(H3N2 viruses isolated from Brazilian patients with mild illness during the 2014 season

    Directory of Open Access Journals (Sweden)

    Paola Cristina Resende

    2015-02-01

    Full Text Available The influenza A(H3N2 virus has circulated worldwide for almost five decades and is the dominant subtype in most seasonal influenza epidemics, as occurred in the 2014 season in South America. In this study we evaluate five whole genome sequences of influenza A(H3N2 viruses detected in patients with mild illness collected from January-March 2014. To sequence the genomes, a new generation sequencing (NGS protocol was performed using the Ion Torrent PGM platform. In addition to analysing the common genes, haemagglutinin, neuraminidase and matrix, our work also comprised internal genes. This was the first report of a whole genome analysis with Brazilian influenza A(H3N2 samples. Considerable amino acid variability was encountered in all gene segments, demonstrating the importance of studying the internal genes. NGS of whole genomes in this study will facilitate deeper virus characterisation, contributing to the improvement of influenza strain surveillance in Brazil.

  18. Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products.

    Science.gov (United States)

    Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana

    2016-07-01

    The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed. PMID:27100228

  19. Rapid and Easy In Silico Serotyping of Escherichia coli Isolates by Use of Whole-Genome Sequencing Data

    DEFF Research Database (Denmark)

    Joensen, Katrine Grimstrup; Tetzschner, Anna M. M.; Iguchi, Atsushi;

    2015-01-01

    typing and surveillance. The aim of this study was to establish a valid and publicly available tool for WGS-based in silico serotyping of E. coli applicable for routine typing and surveillance. A FASTA database of specific O-antigen processing system genes for O typing and flagellin genes for H typing...... was created as a component of the publicly available Web tools hosted by the Center for Genomic Epidemiology (CGE) (www.genomicepidemiology.org). All E. coli isolates available with WGS data and conventional serotype information were subjected to WGS-based serotyping employing this specific Serotype......Finder CGE tool. SerotypeFinder was evaluated on 682 E. coli genomes, 108 of which were sequenced for this study, where both the whole genome and the serotype were available. In total, 601 and 509 isolates were included for O and H typing, respectively. The O-antigen genes wzx, wzy, wzm, and wzt and the...

  20. Genetic Diversity and Fingerprint Profiles of Commercial Lentinula edodes Cultivars Based on SSR Markers Developed from the Whole Genome Sequence

    Institute of Scientific and Technical Information of China (English)

    ZHANG Dan; SONG Chunyan; ZHANG Lujun; WU Ping; BAO Dapeng; SHANG Xiaodong; TAN Qi

    2014-01-01

    Lentinula edodes is an important cultivated mushroom in China, and accurate and reliable identification of individual cultivars is a prerequisite for successful cultivation and variety protection.In this study,the whole genome sequence of L.edodes was used to generate 200 simple sequence repeat (SSR) markers for delineating 25 commercial cultivars and for determining their genetic diversity.Our data revealed a relatively high level of genetic similarity among the cultivars,with average,minimum and maximum genetic similarity coefficient values of 0.776,0.567 and 1.000,respectively.Seven SSR primer pairs delineated eleven of the cultivars (Cr-02,Minfeng-1,Xianggu 241-4,Senyuan-1,Senyuan-8404,Xiang-9,Guangxiang-51,Huaxiang-5,L952,L9319 and L808)based on their unique multilocus SSR fingerprint profiles.

  1. Genotyping using whole-genome sequencing is a realistic alternative to surveillance based on phenotypic antimicrobial susceptibility testing

    DEFF Research Database (Denmark)

    Zankari, Ea; Hasman, Henrik; Kaas, Rolf Sommer;

    2013-01-01

    200 isolates originating from Danish pigs, covering four bacterial species. Genomic DNA was purified from all isolates and sequenced as paired-end reads on the Illumina platform. The web servers ResFinder and MLST (www.genomicepidemiology.org) were used to identify acquired antimicrobial resistance......Objectives: Antimicrobial susceptibility testing of bacterial isolates is essential for clinical diagnosis, to detect emerging problems and to guide empirical treatment. Current phenotypic procedures are sometimes associated with mistakes and may require further genetic testing. Whole......-genome sequencing (WGS) may soon be within reach even for routine surveillance and clinical diagnostics. The aim of this study was to evaluate WGS as a routine tool for surveillance of antimicrobial resistance compared with current phenotypic procedures. Methods: Antimicrobial susceptibility tests were performed on...

  2. Transmission of methicillin-resistant Staphylococcus aureus infection through solid organ transplantation: confirmation via whole genome sequencing.

    Science.gov (United States)

    Wendt, J M; Kaul, D; Limbago, B M; Ramesh, M; Cohle, S; Denison, A M; Driebe, E M; Rasheed, J K; Zaki, S R; Blau, D M; Paddock, C D; McDougal, L K; Engelthaler, D M; Keim, P S; Roe, C C; Akselrod, H; Kuehnert, M J; Basavaraju, S V

    2014-11-01

    We describe two cases of donor-derived methicillin-resistant Staphylococcus aureus (MRSA) bacteremia that developed after transplantation of organs from a common donor who died from acute MRSA endocarditis. Both recipients developed recurrent MRSA infection despite appropriate antibiotic therapy, and required prolonged hospitalization and hospital readmission. Comparison of S. aureus whole genome sequence of DNA extracted from fixed donor tissue and recipients' isolates confirmed donor-derived transmission. Current guidelines emphasize the risk posed by donors with bacteremia from multidrug-resistant organisms. This investigation suggests that, particularly in the setting of donor endocarditis, even a standard course of prophylactic antibiotics may not be sufficient to prevent donor-derived infection. PMID:25250717

  3. Genome-wide association analyses based on whole-genome sequencing in Sardinia provide insights into regulation of hemoglobin levels.

    Science.gov (United States)

    Danjou, Fabrice; Zoledziewska, Magdalena; Sidore, Carlo; Steri, Maristella; Busonero, Fabio; Maschio, Andrea; Mulas, Antonella; Perseu, Lucia; Barella, Susanna; Porcu, Eleonora; Pistis, Giorgio; Pitzalis, Maristella; Pala, Mauro; Menzel, Stephan; Metrustry, Sarah; Spector, Timothy D; Leoni, Lidia; Angius, Andrea; Uda, Manuela; Moi, Paolo; Thein, Swee Lay; Galanello, Renzo; Abecasis, Gonçalo R; Schlessinger, David; Sanna, Serena; Cucca, Francesco

    2015-11-01

    We report genome-wide association study results for the levels of A1, A2 and fetal hemoglobins, analyzed for the first time concurrently. Integrating high-density array genotyping and whole-genome sequencing in a large general population cohort from Sardinia, we detected 23 associations at 10 loci. Five signals are due to variants at previously undetected loci: MPHOSPH9, PLTP-PCIF1, ZFPM1 (FOG1), NFIX and CCND3. Among the signals at known loci, ten are new lead variants and four are new independent signals. Half of all variants also showed pleiotropic associations with different hemoglobins, which further corroborated some of the detected associations and identified features of coordinated hemoglobin species production. PMID:26366553

  4. Evidence and evolutionary analysis of ancient whole-genome duplication in barley predating the divergence from rice

    Directory of Open Access Journals (Sweden)

    Grosse Ivo

    2009-08-01

    Full Text Available Abstract Background Well preserved genomic colinearity among agronomically important grass species such as rice, maize, Sorghum, wheat and barley provides access to whole-genome structure information even in species lacking a reference genome sequence. We investigated footprints of whole-genome duplication (WGD in barley that shaped the cereal ancestor genome by analyzing shared synteny with rice using a ~2000 gene-based barley genetic map and the rice genome reference sequence. Results Based on a recent annotation of the rice genome, we reviewed the WGD in rice and identified 24 pairs of duplicated genomic segments involving 70% of the rice genome. Using 968 putative orthologous gene pairs, synteny covered 89% of the barley genetic map and 63% of the rice genome. We found strong evidence for seven shared segmental genome duplications, corresponding to more than 50% of the segmental genome duplications previously determined in rice. Analysis of synonymous substitution rates (Ks suggested that shared duplications originated before the divergence of these two species. While major genome rearrangements affected the ancestral genome of both species, small paracentric inversions were found to be species specific. Conclusion We provide a thorough analysis of comparative genome evolution between barley and rice. A barley genetic map of approximately 2000 non-redundant EST sequences provided sufficient density to allow a detailed view of shared synteny with the rice genome. Using an indirect approach that included the localization of WGD-derived duplicated genome segments in the rice genome, we determined the current extent of shared WGD-derived genome duplications that occurred prior to species divergence.

  5. Whole-genome SNP association in the horse: identification of a deletion in myosin Va responsible for Lavender Foal Syndrome.

    Directory of Open Access Journals (Sweden)

    Samantha A Brooks

    2010-04-01

    Full Text Available Lavender Foal Syndrome (LFS is a lethal inherited disease of horses with a suspected autosomal recessive mode of inheritance. LFS has been primarily diagnosed in a subgroup of the Arabian breed, the Egyptian Arabian horse. The condition is characterized by multiple neurological abnormalities and a dilute coat color. Candidate genes based on comparative phenotypes in mice and humans include the ras-associated protein RAB27a (RAB27A and myosin Va (MYO5A. Here we report mapping of the locus responsible for LFS using a small set of 36 horses segregating for LFS. These horses were genotyped using a newly available single nucleotide polymorphism (SNP chip containing 56,402 discriminatory elements. The whole genome scan identified an associated region containing these two functional candidate genes. Exon sequencing of the MYO5A gene from an affected foal revealed a single base deletion in exon 30 that changes the reading frame and introduces a premature stop codon. A PCR-based Restriction Fragment Length Polymorphism (PCR-RFLP assay was designed and used to investigate the frequency of the mutant gene. All affected horses tested were homozygous for this mutation. Heterozygous carriers were detected in high frequency in families segregating for this trait, and the frequency of carriers in unrelated Egyptian Arabians was 10.3%. The mapping and discovery of the LFS mutation represents the first successful use of whole-genome SNP scanning in the horse for any trait. The RFLP assay can be used to assist breeders in avoiding carrier-to-carrier matings and thus in preventing the birth of affected foals.

  6. Mycobacterium tuberculosis whole genome sequencing and protein structure modelling provides insights into anti-tuberculosis drug resistance

    KAUST Repository

    Phelan, Jody

    2016-03-23

    Background Combating the spread of drug resistant tuberculosis is a global health priority. Whole genome association studies are being applied to identify genetic determinants of resistance to anti-tuberculosis drugs. Protein structure and interaction modelling are used to understand the functional effects of putative mutations and provide insight into the molecular mechanisms leading to resistance. Methods To investigate the potential utility of these approaches, we analysed the genomes of 144 Mycobacterium tuberculosis clinical isolates from The Special Programme for Research and Training in Tropical Diseases (TDR) collection sourced from 20 countries in four continents. A genome-wide approach was applied to 127 isolates to identify polymorphisms associated with minimum inhibitory concentrations for first-line anti-tuberculosis drugs. In addition, the effect of identified candidate mutations on protein stability and interactions was assessed quantitatively with well-established computational methods. Results The analysis revealed that mutations in the genes rpoB (rifampicin), katG (isoniazid), inhA-promoter (isoniazid), rpsL (streptomycin) and embB (ethambutol) were responsible for the majority of resistance observed. A subset of the mutations identified in rpoB and katG were predicted to affect protein stability. Further, a strong direct correlation was observed between the minimum inhibitory concentration values and the distance of the mutated residues in the three-dimensional structures of rpoB and katG to their respective drugs binding sites. Conclusions Using the TDR resource, we demonstrate the usefulness of whole genome association and convergent evolution approaches to detect known and potentially novel mutations associated with drug resistance. Further, protein structural modelling could provide a means of predicting the impact of polymorphisms on drug efficacy in the absence of phenotypic data. These approaches could ultimately lead to novel resistance

  7. Whole genome sequence of Treponema pallidum ssp. pallidum, strain Mexico A, suggests recombination between yaws and syphilis strains.

    Directory of Open Access Journals (Sweden)

    Helena Pětrošová

    Full Text Available BACKGROUND: Treponema pallidum ssp. pallidum (TPA, the causative agent of syphilis, and Treponema pallidum ssp. pertenue (TPE, the causative agent of yaws, are closely related spirochetes causing diseases with distinct clinical manifestations. The TPA Mexico A strain was isolated in 1953 from male, with primary syphilis, living in Mexico. Attempts to cultivate TPA Mexico A strain under in vitro conditions have revealed lower growth potential compared to other tested TPA strains. METHODOLOGY/PRINCIPAL FINDINGS: The complete genome sequence of the TPA Mexico A strain was determined using the Illumina sequencing technique. The genome sequence assembly was verified using the whole genome fingerprinting technique and the final sequence was annotated. The genome size of the Mexico A strain was determined to be 1,140,038 bp with 1,035 predicted ORFs. The Mexico A genome sequence was compared to the whole genome sequences of three TPA (Nichols, SS14 and Chicago and three TPE (CDC-2, Samoa D and Gauthier strains. No large rearrangements in the Mexico A genome were found and the identified nucleotide changes occurred most frequently in genes encoding putative virulence factors. Nevertheless, the genome of the Mexico A strain, revealed two genes (TPAMA_0326 (tp92 and TPAMA_0488 (mcp2-1 which combine TPA- and TPE- specific nucleotide sequences. Both genes were found to be under positive selection within TPA strains and also between TPA and TPE strains. CONCLUSIONS/SIGNIFICANCE: The observed mosaic character of the TPAMA_0326 and TPAMA_0488 loci is likely a result of inter-strain recombination between TPA and TPE strains during simultaneous infection of a single host suggesting horizontal gene transfer between treponemal subspecies.

  8. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon. PMID:27311063

  9. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Directory of Open Access Journals (Sweden)

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  10. Whole-Genome Sequencing Allows for Improved Identification of Persistent Listeria monocytogenes in Food-Associated Environments.

    Science.gov (United States)

    Stasiewicz, Matthew J; Oliver, Haley F; Wiedmann, Martin; den Bakker, Henk C

    2015-09-01

    While the food-borne pathogen Listeria monocytogenes can persist in food associated environments, there are no whole-genome sequence (WGS) based methods to differentiate persistent from sporadic strains. Whole-genome sequencing of 188 isolates from a longitudinal study of L. monocytogenes in retail delis was used to (i) apply single-nucleotide polymorphism (SNP)-based phylogenetics for subtyping of L. monocytogenes, (ii) use SNP counts to differentiate persistent from repeatedly reintroduced strains, and (iii) identify genetic determinants of L. monocytogenes persistence. WGS analysis revealed three prophage regions that explained differences between three pairs of phylogenetically similar populations with pulsed-field gel electrophoresis types that differed by ≤3 bands. WGS-SNP-based phylogenetics found that putatively persistent L. monocytogenes represent SNP patterns (i) unique to a single retail deli, supporting persistence within the deli (11 clades), (ii) unique to a single state, supporting clonal spread within a state (7 clades), or (iii) spanning multiple states (5 clades). Isolates that formed one of 11 deli-specific clades differed by a median of 10 SNPs or fewer. Isolates from 12 putative persistence events had significantly fewer SNPs (median, 2 to 22 SNPs) than between isolates of the same subtype from other delis (median up to 77 SNPs), supporting persistence of the strain. In 13 events, nearly indistinguishable isolates (0 to 1 SNP) were found across multiple delis. No individual genes were enriched among persistent isolates compared to sporadic isolates. Our data show that WGS analysis improves food-borne pathogen subtyping and identification of persistent bacterial pathogens in food associated environments. PMID:26116683

  11. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis

    Science.gov (United States)

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S.; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense. PMID:27031249

  12. Whole Genome Sequence of Treponema pallidum ssp. pallidum, Strain Mexico A, Suggests Recombination between Yaws and Syphilis Strains

    Science.gov (United States)

    Pětrošová, Helena; Zobaníková, Marie; Čejková, Darina; Mikalová, Lenka; Pospíšilová, Petra; Strouhal, Michal; Chen, Lei; Qin, Xiang; Muzny, Donna M.; Weinstock, George M.; Šmajs, David

    2012-01-01

    Background Treponema pallidum ssp. pallidum (TPA), the causative agent of syphilis, and Treponema pallidum ssp. pertenue (TPE), the causative agent of yaws, are closely related spirochetes causing diseases with distinct clinical manifestations. The TPA Mexico A strain was isolated in 1953 from male, with primary syphilis, living in Mexico. Attempts to cultivate TPA Mexico A strain under in vitro conditions have revealed lower growth potential compared to other tested TPA strains. Methodology/Principal Findings The complete genome sequence of the TPA Mexico A strain was determined using the Illumina sequencing technique. The genome sequence assembly was verified using the whole genome fingerprinting technique and the final sequence was annotated. The genome size of the Mexico A strain was determined to be 1,140,038 bp with 1,035 predicted ORFs. The Mexico A genome sequence was compared to the whole genome sequences of three TPA (Nichols, SS14 and Chicago) and three TPE (CDC-2, Samoa D and Gauthier) strains. No large rearrangements in the Mexico A genome were found and the identified nucleotide changes occurred most frequently in genes encoding putative virulence factors. Nevertheless, the genome of the Mexico A strain, revealed two genes (TPAMA_0326 (tp92) and TPAMA_0488 (mcp2-1)) which combine TPA- and TPE- specific nucleotide sequences. Both genes were found to be under positive selection within TPA strains and also between TPA and TPE strains. Conclusions/Significance The observed mosaic character of the TPAMA_0326 and TPAMA_0488 loci is likely a result of inter-strain recombination between TPA and TPE strains during simultaneous infection of a single host suggesting horizontal gene transfer between treponemal subspecies. PMID:23029591

  13. Whole genome sequencing of mutation accumulation lines reveals a low mutation rate in the social amoeba Dictyostelium discoideum.

    Directory of Open Access Journals (Sweden)

    Gerda Saxer

    Full Text Available Spontaneous mutations play a central role in evolution. Despite their importance, mutation rates are some of the most elusive parameters to measure in evolutionary biology. The combination of mutation accumulation (MA experiments and whole-genome sequencing now makes it possible to estimate mutation rates by directly observing new mutations at the molecular level across the whole genome. We performed an MA experiment with the social amoeba Dictyostelium discoideum and sequenced the genomes of three randomly chosen lines using high-throughput sequencing to estimate the spontaneous mutation rate in this model organism. The mitochondrial mutation rate of 6.76×10(-9, with a Poisson confidence interval of 4.1×10(-9 - 9.5×10(-9, per nucleotide per generation is slightly lower than estimates for other taxa. The mutation rate estimate for the nuclear DNA of 2.9×10(-11, with a Poisson confidence interval ranging from 7.4×10(-13 to 1.6×10(-10, is the lowest reported for any eukaryote. These results are consistent with low microsatellite mutation rates previously observed in D. discoideum and low levels of genetic variation observed in wild D. discoideum populations. In addition, D. discoideum has been shown to be quite resistant to DNA damage, which suggests an efficient DNA-repair mechanism that could be an adaptation to life in soil and frequent exposure to intracellular and extracellular mutagenic compounds. The social aspect of the life cycle of D. discoideum and a large portion of the genome under relaxed selection during vegetative growth could also select for a low mutation rate. This hypothesis is supported by a significantly lower mutation rate per cell division in multicellular eukaryotes compared with unicellular eukaryotes.

  14. Rapid identification of genetic modifications in Bacillus anthracis using whole genome draft sequences generated by 454 pyrosequencing.

    Directory of Open Access Journals (Sweden)

    Peter E Chen

    Full Text Available BACKGROUND: The anthrax letter attacks of 2001 highlighted the need for rapid identification of biothreat agents not only for epidemiological surveillance of the intentional outbreak but also for implementing appropriate countermeasures, such as antibiotic treatment, in a timely manner to prevent further casualties. It is clear from the 2001 cases that survival may be markedly improved by administration of antimicrobial therapy during the early symptomatic phase of the illness; i.e., within 3 days of appearance of symptoms. Microbiological detection methods are feasible only for organisms that can be cultured in vitro and cannot detect all genetic modifications with the exception of antibiotic resistance. Currently available immuno or nucleic acid-based rapid detection assays utilize known, organism-specific proteins or genomic DNA signatures respectively. Hence, these assays lack the ability to detect novel natural variations or intentional genetic modifications that circumvent the targets of the detection assays or in the case of a biological attack using an antibiotic resistant or virulence enhanced Bacillus anthracis, to advise on therapeutic treatments. METHODOLOGY/PRINCIPAL FINDINGS: We show here that the Roche 454-based pyrosequencing can generate whole genome draft sequences of deep and broad enough coverage of a bacterial genome in less than 24 hours. Furthermore, using the unfinished draft sequences, we demonstrate that unbiased identification of known as well as heretofore-unreported genetic modifications that include indels and single nucleotide polymorphisms conferring antibiotic and phage resistances is feasible within the next 12 hours. CONCLUSIONS/SIGNIFICANCE: Second generation sequencing technologies have paved the way for sequence-based rapid identification of both known and previously undocumented genetic modifications in cultured, conventional and newly emerging biothreat agents. Our findings have significant implications in

  15. Lessons learned from the application of whole-genome analysis to the treatment of patients with advanced cancers

    Science.gov (United States)

    Laskin, Janessa; Jones, Steven; Aparicio, Samuel; Chia, Stephen; Ch'ng, Carolyn; Deyell, Rebecca; Eirew, Peter; Fok, Alexandra; Gelmon, Karen; Ho, Cheryl; Huntsman, David; Jones, Martin; Kasaian, Katayoon; Karsan, Aly; Leelakumari, Sreeja; Li, Yvonne; Lim, Howard; Ma, Yussanne; Mar, Colin; Martin, Monty; Moore, Richard; Mungall, Andrew; Mungall, Karen; Pleasance, Erin; Rassekh, S. Rod; Renouf, Daniel; Shen, Yaoqing; Schein, Jacqueline; Schrader, Kasmintan; Sun, Sophie; Tinker, Anna; Zhao, Eric; Yip, Stephen; Marra, Marco A.

    2015-01-01

    Given the success of targeted agents in specific populations it is expected that some degree of molecular biomarker testing will become standard of care for many, if not all, cancers. To facilitate this, cancer centers worldwide are experimenting with targeted “panel” sequencing of selected mutations. Recent advances in genomic technology enable the generation of genome-scale data sets for individual patients. Recognizing the risk, inherent in panel sequencing, of failing to detect meaningful somatic alterations, we sought to establish processes to integrate data from whole-genome analysis (WGA) into routine cancer care. Between June 2012 and August 2014, 100 adult patients with incurable cancers consented to participate in the Personalized OncoGenomics (POG) study. Fresh tumor and blood samples were obtained and used for whole-genome and RNA sequencing. Computational approaches were used to identify candidate driver mutations, genes, and pathways. Diagnostic and drug information were then sought based on these candidate “drivers.” Reports were generated and discussed weekly in a multidisciplinary team setting. Other multidisciplinary working groups were assembled to establish guidelines on the interpretation, communication, and integration of individual genomic findings into patient care. Of 78 patients for whom WGA was possible, results were considered actionable in 55 cases. In 23 of these 55 cases, the patients received treatments motivated by WGA. Our experience indicates that a multidisciplinary team of clinicians and scientists can implement a paradigm in which WGA is integrated into the care of late stage cancer patients to inform systemic therapy decisions. PMID:27148575

  16. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, ‘SCNU1154’, ‘Edisto47’, ‘MR-1’, and ‘PMR5’. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon. PMID:27311063

  17. Development and preliminary evaluation of an online educational video about whole-genome sequencing for research participants, patients, and the general public

    OpenAIRE

    Sanderson, Saskia C.; Suckiel, Sabrina A; Zweig, Micol; Bottinger, Erwin P.; Jabs, Ethylin Wang; Richardson, Lynne D.

    2015-01-01

    Background: As whole-genome sequencing (WGS) increases in availability, WGS educational aids are needed for research participants, patients, and the general public. Our aim was therefore to develop an accessible and scalable WGS educational aid. Genet Med 18 5, 501–512. Methods: We engaged multiple stakeholders in an iterative process over a 1-year period culminating in the production of a novel 10-minute WGS educational animated video, “Whole Genome Sequencing and You” (https://goo.gl/HV8ezJ...

  18. Analyses and interpretation of whole-genome gene expression from formalin-fixed paraffin-embedded tissue: an illustration with breast cancer tissues

    OpenAIRE

    Argos Maria; Paul-Brutus Rachelle M; Roy Shantanu; Jasmine Farzana; Kibriya Muhammad G; Ahsan Habibul

    2010-01-01

    Abstract Background We evaluated (a) the feasibility of whole genome cDNA-mediated Annealing, Selection, extension and Ligation (DASL) assay on formalin-fixed paraffin-embedded (FFPE) tissue and (b) whether similar conclusions can be drawn by examining FFPE samples as proxies for fresh frozen (FF) tissues. We used a whole genome DASL assay (addressing 18,391 genes) on a total of 72 samples from paired breast tumor and surrounding healthy tissues from both FF and FFPE samples. Results Gene det...

  19. Whole genome evaluation of tandem repeat polymorphisms between two pathogenically similar strains of Xylella fastidiosa isolated from almond and grape in California

    Science.gov (United States)

    Whole genome tandem repeat polymorphisms were evaluated between two closely related Xylella fastidiosa strains, M23 and Temecula1, both cause almond leaf scorch disease (ALSD) and grape Pierce’s disease (PD) in California. Strain M23 was isolated from almond and the genome was sequenced in this stu...

  20. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping.

    Science.gov (United States)

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-04-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  1. Maps of cis-Regulatory Nodes in Megabase Long Genome Segments are an Inevitable Intermediate Step Toward Whole Genome Functional Mapping

    Science.gov (United States)

    Nikolaev, Lev G; Akopov, Sergey B; Chernov, Igor P; Sverdlov, Eugene D

    2007-01-01

    The availability of complete human and other metazoan genome sequences has greatly facilitated positioning and analysis of various genomic functional elements, with initial emphasis on coding sequences. However, complete functional maps of sequenced eukaryotic genomes should include also positions of all non-coding regulatory elements. Unfortunately, experimental data on genomic positions of a multitude of regulatory sequences, such as enhancers, silencers, insulators, transcription terminators, and replication origins are very limited, especially at the whole genome level. Since most genomic regulatory elements (e.g. enhancers) are generally gene-, tissue-, or cell-specific, the prediction of these elements by computational methods is difficult and often ambiguous. Therefore, the development of high-throughput experimental approaches for identifying and mapping genomic functional elements is highly desirable. At the same time, the creation of whole-genome map of hundreds of thousands of regulatory elements in several hundreds of tissue/cell types is presently far beyond our capabilities. A possible alternative for the whole genome approach is to concentrate efforts on individual genomic segments and then to integrate the data obtained into a whole genome functional map. Moreover, the maps of polygenic fragments with functional cis-regulatory elements would provide valuable data on complex regulatory systems, including their variability and evolution. Here, we reviewed experimental approaches to the realization of these ideas, including our own developments of experimental techniques for selection of cis-acting functionally active DNA fragments from large (megabase-sized) segments of mammalian genomes. PMID:18660850

  2. DNA copy number analysis of fresh and formalin-fixed specimens by shallow whole-genome sequencing with identification and exclusion of problematic regions in the genome assembly

    NARCIS (Netherlands)

    Scheinin, I.; Sie, D.; Bengtsson, H.; Wiel, M.A. van de; Olshen, A.B.; Thuijl, H.F. van; Essen, H.F. van; Eijk, P.P.; Rustenburg, F.; Meijer, G.A.; Reijneveld, J.C.; Wesseling, P.; Pinkel, D.; Albertson, D.G.; Ylstra, B.

    2014-01-01

    Detection of DNA copy number aberrations by shallow whole-genome sequencing (WGS) faces many challenges, including lack of completion and errors in the human reference genome, repetitive sequences, polymorphisms, variable sample quality, and biases in the sequencing procedures. Formalin-fixed paraff

  3. Whole-Genome Shotgun Sequence of Bacillus mojavensis Strain RRC101, an Endophytic Bacterium Antagonistic to the Mycotoxigenic Endophytic Fungus Fusrium verticillioides

    Science.gov (United States)

    Here we report the whole genome shotgun sequence of Bacillus mojavensis strain RRC101, isolated from a maize kernel. This strain is antagonistic to the mycotoxigenic plant pathogen Fusarium verticillioides, and grows within maize tissue, suggesting potential as an endophytic biocontrol agent....

  4. Whole-genome profiling and shotgun sequencing delivers an anchored, gene-decorated, physical map assembly of bread wheat chromosome 6A

    Czech Academy of Sciences Publication Activity Database

    Poursarebani, N.; Nussbaumer, T.; Šimková, Hana; Šafář, Jan; Witsenboer, H.; van Oeveren, J.; Doležel, Jaroslav; Mayer, K. F. X.; Stein, N.; Schnurbusch, T.

    2014-01-01

    Roč. 79, č. 2 (2014), s. 334-347. ISSN 0960-7412 Institutional support: RVO:61389030 Keywords : bread wheat chromosome 6A * whole -genome profiling * LINEAR TOPOLOGICAL CONTIGS Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 5.972, year: 2014

  5. Whole-Genome Sequence of Leptospira interrogans Serovar Hardjo Subtype Hardjoprajitno Strain Norma, Isolated from Cattle in a Leptospirosis Outbreak in Brazil

    OpenAIRE

    Cosate, M. R. V.; Soares, S. C.; Mendes, T. A.; Raittz, R. T.; E.C. Moreira; Leite, R.; Fernandes, G. R.; J.P.A. Haddad; Ortega, J Miguel

    2015-01-01

    Leptospirosis is caused by pathogenic bacteria of the genus Leptospira spp. This neglected re-emergent disease has global distribution and relevance in veterinary production. Here, we report the whole-genome sequence and annotation of Leptospira interrogans serovar Hardjo subtype Hardjoprajitno strain Norma, isolated from cattle in a livestock leptospirosis outbreak in Brazil.

  6. Monitoring meticillin resistant Staphylococcus aureus and its spread in Copenhagen, Denmark, 2013, through routine whole genome sequencing

    DEFF Research Database (Denmark)

    Bartels, M D; Larner-Svensson, H; Meiniche, H; Kristoffersen, K; Schonning, K; Nielsen, J B; Rohde, S M; Christensen, L B; Skibsted, A W; Jarlov, J O; Johansen, H K; Andersen, L P; Petersen, I S; Crook, D W; Bowden, R; Boye, K; Worning, P; Westh, H

    2015-01-01

    Typing of meticillin resistant Staphylococcus aureus (MRSA) by whole genome sequencing (WGS) is performed routinely in Copenhagen since January 2013. We describe the relatedness, based on WGS data and epidemiological data, of 341 MRSA isolates. These comprised all MRSA (n = 300) identified in...

  7. Determination of evolutionary relationships of outbreak-associated Listeria monocytogenes strains of serotypes 1/2a and 1/2b by whole-genome sequencing

    Science.gov (United States)

    We used whole-genome sequencing to determine evolutionary relationships among 20 outbreak-associated clinical isolates of Listeria monocytogenes serotypes 1/2a and 1/2b. Isolates from 6 of 11 outbreaks fell outside the clonal groups or “epidemic clones” that have been previously associated with outb...

  8. Whole-Genome Shotgun Sequence of Bacillus mojavensis Strain RRC101, an Endophytic Bacterium Antagonistic to the Mycotoxigenic Endophytic Fungus Fusarium verticillioides.

    Science.gov (United States)

    Gold, S E; Blacutt, A A; Meinersmann, R J; Bacon, C W

    2014-01-01

    Here, we report the whole-genome shotgun sequence of Bacillus mojavensis strain RRC101, isolated from a maize kernel. This strain is antagonistic to the mycotoxigenic plant pathogen Fusarium verticillioides and grows within maize tissue, suggesting potential as an endophytic biocontrol agent. PMID:25359909

  9. Whole-Genome Shotgun Sequence of Bacillus mojavensis Strain RRC101, an Endophytic Bacterium Antagonistic to the Mycotoxigenic Endophytic Fungus Fusarium verticillioides

    OpenAIRE

    Gold, S. E.; Blacutt, A. A.; Meinersmann, R. J.; Bacon, C W

    2014-01-01

    Here, we report the whole-genome shotgun sequence of Bacillus mojavensis strain RRC101, isolated from a maize kernel. This strain is antagonistic to the mycotoxigenic plant pathogen Fusarium verticillioides and grows within maize tissue, suggesting potential as an endophytic biocontrol agent.

  10. De Novo Whole-Genome Sequence of Micromonospora carbonacea JXNU-1 with Broad-Spectrum Antimicrobial Activity, Isolated from Soil Samples

    OpenAIRE

    Jiang, Yun; Huang, Yun-hong; Long, Zhong-er

    2015-01-01

    Micromonospora carbonacea JXNU-1 is an actinomycete with broad-spectrum antimicrobial activity, isolated from soil samples from the farmland in the area of Yaohu Lake in Nanchang, China. Here, we report the whole-genome sequence of M. carbonacea JXNU-1.

  11. Single site suppressors of a fission yeast temperature-sensitive mutant in cdc48 identified by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Irina N Marinova

    Full Text Available The protein called p97 in mammals and Cdc48 in budding and fission yeast is a homo-hexameric, ring-shaped, ubiquitin-dependent ATPase complex involved in a range of cellular functions, including protein degradation, vesicle fusion, DNA repair, and cell division. The cdc48+ gene is essential for viability in fission yeast, and point mutations in the human orthologue have been linked to disease. To analyze the function of p97/Cdc48 further, we performed a screen for cold-sensitive suppressors of the temperature-sensitive cdc48-353 fission yeast strain. In total, 29 independent pseudo revertants that had lost the temperature-sensitive growth defect of the cdc48-353 strain were isolated. Of these, 28 had instead acquired a cold-sensitive phenotype. Since the suppressors were all spontaneous mutants, and not the result of mutagenesis induced by chemicals or UV irradiation, we reasoned that the genome sequences of the 29 independent cdc48-353 suppressors were most likely identical with the exception of the acquired suppressor mutations. This prompted us to test if a whole genome sequencing approach would allow us to map the mutations. Indeed genome sequencing unambiguously revealed that the cold-sensitive suppressors were all second site intragenic cdc48 mutants. Projecting these onto the Cdc48 structure revealed that while the original temperature-sensitive G338D mutation is positioned near the central pore in the hexameric ring, the suppressor mutations locate to subunit-subunit and inter-domain boundaries. This suggests that Cdc48-353 is structurally compromized at the restrictive temperature, but re-established in the suppressor mutants. The last suppressor was an extragenic frame shift mutation in the ufd1 gene, which encodes a known Cdc48 co-factor. In conclusion, we show, using a novel whole genome sequencing approach, that Cdc48-353 is structurally compromized at the restrictive temperature, but stabilized in the suppressors.

  12. Copy number and loss of heterozygosity detected by SNP array of formalin-fixed tissues using whole-genome amplification.

    Directory of Open Access Journals (Sweden)

    Angela Stokes

    Full Text Available The requirement for large amounts of good quality DNA for whole-genome applications prohibits their use for small, laser capture micro-dissected (LCM, and/or rare clinical samples, which are also often formalin-fixed and paraffin-embedded (FFPE. Whole-genome amplification of DNA from these samples could, potentially, overcome these limitations. However, little is known about the artefacts introduced by amplification of FFPE-derived DNA with regard to genotyping, and subsequent copy number and loss of heterozygosity (LOH analyses. Using a ligation adaptor amplification method, we present data from a total of 22 Affymetrix SNP 6.0 experiments, using matched paired amplified and non-amplified DNA from 10 LCM FFPE normal and dysplastic oral epithelial tissues, and an internal method control. An average of 76.5% of SNPs were called in both matched amplified and non-amplified DNA samples, and concordance was a promising 82.4%. Paired analysis for copy number, LOH, and both combined, showed that copy number changes were reduced in amplified DNA, but were 99.5% concordant when detected, amplifications were the changes most likely to be 'missed', only 30% of non-amplified LOH changes were identified in amplified pairs, and when copy number and LOH are combined ∼50% of gene changes detected in the unamplified DNA were also detected in the amplified DNA and within these changes, 86.5% were concordant for both copy number and LOH status. However, there are also changes introduced as ∼20% of changes in the amplified DNA are not detected in the non-amplified DNA. An integrative network biology approach revealed that changes in amplified DNA of dysplastic oral epithelium localize to topologically critical regions of the human protein-protein interaction network, suggesting their functional implication in the pathobiology of this disease. Taken together, our results support the use of amplification of FFPE-derived DNA, provided sufficient samples are used

  13. Effect of Wortmannin on the repair profiles of DNA double-strand breaks in the whole genome and in interstitial telomeric sequences of Chinese hamster cells

    International Nuclear Information System (INIS)

    The DNA breakage detection-fluorescence in situ hybridization (DBD-FISH) procedure was applied to analyze the effect of Wortmannin (WM) in the rejoining kinetics of ionizing radiation-induced DNA double-strand breaks (DSBs) in the whole genome and in the long interstitial telomeric repeat sequence (ITRS) blocks from Chinese hamster cell lines. The results indicate that the ITRS blocks from wild-type Chinese hamster cell lines, CHO9 and V79B, exhibit a slower initial rejoining rate of ionizing radiation-induced DSBs than the genome overall. Neither Rad51C nor the catalytic subunit of DNA-dependent protein kinase (DNA-PKcs) activities, involved in homologous recombination (HR) and in non-homologous end-joining (NHEJ) pathways of DSB repair respectively, influenced the rejoining kinetics within ITRS in contrast to DNA sequences in the whole genome. Nevertheless, DSB removal rate within ITRS was decreased in the absence of Ku86 activity, though at a lower affectation level than in the whole genome, thus homogenizing both rejoining kinetics rates. WM treatment slowed down the DSB rejoining kinetics rate in ITRS, this effect being more pronounced in the whole genome, resulting in a similar pattern to that of the Ku86 deficient cells. In fact, no WM effect was detected in the Ku86 deficient Chinese hamster cells, so probably WM does not add further impairment in DSB rejoining than that resulted as a consequence of absence of Ku activity. The same slowing effect was also observed after treatment of Rad51C and DNA-PKcs defective hamster cells by WM, suggesting that: (1) there is no potentiation of the HR when the NHEJ is impaired by WM, either in the whole genome or in the ITRS, and (2) that this impairment may probably involve more targets than DNA-PKcs. These results suggest that there is an intragenomic heterogeneity in DSB repair, as well as in the effect of WM on this process

  14. Array-based techniques for fingerprinting medicinal herbs

    Directory of Open Access Journals (Sweden)

    Xue Charlie

    2011-05-01

    Full Text Available Abstract Poor quality control of medicinal herbs has led to instances of toxicity, poisoning and even deaths. The fundamental step in quality control of herbal medicine is accurate identification of herbs. Array-based techniques have recently been adapted to authenticate or identify herbal plants. This article reviews the current array-based techniques, eg oligonucleotides microarrays, gene-based probe microarrays, Suppression Subtractive Hybridization (SSH-based arrays, Diversity Array Technology (DArT and Subtracted Diversity Array (SDA. We further compare these techniques according to important parameters such as markers, polymorphism rates, restriction enzymes and sample type. The applicability of the array-based methods for fingerprinting depends on the availability of genomics and genetics of the species to be fingerprinted. For the species with few genome sequence information but high polymorphism rates, SDA techniques are particularly recommended because they require less labour and lower material cost.

  15. CVTree3 Web Server for Whole-genome-based and Alignment-free Prokaryotic Phylogeny and Taxonomy

    Institute of Scientific and Technical Information of China (English)

    Guanghong Zuo; Bailin Hao

    2015-01-01

    A faithful phylogeny and an objective taxonomy for prokaryotes should agree with each other and ultimately follow the genome data. With the number of sequenced genomes reaching tens of thousands, both tree inference and detailed comparison with taxonomy are great challenges. We now provide one solution in the latest Release 3.0 of the alignment-free and whole-genome-based web server CVTree3. The server resides in a cluster of 64 cores and is equipped with an interactive, collapsible, and expandable tree display. It is capable of comparing the tree branching order with prokaryotic classification at all taxonomic ranks from domains down to species and strains. CVTree3 allows for inquiry by taxon names and trial on lineage modifications. In addition, it reports a summary of monophyletic and non-monophyletic taxa at all ranks as well as produces print-quality subtree figures. After giving an overview of retrospective verification of the CVTree approach, the power of the new server is described for the mega-classification of prokaryotes and determination of taxonomic placement of some newly-sequenced genomes. A few discrepancies between CVTree and 16S rRNA analyses are also summarized with regard to possible taxonomic revisions. CVTree3 is freely accessible to all users at http://tlife.fudan.edu.cn/cvtree3/without login requirements.

  16. Monodisperse Picoliter Droplets for Low-Bias and Contamination-Free Reactions in Single-Cell Whole Genome Amplification.

    Directory of Open Access Journals (Sweden)

    Yohei Nishikawa

    Full Text Available Whole genome amplification (WGA is essential for obtaining genome sequences from single bacterial cells because the quantity of template DNA contained in a single cell is very low. Multiple displacement amplification (MDA, using Phi29 DNA polymerase and random primers, is the most widely used method for single-cell WGA. However, single-cell MDA usually results in uneven genome coverage because of amplification bias, background amplification of contaminating DNA, and formation of chimeras by linking of non-contiguous chromosomal regions. Here, we present a novel MDA method, termed droplet MDA, that minimizes amplification bias and amplification of contaminants by using picoliter-sized droplets for compartmentalized WGA reactions. Extracted DNA fragments from a lysed cell in MDA mixture are divided into 105 droplets (67 pL within minutes via flow through simple microfluidic channels. Compartmentalized genome fragments can be individually amplified in these droplets without the risk of encounter with reagent-borne or environmental contaminants. Following quality assessment of WGA products from single Escherichia coli cells, we showed that droplet MDA minimized unexpected amplification and improved the percentage of genome recovery from 59% to 89%. Our results demonstrate that microfluidic-generated droplets show potential as an efficient tool for effective amplification of low-input DNA for single-cell genomics and greatly reduce the cost and labor investment required for determination of nearly complete genome sequences of uncultured bacteria from environmental samples.

  17. Whole genome amplification and microsatellite genotyping of herbarium DNA revealed the identity of an ancient grapevine cultivar

    Science.gov (United States)

    Malenica, Nenad; Šimon, Silvio; Besendorfer, Višnja; Maletić, Edi; Karoglan Kontić, Jasminka; Pejić, Ivan

    2011-09-01

    Reconstruction of the grapevine cultivation history has advanced tremendously during the last decade. Identification of grapevine cultivars by using microsatellite DNA markers has mostly become a routine. The parentage of several renowned grapevine cultivars, like Cabernet Sauvignon and Chardonnay, has been elucidated. However, the assembly of a complete grapevine genealogy is not yet possible because missing links might no longer be in cultivation or are even extinct. This problem could be overcome by analyzing ancient DNA from grapevine herbarium specimens and other historical remnants of once cultivated varieties. Here, we present the first successful genotyping of a grapevine herbarium specimen and the identification of the corresponding grapevine cultivar. Using a set of nine grapevine microsatellite markers, in combination with a whole genome amplification procedure, we found the 90-year-old Tribidrag herbarium specimen to display the same microsatellite profile as the popular American cultivar Zinfandel. This work, together with information from several historical documents, provides a new clue of Zinfandel cultivation in Croatia as early as the beginning of fifteenth century, under the native name Tribidrag. Moreover, it emphasizes substantial information potential of existing grapevine and other herbarium collections worldwide.

  18. The roles of whole-genome and small-scale duplications in the functional specialization of Saccharomyces cerevisiae genes.

    Directory of Open Access Journals (Sweden)

    Mario A Fares

    Full Text Available Researchers have long been enthralled with the idea that gene duplication can generate novel functions, crediting this process with great evolutionary importance. Empirical data shows that whole-genome duplications (WGDs are more likely to be retained than small-scale duplications (SSDs, though their relative contribution to the functional fate of duplicates remains unexplored. Using the map of genetic interactions and the re-sequencing of 27 Saccharomyces cerevisiae genomes evolving for 2,200 generations we show that SSD-duplicates lead to neo-functionalization while WGD-duplicates partition ancestral functions. This conclusion is supported by: (a SSD-duplicates establish more genetic interactions than singletons and WGD-duplicates; (b SSD-duplicates copies share more interaction-partners than WGD-duplicates copies; (c WGD-duplicates interaction partners are more functionally related than SSD-duplicates partners; (d SSD-duplicates gene copies are more functionally divergent from one another, while keeping more overlapping functions, and diverge in their sub-cellular locations more than WGD-duplicates copies; and (e SSD-duplicates complement their functions to a greater extent than WGD-duplicates. We propose a novel model that uncovers the complexity of evolution after gene duplication.

  19. Whole-genome sequencing in newborn screening? A statement on the continued importance of targeted approaches in newborn screening programmes.

    Science.gov (United States)

    Howard, Heidi Carmen; Knoppers, Bartha Maria; Cornel, Martina C; Wright Clayton, Ellen; Sénécal, Karine; Borry, Pascal

    2015-12-01

    The advent and refinement of sequencing technologies has resulted in a decrease in both the cost and time needed to generate data on the entire sequence of the human genome. This has increased the accessibility of using whole-genome sequencing and whole-exome sequencing approaches for analysis in both the research and clinical contexts. The expectation is that more services based on these and other high-throughput technologies will become available to patients and the wider population. Some authors predict that sequencing will be performed once in a lifetime, namely, shortly after birth. The Public and Professional Policy Committee of the European Society of Human Genetics, the Human Genome Organisation Committee on Ethics, Law and Society, the PHG Foundation and the P3G International Paediatric Platform address herein the important issues and challenges surrounding the potential use of sequencing technologies in publicly funded newborn screening (NBS) programmes. This statement presents the relevant issues and culminates in a set of recommendations to help inform and guide scientists and clinicians, as well as policy makers regarding the necessary considerations for the use of genome sequencing technologies and approaches in NBS programmes. The primary objective of NBS should be the targeted analysis and identification of gene variants conferring a high risk of preventable or treatable conditions, for which treatment has to start in the newborn period or in early childhood. PMID:25626707

  20. Fatal cases of influenza A(H3N2 in children: insights from whole genome sequence analysis.

    Directory of Open Access Journals (Sweden)

    Monica Galiano

    Full Text Available During the Northern Hemisphere winter of 2003-2004 the emergence of a novel influenza antigenic variant, A/Fujian/411/2002-like(H3N2, was associated with an unusually high number of fatalities in children. Seventeen fatal cases in the UK were laboratory confirmed for Fujian/411-like viruses. To look for phylogenetic patterns and genetic markers that might be associated with increased virulence, sequencing and phylogenetic analysis of the whole genomes of 63 viruses isolated from fatal cases and non fatal "control" cases was undertaken. The analysis revealed the circulation of two main genetic groups, I and II, both of which contained viruses from fatal cases. No associated amino acid substitutions could be linked with an exclusive or higher occurrence in fatal cases. The Fujian/411-like viruses in genetic groups I and II completely displaced other A(H3N2 viruses, but they disappeared after 2004. This study shows that two A(H3N2 virus genotypes circulated exclusively during the winter of 2003-2004 in the UK and caused an unusually high number of deaths in children. Host factors related to immune state and differences in genetic background between patients may also play important roles in determining the outcome of an influenza infection.

  1. Should the Affordable Care Act's preventive services coverage provision be used to widely disseminate whole genome sequencing to Americans?

    Science.gov (United States)

    Payne, Perry W

    2014-02-01

    I argue that the provision of the Patient Protection and Affordable Care Act (ACA) of 2010, which eliminates cost sharing for preventive services, should be utilized as a pathway for reimbursing whole genome sequencing (WGS) and making it widely available to most Americans. This act provides multiple routes for determining which preventive services receive this designation. Three of these routes should be considered as pathways for reimbursing WGS, including approval by the United States Preventive Services Task Force, inclusion in the guidelines of the American Academy of Pediatrics Bright Futures Project, and classification as a preventive service for women by the Institute of Medicine. There are valid arguments against the expansion of this technology, including inadequate national and state laws prohibiting genetic discrimination, informed consent limitations, and potentially expensive genome interpretations. These concerns should not inhibit the wide dissemination of this technology, as current efforts by the NIH and industry to expand the use of genome sequencing demonstrate. The ACA should be used as a tool to prevent disparities in access to genome information in the United States and avoid the development of a two-tiered health system based on those with and without genome sequence data. PMID:24193604

  2. Facile mutant identification via a single parental backcross method and application of whole genome sequencing based mapping pipelines

    Directory of Open Access Journals (Sweden)

    Robert Silas Allen

    2013-09-01

    Full Text Available Forward genetic screens have identified numerous genes involved in development and metabolism, and remain a cornerstone of biological research. However to locate a causal mutation, the practice of crossing to a polymorphic background to generate a mapping population can be problematic if the mutant phenotype is difficult to recognise in the hybrid F2 progeny, or dependent on parental specific traits. Here in a screen for leaf hyponasty mutants, we have performed a single backcross of an Ethane Methyl Sulphonate (EMS generated hyponastic mutant to its parent. Whole genome deep sequencing of a bulked homozygous F2 population and analysis via the Next Generation EMS mutation mapping pipeline (NGM unambiguously determined the causal mutation to be a single nucleotide polymorphisim (SNP residing in HASTY, a previously characterised gene involved in microRNA biogenesis. We have evaluated the feasibility of this backcross approach using three additional SNP mapping pipelines; SHOREmap, the GATK pipeline, and the samtools pipeline. Although there was variance in the identification of EMS SNPs, all returned the same outcome in clearly identifying the causal mutation in HASTY. The simplicity of performing a single parental backcross and genome sequencing a small pool of segregating mutants has great promise for identifying mutations that may be difficult to map using conventional approaches.

  3. Whole Genome Re-Sequencing Identifies a Quantitative Trait Locus Repressing Carbon Reserve Accumulation during Optimal Growth in Chlamydomonas reinhardtii.

    Science.gov (United States)

    Goold, Hugh Douglas; Nguyen, Hoa Mai; Kong, Fantao; Beyly-Adriano, Audrey; Légeret, Bertrand; Billon, Emmanuelle; Cuiné, Stéphan; Beisson, Fred; Peltier, Gilles; Li-Beisson, Yonghua

    2016-01-01

    Microalgae have emerged as a promising source for biofuel production. Massive oil and starch accumulation in microalgae is possible, but occurs mostly when biomass growth is impaired. The molecular networks underlying the negative correlation between growth and reserve formation are not known. Thus isolation of strains capable of accumulating carbon reserves during optimal growth would be highly desirable. To this end, we screened an insertional mutant library of Chlamydomonas reinhardtii for alterations in oil content. A mutant accumulating five times more oil and twice more starch than wild-type during optimal growth was isolated and named constitutive oil accumulator 1 (coa1). Growth in photobioreactors under highly controlled conditions revealed that the increase in oil and starch content in coa1 was dependent on light intensity. Genetic analysis and DNA hybridization pointed to a single insertional event responsible for the phenotype. Whole genome re-sequencing identified in coa1 a >200 kb deletion on chromosome 14 containing 41 genes. This study demonstrates that, 1), the generation of algal strains accumulating higher reserve amount without compromising biomass accumulation is feasible; 2), light is an important parameter in phenotypic analysis; and 3), a chromosomal region (Quantitative Trait Locus) acts as suppressor of carbon reserve accumulation during optimal growth. PMID:27141848

  4. The first detection and whole genome characterization of the G6P[15] group A rotavirus strain from roe deer.

    Science.gov (United States)

    Jamnikar-Ciglenecki, Urska; Kuhar, Urska; Sturm, Sabina; Kirbis, Andrej; Racki, Nejc; Steyer, Andrej

    2016-08-15

    Although rotaviruses have been detected in a variety of host species, there are only limited records of their occurrence in deer, where their role is unknown. In this study, group A rotavirus was identified in roe deer during a study of enteric viruses in game animals. 102 samples of intestinal content were collected from roe deer (56), wild boars (29), chamois (10), red deer (6) and mouflon (1), but only one sample from roe deer was positive. Following whole genome sequence analysis, the rotavirus strain D38/14 was characterized by next generation sequencing. The genotype constellation, comprising 11 genome segments, was G6-P[15]-I2-R2-C2-M2-A3-N2-T6-E2-H3. Phylogenetic analysis of the VP7 genome segment showed that the D38/14 rotavirus strain is closely related to the various G6 zoonotic rotavirus strains of bovine-like origin frequently detected in humans. In the VP4 segment, this strain showed high variation compared to that in the P[15] strain found in sheep and in a goat. This finding suggests that rotaviruses from deer are similar to those in other DS-1 rotavirus groups and could constitute a source of zoonotically transmitted rotaviruses. The epidemiological status of group A rotaviruses in deer should be further investigated. PMID:27374907

  5. Association analysis for feet and legs disorders with whole-genome sequence variants in 3 dairy cattle breeds.

    Science.gov (United States)

    Wu, Xiaoping; Guldbrandtsen, Bernt; Lund, Mogens Sandø; Sahana, Goutam

    2016-09-01

    Identification of genetic variants associated with feet and legs disorders (FLD) will aid in the genetic improvement of these traits by providing knowledge on genes that influence trait variations. In Denmark, FLD in cattle has been recorded since the 1990s. In this report, we used deregressed breeding values as response variables for a genome-wide association study. Bulls (5,334 Danish Holstein, 4,237 Nordic Red Dairy Cattle, and 1,180 Danish Jersey) with deregressed estimated breeding values were genotyped with the Illumina Bovine 54k single nucleotide polymorphism (SNP) genotyping array. Genotypes were imputed to whole-genome sequence variants, and then 22,751,039 SNP on 29 autosomes were used for an association analysis. A modified linear mixed-model approach (efficient mixed-model association eXpedited, EMMAX) and a linear mixed model were used for association analysis. We identified 5 (3,854 SNP), 3 (13,642 SNP), and 0 quantitative trait locus (QTL) regions associated with the FLD index in Danish Holstein, Nordic Red Dairy Cattle, and Danish Jersey populations, respectively. We did not identify any QTL that were common among the 3 breeds. In a meta-analysis of the 3 breeds, 4 QTL regions were significant, but no additional QTL region was identified compared with within-breed analyses. Comparison between top SNP locations within these QTL regions and known genes suggested that RASGRP1, LCORL, MOS, and MITF may be candidate genes for FLD in dairy cattle. PMID:27344389

  6. Whole-Genome Sequencing Suggests Schizophrenia Risk Mechanisms in Humans with 22q11.2 Deletion Syndrome

    Science.gov (United States)

    Merico, Daniele; Zarrei, Mehdi; Costain, Gregory; Ogura, Lucas; Alipanahi, Babak; Gazzellone, Matthew J.; Butcher, Nancy J.; Thiruvahindrapuram, Bhooma; Nalpathamkalam, Thomas; Chow, Eva W. C.; Andrade, Danielle M.; Frey, Brendan J.; Marshall, Christian R.; Scherer, Stephen W.; Bassett, Anne S.

    2015-01-01

    Chromosome 22q11.2 microdeletions impart a high but incomplete risk for schizophrenia. Possible mechanisms include genome-wide effects of DGCR8 haploinsufficiency. In a proof-of-principle study to assess the power of this model, we used high-quality, whole-genome sequencing of nine individuals with 22q11.2 deletions and extreme phenotypes (schizophrenia, or no psychotic disorder at age >50 years). The schizophrenia group had a greater burden of rare, damaging variants impacting protein-coding neurofunctional genes, including genes involved in neuron projection (nominal P = 0.02, joint burden of three variant types). Variants in the intact 22q11.2 region were not major contributors. Restricting to genes affected by a DGCR8 mechanism tended to amplify between-group differences. Damaging variants in highly conserved long intergenic noncoding RNA genes also were enriched in the schizophrenia group (nominal P = 0.04). The findings support the 22q11.2 deletion model as a threshold-lowering first hit for schizophrenia risk. If applied to a larger and thus better-powered cohort, this appears to be a promising approach to identify genome-wide rare variants in coding and noncoding sequence that perturb gene networks relevant to idiopathic schizophrenia. Similarly designed studies exploiting genetic models may prove useful to help delineate the genetic architecture of other complex phenotypes. PMID:26384369

  7. Divergent whole-genome methylation maps of human and chimpanzee brains reveal epigenetic basis of human regulatory evolution.

    Science.gov (United States)

    Zeng, Jia; Konopka, Genevieve; Hunt, Brendan G; Preuss, Todd M; Geschwind, Dan; Yi, Soojin V

    2012-09-01

    DNA methylation is a pervasive epigenetic DNA modification that strongly affects chromatin regulation and gene expression. To date, it remains largely unknown how patterns of DNA methylation differ between closely related species and whether such differences contribute to species-specific phenotypes. To investigate these questions, we generated nucleotide-resolution whole-genome methylation maps of the prefrontal cortex of multiple humans and chimpanzees. Levels and patterns of DNA methylation vary across individuals within species according to the age and the sex of the individuals. We also found extensive species-level divergence in patterns of DNA methylation and that hundreds of genes exhibit significantly lower levels of promoter methylation in the human brain than in the chimpanzee brain. Furthermore, we investigated the functional consequences of methylation differences in humans and chimpanzees by integrating data on gene expression generated with next-generation sequencing methods, and we found a strong relationship between differential methylation and gene expression. Finally, we found that differentially methylated genes are strikingly enriched with loci associated with neurological disorders, psychological disorders, and cancers. Our results demonstrate that differential DNA methylation might be an important molecular mechanism driving gene-expression divergence between human and chimpanzee brains and might potentially contribute to the evolution of disease vulnerabilities. Thus, comparative studies of humans and chimpanzees stand to identify key epigenomic modifications underlying the evolution of human-specific traits. PMID:22922032

  8. Effect of long real space flight on the whole genome mRNA expression properties in medaka Oryzias latipes

    Science.gov (United States)

    Kozlova, Olga; Gusev, Oleg; Levinskikh, Margarita; Sychev, Vladimir; Poddubko, Svetlana

    The current study is addressed to the complex analysis of whole genome mRNA expression profile and properties of splicing variants formation in different organs of medaka fish exposed to prolonged space flight in the frame of joint Russia-Japan research program “Aquarium-AQH”. The fish were kept in the AQH joint-aquariums system in October-December 2013, followed by fixation in RNA-preserving buffers and freezing during the space flight. The samples we returned to the Earth frozen in March 2013 and mRNAs from four fish were sequenced in organ-specific manner using HiSeq Illumina sequencing platform. The ground group fish treated in the same way was used as a control. The comparison between the groups revealed space group-specific specific mRNA expression pattern. More than 50 genes (including several types of myosins) were down-regulated in the space group. Moreover, we found an evidence for formation of space group-specific splicing variants of mRNA. Taking together, the data suggest that in spite of aquatic environment, space flight-associated factors have a strong effect on the activity of fish genome. This work was supported in part by subsidy of the Russian Government to support the Program of competitive growth of Kazan Federal University among world class academic centres and universities.

  9. Characterization of Genomic Variants Associated with Scout and Recruit Behavioral Castes in Honey Bees Using Whole-Genome Sequencing.

    Directory of Open Access Journals (Sweden)

    Bruce R Southey

    Full Text Available Among forager honey bees, scouts seek new resources and return to the colony, enlisting recruits to collect these resources. Differentially expressed genes between these behaviors and genetic variability in scouting phenotypes have been reported. Whole-genome sequencing of 44 Apis mellifera scouts and recruits was undertaken to detect variants and further understand the genetic architecture underlying the behavioral differences between scouts and recruits. The median coverage depth in recruits and scouts was 10.01 and 10.7 X, respectively. Representation of bacterial species among the unmapped reads reflected a more diverse microbiome in scouts than recruits. Overall, 1,412,705 polymorphic positions were analyzed for associations with scouting behavior, and 212 significant (p-value 1000 bp apart from each other. A number of these variants were mapped to ncRNA LOC100578102, solute carrier family 12 member 6-like gene, and LOC100576965 (meprin and TRAF-C homology domain containing gene. Functional categories represented among the genes corresponding to significant variants included: neuronal function, exoskeleton, immune response, salivary gland development, and enzymatic food processing. These categories offer a glimpse into the molecular support to the behaviors of scouts and recruits. The level of association between genomic variants and scouting behavior observed in this study may be linked to the honey bee's genomic plasticity and fluidity of transition between castes.

  10. Characterization of Genomic Variants Associated with Scout and Recruit Behavioral Castes in Honey Bees Using Whole-Genome Sequencing.

    Science.gov (United States)

    Southey, Bruce R; Zhu, Ping; Carr-Markell, Morgan K; Liang, Zhengzheng S; Zayed, Amro; Li, Ruiqiang; Robinson, Gene E; Rodriguez-Zas, Sandra L

    2016-01-01

    Among forager honey bees, scouts seek new resources and return to the colony, enlisting recruits to collect these resources. Differentially expressed genes between these behaviors and genetic variability in scouting phenotypes have been reported. Whole-genome sequencing of 44 Apis mellifera scouts and recruits was undertaken to detect variants and further understand the genetic architecture underlying the behavioral differences between scouts and recruits. The median coverage depth in recruits and scouts was 10.01 and 10.7 X, respectively. Representation of bacterial species among the unmapped reads reflected a more diverse microbiome in scouts than recruits. Overall, 1,412,705 polymorphic positions were analyzed for associations with scouting behavior, and 212 significant (p-value 1000 bp apart from each other. A number of these variants were mapped to ncRNA LOC100578102, solute carrier family 12 member 6-like gene, and LOC100576965 (meprin and TRAF-C homology domain containing gene). Functional categories represented among the genes corresponding to significant variants included: neuronal function, exoskeleton, immune response, salivary gland development, and enzymatic food processing. These categories offer a glimpse into the molecular support to the behaviors of scouts and recruits. The level of association between genomic variants and scouting behavior observed in this study may be linked to the honey bee's genomic plasticity and fluidity of transition between castes. PMID:26784945

  11. Comparative genomic analysis of Acidithiobacillus ferrooxidans strains using the A. ferrooxidans ATCC 23270 whole-genome oligonucleotide microarray.

    Science.gov (United States)

    Luo, Hailang; Shen, Li; Yin, Huaqun; Li, Qian; Chen, Qijiong; Luo, Yanjie; Liao, Liqin; Qiu, Guanzhou; Liu, Xueduan

    2009-05-01

    Acidithiobacillus ferrooxidans is an important microorganism used in biomining operations for metal recovery. Whole-genomic diversity analysis based on the oligonucleotide microarray was used to analyze the gene content of 12 strains of A. ferrooxidans purified from various mining areas in China. Among the 3100 open reading frames (ORFs) on the slides, 1235 ORFs were absent in at least 1 strain of bacteria and 1385 ORFs were conserved in all strains. The hybridization results showed that these strains were highly diverse from a genomic perspective. The hybridization results of 4 major functional gene categories, namely electron transport, carbon metabolism, extracellular polysaccharides, and detoxification, were analyzed. Based on the hybridization signals obtained, a phylogenetic tree was built to analyze the evolution of the 12 tested strains, which indicated that the geographic distribution was the main factor influencing the strain diversity of these strains. Based on the hybridization signals of genes associated with bioleaching, another phylogenetic tree showed an evolutionary relationship from which the co-relation between the clustering of specific genes and geochemistry could be observed. The results revealed that the main factor was geochemistry, among which the following 6 factors were the most important: pH, Mg, Cu, S, Fe, and Al. PMID:19483787

  12. Whole-genome pyrosequencing of an epidemic multidrug-resistant Acinetobacter baumannii strain belonging to the European clone II group

    DEFF Research Database (Denmark)

    Iacono, M.; Villa, L.; Fortini, D.; Bordoni, R.; Imperi, F.; Bonnal, R.J.P.; Sicheritz-Pontén, Thomas; De Bellis, G.; Visca, P.; Cassone, A.; Carattoli, A.

    2008-01-01

    The whole-genome sequence of an epidemic, multidrug-resistant Acinetobacter baumannii strain (strain ACICU) belonging to the European clone II group and carrying the plasmid-mediated bla(OXA-58) carbapenem resistance gene was determined. The A. baumannii ACICU genome was compared with the genomes...... of A. baumannii ATCC 17978 and Acinetobacter baylyi ADP1, with the aim of identifying novel genes related to virulence and drug resistance. A. baumannii ACICU has a single chromosome of 3,904,116 bp (which is predicted to contain 3,758 genes) and two plasmids, pACICUI and pACICU2, of 28,279 and 64...... than in ATCC 17978 and ADP1 (76.2, 57.2, and 62.5 transporters per Mb of genome, respectively). An antibiotic resistance island, AbaR2, was identified in ACICU and had plausibly evolved by reductive evolution from the AbaR1 island previously described in multiresistant strain A. baumannii AYE. Moreover...

  13. Whole-genome sequence of a flatfish provides insights into ZW sex chromosome evolution and adaptation to a benthic lifestyle.

    Science.gov (United States)

    Chen, Songlin; Zhang, Guojie; Shao, Changwei; Huang, Quanfei; Liu, Geng; Zhang, Pei; Song, Wentao; An, Na; Chalopin, Domitille; Volff, Jean-Nicolas; Hong, Yunhan; Li, Qiye; Sha, Zhenxia; Zhou, Heling; Xie, Mingshu; Yu, Qiulin; Liu, Yang; Xiang, Hui; Wang, Na; Wu, Kui; Yang, Changgeng; Zhou, Qian; Liao, Xiaolin; Yang, Linfeng; Hu, Qiaomu; Zhang, Jilin; Meng, Liang; Jin, Lijun; Tian, Yongsheng; Lian, Jinmin; Yang, Jingfeng; Miao, Guidong; Liu, Shanshan; Liang, Zhuo; Yan, Fang; Li, Yangzhen; Sun, Bin; Zhang, Hong; Zhang, Jing; Zhu, Ying; Du, Min; Zhao, Yongwei; Schartl, Manfred; Tang, Qisheng; Wang, Jun

    2014-03-01

    Genetic sex determination by W and Z chromosomes has developed independently in different groups of organisms. To better understand the evolution of sex chromosomes and the plasticity of sex-determination mechanisms, we sequenced the whole genomes of a male (ZZ) and a female (ZW) half-smooth tongue sole (Cynoglossus semilaevis). In addition to insights into adaptation to a benthic lifestyle, we find that the sex chromosomes of these fish are derived from the same ancestral vertebrate protochromosome as the avian W and Z chromosomes. Notably, the same gene on the Z chromosome, dmrt1, which is the male-determining gene in birds, showed convergent evolution of features that are compatible with a similar function in tongue sole. Comparison of the relatively young tongue sole sex chromosomes with those of mammals and birds identified events that occurred during the early phase of sex-chromosome evolution. Pertinent to the current debate about heterogametic sex-chromosome decay, we find that massive gene loss occurred in the wake of sex-chromosome 'birth'. PMID:24487278

  14. High-resolution Whole-Genome Analysis of Skull Base Chordomas Implicates FHIT Loss in Chordoma Pathogenesis12

    Science.gov (United States)

    Diaz, Roberto Jose; Guduk, Mustafa; Romagnuolo, Rocco; Smith, Christian A; Northcott, Paul; Shih, David; Berisha, Fitim; Flanagan, Adrienne; Munoz, David G; Cusimano, Michael D; Pamir, M Necmettin; Rutka, James T

    2012-01-01

    Chordoma is a rare tumor arising in the sacrum, clivus, or vertebrae. It is often not completely resectable and shows a high incidence of recurrence and progression with shortened patient survival and impaired quality of life. Chemotherapeutic options are limited to investigational therapies at present. Therefore, adjuvant therapy for control of tumor recurrence and progression is of great interest, especially in skull base lesions where complete tumor resection is often not possible because of the proximity of cranial nerves. To understand the extent of genetic instability and associated chromosomal and gene losses or gains in skull base chordoma, we undertook whole-genome single-nucleotide polymorphism microarray analysis of flash frozen surgical chordoma specimens, 21 from the clivus and 1 from C1 to C2 vertebrae. We confirm the presence of a deletion at 9p involving CDKN2A, CDKN2B, and MTAP but at a much lower rate (22%) than previously reported for sacral chordoma. At a similar frequency (21%), we found aneuploidy of chromosome 3. Tissue microarray immunohistochemistry demonstrated absent or reduced fragile histidine triad (FHIT) protein expression in 98% of sacral chordomas and 67%of skull base chordomas. Our data suggest that chromosome 3 aneuploidy and epigenetic regulation of FHIT contribute to loss of the FHIT tumor suppressor in chordoma. The finding that FHIT is lost in a majority of chordomas provides new insight into chordoma pathogenesis and points to a potential new therapeutic target for this challenging neoplasm. PMID:23019410

  15. High-resolution Whole-Genome Analysis of Skull Base Chordomas Implicates FHIT Loss in Chordoma Pathogenesis

    Directory of Open Access Journals (Sweden)

    Roberto Jose Diaz

    2012-09-01

    Full Text Available Chordoma is a rare tumor arising in the sacrum, clivus, or vertebrae. It is often not completely resectable and shows a high incidence of recurrence and progression with shortened patient survival and impaired quality of life. Chemotherapeutic options are limited to investigational therapies at present. Therefore, adjuvant therapy for control of tumor recurrence and progression is of great interest, especially in skull base lesions where complete tumor resection is often not possible because of the proximity of cranial nerves. To understand the extent of genetic instability and associated chromosomal and gene losses or gains in skull base chordoma, we undertook whole-genome single-nucleotide polymorphism microarray analysis of flash frozen surgical chordoma specimens, 21 from the clivus and 1 from C1 to C2 vertebrae. We confirm the presence of a deletion at 9p involving CDKN2A, CDKN2B, and MTAP but at a much lower rate (22% than previously reported for sacral chordoma. At a similar frequency (21%, we found aneuploidy of chromosome 3. Tissue microarray immunohistochemistry demonstrated absent or reduced fragile histidine triad (FHIT protein expression in 98% of sacral chordomas and 67%of skull base chordomas. Our data suggest that chromosome 3 aneuploidy and epigenetic regulation of FHIT contribute to loss of the FHIT tumor suppressor in chordoma. The finding that FHIT is lost in a majority of chordomas provides new insight into chordoma pathogenesis and points to a potential new therapeutic target for this challenging neoplasm.

  16. High-resolution whole-genome analysis of skull base chordomas implicates FHIT loss in chordoma pathogenesis.

    Science.gov (United States)

    Diaz, Roberto Jose; Guduk, Mustafa; Romagnuolo, Rocco; Smith, Christian A; Northcott, Paul; Shih, David; Berisha, Fitim; Flanagan, Adrienne; Munoz, David G; Cusimano, Michael D; Pamir, M Necmettin; Rutka, James T

    2012-09-01

    Chordoma is a rare tumor arising in the sacrum, clivus, or vertebrae. It is often not completely resectable and shows a high incidence of recurrence and progression with shortened patient survival and impaired quality of life. Chemotherapeutic options are limited to investigational therapies at present. Therefore, adjuvant therapy for control of tumor recurrence and progression is of great interest, especially in skull base lesions where complete tumor resection is often not possible because of the proximity of cranial nerves. To understand the extent of genetic instability and associated chromosomal and gene losses or gains in skull base chordoma, we undertook whole-genome single-nucleotide polymorphism microarray analysis of flash frozen surgical chordoma specimens, 21 from the clivus and 1 from C1 to C2 vertebrae. We confirm the presence of a deletion at 9p involving CDKN2A, CDKN2B, and MTAP but at a much lower rate (22%) than previously reported for sacral chordoma. At a similar frequency (21%), we found aneuploidy of chromosome 3. Tissue microarray immunohistochemistry demonstrated absent or reduced fragile histidine triad (FHIT) protein expression in 98% of sacral chordomas and 67%of skull base chordomas. Our data suggest that chromosome 3 aneuploidy and epigenetic regulation of FHIT contribute to loss of the FHIT tumor suppressor in chordoma. The finding that FHIT is lost in a majority of chordomas provides new insight into chordoma pathogenesis and points to a potential new therapeutic target for this challenging neoplasm. PMID:23019410

  17. Whole Genome Sequencing Identifies a Missense Mutation in HES7 Associated with Short Tails in Asian Domestic Cats.

    Science.gov (United States)

    Xu, Xiao; Sun, Xin; Hu, Xue-Song; Zhuang, Yan; Liu, Yue-Chen; Meng, Hao; Miao, Lin; Yu, He; Luo, Shu-Jin

    2016-01-01

    Domestic cats exhibit abundant variations in tail morphology and serve as an excellent model to study the development and evolution of vertebrate tails. Cats with shortened and kinked tails were first recorded in the Malayan archipelago by Charles Darwin in 1868 and remain quite common today in Southeast and East Asia. To elucidate the genetic basis of short tails in Asian cats, we built a pedigree of 13 cats segregating at the trait with a founder from southern China and performed linkage mapping based on whole genome sequencing data from the pedigree. The short-tailed trait was mapped to a 5.6 Mb region of Chr E1, within which the substitution c. 5T > C in the somite segmentation-related gene HES7 was identified as the causal mutation resulting in a missense change (p.V2A). Validation in 245 unrelated cats confirmed the correlation between HES7-c. 5T > C and Chinese short-tailed feral cats as well as the Japanese Bobtail breed, indicating a common genetic basis of the two. In addition, some of our sampled kinked-tailed cats could not be explained by either HES7 or the Manx-related T-box, suggesting at least three independent events in the evolution of domestic cats giving rise to short-tailed traits. PMID:27560986

  18. Whole Genome Re-Sequencing Identifies a Quantitative Trait Locus Repressing Carbon Reserve Accumulation during Optimal Growth in Chlamydomonas reinhardtii

    Science.gov (United States)

    Goold, Hugh Douglas; Nguyen, Hoa Mai; Kong, Fantao; Beyly-Adriano, Audrey; Légeret, Bertrand; Billon, Emmanuelle; Cuiné, Stéphan; Beisson, Fred; Peltier, Gilles; Li-Beisson, Yonghua

    2016-01-01

    Microalgae have emerged as a promising source for biofuel production. Massive oil and starch accumulation in microalgae is possible, but occurs mostly when biomass growth is impaired. The molecular networks underlying the negative correlation between growth and reserve formation are not known. Thus isolation of strains capable of accumulating carbon reserves during optimal growth would be highly desirable. To this end, we screened an insertional mutant library of Chlamydomonas reinhardtii for alterations in oil content. A mutant accumulating five times more oil and twice more starch than wild-type during optimal growth was isolated and named constitutive oil accumulator 1 (coa1). Growth in photobioreactors under highly controlled conditions revealed that the increase in oil and starch content in coa1 was dependent on light intensity. Genetic analysis and DNA hybridization pointed to a single insertional event responsible for the phenotype. Whole genome re-sequencing identified in coa1 a >200 kb deletion on chromosome 14 containing 41 genes. This study demonstrates that, 1), the generation of algal strains accumulating higher reserve amount without compromising biomass accumulation is feasible; 2), light is an important parameter in phenotypic analysis; and 3), a chromosomal region (Quantitative Trait Locus) acts as suppressor of carbon reserve accumulation during optimal growth. PMID:27141848

  19. Hyperlipidemia-associated gene variations and expression patterns revealed by whole-genome and transcriptome sequencing of rabbit models.

    Science.gov (United States)

    Wang, Zhen; Zhang, Jifeng; Li, Hong; Li, Junyi; Niimi, Manabu; Ding, Guohui; Chen, Haifeng; Xu, Jie; Zhang, Hongjiu; Xu, Ze; Dai, Yulin; Gui, Tuantuan; Li, Shengdi; Liu, Zhi; Wu, Sujuan; Cao, Mushui; Zhou, Lu; Lu, Xingyu; Wang, Junxia; Yang, Jing; Fu, Yunhe; Yang, Dongshan; Song, Jun; Zhu, Tianqing; Li, Shen; Ning, Bo; Wang, Ziyun; Koike, Tomonari; Shiomi, Masashi; Liu, Enqi; Chen, Luonan; Fan, Jianglin; Chen, Y Eugene; Li, Yixue

    2016-01-01

    The rabbit (Oryctolagus cuniculus) is an important experimental animal for studying human diseases, such as hypercholesterolemia and atherosclerosis. Despite this, genetic information and RNA expression profiling of laboratory rabbits are lacking. Here, we characterized the whole-genome variants of three breeds of the most popular experimental rabbits, New Zealand White (NZW), Japanese White (JW) and Watanabe heritable hyperlipidemic (WHHL) rabbits. Although the genetic diversity of WHHL rabbits was relatively low, they accumulated a large proportion of high-frequency deleterious mutations due to the small population size. Some of the deleterious mutations were associated with the pathophysiology of WHHL rabbits in addition to the LDLR deficiency. Furthermore, we conducted transcriptome sequencing of different organs of both WHHL and cholesterol-rich diet (Chol)-fed NZW rabbits. We found that gene expression profiles of the two rabbit models were essentially similar in the aorta, even though they exhibited different types of hypercholesterolemia. In contrast, Chol-fed rabbits, but not WHHL rabbits, exhibited pronounced inflammatory responses and abnormal lipid metabolism in the liver. These results provide valuable insights into identifying therapeutic targets of hypercholesterolemia and atherosclerosis with rabbit models. PMID:27245873

  20. Evolutionary Dynamics of Local Pandemic H1N1/2009 Influenza Virus Lineages Revealed by Whole-Genome Analysis

    Science.gov (United States)

    Baillie, Gregory J.; Galiano, Monica; Agapow, Paul-Michael; Myers, Richard; Chiam, Rachael; Gall, Astrid; Palser, Anne L.; Watson, Simon J.; Hedge, Jessica; Underwood, Anthony; Platt, Steven; McLean, Estelle; Pebody, Richard G.; Rambaut, Andrew; Green, Jonathan; Daniels, Rod; Pybus, Oliver G.; Zambon, Maria

    2012-01-01

    Virus gene sequencing and phylogenetics can be used to study the epidemiological dynamics of rapidly evolving viruses. With complete genome data, it becomes possible to identify and trace individual transmission chains of viruses such as influenza virus during the course of an epidemic. Here we sequenced 153 pandemic influenza H1N1/09 virus genomes from United Kingdom isolates from the first (127 isolates) and second (26 isolates) waves of the 2009 pandemic and used their sequences, dates of isolation, and geographical locations to infer the genetic epidemiology of the epidemic in the United Kingdom. We demonstrate that the epidemic in the United Kingdom was composed of many cocirculating lineages, among which at least 13 were exclusively or predominantly United Kingdom clusters. The estimated divergence times of two of the clusters predate the detection of pandemic H1N1/09 virus in the United Kingdom, suggesting that the pandemic H1N1/09 virus was already circulating in the United Kingdom before the first clinical case. Crucially, three clusters contain isolates from the second wave of infections in the United Kingdom, two of which represent chains of transmission that appear to have persisted within the United Kingdom between the first and second waves. This demonstrates that whole-genome analysis can track in fine detail the behavior of individual influenza virus lineages during the course of a single epidemic or pandemic. PMID:22013031

  1. Comparison of whole genome sequencing typing results and epidemiological contact information from outbreaks of Salmonella Dublin in Swedish cattle herds

    Science.gov (United States)

    Ågren, Estelle C. C.; Wahlström, Helene; Vesterlund-Carlson, Catrin; Lahti, Elina; Melin, Lennart; Söderlund, Robert

    2016-01-01

    Background Whole genome sequencing (WGS) is becoming a routine tool for infectious disease outbreak investigations. The Swedish situation provides an excellent opportunity to test the usefulness of WGS for investigation of outbreaks with Salmonella Dublin (S. Dublin) as epidemiological investigations are always performed when Salmonella is detected in livestock production, and index isolates from all detected herds are stored and therefore available for analysis. This study was performed to evaluate WGS as a tool in forward and backward tracings from herds infected with S. Dublin. Material and methods In this study, 28 isolates from 26 cattle herds were analysed and the WGS results were compared with results from the epidemiological investigations, for example, information on contacts between herds. The isolates originated from herds in three different outbreaks separated geographically and to some extent also in time, and from the only region in Sweden where S. Dublin is endemic (Öland). Results The WGS results of isolates from the three non-endemic regions were reliably separated from each other and from the endemic isolates. Within the outbreaks, herds with known epidemiological contacts generally showed smaller differences between isolates as compared to when there were no known epidemiological contacts. Conclusion The results indicate that WGS can provide valuable supplemental information in S. Dublin outbreak investigations. The resolution of the WGS was sufficient to distinguish isolates from the different outbreaks and provided additional information to the investigations within an outbreak. PMID:27396609

  2. Quantification of read species behavior within whole genome sequencing of cancer genomes for the stratification and visualization of genomic variation.

    Science.gov (United States)

    Hibsh, Dror; Buetow, Kenneth H; Yaari, Gur; Efroni, Sol

    2016-05-19

    The cancer genome is abnormal genome, and the ability to monitor its sequence had undergone a technological revolution. Yet prognosis and diagnosis remain an expert-based decision, with only limited abilities to provide machine-based decisions. We introduce a heterogeneity-based method for stratifying and visualizing whole-genome sequencing (WGS) reads. This method uses the heterogeneity within WGS reads to markedly reduce the dimensionality of next-generation sequencing data; it is available through the tool HiBS (Heterogeneity-Based Subclassification) that allows cancer sample classification. We validated HiBS using >200 WGS samples from nine different cancer types from The Cancer Genome Atlas (TCGA). With HiBS, we show progress with two WGS related issues: (i) differentiation between normal (NB) and tumor (TP) samples based solely on the information structure of their WGS data, and (ii) identification of specific regions of chromosomal amplification/deletion and their association with tumor stage. By comparing results to those obtained through available WGS analyses tools, we demonstrate some of the novelties obtained by the approach implemented in HiBS and also show nearly perfect normal/tumor classification, used to identify known and unknown chromosomal aberrations. Finally, the HiBS index has been associated with breast cancer tumor stage. PMID:26809676

  3. Reconstruction of thermotolerant yeast by one-point mutation identified through whole-genome analyses of adaptively-evolved strains.

    Science.gov (United States)

    Satomura, Atsushi; Miura, Natsuko; Kuroda, Kouichi; Ueda, Mitsuyoshi

    2016-01-01

    Saccharomyces cerevisiae is used as a host strain in bioproduction, because of its rapid growth, ease of genetic manipulation, and high reducing capacity. However, the heat produced during the fermentation processes inhibits the biological activities and growth of the yeast cells. We performed whole-genome sequencing of 19 intermediate strains previously obtained during adaptation experiments under heat stress; 49 mutations were found in the adaptation steps. Phylogenetic tree revealed at least five events in which these strains had acquired mutations in the CDC25 gene. Reconstructed CDC25 point mutants based on a parental strain had acquired thermotolerance without any growth defects. These mutations led to the downregulation of the cAMP-dependent protein kinase (PKA) signaling pathway, which controls a variety of processes such as cell-cycle progression and stress tolerance. The one-point mutations in CDC25 were involved in the global transcriptional regulation through the cAMP/PKA pathway. Additionally, the mutations enabled efficient ethanol fermentation at 39 °C, suggesting that the one-point mutations in CDC25 may contribute to bioproduction. PMID:26984760

  4. Evolutionary dynamics of local pandemic H1N1/2009 influenza virus lineages revealed by whole-genome analysis.

    Science.gov (United States)

    Baillie, Gregory J; Galiano, Monica; Agapow, Paul-Michael; Myers, Richard; Chiam, Rachael; Gall, Astrid; Palser, Anne L; Watson, Simon J; Hedge, Jessica; Underwood, Anthony; Platt, Steven; McLean, Estelle; Pebody, Richard G; Rambaut, Andrew; Green, Jonathan; Daniels, Rod; Pybus, Oliver G; Kellam, Paul; Zambon, Maria

    2012-01-01

    Virus gene sequencing and phylogenetics can be used to study the epidemiological dynamics of rapidly evolving viruses. With complete genome data, it becomes possible to identify and trace individual transmission chains of viruses such as influenza virus during the course of an epidemic. Here we sequenced 153 pandemic influenza H1N1/09 virus genomes from United Kingdom isolates from the first (127 isolates) and second (26 isolates) waves of the 2009 pandemic and used their sequences, dates of isolation, and geographical locations to infer the genetic epidemiology of the epidemic in the United Kingdom. We demonstrate that the epidemic in the United Kingdom was composed of many cocirculating lineages, among which at least 13 were exclusively or predominantly United Kingdom clusters. The estimated divergence times of two of the clusters predate the detection of pandemic H1N1/09 virus in the United Kingdom, suggesting that the pandemic H1N1/09 virus was already circulating in the United Kingdom before the first clinical case. Crucially, three clusters contain isolates from the second wave of infections in the United Kingdom, two of which represent chains of transmission that appear to have persisted within the United Kingdom between the first and second waves. This demonstrates that whole-genome analysis can track in fine detail the behavior of individual influenza virus lineages during the course of a single epidemic or pandemic. PMID:22013031

  5. Giardia spp. Are Commonly Found in Mixed Assemblages in Surface Water, as Revealed by Molecular and Whole-Genome Characterization.

    Science.gov (United States)

    Prystajecky, Natalie; Tsui, Clement K-M; Hsiao, William W L; Uyaguari-Diaz, Miguel I; Ho, Jordan; Tang, Patrick; Isaac-Renton, Judith

    2015-07-01

    Giardia is the most common parasitic cause of gastrointestinal infections worldwide, with transmission through surface water playing an important role in various parts of the world. Giardia duodenalis (synonyms: G. intestinalis and G. lamblia), a multispecies complex, has two zoonotic subtypes, assemblages A and B. When British Columbia (BC), a western Canadian province, experienced several waterborne giardiasis outbreaks due to unfiltered surface drinking water in the late 1980s, collection of isolates from surface water, as well as from humans and beavers (Castor canadensis), throughout the province was carried out. To better understand Giardia in surface water, 71 isolates, including 29 from raw surface water samples, 29 from human giardiasis cases, and 13 from beavers in watersheds from this historical library were characterized by PCR. Study isolates also included isolates from waterborne giardiasis outbreaks. Both assemblages A and B were identified in surface water, human, and beavers samples, including a mixture of both assemblages A and B in waterborne outbreaks. PCR results were confirmed by whole-genome sequencing (WGS) for one waterborne outbreak and supported the clustering of human, water, and beaver isolates within both assemblages. We concluded that contamination of surface water by Giardia is complex, that the majority of our surface water isolates were assemblage B, and that both assemblages A and B may cause waterborne outbreaks. The higher-resolution data provided by WGS warrants further study to better understand the spread of Giardia. PMID:25956776

  6. Whole Genome Sequencing Identifies a Missense Mutation in HES7 Associated with Short Tails in Asian Domestic Cats

    Science.gov (United States)

    Xu, Xiao; Sun, Xin; Hu, Xue-Song; Zhuang, Yan; Liu, Yue-Chen; Meng, Hao; Miao, Lin; Yu, He; Luo, Shu-Jin

    2016-01-01

    Domestic cats exhibit abundant variations in tail morphology and serve as an excellent model to study the development and evolution of vertebrate tails. Cats with shortened and kinked tails were first recorded in the Malayan archipelago by Charles Darwin in 1868 and remain quite common today in Southeast and East Asia. To elucidate the genetic basis of short tails in Asian cats, we built a pedigree of 13 cats segregating at the trait with a founder from southern China and performed linkage mapping based on whole genome sequencing data from the pedigree. The short-tailed trait was mapped to a 5.6 Mb region of Chr E1, within which the substitution c. 5T > C in the somite segmentation-related gene HES7 was identified as the causal mutation resulting in a missense change (p.V2A). Validation in 245 unrelated cats confirmed the correlation between HES7-c. 5T > C and Chinese short-tailed feral cats as well as the Japanese Bobtail breed, indicating a common genetic basis of the two. In addition, some of our sampled kinked-tailed cats could not be explained by either HES7 or the Manx-related T-box, suggesting at least three independent events in the evolution of domestic cats giving rise to short-tailed traits. PMID:27560986

  7. Whole genome sequencing of field isolates reveals a common duplication of the Duffy binding protein gene in Malagasy Plasmodium vivax strains.

    Directory of Open Access Journals (Sweden)

    Didier Menard

    2013-11-01

    Full Text Available BACKGROUND: Plasmodium vivax is the most prevalent human malaria parasite, causing serious public health problems in malaria-endemic countries. Until recently the Duffy-negative blood group phenotype was considered to confer resistance to vivax malaria for most African ethnicities. We and others have reported that P. vivax strains in African countries from Madagascar to Mauritania display capacity to cause clinical vivax malaria in Duffy-negative people. New insights must now explain Duffy-independent P. vivax invasion of human erythrocytes. METHODS/PRINCIPAL FINDINGS: Through recent whole genome sequencing we obtained ≥ 70× coverage of the P. vivax genome from five field-isolates, resulting in ≥ 93% of the Sal I reference sequenced at coverage greater than 20×. Combined with sequences from one additional Malagasy field isolate and from five monkey-adapted strains, we describe here identification of DNA sequence rearrangements in the P. vivax genome, including discovery of a duplication of the P. vivax Duffy binding protein (PvDBP gene. A survey of Malagasy patients infected with P. vivax showed that the PvDBP duplication was present in numerous locations in Madagascar and found in over 50% of infected patients evaluated. Extended geographic surveys showed that the PvDBP duplication was detected frequently in vivax patients living in East Africa and in some residents of non-African P. vivax-endemic countries. Additionally, the PvDBP duplication was observed in travelers seeking treatment of vivax malaria upon returning home. PvDBP duplication prevalence was highest in west-central Madagascar sites where the highest frequencies of P. vivax-infected, Duffy-negative people were reported. CONCLUSIONS/SIGNIFICANCE: The highly conserved nature of the sequence involved in the PvDBP duplication suggests that it has occurred in a recent evolutionary time frame. These data suggest that PvDBP, a merozoite surface protein involved in red cell adhesion

  8. Evolutionary insights from suffix array-based genome sequence analysis

    Indian Academy of Sciences (India)

    Anindya Poddar; Nagasuma Chandra; Madhavi Ganapathiraju; K Sekar; Judith Klein-Seetharaman; Raj Reddy; N Balakrishnan

    2007-08-01

    Gene and protein sequence analyses, central components of studies in modern biology are easily amenable to string matching and pattern recognition algorithms. The growing need of analysing whole genome sequences more efficiently and thoroughly, has led to the emergence of new computational methods. Suffix trees and suffix arrays are data structures, well known in many other areas and are highly suited for sequence analysis too. Here we report an improvement to the design of construction of suffix arrays. Enhancement in versatility and scalability, enabled by this approach, is demonstrated through the use of real-life examples. The scalability of the algorithm to whole genomes renders it suitable to address many biologically interesting problems. One example is the evolutionary insight gained by analysing unigrams, bi-grams and higher n-grams, indicating that the genetic code has a direct influence on the overall composition of the genome. Further, different proteomes have been analysed for the coverage of the possible peptide space, which indicate that as much as a quarter of the total space at the tetra-peptide level is left un-sampled in prokaryotic organisms, although almost all tri-peptides can be seen in one protein or another in a proteome. Besides, distinct patterns begin to emerge for the counts of particular tetra and higher peptides, indicative of a ‘meaning’ for tetra and higher n-grams. The toolkit has also been used to demonstrate the usefulness of identifying repeats in whole proteomes efficiently. As an example, 16 members of one COG, coded by the genome of Mycobacterium tuberculosis H37Rv have been found to contain a repeating sequence of 300 amino acids.

  9. Identification of Ohnolog Genes Originating from Whole Genome Duplication in Early Vertebrates, Based on Synteny Comparison across Multiple Genomes.

    Science.gov (United States)

    Singh, Param Priya; Arora, Jatin; Isambert, Hervé

    2015-07-01

    Whole genome duplications (WGD) have now been firmly established in all major eukaryotic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to dominant deleterious mutations and frequently implicated in cancer and genetic diseases. Hence, identifying ohnologs is central to better understand the evolution of vertebrates and their susceptibility to genetic diseases. Early computational analyses to identify vertebrate ohnologs relied on content-based synteny comparisons between the human genome and a single invertebrate outgroup genome or within the human genome itself. These approaches are thus limited by lineage specific rearrangements in individual genomes. We report, in this study, the identification of vertebrate ohnologs based on the quantitative assessment and integration of synteny conservation between six amniote vertebrates and six invertebrate outgroups. Such a synteny comparison across multiple genomes is shown to enhance the statistical power of ohnolog identification in vertebrates compared to earlier approaches, by overcoming lineage specific genome rearrangements. Ohnolog gene families can be browsed and downloaded for three statistical confidence levels or recompiled for specific, user-defined, significance criteria at http://ohnologs.curie.fr/. In the light of the importance of WGD on the genetic makeup of vertebrates, our analysis provides a useful resource for researchers interested in gaining further insights on vertebrate evolution and genetic diseases. PMID:26181593

  10. Arapan-S: a fast and highly accurate whole-genome assembly software for viruses and small genomes

    Directory of Open Access Journals (Sweden)

    Sahli Mohammed

    2012-05-01

    Full Text Available Abstract Background Genome assembly is considered to be a challenging problem in computational biology, and has been studied extensively by many researchers. It is extremely difficult to build a general assembler that is able to reconstruct the original sequence instead of many contigs. However, we believe that creating specific assemblers, for solving specific cases, will be much more fruitful than creating general assemblers. Findings In this paper, we present Arapan-S, a whole-genome assembly program dedicated to handling small genomes. It provides only one contig (along with the reverse complement of this contig in many cases. Although genomes consist of a number of segments, the implemented algorithm can detect all the segments, as we demonstrate for Influenza Virus A. The Arapan-S program is based on the de Bruijn graph. We have implemented a very sophisticated and fast method to reconstruct the original sequence and neglect erroneous k-mers. The method explores the graph by using neither the shortest nor the longest path, but rather a specific and reliable path based on the coverage level or k-mers’ lengths. Arapan-S uses short reads, and it was tested on raw data downloaded from the NCBI Trace Archive. Conclusions Our findings show that the accuracy of the assembly was very high; the result was checked against the European Bioinformatics Institute (EBI database using the NCBI BLAST Sequence Similarity Search. The identity and the genome coverage was more than 99%. We also compared the efficiency of Arapan-S with other well-known assemblers. In dealing with small genomes, the accuracy of Arapan-S is significantly higher than the accuracy of other assemblers. The assembly process is very fast and requires only a few seconds. Arapan-S is available for free to the public. The binary files for Arapan-S are available through http://sourceforge.net/projects/dnascissor/files/.

  11. Regulatory divergence of homeologous Atlantic salmon elovl5 genes following the salmonid-specific whole-genome duplication.

    Science.gov (United States)

    Carmona-Antoñanzas, Greta; Zheng, Xiaozhong; Tocher, Douglas R; Leaver, Michael J

    2016-10-10

    Fatty acyl elongase 5 (elovl5) is a critical enzyme in the vertebrate biosynthetic pathway which produces the physiologically essential long-chain polyunsaturated fatty acids (LC-PUFA), docosahexenoic acid (DHA), and eicosapentenoic acid (EPA) from 18 carbon fatty acids precursors. In contrast to most other vertebrates, Atlantic salmon possess two copies of elovl5 (elovl5a and elovl5b) as a result of a whole-genome duplication (WGD) which occurred at the base of the salmonid lineage. WGDs have had a major influence on vertebrate evolution, providing extra genetic material, enabling neofunctionalization to accelerate adaptation and speciation. However, little is known about the mechanisms by which such duplicated homeologous genes diverge. Here we show that homeologous Atlantic salmon elovl5a and elovl5b genes have been asymmetrically colonised by transposon-like elements. Identical locations and identities of insertions are also present in the rainbow trout duplicate elovl5 genes, but not in the nearest extant representative preduplicated teleost, the northern pike. Both elovl5 salmon duplicates possessed conserved regulatory elements that promoted Srebp1- and Srebp2-dependent transcription, and differences in the magnitude of Srebp response between promoters could be attributed to a tandem duplication of SRE and NF-Y cofactor binding sites in elovl5b. Furthermore, an insertion in the promoter region of elovl5a confers responsiveness to Lxr/Rxr transcriptional activation. Our results indicate that most, but not all, transposon mobilisation into elovl5 genes occurred after the split from the common ancestor of pike and salmon, but before more recent salmonid speciations, and that divergence of elovl5 regulatory regions have enabled neofuntionalization by promoting differential expression of these homeologous genes. PMID:27374149

  12. Genetic characterization of 2006-2008 isolates of Chikungunya virus from Kerala, South India, by whole genome sequence analysis.

    Science.gov (United States)

    Sreekumar, E; Issac, Aneesh; Nair, Sajith; Hariharan, Ramkumar; Janki, M B; Arathy, D S; Regu, R; Mathew, Thomas; Anoop, M; Niyas, K P; Pillai, M R

    2010-02-01

    Chikungunya virus (CHIKV), a positive-stranded alphavirus, causes epidemic febrile infections characterized by severe and prolonged arthralgia. In the present study, six CHIKV isolates (2006 RGCB03, RGCB05; 2007 RGCB80, RGCB120; 2008 RGCB355, RGCB356) from three consecutive Chikungunya outbreaks in Kerala, South India, were analyzed for genetic variations by sequencing the 11798 bp whole genome of the virus. A total of 37 novel mutations were identified and they were predominant in the 2007 and 2008 isolates among the six isolates studied. The previously identified E1 A226V critical mutation, which enhances mosquito adaptability, was present in the 2007 and 2008 samples. An important observation was the presence of two coding region substitutions, leading to nsP2 L539S and E2 K252Q change. These were identified in three isolates (2007 RGCB80 and RGCB120; 2008 RGCB355) by full-genome analysis, and also in 13 of the 31 additional samples (42%), obtained from various parts of the state, by sequencing the corresponding genomic regions. These mutations showed 100% co-occurrence in all these samples. In phylogenetic analysis, formation of a new genetic clade by these isolates within the East, Central and South African (ECSA) genotypes was observed. Homology modeling followed by mapping revealed that at least 20 of the identified mutations fall into functionally significant domains of the viral proteins and are predicted to affect protein structure. Eighteen of the identified mutations in structural proteins, including the E2 K252Q change, are predicted to disrupt T-cell epitope immunogenicity. Our study reveals that CHIK virus with novel genetic changes were present in the severe Chikungunya outbreaks in 2007 and 2008 in South India. PMID:19851853

  13. A whole-genome massively parallel sequencing analysis of BRCA1 mutant oestrogen receptor negative and positive breast cancers

    Science.gov (United States)

    Weigelt, Britta; Wilkerson, Paul M; Manie, Elodie; Grigoriadis, Anita; A’Hern, Roger; van der Groep, Petra; Kozarewa, Iwanka; Popova, Tatiana; Mariani, Odette; Turaljic, Samra; Furney, Simon J; Marais, Richard; Rodruigues, Daniel-Nava; Flora, Adriana C; Wai, Patty; Pawar, Vidya; McDade, Simon; Carroll, Jason; Stoppa-Lyonnet, Dominique; Green, Andrew R; Ellis, Ian O; Swanton, Charles; van Diest, Paul; Delattre, Olivier; Lord, Christopher J; Foulkes, William D; Vincent-Salomon, Anne; Ashworth, Alan; Stern, Marc Henri; Reis-Filho, Jorge S

    2016-01-01

    BRCA1 encodes a tumour suppressor protein that plays pivotal roles in homologous recombination (HR) DNA repair, cell-cycle checkpoints, and transcriptional regulation. BRCA1 germline mutations confer a high risk of early-onset breast and ovarian cancer. In >80% of cases, tumours arising in BRCA1 germline mutation carriers are oestrogen receptor (ER)-negative, however up to 15% are ER-positive. It has been suggested that BRCA1 ER-positive breast cancers constitute sporadic cancers arising in the context of a BRCA1 germline mutation rather than being causally related to BRCA1 loss-of-function. Whole-genome massively parallel sequencing of ER-positive and ER-negative BRCA1 breast cancers, and their respective germline DNAs, was used to characterise the genetic landscape of BRCA1 cancers at base-pair resolution. Only BRCA1 germline mutations and somatic loss of the wild-type allele, and TP53 somatic mutations were recurrently found in the index cases. BRCA1 breast cancers displayed a mutational signature consistent with that caused by lack of HR DNA repair in both ER-positive and ER-negative cases. Sequencing analysis of independent cohorts of hereditary BRCA1 and sporadic non-BRCA1 breast cancers for the presence of recurrent pathogenic mutations and/or homozygous deletions found in the index cases revealed that DAPK3, TMEM135, KIAA1797, PDE4D and GATA4 are potential additional drivers of breast cancers. This study demonstrates that BRCA1 pathogenic germline mutations coupled with somatic loss of the wild-type allele are not sufficient for hereditary breast cancers to display an ER-negative phenotype, and has led to the identification of three potential novel breast cancer genes (i.e. DAPK3, TMEM135 and GATA4). PMID:22362584

  14. Whole-Genome Transcriptional Analysis of Chemolithoautotrophic Thiosulfate Oxidation by Thiobacillus denitrificans Under Aerobic vs. Denitrifying Conditions

    Energy Technology Data Exchange (ETDEWEB)

    Beller, H R; Letain, T E; Chakicherla, A; Kane, S R; Legler, T C; Coleman, M A

    2006-04-22

    Thiobacillus denitrificans is one of the few known obligate chemolithoautotrophic bacteria capable of energetically coupling thiosulfate oxidation to denitrification as well as aerobic respiration. As very little is known about the differential expression of genes associated with ke chemolithoautotrophic functions (such as sulfur-compound oxidation and CO2 fixation) under aerobic versus denitrifying conditions, we conducted whole-genome, cDNA microarray studies to explore this topic systematically. The microarrays identified 277 genes (approximately ten percent of the genome) as differentially expressed using Robust Multi-array Average statistical analysis and a 2-fold cutoff. Genes upregulated (ca. 6- to 150-fold) under aerobic conditions included a cluster of genes associated with iron acquisition (e.g., siderophore-related genes), a cluster of cytochrome cbb3 oxidase genes, cbbL and cbbS (encoding the large and small subunits of form I ribulose 1,5-bisphosphate carboxylase/oxygenase, or RubisCO), and multiple molecular chaperone genes. Genes upregulated (ca. 4- to 95-fold) under denitrifying conditions included nar, nir, and nor genes (associated respectively with nitrate reductase, nitrite reductase, and nitric oxide reductase, which catalyze successive steps of denitrification), cbbM (encoding form II RubisCO), and genes involved with sulfur-compound oxidation (including two physically separated but highly similar copies of sulfide:quinone oxidoreductase and of dsrC, associated with dissimilatory sulfite reductase). Among genes associated with denitrification, relative expression levels (i.e., degree of upregulation with nitrate) tended to decrease in the order nar > nir > nor > nos. Reverse transcription, quantitative PCR analysis was used to validate these trends.

  15. Whole-Genome Identification, Phylogeny, and Evolution of the Cytochrome P450 Family 2 (CYP2) Subfamilies in Birds.

    Science.gov (United States)

    Almeida, Daniela; Maldonado, Emanuel; Khan, Imran; Silva, Liliana; Gilbert, M Thomas P; Zhang, Guojie; Jarvis, Erich D; O'Brien, Stephen J; Johnson, Warren E; Antunes, Agostinho

    2016-01-01

    The cytochrome P450 (CYP) superfamily defends organisms from endogenous and noxious environmental compounds, and thus is crucial for survival. However, beyond mammals the molecular evolution of CYP2 subfamilies is poorly understood. Here, we characterized the CYP2 family across 48 avian whole genomes representing all major extant bird clades. Overall, 12 CYP2 subfamilies were identified, including the first description of the CYP2F, CYP2G, and several CYP2AF genes in avian genomes. Some of the CYP2 genes previously described as being lineage-specific, such as CYP2K and CYP2W, are ubiquitous to all avian groups. Furthermore, we identified a large number of CYP2J copies, which have been associated previously with water reabsorption. We detected positive selection in the avian CYP2C, CYP2D, CYP2H, CYP2J, CYP2K, and CYP2AC subfamilies. Moreover, we identified new substrate recognition sites (SRS0, SRS2_SRS3, and SRS3.1) and heme binding areas that influence CYP2 structure and function of functional importance as under significant positive selection. Some of the positively selected sites in avian CYP2D are located within the same SRS1 region that was previously linked with the metabolism of plant toxins. Additionally, we find that selective constraint variations in some avian CYP2 subfamilies are consistently associated with different feeding habits (CYP2H and CYP2J), habitats (CYP2D, CYP2H, CYP2J, and CYP2K), and migratory behaviors (CYP2D, CYP2H, and CYP2J). Overall, our findings indicate that there has been active enzyme site selection on CYP2 subfamilies and differential selection associated with different life history traits among birds. PMID:26979796

  16. Whole Genome Sequencing of Mycobacterium africanum Strains from Mali Provides Insights into the Mechanisms of Geographic Restriction

    Science.gov (United States)

    Maiga, Mamoudou; Abeel, Thomas; Shea, Terrance; Desjardins, Christopher A.; Diarra, Bassirou; Baya, Bocar; Sanogo, Moumine; Diallo, Souleymane; Earl, Ashlee M.; Bishai, William R.

    2016-01-01

    Background Mycobacterium africanum, made up of lineages 5 and 6 within the Mycobacterium tuberculosis complex (MTC), causes up to half of all tuberculosis cases in West Africa, but is rarely found outside of this region. The reasons for this geographical restriction remain unknown. Possible reasons include a geographically restricted animal reservoir, a unique preference for hosts of West African ethnicity, and an inability to compete with other lineages outside of West Africa. These latter two hypotheses could be caused by loss of fitness or altered interactions with the host immune system. Methodology/Principal Findings We sequenced 92 MTC clinical isolates from Mali, including two lineage 5 and 24 lineage 6 strains. Our genome sequencing assembly, alignment, phylogeny and average nucleotide identity analyses enabled us to identify features that typify lineages 5 and 6 and made clear that these lineages do not constitute a distinct species within the MTC. We found that in Mali, lineage 6 and lineage 4 strains have similar levels of diversity and evolve drug resistance through similar mechanisms. In the process, we identified a putative novel streptomycin resistance mutation. In addition, we found evidence of person-to-person transmission of lineage 6 isolates and showed that lineage 6 is not enriched for mutations in virulence-associated genes. Conclusions This is the largest collection of lineage 5 and 6 whole genome sequences to date, and our assembly and alignment data provide valuable insights into what distinguishes these lineages from other MTC lineages. Lineages 5 and 6 do not appear to be geographically restricted due to an inability to transmit between West African hosts or to an elevated number of mutations in virulence-associated genes. However, lineage-specific mutations, such as mutations in cell wall structure, secretion systems and cofactor biosynthesis, provide alternative mechanisms that may lead to host specificity. PMID:26751217

  17. Whole-Genome Sequencing Analysis of Serially Isolated Multi-Drug and Extensively Drug Resistant Mycobacterium tuberculosis from Thai Patients.

    Science.gov (United States)

    Faksri, Kiatichai; Tan, Jun Hao; Disratthakit, Areeya; Xia, Eryu; Prammananan, Therdsak; Suriyaphol, Prapat; Khor, Chiea Chuen; Teo, Yik-Ying; Ong, Rick Twee-Hee; Chaiprasert, Angkana

    2016-01-01

    Multi-drug and extensively drug-resistant tuberculosis (MDR and XDR-TB) are problems that threaten public health worldwide. Only some genetic markers associated with drug-resistant TB are known. Whole-genome sequencing (WGS) is a promising tool for distinguishing between re-infection and persistent infection in isolates taken at different times from a single patient, but has not yet been applied in MDR and XDR-TB. We aim to detect genetic markers associated with drug resistance and distinguish between reinfection and persistent infection from MDR and XDR-TB patients based on WGS analysis. Samples of Mycobacterium tuberculosis (n = 7), serially isolated from 2 MDR cases and 1 XDR-TB case, were retrieved from Siriraj Hospital, Bangkok. The WGS analysis used an Illumina Miseq sequencer. In cases of persistent infection, MDR-TB isolates differed at an average of 2 SNPs across the span of 2-9 months whereas in the case of reinfection, isolates differed at 61 SNPs across 2 years. Known genetic markers associated with resistance were detected from strains susceptible to streptomycin (2/7 isolates), p-aminosalicylic acid (3/7 isolates) and fluoroquinolone drugs. Among fluoroquinolone drugs, ofloxacin had the highest phenotype-genotype concordance (6/7 isolates), whereas gatifloxcain had the lowest (3/7 isolates). A putative candidate SNP in Rv2477c associated with kanamycin and amikacin resistance was suggested for further validation. WGS provided comprehensive results regarding molecular epidemiology, distinguishing between persistent infection and reinfection in M/XDR-TB and potentially can be used for detection of novel mutations associated with drug resistance. PMID:27518818

  18. Novel degenerate PCR method for whole genome amplification applied to Peru Margin (ODP Leg 201 subsurface samples

    Directory of Open Access Journals (Sweden)

    Amanda eMartino

    2012-01-01

    Full Text Available A degenerate PCR-based method of whole-genome amplification, designed to work fluidly with 454 sequencing technology, was developed and tested for use on deep marine subsurface DNA samples. The method, which we have called Random Amplification Metagenomic PCR (RAMP, involves the use of specific primers from Roche 454 amplicon sequencing, modified by the addition of a degenerate region at the 3’ end. It utilizes a PCR reaction, which resulted in no amplification from blanks, even after 50 cycles of PCR. After efforts to optimize experimental conditions, the method was tested with DNA extracted from cultured E. coli cells, and genome coverage was estimated after sequencing on three different occasions. Coverage did not vary greatly with the different experimental conditions tested, and was around 62% with a sequencing effort equivalent to a theoretical genome coverage of 14.10X. The GC content of the sequenced amplification product was within 2% of the predicted values for this strain of E. coli. The method was also applied to DNA extracted from marine subsurface samples from ODP Leg 201 site 1229 (Peru Margin, and results of a taxonomic analysis revealed microbial communities dominated by Proteobacteria, Chloroflexi, Firmicutes, Euryarchaeota, and Crenarchaeota, among others. These results were similar to those obtained previously for those samples; however, variations in the proportions of taxa show that community analysis can be sensitive to both the amplification technique used and the method of assigning sequences to taxonomic groups. Overall, we find that RAMP represents a valid methodology for amplifying metagenomes from low biomass samples.

  19. Whole genome sequence of Klebsiella pneumoniae U25, a hypermucoviscous, multidrug resistant, biofilm producing isolate from India

    Directory of Open Access Journals (Sweden)

    Zumaana Rafiq

    2016-02-01

    Full Text Available Klebsiella pneumoniae U25 is a multidrug resistant strain isolated from a tertiary care hospital in Chennai, India. Here, we report the complete annotated genome sequence of strain U25 obtained using PacBio RSII. This is the first report of the whole genome of K. pneumoniaespecies from Chennai. It consists of a single circular chromosome of size 5,491,870-bp and two plasmids of size 211,813 and 172,619-bp. The genes associated with multidrug resistance were identified. The chromosome of U25 was found to have eight antibiotic resistant genes [blaOXA-1,blaSHV-28, aac(6’1b-cr,catB3, oqxAB, dfrA1]. The plasmid pMGRU25-001 was found to have only one resistant gene (catA1 while plasmid pMGRU25-002 had 20 resistant genes [strAB, aadA1,aac(6’-Ib, aac(3-IId,sul1,2, blaTEM-1A,1B,blaOXA-9, blaCTX-M-15,blaSHV-11, cmlA1, erm(B,mph(A]. A mutation in the porin OmpK36 was identified which is likely to be associated with the intermediate resistance to carbapenems in the absence of carbapenemase genes. U25 is one of the few K. pneumoniaestrains to harbour clustered regularly interspaced short palindromic repeats (CRISPR systems. Two CRISPR arrays corresponding to Cas3 family helicase were identified in the genome. When compared to K. pneumoniaeNTUHK2044, a transposase gene InsH of IS5-13 was found inserted.

  20. Fast homozygosity mapping and identification of a zebrafish ENU-induced mutation by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Marianne L Voz

    Full Text Available Forward genetics using zebrafish is a powerful tool for studying vertebrate development through large-scale mutagenesis. Nonetheless, the identification of the molecular lesion is still laborious and involves time-consuming genetic mapping. Here, we show that high-throughput sequencing of the whole zebrafish genome can directly locate the interval carrying the causative mutation and at the same time pinpoint the molecular lesion. The feasibility of this approach was validated by sequencing the m1045 mutant line that displays a severe hypoplasia of the exocrine pancreas. We generated 13 Gb of sequence, equivalent to an eightfold genomic coverage, from a pool of 50 mutant embryos obtained from a map-cross between the AB mutant carrier and the WIK polymorphic strain. The chromosomal region carrying the causal mutation was localized based on its unique property to display high levels of homozygosity among sequence reads as it derives exclusively from the initial AB mutated allele. We developed an algorithm identifying such a region by calculating a homozygosity score along all chromosomes. This highlighted an 8-Mb window on chromosome 5 with a score close to 1 in the m1045 mutants. The sequence analysis of all genes within this interval revealed a nonsense mutation in the snapc4 gene. Knockdown experiments confirmed the assertion that snapc4 is the gene whose mutation leads to exocrine pancreas hypoplasia. In conclusion, this study constitutes a proof-of-concept that whole-genome sequencing is a fast and effective alternative to the classical positional cloning strategies in zebrafish.

  1. Implementation of exon arrays: alternative splicing during T-cell proliferation as determined by whole genome analysis

    Directory of Open Access Journals (Sweden)

    Whistler Toni

    2010-09-01

    Full Text Available Abstract Background The contribution of alternative splicing and isoform expression to cellular response is emerging as an area of considerable interest, and the newly developed exon arrays allow for systematic study of these processes. We use this pilot study to report on the feasibility of exon array implementation looking to replace the 3' in vitro transcription expression arrays in our laboratory. One of the most widely studied models of cellular response is T-cell activation from exogenous stimulation. Microarray studies have contributed to our understanding of key pathways activated during T-cell stimulation. We use this system to examine whole genome transcription and alternate exon usage events that are regulated during lymphocyte proliferation in an attempt to evaluate the exon arrays. Results Peripheral blood mononuclear cells form healthy donors were activated using phytohemagglutinin, IL2 and ionomycin and harvested at 5 points over a 7 day period. Flow cytometry measured cell cycle events and the Affymetrix exon array platform was used to identify the gene expression and alternate exon usage changes. Gene expression changes were noted in a total of 2105 transcripts, and alternate exon usage identified in 472 transcript clusters. There was an overlap of 263 transcripts which showed both differential expression and alternate exon usage over time. Gene ontology enrichment analysis showed a broader range of biological changes in biological processes for the differentially expressed genes, which include cell cycle, cell division, cell proliferation, chromosome segregation, cell death, component organization and biogenesis and metabolic process ontologies. The alternate exon usage ontological enrichments are in metabolism and component organization and biogenesis. We focus on alternate exon usage changes in the transcripts of the spliceosome complex. The real-time PCR validation rates were 86% for transcript expression and 71% for

  2. Whole-Genome Identification, Phylogeny, and Evolution of the Cytochrome P450 Family 2 (CYP2) Subfamilies in Birds

    Science.gov (United States)

    Almeida, Daniela; Maldonado, Emanuel; Khan, Imran; Silva, Liliana; Gilbert, M. Thomas P.; Zhang, Guojie; Jarvis, Erich D.; O’Brien, Stephen J.; Johnson, Warren E.; Antunes, Agostinho

    2016-01-01

    The cytochrome P450 (CYP) superfamily defends organisms from endogenous and noxious environmental compounds, and thus is crucial for survival. However, beyond mammals the molecular evolution of CYP2 subfamilies is poorly understood. Here, we characterized the CYP2 family across 48 avian whole genomes representing all major extant bird clades. Overall, 12 CYP2 subfamilies were identified, including the first description of the CYP2F, CYP2G, and several CYP2AF genes in avian genomes. Some of the CYP2 genes previously described as being lineage-specific, such as CYP2K and CYP2W, are ubiquitous to all avian groups. Furthermore, we identified a large number of CYP2J copies, which have been associated previously with water reabsorption. We detected positive selection in the avian CYP2C, CYP2D, CYP2H, CYP2J, CYP2K, and CYP2AC subfamilies. Moreover, we identified new substrate recognition sites (SRS0, SRS2_SRS3, and SRS3.1) and heme binding areas that influence CYP2 structure and function of functional importance as under significant positive selection. Some of the positively selected sites in avian CYP2D are located within the same SRS1 region that was previously linked with the metabolism of plant toxins. Additionally, we find that selective constraint variations in some avian CYP2 subfamilies are consistently associated with different feeding habits (CYP2H and CYP2J), habitats (CYP2D, CYP2H, CYP2J, and CYP2K), and migratory behaviors (CYP2D, CYP2H, and CYP2J). Overall, our findings indicate that there has been active enzyme site selection on CYP2 subfamilies and differential selection associated with different life history traits among birds. PMID:26979796

  3. Whole genome sequencing of Mycobacterium tuberculosis reveals slow growth and low mutation rates during latent infections in humans.

    Directory of Open Access Journals (Sweden)

    Roberto Colangeli

    Full Text Available Very little is known about the growth and mutation rates of Mycobacterium tuberculosis during latent infection in humans. However, studies in rhesus macaques have suggested that latent infections have mutation rates that are higher than that observed during active tuberculosis disease. Elevated mutation rates are presumed risk factors for the development of drug resistance. Therefore, the investigation of mutation rates during human latency is of high importance. We performed whole genome mutation analysis of M. tuberculosis isolates from a multi-decade tuberculosis outbreak of the New Zealand Rangipo strain. We used epidemiological and phylogenetic analysis to identify four cases of tuberculosis acquired from the same index case. Two of the tuberculosis cases occurred within two years of exposure and were classified as recently transmitted tuberculosis. Two other cases occurred more than 20 years after exposure and were classified as reactivation of latent M. tuberculosis infections. Mutation rates were compared between the two recently transmitted pairs versus the two latent pairs. Mean mutation rates assuming 20 hour generation times were 5.5 X 10(-10 mutations/bp/generation for recently transmitted tuberculosis and 7.3 X 10(-11 mutations/bp/generation for latent tuberculosis. Generation time versus mutation rate curves were also significantly higher for recently transmitted tuberculosis across all replication rates (p = 0.006. Assuming identical replication and mutation rates among all isolates in the final two years before disease reactivation, the u 20 hr mutation rate attributable to the remaining latent period was 1.6 × 10(-11 mutations/bp/generation, or approximately 30 fold less than that calculated during the two years immediately before disease. Mutations attributable to oxidative stress as might be caused by bacterial exposure to the host immune system were not increased in latent infections. In conclusion, we did not find any

  4. Whole genome sequencing of Mycobacterium tuberculosis reveals slow growth and low mutation rates during latent infections in humans.

    Science.gov (United States)

    Colangeli, Roberto; Arcus, Vic L; Cursons, Ray T; Ruthe, Ali; Karalus, Noel; Coley, Kathy; Manning, Shannon D; Kim, Soyeon; Marchiano, Emily; Alland, David

    2014-01-01

    Very little is known about the growth and mutation rates of Mycobacterium tuberculosis during latent infection in humans. However, studies in rhesus macaques have suggested that latent infections have mutation rates that are higher than that observed during active tuberculosis disease. Elevated mutation rates are presumed risk factors for the development of drug resistance. Therefore, the investigation of mutation rates during human latency is of high importance. We performed whole genome mutation analysis of M. tuberculosis isolates from a multi-decade tuberculosis outbreak of the New Zealand Rangipo strain. We used epidemiological and phylogenetic analysis to identify four cases of tuberculosis acquired from the same index case. Two of the tuberculosis cases occurred within two years of exposure and were classified as recently transmitted tuberculosis. Two other cases occurred more than 20 years after exposure and were classified as reactivation of latent M. tuberculosis infections. Mutation rates were compared between the two recently transmitted pairs versus the two latent pairs. Mean mutation rates assuming 20 hour generation times were 5.5 X 10(-10) mutations/bp/generation for recently transmitted tuberculosis and 7.3 X 10(-11) mutations/bp/generation for latent tuberculosis. Generation time versus mutation rate curves were also significantly higher for recently transmitted tuberculosis across all replication rates (p = 0.006). Assuming identical replication and mutation rates among all isolates in the final two years before disease reactivation, the u 20 hr mutation rate attributable to the remaining latent period was 1.6 × 10(-11) mutations/bp/generation, or approximately 30 fold less than that calculated during the two years immediately before disease. Mutations attributable to oxidative stress as might be caused by bacterial exposure to the host immune system were not increased in latent infections. In conclusion, we did not find any evidence to suggest

  5. Next-Gen phylogeography of rainforest trees: exploring landscape-level cpDNA variation from whole-genome sequencing.

    Science.gov (United States)

    van der Merwe, M; McPherson, H; Siow, J; Rossetto, M

    2014-01-01

    Standardized phylogeographic studies across codistributed taxa can identify important refugia and biogeographic barriers, and potentially uncover how changes in adaptive constraints through space and time impact on the distribution of genetic diversity. The combination of next-generation sequencing and methodologies that enable uncomplicated analysis of the full chloroplast genome may provide an invaluable resource for such studies. Here, we assess the potential of a shotgun-based method across twelve nonmodel rainforest trees sampled from two evolutionary distinct regions. Whole genomic shotgun sequencing libraries consisting of pooled individuals were used to assemble species-specific chloroplast references (in silicio). For each species, the pooled libraries allowed for the detection of variation within and between data sets (each representing a geographic region). The potential use of nuclear rDNA as an additional marker from the NGS libraries was investigated by mapping reads against available references. We successfully obtained phylogeographically informative sequence data from a range of previously unstudied rainforest trees. Greater levels of diversity were found in northern refugial rainforests than in southern expansion areas. The genetic signatures of varying evolutionary histories were detected, and interesting associative patterns between functional characteristics and genetic diversity were identified. This approach can suit a wide range of landscape-level studies. As the key laboratory-based steps do not require prior species-specific knowledge and can be easily outsourced, the techniques described here are even suitable for researchers without access to wet-laboratory facilities, making evolutionary ecology questions increasingly accessible to the research community. PMID:24119022

  6. Evaluation of an Optimal Epidemiological Typing Scheme for Legionella pneumophila with Whole-Genome Sequence Data Using Validation Guidelines.

    Science.gov (United States)

    David, Sophia; Mentasti, Massimo; Tewolde, Rediat; Aslett, Martin; Harris, Simon R; Afshar, Baharak; Underwood, Anthony; Fry, Norman K; Parkhill, Julian; Harrison, Timothy G

    2016-08-01

    Sequence-based typing (SBT), analogous to multilocus sequence typing (MLST), is the current "gold standard" typing method for investigation of legionellosis outbreaks caused by Legionella pneumophila However, as common sequence types (STs) cause many infections, some investigations remain unresolved. In this study, various whole-genome sequencing (WGS)-based methods were evaluated according to published guidelines, including (i) a single nucleotide polymorphism (SNP)-based method, (ii) extended MLST using different numbers of genes, (iii) determination of gene presence or absence, and (iv) a kmer-based method. L. pneumophila serogroup 1 isolates (n = 106) from the standard "typing panel," previously used by the European Society for Clinical Microbiology Study Group on Legionella Infections (ESGLI), were tested together with another 229 isolates. Over 98% of isolates were considered typeable using the SNP- and kmer-based methods. Percentages of isolates with complete extended MLST profiles ranged from 99.1% (50 genes) to 86.8% (1,455 genes), while only 41.5% produced a full profile with the gene presence/absence scheme. Replicates demonstrated that all methods offer 100% reproducibility. Indices of discrimination range from 0.972 (ribosomal MLST) to 0.999 (SNP based), and all values were higher than that achieved with SBT (0.940). Epidemiological concordance is generally inversely related to discriminatory power. We propose that an extended MLST scheme with ∼50 genes provides optimal epidemiological concordance while substantially improving the discrimination offered by SBT and can be used as part of a hierarchical typing scheme that should maintain backwards compatibility and increase discrimination where necessary. This analysis will be useful for the ESGLI to design a scheme that has the potential to become the new gold standard typing method for L. pneumophila. PMID:27280420

  7. Detection of genetic variation affecting milk coagulation properties in Danish Holstein dairy cattle by analyses of pooled whole-genome sequences from phenotypically extreme samples (pool-seq)

    DEFF Research Database (Denmark)

    Bertelsen, H P; Gregersen, V R; Poulsen, Nina Aagaard;

    2016-01-01

    differences between pooled whole-genome sequences of phenotypically extreme samples (pool-seq).. Curd-firming rate and raw milk pH were measured for 415 Danish Holstein cows, and each animal was sequenced at low coverage. Pools were created containing whole genome sequence reads from samples with "extreme...... located on chromosome 6. A total of 9 significant SNP, which were selected based on the possible function of proximal candidate genes, were genotyped in the entire sample set ( = 415) to test for an association. The most significant SNP was located proximal to , explaining 33% of the phenotypic variance....... , coding for κ-casein, is the most studied in relation to milk coagulation due to its position on the surface of the casein micelles and the direct involvement in milk coagulation. Three additional SNP located on chromosome 6 showed significant associations explaining 7, 3.6, and 1.3% of the phenotypic...

  8. Whole Genome Sequences of Three Treponema pallidum ssp. pertenue Strains: Yaws and Syphilis Treponemes Differ in Less than 0.2% of the Genome Sequence

    OpenAIRE

    Čejková, Darina; Zobaníková, Marie; Chen, Lei; Pospíšilová, Petra; Strouhal, Michal; Qin, Xiang; Mikalová, Lenka; Norris, Steven J.; Muzny, Donna M; Gibbs, Richard A.; Fulton, Lucinda L.; Sodergren, Erica; Weinstock, George M.; Šmajs, David

    2012-01-01

    Background The yaws treponemes, Treponema pallidum ssp. pertenue (TPE) strains, are closely related to syphilis causing strains of Treponema pallidum ssp. pallidum (TPA). Both yaws and syphilis are distinguished on the basis of epidemiological characteristics, clinical symptoms, and several genetic signatures of the corresponding causative agents. Methodology/Principal Findings To precisely define genetic differences between TPA and TPE, high-quality whole genome sequences of three TPE strain...

  9. Whole genome and transcriptome analyses of environmental antibiotic sensitive and multi-resistant Pseudomonas aeruginosa isolates exposed to waste water and tap water

    OpenAIRE

    Schwartz, Thomas; Armant, Olivier; Bretschneider, Nancy; Hahn, Alexander; Kirchen, Silke; Seifert, Martin; Dötsch, Andreas

    2014-01-01

    The fitness of sensitive and resistant P seudomonas aeruginosa in different aquatic environments depends on genetic capacities and transcriptional regulation. Therefore, an antibiotic-sensitive isolate PA30 and a multi-resistant isolate PA49 originating from waste waters were compared via whole genome and transcriptome Illumina sequencing after exposure to municipal waste water and tap water. A number of different genomic islands (e.g. PAGIs, PAPIs) were identified in the two environmental is...

  10. Whole-genome profiling and shotgun sequencing delivers an anchored, gene-decorated, physical map assembly of bread wheat chromosome 6A

    OpenAIRE

    Poursarebani, N.; Nussbaumer, T.; Šimková, H. (Hana); Šafář, J.; Witsenboer, H.; van Oeveren, J.; Doležel, J. (Jaroslav); Mayer, K. F. X.; N. Stein; Schnurbusch, T.

    2014-01-01

    Bread wheat (Triticum aestivum L.) is the most important staple food crop for 35% of the world's population. International efforts are underway to facilitate an increase in wheat production, of which the International Wheat Genome Sequencing Consortium (IWGSC) plays an important role. As part of this effort, we have developed a sequence-based physical map of wheat chromosome 6A using whole-genome profiling (WGP (TM)). The bacterial artificial chromosome (BAC) contig assembly tools FINGERPRINT...

  11. Defining and Evaluating a Core Genome Multilocus Sequence Typing Scheme for Whole-Genome Sequence-Based Typing of Listeria monocytogenes

    OpenAIRE

    Ruppitsch, Werner; Pietzka, Ariane; Prior, Karola; Bletz, Stefan; Fernandez, Haizpea Lasa; Allerberger, Franz; Harmsen, Dag; Mellmann, Alexander

    2015-01-01

    Whole-genome sequencing (WGS) has emerged today as an ultimate typing tool to characterize Listeria monocytogenes outbreaks. However, data analysis and interlaboratory comparability of WGS data are still challenging for most public health laboratories. Therefore, we have developed and evaluated a new L. monocytogenes typing scheme based on genome-wide gene-by-gene comparisons (core genome multilocus the sequence typing [cgMLST]) to allow for a unique typing nomenclature. Initially, we determi...

  12. Comparison of Whole Genome Amplification Methods for Analysis of DNA Extracted from Microdissected Early Breast Lesions in Formalin-Fixed Paraffin-Embedded Tissue

    OpenAIRE

    Nona Arneson; Juan Moreno; Vladimir Iakovlev; Arezou Ghazani; Keisha Warren; David McCready; Igor Jurisica; Done, Susan J.

    2012-01-01

    To understand cancer progression, it is desirable to study the earliest stages of its development, which are often microscopic lesions. Array comparative genomic hybridization (aCGH) is a valuable high-throughput molecular approach for discovering DNA copy number changes; however, it requires a relatively large amount of DNA, which is difficult to obtain from microdissected lesions. Whole genome amplification (WGA) methods were developed to increase DNA quantity; however their reproducibility...

  13. Whole genome sequencing for deciphering the resistome of Chryseobacterium indologenes, an emerging multidrug-resistant bacterium isolated from a cystic fibrosis patient in Marseille, France

    OpenAIRE

    T. Cimmino; J.-M. Rolain

    2016-01-01

    We decipher the resistome of Chryseobacterium indologenes MARS15, an emerging multidrug-resistant clinical strain, using the whole genome sequencing strategy. The bacterium was isolated from the sputum of a hospitalized patient with cystic fibrosis in the Timone Hospital in Marseille, France. Genome sequencing was done with Illumina MiSeq using a paired-end strategy. The in silico analysis was done by RAST, the resistome by the ARG-ANNOT database and detection of polyketide synthase (PKS) by ...

  14. Performance Evaluation of NIPT in Detection of Chromosomal Copy Number Variants Using Low-Coverage Whole-Genome Sequencing of Plasma DNA.

    Directory of Open Access Journals (Sweden)

    Hongtai Liu

    Full Text Available The aim of this study was to assess the performance of noninvasively prenatal testing (NIPT for fetal copy number variants (CNVs in clinical samples, using a whole-genome sequencing method.A total of 919 archived maternal plasma samples with karyotyping/microarray results, including 33 CNVs samples and 886 normal samples from September 1, 2011 to May 31, 2013, were enrolled in this study. The samples were randomly rearranged and blindly sequenced by low-coverage (about 7M reads whole-genome sequencing of plasma DNA. Fetal CNVs were detected by Fetal Copy-number Analysis through Maternal Plasma Sequencing (FCAPS to compare to the karyotyping/microarray results. Sensitivity, specificity and were evaluated.33 samples with deletions/duplications ranging from 1 to 129 Mb were detected with the consistent CNV size and location to karyotyping/microarray results in the study. Ten false positive results and two false negative results were obtained. The sensitivity and specificity of detection deletions/duplications were 84.21% and 98.42%, respectively.Whole-genome sequencing-based NIPT has high performance in detecting genome-wide CNVs, in particular >10Mb CNVs using the current FCAPS algorithm. It is possible to implement the current method in NIPT to prenatally screening for fetal CNVs.

  15. Array-based approaches to bacterial transcriptome analysis

    OpenAIRE

    Mäder, Ulrike; Nicolas, Pierre

    2012-01-01

    Microarray technology has been extensively used to compare or quantify genome-wide mRNA levels, a key factor in the adaptive response of bacteria to the environment. Classical gene expression arrays based on an existing genome annotation with relatively few probes for each gene, are well suited to assess the expression levels of all annotated transcripts under many different conditions. Newer genomic tiling arrays that cover both strands of a genome by overlapping probes and, more recently, R...

  16. Multiwalled Carbon Nanotubes for Amperometric Array-Based Biosensors

    OpenAIRE

    Taurino, Irene; De Micheli, Giovanni; Carrara, Sandro

    2012-01-01

    For diagnostic and therapeutic purposes an accurate determination of multiple metabolites is often required. Amperometric devices are attractive tools to quantify biological compounds due to the direct conversion of a biochemical event to a current. This review addresses recent developments in the use of multiwalled carbon nanotubes to enhance detection ca- pability of amperometric array-based biosensors. More specifically, the principal techniques for multiwalled carbon nanotube incorporatio...

  17. Linear Microbolometric Array Based on VOx Thin Film

    Science.gov (United States)

    Chen, Xi-Qu

    2010-05-01

    In this paper, a linear microbolometric array based on VOx thin film is proposed. The linear microbolometric array is fabricated by using micromachining technology, and its thermo-sensitive VOx thin film has excellent infrared response spectrum and TCR characteristics. Integrated with CMOS circuit, an experimentally prototypical monolithic linear microbolometric array is designed and fabricated. The testing results of the experimental linear array show that the responsivity of linear array can approach 18KV/W and is potential for infrared image systems.

  18. Whole genome sequencing identifies a deletion in protein phosphatase 2A that affects its stability and localization in Chlamydomonas reinhardtii.

    Directory of Open Access Journals (Sweden)

    Huawen Lin

    Full Text Available Whole genome sequencing is a powerful tool in the discovery of single nucleotide polymorphisms (SNPs and small insertions/deletions (indels among mutant strains, which simplifies forward genetics approaches. However, identification of the causative mutation among a large number of non-causative SNPs in a mutant strain remains a big challenge. In the unicellular biflagellate green alga Chlamydomonas reinhardtii, we generated a SNP/indel library that contains over 2 million polymorphisms from four wild-type strains, one highly polymorphic strain that is frequently used in meiotic mapping, ten mutant strains that have flagellar assembly or motility defects, and one mutant strain, imp3, which has a mating defect. A comparison of polymorphisms in the imp3 strain and the other 15 strains allowed us to identify a deletion of the last three amino acids, Y313F314L315, in a protein phosphatase 2A catalytic subunit (PP2A3 in the imp3 strain. Introduction of a wild-type HA-tagged PP2A3 rescues the mutant phenotype, but mutant HA-PP2A3 at Y313 or L315 fail to rescue. Our immunoprecipitation results indicate that the Y313, L315, or YFLΔ mutations do not affect the binding of PP2A3 to the scaffold subunit, PP2A-2r. In contrast, the Y313, L315, or YFLΔ mutations affect both the stability and the localization of PP2A3. The PP2A3 protein is less abundant in these mutants and fails to accumulate in the basal body area as observed in transformants with either wild-type HA-PP2A3 or a HA-PP2A3 with a V310T change. The accumulation of HA-PP2A3 in the basal body region disappears in mated dikaryons, which suggests that the localization of PP2A3 may be essential to the mating process. Overall, our results demonstrate that the terminal YFL tail of PP2A3 is important in the regulation on Chlamydomonas mating.

  19. Genesis of the vertebrate FoxP subfamily member genes occurred during two ancestral whole genome duplication events.

    Science.gov (United States)

    Song, Xiaowei; Tang, Yezhong; Wang, Yajun

    2016-08-22

    The vertebrate FoxP subfamily genes play important roles in the construction of essential functional modules involved in physiological and developmental processes. To explore the adaptive evolution of functional modules associated with the FoxP subfamily member genes, it is necessary to study the gene duplication process. We detected four member genes of the FoxP subfamily in sea lampreys (a representative species of jawless vertebrates) through genome screenings and phylogenetic analyses. Reliable paralogons (i.e. paralogous chromosome segments) have rarely been detected in scaffolds of FoxP subfamily member genes in sea lampreys due to the considerable existence of HTH_Tnp_Tc3_2 transposases. However, these transposases did not alter gene numbers of the FoxP subfamily in sea lampreys. The coincidence between the "1-4" gene duplication pattern of FoxP subfamily genes from invertebrates to vertebrates and two rounds of ancestral whole genome duplication (1R- and 2R-WGD) events reveal that the FoxP subfamily of vertebrates was quadruplicated in the 1R- and 2R-WGD events. Furthermore, we deduced that a synchronous gene duplication process occurred for the FoxP subfamily and for three linked gene families/subfamilies (i.e. MIT family, mGluR group III and PLXNA subfamily) in the 1R- and 2R-WGD events using phylogenetic analyses and mirror-dendrogram methods (i.e. algorithms to test protein-protein interactions). Specifically, the ancestor of FoxP1 and FoxP3 and the ancestor of FoxP2 and FoxP4 were generated in 1R-WGD event. In the subsequent 2R-WGD event, these two ancestral genes were changed into FoxP1, FoxP2, FoxP3 and FoxP4. The elucidation of these gene duplication processes shed light on the phylogenetic relationships between functional modules of the FoxP subfamily member genes. PMID:27188254

  20. Whole genome sequencing versus traditional genotyping for investigation of a Mycobacterium tuberculosis outbreak: a longitudinal molecular epidemiological study.

    Directory of Open Access Journals (Sweden)

    Andreas Roetzer

    Full Text Available BACKGROUND: Understanding Mycobacterium tuberculosis (Mtb transmission is essential to guide efficient tuberculosis control strategies. Traditional strain typing lacks sufficient discriminatory power to resolve large outbreaks. Here, we tested the potential of using next generation genome sequencing for identification of outbreak-related transmission chains. METHODS AND FINDINGS: During long-term (1997 to 2010 prospective population-based molecular epidemiological surveillance comprising a total of 2,301 patients, we identified a large outbreak caused by an Mtb strain of the Haarlem lineage. The main performance outcome measure of whole genome sequencing (WGS analyses was the degree of correlation of the WGS analyses with contact tracing data and the spatio-temporal distribution of the outbreak cases. WGS analyses of the 86 isolates revealed 85 single nucleotide polymorphisms (SNPs, subdividing the outbreak into seven genome clusters (two to 24 isolates each, plus 36 unique SNP profiles. WGS results showed that the first outbreak isolates detected in 1997 were falsely clustered by classical genotyping. In 1998, one clone (termed "Hamburg clone" started expanding, apparently independently from differences in the social environment of early cases. Genome-based clustering patterns were in better accordance with contact tracing data and the geographical distribution of the cases than clustering patterns based on classical genotyping. A maximum of three SNPs were identified in eight confirmed human-to-human transmission chains, involving 31 patients. We estimated the Mtb genome evolutionary rate at 0.4 mutations per genome per year. This rate suggests that Mtb grows in its natural host with a doubling time of approximately 22 h (400 generations per year. Based on the genome variation discovered, emergence of the Hamburg clone was dated back to a period between 1993 and 1997, hence shortly before the discovery of the outbreak through epidemiological

  1. Inferring regulatory elements from a whole genome. An analysis of Helicobacter pylori sigma(80) family of promoter signals.

    Science.gov (United States)

    Vanet, A; Marsan, L; Labigne, A; Sagot, M F

    2000-03-24

    Helicobacter pylori is adapted to life in a unique niche, the gastric epithelium of primates. Its promoters may therefore be different from those of other bacteria. Here, we determine motifs possibly involved in the recognition of such promoter sequences by the RNA polymerase using a new motif identification method. An important feature of this method is that the motifs are sought with the least possible assumptions about what they may look like. The method starts by considering the whole genome of H. pylori and attempts to infer directly from it a description for a family of promoters. Thus, this approach differs from searching for such promoters with a previously established description. The two algorithms are based on the idea of inferring motifs by flexibly comparing words in the sequences with an external object, instead of between themselves. The first algorithm infers single motifs, the second a combination of two motifs separated from one another by strictly defined, sterically constrained distances. Besides independently finding motifs known to be present in other bacteria, such as the Shine-Dalgarno sequence and the TATA-box, this approach suggests the existence in H. pylori of a new, combined motif, TTAAGC, followed optimally 21 bp downstream by TATAAT. Between these two motifs, there is in some cases another, TTTTAA or, less frequently, a repetition of TTAAGC separated optimally from the TATA-box by 12 bp. The combined motif TTAAGCx(21+/-2)TATAAT is present with no errors immediately upstream from the only two copies of the ribosomal 23 S-5 S RNA genes in H. pylori, and with one error upstream from the only two copies of the ribosomal 16 S RNA genes. The operons of both ribosomal RNA molecules are strongly expressed, representing an encouraging sign of the pertinence of the motifs found by the algorithms. In 25 cases out of a possible 30, the combined motif is found with no more than three substitutions immediately upstream from ribosomal proteins, or

  2. Whole genome sequencing-based characterization of extensively drug resistant (XDR) strains of Mycobacterium tuberculosis from Pakistan

    KAUST Repository

    Hasan, Zahra

    2015-03-01

    Objectives: The global increase in drug resistance in Mycobacterium tuberculosis (MTB) strains increases the focus on improved molecular diagnostics for MTB. Extensively drug-resistant (XDR) - TB is caused by MTB strains resistant to rifampicin, isoniazid, fluoroquinolone and aminoglycoside antibiotics. Resistance to anti-tuberculous drugs has been associated with single nucleotide polymorphisms (SNPs), in particular MTB genes. However, there is regional variation between MTB lineages and the SNPs associated with resistance. Therefore, there is a need to identify common resistance conferring SNPs so that effective molecular-based diagnostic tests for MTB can be developed. This study investigated used whole genome sequencing (WGS) to characterize 37 XDR MTB isolates from Pakistan and investigated SNPs related to drug resistance. Methods: XDR-TB strains were selected. DNA was extracted from MTB strains, and samples underwent WGS with 76-base-paired end fragment sizes using Illumina paired end HiSeq2000 technology. Raw sequence data were mapped uniquely to H37Rv reference genome. The mappings allowed SNPs and small indels to be called using SAMtools/BCFtools. Results: This study found that in all XDR strains, rifampicin resistance was attributable to SNPs in the rpoB RDR region. Isoniazid resistance-associated mutations were primarily related to katG codon 315 followed by inhA S94A. Fluoroquinolone resistance was attributable to gyrA 91-94 codons in most strains, while one did not have SNPs in either gyrA or gyrB. Aminoglycoside resistance was mostly associated with SNPs in rrs, except in 6 strains. Ethambutol resistant strains had embB codon 306 mutations, but many strains did not have this present. The SNPs were compared with those present in commercial assays such as LiPA Hain MDRTBsl, and the sensitivity of the assays for these strains was evaluated. Conclusions: If common drug resistance associated with SNPs evaluated the concordance between phenotypic and

  3. Increased Proportion of Variance Explained and Prediction Accuracy of Survival of Breast Cancer Patients with Use of Whole-Genome Multiomic Profiles.

    Science.gov (United States)

    Vazquez, Ana I; Veturi, Yogasudha; Behring, Michael; Shrestha, Sadeep; Kirst, Matias; Resende, Marcio F R; de Los Campos, Gustavo

    2016-07-01

    Whole-genome multiomic profiles hold valuable information for the analysis and prediction of disease risk and progression. However, integrating high-dimensional multilayer omic data into risk-assessment models is statistically and computationally challenging. We describe a statistical framework, the Bayesian generalized additive model ((BGAM), and present software for integrating multilayer high-dimensional inputs into risk-assessment models. We used BGAM and data from The Cancer Genome Atlas for the analysis and prediction of survival after diagnosis of breast cancer. We developed a sequence of studies to (1) compare predictions based on single omics with those based on clinical covariates commonly used for the assessment of breast cancer patients (COV), (2) evaluate the benefits of combining COV and omics, (3) compare models based on (a) COV and gene expression profiles from oncogenes with (b) COV and whole-genome gene expression (WGGE) profiles, and (4) evaluate the impacts of combining multiple omics and their interactions. We report that (1) WGGE profiles and whole-genome methylation (METH) profiles offer more predictive power than any of the COV commonly used in clinical practice (e.g., subtype and stage), (2) adding WGGE or METH profiles to COV increases prediction accuracy, (3) the predictive power of WGGE profiles is considerably higher than that based on expression from large-effect oncogenes, and (4) the gain in prediction accuracy when combining multiple omics is consistent. Our results show the feasibility of omic integration and highlight the importance of WGGE and METH profiles in breast cancer, achieving gains of up to 7 points area under the curve (AUC) over the COV in some cases. PMID:27129736

  4. Whole-Genome Sequencing of Measles Virus Genotypes H1 and D8 During Outbreaks of Infection Following the 2010 Olympic Winter Games Reveals Viral Transmission Routes.

    Science.gov (United States)

    Gardy, Jennifer L; Naus, Monika; Amlani, Ashraf; Chung, Walter; Kim, Hochan; Tan, Malcolm; Severini, Alberto; Krajden, Mel; Puddicombe, David; Sahni, Vanita; Hayden, Althea S; Gustafson, Reka; Henry, Bonnie; Tang, Patrick

    2015-11-15

    We used whole-genome sequencing to investigate a dual-genotype outbreak of measles occurring after the XXI Olympic Winter Games in Vancouver, Canada. By sequencing 27 complete genomes from H1 and D8 genotype measles viruses isolated from outbreak cases, we estimated the virus mutation rate, determined that person-to-person transmission is typically associated with 0 mutations between isolates, and established that a single introduction of H1 virus led to the expansion of the outbreak beyond Vancouver. This is the largest measles genomics project to date, revealing novel aspects of measles virus genetics and providing new insights into transmission of this reemerging viral pathogen. PMID:26153409

  5. Whole genome sequencing as a tool to investigate a cluster of seven cases of listeriosis in Austria and Germany, 2011–2013

    OpenAIRE

    Schmid, D.; Allerberger, F.; Huhulescu, S; Pietzka, A; Amar, C.; Kleta, S; Prager, R; Preußel, K; Aichinger, E.; Mellmann, A.; Raoult, D.

    2014-01-01

    A cluster of seven human cases of listeriosis occurred in Austria and in Germany between April 2011 and July 2013. The Listeria monocytogenes serovar (SV) 1/2b isolates shared pulsed-field gel electrophoresis (PFGE) and fluorescent amplified fragment length polymorphism (fAFLP) patterns indistinguishable from those from five food producers. The seven human isolates, a control strain with a different PFGE/fAFLP profile and ten food isolates were subjected to whole genome sequencing (WGS) in a ...

  6. Analyses of Twelve New Whole Genome Sequences of Cassava Brown Streak Viruses and Ugandan Cassava Brown Streak Viruses from East Africa: Diversity, Supercomputing and Evidence for Further Speciation

    OpenAIRE

    Ndunguru, Joseph; Sseruwagi, Peter; Tairo, Fred; Stomeo, Francesca; Maina, Solomon; Djinkeng, Appolinaire; Kehoe, Monica; Boykin, Laura M

    2015-01-01

    Cassava brown streak disease is caused by two devastating viruses, Cassava brown streak virus (CBSV) and Ugandan cassava brown streak virus (UCBSV) which are frequently found infecting cassava, one of sub-Saharan Africa’s most important staple food crops. Each year these viruses cause losses of up to $100 million USD and can leave entire families without their primary food source, for an entire year. Twelve new whole genomes, including seven of CBSV and five of UCBSV were uncovered in this re...

  7. Development of Genome-wide Simple Sequence Repeat Markers Using Whole-genome Shotgun Sequences of Sorghum (Sorghum bicolor (L.) Moench)

    OpenAIRE

    Yonemaru, Jun-ichi; Ando, Tsuyu; Mizubayashi, Tatsumi; Kasuga, Shigemitsu; Matsumoto, Takashi; Yano, Masahiro

    2009-01-01

    Simple sequence repeat (SSR) markers with a high degree of polymorphism contribute to the molecular dissection of agriculturally important traits in sorghum (Sorghum bicolor (L.) Moench). We designed 5599 non-redundant SSR markers, including regions flanking the SSRs, in whole-genome shotgun sequences of sorghum line ATx623. (AT/TA) n repeats constituted 26.1% of all SSRs, followed by (AG/TC) n at 20.5%, (AC/TG) n at 13.7% and (CG/GC) n at 11.8%. The chromosomal locations of 5012 SSR markers ...

  8. Small Area Array-Based LED Luminaire Design

    Energy Technology Data Exchange (ETDEWEB)

    Thomas Yuan

    2008-01-09

    This report contains a summary of technical achievements during a three-year project to demonstrate high efficiency LED luminaire designs based on small area array-based gallium nitride diodes. Novel GaN-based LED array designs are described, specifically addressing the thermal, optical, electrical and mechanical requirements for the incorporation of such arrays into viable solid-state LED luminaires. This work resulted in the demonstration of an integrated luminaire prototype of 1000 lumens cool white light output with reflector shaped beams and efficacy of 89.4 lm/W at CCT of 6000oK and CRI of 73; and performance of 903 lumens warm white light output with reflector shaped beams and efficacy of 63.0 lm/W at CCT of 2800oK and CRI of 82. In addition, up to 1275 lumens cool white light output at 114.2 lm/W and 1156 lumens warm white light output at 76.5 lm/W were achieved if the reflector was not used. The success to integrate small area array-based LED designs and address thermal, optical, electrical and mechanical requirements was clearly achieved in these luminaire prototypes with outstanding performance and high efficiency.

  9. Whole genome sequencing for deciphering the resistome of Chryseobacterium indologenes, an emerging multidrug-resistant bacterium isolated from a cystic fibrosis patient in Marseille, France

    Directory of Open Access Journals (Sweden)

    T. Cimmino

    2016-07-01

    Full Text Available We decipher the resistome of Chryseobacterium indologenes MARS15, an emerging multidrug-resistant clinical strain, using the whole genome sequencing strategy. The bacterium was isolated from the sputum of a hospitalized patient with cystic fibrosis in the Timone Hospital in Marseille, France. Genome sequencing was done with Illumina MiSeq using a paired-end strategy. The in silico analysis was done by RAST, the resistome by the ARG-ANNOT database and detection of polyketide synthase (PKS by ANTISMAH. The genome size of C. indologenes MARS15 is 4 972 580 bp with 36.4% GC content. This multidrug-resistant bacterium was resistant to all β-lactams, including imipenem, and also to colistin. The resistome of C. indologenes MARS15 includes Ambler class A and B β-lactams encoding blaCIA and blaIND-2 genes and MBL (metallo-β-lactamase genes, the CAT (chloramphenicol acetyltransferase gene and the multidrug efflux pump AcrB. Specific features include the presence of an urease operon, an intact prophage and a carotenoid biosynthesis pathway. Interestingly, we report for the first time in C. indologenes a PKS cluster that might be responsible for secondary metabolite biosynthesis, similar to erythromycin. The whole genome sequence analysis provides insight into the resistome and the discovery of new details, such as the PKS cluster.

  10. Whole genome sequencing for deciphering the resistome of Chryseobacterium indologenes, an emerging multidrug-resistant bacterium isolated from a cystic fibrosis patient in Marseille, France.

    Science.gov (United States)

    Cimmino, T; Rolain, J-M

    2016-07-01

    We decipher the resistome of Chryseobacterium indologenes MARS15, an emerging multidrug-resistant clinical strain, using the whole genome sequencing strategy. The bacterium was isolated from the sputum of a hospitalized patient with cystic fibrosis in the Timone Hospital in Marseille, France. Genome sequencing was done with Illumina MiSeq using a paired-end strategy. The in silico analysis was done by RAST, the resistome by the ARG-ANNOT database and detection of polyketide synthase (PKS) by ANTISMAH. The genome size of C. indologenes MARS15 is 4 972 580 bp with 36.4% GC content. This multidrug-resistant bacterium was resistant to all β-lactams, including imipenem, and also to colistin. The resistome of C. indologenes MARS15 includes Ambler class A and B β-lactams encoding bla CIA and bla IND-2 genes and MBL (metallo-β-lactamase) genes, the CAT (chloramphenicol acetyltransferase) gene and the multidrug efflux pump AcrB. Specific features include the presence of an urease operon, an intact prophage and a carotenoid biosynthesis pathway. Interestingly, we report for the first time in C. indologenes a PKS cluster that might be responsible for secondary metabolite biosynthesis, similar to erythromycin. The whole genome sequence analysis provides insight into the resistome and the discovery of new details, such as the PKS cluster. PMID:27222716

  11. Genetic characterisation of Malawian pneumococci prior to the roll-out of the PCV13 vaccine using a high-throughput whole genome sequencing approach.

    Directory of Open Access Journals (Sweden)

    Dean B Everett

    Full Text Available BACKGROUND: Malawi commenced the introduction of the 13-valent pneumococcal conjugate vaccine (PCV13 into the routine infant immunisation schedule in November 2011. Here we have tested the utility of high throughput whole genome sequencing to provide a high-resolution view of pre-vaccine pneumococcal epidemiology and population evolutionary trends to predict potential future change in population structure post introduction. METHODS: One hundred and twenty seven (127 archived pneumococcal isolates from randomly selected adults and children presenting to the Queen Elizabeth Central Hospital, Blantyre, Malawi underwent whole genome sequencing. RESULTS: The pneumococcal population was dominated by serotype 1 (20.5% of invasive isolates prior to vaccine introduction. PCV13 is likely to protect against 62.9% of all circulating invasive pneumococci (78.3% in under-5-year-olds. Several Pneumococcal Molecular Epidemiology Network (PMEN clones are now in circulation in Malawi which were previously undetected but the pandemic multidrug resistant PMEN1 lineage was not identified. Genome analysis identified a number of novel sequence types and serotype switching. CONCLUSIONS: High throughput genome sequencing is now feasible and has the capacity to simultaneously elucidate serotype, sequence type and as well as detailed genetic information. It enables population level characterization, providing a detailed picture of population structure and genome evolution relevant to disease control. Post-vaccine introduction surveillance supported by genome sequencing is essential to providing a comprehensive picture of the impact of PCV13 on pneumococcal population structure and informing future public health interventions.

  12. Whole genomic analysis of human and bovine G8P[1] rotavirus strains isolated in Nigeria provides evidence for direct bovine-to-human interspecies transmission.

    Science.gov (United States)

    Komoto, Satoshi; Adah, Mohammed Ignatius; Ide, Tomihiko; Yoshikawa, Tetsushi; Taniguchi, Koki

    2016-09-01

    Bovine group A rotavirus (RVA) G8P[1] strains have been rarely detected in humans. Two Nigerian G8P[1] strains, HMG035 (RVA/Human-tc/NGA/HMG035/1999/G8P[1]) and NGRBg8 (RVA/Cow-tc/NGA/NGRBg8/1998/G8P[1]), were previously suggested to have the VP7, VP4, and NSP1 genes of bovine origin. In order to obtain precise information on the origin and evolution of these G8P[1] strains, the complete nucleotide sequences of the whole genomes of strains HMG035 and NGRBg8 were determined and analyzed in the present study. On whole genomic analysis, strains HMG035 and NGRBg8 were found to be very closely related to each other in all the 11 segments, and were found to have a bovine RVA-like genotype constellation (G8-P[1]-I2-R2-C2-M2-A11-N2-T6-E2-H3). Furthermore, on phylogenetic analysis, each of the 11 genes of strains HMG035 and NGRBg8 appeared to be of bovine origin. Thus, strains HMG035 and NGRBg8 were suggested to be derived from a common origin, and strain NGRBg8 was assumed to represent an example of bovine RVA strains that were transmitted to humans. Our findings provide clear evidence for direct bovine-to-human interspecies transmission of RVA strains. PMID:27302094

  13. Planar patterned stretchable electrode arrays based on flexible printed circuits

    International Nuclear Information System (INIS)

    For stretchable electronics to achieve broad industrial application, they must be reliable to manufacture and must perform robustly while undergoing large deformations. We present a new strategy for creating planar stretchable electronics and demonstrate one such device, a stretchable microelectrode array based on flex circuit technology. Stretchability is achieved through novel, rationally designed perforations that provide islands of low strain and continuous low-strain pathways for conductive traces. This approach enables the device to maintain constant electrical properties and planarity while undergoing applied strains up to 15%. Materials selection is not limited to polyimide composite devices and can potentially be implemented with either soft or hard substrates and can incorporate standard metals or new nano-engineered conductors. By using standard flex circuit technology, our planar microelectrode device achieved constant resistances for strains up to 20% with less than a 4% resistance offset over 120 000 cycles at 10% strain. (paper)

  14. Tin Oxide Nanorod Array-Based Electrochemical Hydrogen Peroxide Biosensor

    Directory of Open Access Journals (Sweden)

    Liu Jinping

    2010-01-01

    Full Text Available Abstract SnO2 nanorod array grown directly on alloy substrate has been employed as the working electrode of H2O2 biosensor. Single-crystalline SnO2 nanorods provide not only low isoelectric point and enough void spaces for facile horseradish peroxidase (HRP immobilization but also numerous conductive channels for electron transport to and from current collector; thus, leading to direct electrochemistry of HRP. The nanorod array-based biosensor demonstrates high H2O2 sensing performance in terms of excellent sensitivity (379 μA mM−1 cm−2, low detection limit (0.2 μM and high selectivity with the apparent Michaelis–Menten constant estimated to be as small as 33.9 μM. Our work further demonstrates the advantages of ordered array architecture in electrochemical device application and sheds light on the construction of other high-performance enzymatic biosensors.

  15. Whole genome sequence of the emerging oomycete pathogen Pythium insidiosum strain CDC-B5653 isolated from an infected human in the USA.

    Science.gov (United States)

    Ascunce, Marina S; Huguet-Tapia, Jose C; Braun, Edward L; Ortiz-Urquiza, Almudena; Keyhani, Nemat O; Goss, Erica M

    2016-03-01

    Pythium insidiosum ATCC 200269 strain CDC-B5653, an isolate from necrotizing lesions on the mouth and eye of a 2-year-old boy in Memphis, Tennessee, USA, was sequenced using a combination of Illumina MiSeq (300 bp paired-end, 14 millions reads) and PacBio (10  Kb fragment library, 356,001 reads). The sequencing data were assembled using SPAdes version 3.1.0, yielding a total genome size of 45.6 Mb contained in 8992 contigs, N50 of 13 Kb, 57% G + C content, and 17,867 putative protein-coding genes. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JRHR00000000. PMID:26981361

  16. Whole genome sequence of the emerging oomycete pathogen Pythium insidiosum strain CDC-B5653 isolated from an infected human in the USA

    Directory of Open Access Journals (Sweden)

    Marina S. Ascunce

    2016-03-01

    Full Text Available Pythium insidiosum ATCC 200269 strain CDC-B5653, an isolate from necrotizing lesions on the mouth and eye of a 2-year-old boy in Memphis, Tennessee, USA, was sequenced using a combination of Illumina MiSeq (300 bp paired-end, 14 millions reads and PacBio (10  Kb fragment library, 356,001 reads. The sequencing data were assembled using SPAdes version 3.1.0, yielding a total genome size of 45.6 Mb contained in 8992 contigs, N50 of 13 Kb, 57% G + C content, and 17,867 putative protein-coding genes. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JRHR00000000.

  17. Analyses of Twelve New Whole Genome Sequences of Cassava Brown Streak Viruses and Ugandan Cassava Brown Streak Viruses from East Africa: Diversity, Supercomputing and Evidence for Further Speciation.

    Directory of Open Access Journals (Sweden)

    Joseph Ndunguru

    Full Text Available Cassava brown streak disease is caused by two devastating viruses, Cassava brown streak virus (CBSV and Ugandan cassava brown streak virus (UCBSV which are frequently found infecting cassava, one of sub-Saharan Africa's most important staple food crops. Each year these viruses cause losses of up to $100 million USD and can leave entire families without their primary food source, for an entire year. Twelve new whole genomes, including seven of CBSV and five of UCBSV were uncovered in this research, doubling the genomic sequences available in the public domain for these viruses. These new sequences disprove the assumption that the viruses are limited by agro-ecological zones, show that current diagnostic primers are insufficient to provide confident diagnosis of these viruses and give rise to the possibility that there may be as many as four distinct species of virus. Utilizing NGS sequencing technologies and proper phylogenetic practices will rapidly increase the solution to sustainable cassava production.

  18. Phylogeny and Taxonomy of Archaea: A Comparison of the Whole-Genome-Based CVTree Approach with 16S rRNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Guanghong Zuo

    2015-03-01

    Full Text Available A tripartite comparison of Archaea phylogeny and taxonomy at and above the rank order is reported: (1 the whole-genome-based and alignment-free CVTree using 179 genomes; (2 the 16S rRNA analysis exemplified by the All-Species Living Tree with 366 archaeal sequences; and (3 the Second Edition of Bergey’s Manual of Systematic Bacteriology complemented by some current literature. A high degree of agreement is reached at these ranks. From the newly proposed archaeal phyla, Korarchaeota, Thaumarchaeota, Nanoarchaeota and Aigarchaeota, to the recent suggestion to divide the class Halobacteria into three orders, all gain substantial support from CVTree. In addition, the CVTree helped to determine the taxonomic position of some newly sequenced genomes without proper lineage information. A few discrepancies between the CVTree and the 16S rRNA approaches call for further investigation.

  19. Evolution of Extensively Drug-Resistant Tuberculosis over Four Decades: Whole Genome Sequencing and Dating Analysis of Mycobacterium tuberculosis Isolates from KwaZulu-Natal.

    Directory of Open Access Journals (Sweden)

    Keira A Cohen

    2015-09-01

    Full Text Available The continued advance of antibiotic resistance threatens the treatment and control of many infectious diseases. This is exemplified by the largest global outbreak of extensively drug-resistant (XDR tuberculosis (TB identified in Tugela Ferry, KwaZulu-Natal, South Africa, in 2005 that continues today. It is unclear whether the emergence of XDR-TB in KwaZulu-Natal was due to recent inadequacies in TB control in conjunction with HIV or other factors. Understanding the origins of drug resistance in this fatal outbreak of XDR will inform the control and prevention of drug-resistant TB in other settings. In this study, we used whole genome sequencing and dating analysis to determine if XDR-TB had emerged recently or had ancient antecedents.We performed whole genome sequencing and drug susceptibility testing on 337 clinical isolates of Mycobacterium tuberculosis collected in KwaZulu-Natal from 2008 to 2013, in addition to three historical isolates, collected from patients in the same province and including an isolate from the 2005 Tugela Ferry XDR outbreak, a multidrug-resistant (MDR isolate from 1994, and a pansusceptible isolate from 1995. We utilized an array of whole genome comparative techniques to assess the relatedness among strains, to establish the order of acquisition of drug resistance mutations, including the timing of acquisitions leading to XDR-TB in the LAM4 spoligotype, and to calculate the number of independent evolutionary emergences of MDR and XDR. Our sequencing and analysis revealed a 50-member clone of XDR M. tuberculosis that was highly related to the Tugela Ferry XDR outbreak strain. We estimated that mutations conferring isoniazid and streptomycin resistance in this clone were acquired 50 y prior to the Tugela Ferry outbreak (katG S315T [isoniazid]; gidB 130 bp deletion [streptomycin]; 1957 [95% highest posterior density (HPD: 1937-1971], with the subsequent emergence of MDR and XDR occurring 20 y (rpoB L452P [rifampicin]; pnc

  20. De novo 7p partial trisomy characterized by subtelomeric FISH and whole-genome array in a girl with mental retardation

    Directory of Open Access Journals (Sweden)

    N Chandra

    2011-10-01

    Full Text Available Abstract Chromosome rearrangements involving telomeres have been established as one of the major causes of idiopathic mental retardation/developmental delay. This case of 7p partial trisomy syndrome in a 3-year-old female child presenting with developmental delay emphasizes the clinical relevance of cytogenetic diagnosis in the better management of genetic disorders. Application of subtelomeric FISH technique revealed the presence of interstitial telomeres and led to the ascertainment of partial trisomy for the distal 7p segment localized on the telomeric end of the short arm of chromosome 19. Whole-genome cytogenetic microarray-based analysis showed a mosaic 3.5 Mb gain at Xq21.1 besides the approximately 24.5 Mb gain corresponding to 7p15.3- > pter. The possible mechanisms of origin of the chromosomal rearrangement and the clinical relevance of trisomy for the genes lying in the critical regions are discussed.